In one of my last posts from a couple of weeks ago I wrote about querying gaps between non-contiguous date ranges in Oracle SQL. This week’s post is about contiguous date ranges.
While non-contiguous date ranges are best represented in a database table with a start_date
and an end_date
column, it is better to represent contiguous date ranges only by one date column, so that we avoid redundancy and do not have to keep the start date of a date range in sync with the end date of the previous date range. In this post I will use the start date:
CREATE TABLE date_ranges ( name VARCHAR2(100), start_date DATE );
The example content of the table is:
NAME START_DATE ---- ---------- A 05/02/2020 B 02/04/2020 C 16/04/2020 D 01/06/2020 E 21/06/2020 F 02/07/2020 G 05/08/2020
This representation means that the date range with the most recent start date does not have an end. The application using this data model can choose whether to interpret this as a date range with an open end or just as the end point for the previous range and not as a date range by itself.
While this is a nice non-redundant representation, it is less convenient for queries where we want to have both a start and an end date per row, for example in order to check wether a given date lies within a date range or not. Luckily, we can transform the ranges with a query:
SELECT date_ranges.*, LEAD(date_ranges.start_date) OVER (ORDER BY start_date) AS end_date FROM date_ranges;
As in the previous post on non-contiguous date ranges the LEAD analytic function allows you to access the following row from the current row without using a self-join. Here’s the result:
NAME START_DATE END_DATE ---- ---------- -------- A 05/02/2020 02/04/2020 B 02/04/2020 16/04/2020 C 16/04/2020 01/06/2020 D 01/06/2020 21/06/2020 E 21/06/2020 02/07/2020 F 02/07/2020 05/08/2020 G 05/08/2020 (null)
By using a WITH clause you can use this query like a view and join it with the another table, for example with the join condition that a date lies within a date range:
WITH ranges AS (SELECT date_ranges.*, LEAD(date_ranges.start_date) OVER (ORDER BY start_date) AS end_date FROM date_ranges) SELECT timeseries.*, ranges.name FROM timeseries LEFT OUTER JOIN ranges ON timeseries.measurement_date BETWEEN ranges.start_date AND ranges.end_date;