Complex SQL database queries often contain subqueries.
SELECT * FROM ...
WHERE name IN (SELECT name
These can quickly become unreadable, especially if multiple subqueries are involved. A nice way to organise such queries with multiple subqueries is called Common Table Expressions (CTE), or colloquially: “WITH queries”. They are supported by most contemporary SQL databases.
When using Common Table Expressions you list all the subqueries after a WITH keyword in front of the actual query and assign a name to each subquery:
<subquery1_name> AS (SELECT ...),
<subquery2_name> AS (SELECT ...),
[ ... ]
[ ... ]
You can now refer to these subqueries in the main query by their names. They are conceptually just like temporary views. You can also refer to a subquery in the main query multiple times, whereas if the subquery was inlined you would have to repeat it. A subquery defined as a Common Table Expression can also refer to the preceding subqueries in the subquery list.
A Common Table Expression can even refer to itself, which is a recursive definition. In some database systems you have to add the RECURSIVE keyword after WITH, in others you can leave it out. Here’s a recursive CTE that calculates factorial numbers:
WITH RECURSIVE factorials (n, fact) AS
(SELECT 0, 1
SELECT n+1, (n+1)*fact FROM factorials
WHERE n < 6)
SELECT * FROM factorials;
You could also use such a recursive query to descend into a tree hierarchy.
While the use cases for recursive queries are less frequent, I find the general concept of Common Table Expressions very useful to make complex queries more readable.
The number of sensors and other things that periodically collect data is ever growing. This advent of the internet of things (IoT) demands a way of storing and analyzing all this so-called time-series data. There are many options for such data – the most prominent being special time-series databases like InfluxDB or well suited, nicely scaling databases like Apache Cassandra.
The problem is you have to tailor your solution to one of these technologies whereas there is SQL with mature database management systems (DBMS) and drivers/bindings for almost any programming language.
Why not use a plain SQL database?
Relational SQL databases are a mature and well understood piece of technology albeit not as sexy as all those new NoSQL databases. Using them for time series data may not be a problem for smaller datasets but sooner or later your ingestion and query performance will degrade massivly. So in general it is not a good option to store all your time-series data in a traditional relational DBMS (RDBMS).