Half table, half view: Generated Columns

Anyone familiar with SQL database basics knows the two fundamental structures of relational databases: tables and views. Data is stored in tables, while views are virtual tables calculated on-the-fly from a SQL query. Additionally, relational database management systems often support materialized views, which, like views, are based on a query from other tables, but their results are actually persisted and only recalculated as needed.

What many don’t know is that the most common SQL databases (PostgreSQL, MySQL, Oracle) nowadays also support something in between: we’re talking about Generated Columns, which will be introduced in this blog post.

So, what are Generated Columns? Generated Columns are columns within a normal database table. But unlike regular columns, their values are not stored as independent individual values; rather, they are computed from other values in the table.

Below is an example of how to define a Generated Column. The example is for PostgreSQL, but the syntax is similar in other popular relational database systems that support this feature.

CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
quantity INTEGER,
unit_price DECIMAL(10, 2),
total_price DECIMAL(10, 2) GENERATED ALWAYS
AS (quantity * unit_price) STORED
);

As seen above, a Generated Column is defined with the keywords GENERATED ALWAYS AS. The GENERATED ALWAYS is even optional (you could just write AS), but it clarifies what it’s about. Following AS is the expression that computes the value.

At the end of the column definition, either the keyword STORED or VIRTUAL can be used. In the example above, it says STORED, which means the computed value is physically stored on the disk. The value is recalculated and stored only after an INSERT or UPDATE. In contrast, with VIRTUAL, the value is not stored but always computed on-the-fly. Thus, virtual Generated Columns behave similarly to a view, while STORED is more comparable to a materialized view.

The choice between the two options depends on the specific requirements. Stored Generated Columns consume more disk space, while virtual Generated Columns save space at the expense of performance.

In the expression following AS, other columns of the table can be referenced. Even other Generated Columns can be referenced, as long as they are specified in the table definition before the current column. However, SQL subqueries cannot be used in the expression.

In conclusion, Generated Columns are a useful feature that combines parts of a table with the benefits of a view.

Time travel with Oracle database’s Flashback Queries

Oracle’s database management system offers a feature known as Flashback Queries, allowing users to peek into the past and retrieve data as it existed at a previous point in time. This functionality can eliminate the need for manual data restoration from backups, making it a useful asset for both developers and database administrators.

Enabling Flashback Queries

Before using Flashback Queries, ensure that the database has the required configuration. Firstly, confirm that the database’s DB_FLASHBACK_RETENTION_TARGET parameter is appropriately set. This parameter defines the period for which historical data is retained. Adjust it based on your organization’s data retention policies. Before making changes, you can check its current value:

SHOW PARAMETER DB_FLASHBACK_RETENTION_TARGET;

Use the ALTER SYSTEM command to set the parameter. For example, to set it to retain data for 7 days:

ALTER SYSTEM SET DB_FLASHBACK_RETENTION_TARGET=604800 SCOPE=BOTH;

604800 is the retention period in seconds (7 days × 24 hours × 60 minutes × 60 seconds). Please note that setting the parameter to a higher value consumes more space in the flashback recovery area, so consider your storage constraints. SCOPE=BOTH ensures that the change persists across database restarts, i.e. it changes the value both in memory and in the server parameter file.

To enable Flashback Queries for a specific table, execute the ALTER TABLE command with the FLASHBACK option:

ALTER TABLE table_name FLASHBACK ARCHIVE;

This setup allows Oracle to maintain historical changes for the specified table.

Using Flashback Queries

Consider a scenario where an employee accidentally updates critical data in the employees table. With Flashback Queries, you can rectify the mistake:

SELECT * FROM employees AS OF TIMESTAMP TO_TIMESTAMP('2023-11-10 15:00:00', 'YYYY-MM-DD HH24:MI:SS');

This query retrieves the data from the employees table as it existed before the erroneous update.

You can also recover dropped tables if they are still within the retention period:

FLASHBACK TABLE orders TO BEFORE DROP;

This command restores the dropped table and its data.

Flashback Queries offer a mechanism to navigate through time within a database, providing a simple way to recover historical data or inspect changes. They stand as a useful asset in the arsenal of database administrators and developers, fostering greater confidence in managing data.

Materialized views in Oracle

Most relational database management systems (RDBMs) support not only views, but also materialized views. Materialized views and normal views are both database objects used to present data to users, but they work in different ways.

A database view is a virtual table or a named query that presents data from one or more tables in a specific way. They are not physical tables and do not store data directly, but instead retrieve data from the underlying tables based on a specified query.

Materialized views are similar to normal views, but they store pre-computed results. When a materialized view is created, the results of the underlying query are computed and stored in the database. One advantage of materialized views is faster query performance by avoiding the need to compute the same results repeatedly. This is especially useful for complex and time-consuming queries, as the results can be stored and accessed quickly.

The syntax for creating a normal view in an Oracle database is as follows:

CREATE VIEW view_name AS SELECT … FROM … WHERE …;

To create a materialized view instead of a normal view you add the MATERIALIZED keyword:

CREATE MATERIALIZED VIEW view_name AS SELECT … FROM … WHERE …;

When creating a materialized view you should think about and decide on three aspects of materialized views:

  1. the refresh method,
  2. the refresh interval,
  3. and the storage properties

The refresh method

The refresh method determines how the data in the materialized view is updated or refreshed to reflect changes in the base tables.

  • COMPLETE: This one completely rebuilds the materialized view from scratch. It drops the existing contents of the materialized view and then re-executes the query to populate it with fresh data. This method can be resource-intensive and slow, especially for large materialized views.
  • FAST: Updates only the rows in the materialized view that have changed since the last refresh. It uses the materialized view logs on the base tables to identify the changed rows and then applies the changes to the materialized view. It can be much faster than a complete refresh, especially if there are only a few changes to the data.
  • FORCE: Tries to perform a fast refresh if possible, but falls back to a complete refresh if necessary. This method is useful if you want to try to perform a fast refresh, but you’re not sure if it will be possible due to the nature of the data or the query.

You can specify the refresh method when creating the materialized view using the REFRESH keyword:

CREATE MATERIALIZED VIEW view_name
  REFRESH FAST
  AS SELECT ...;

If you do not specify a refresh mode FORCE is the default.

The refresh interval

The refresh interval controls how often the materialized view is automatically refreshed. It determines how frequently the materialized view is updated to reflect changes in the underlying data.

Some refresh interval options in Oracle are:

  • ON COMMIT: The materialized view is refreshed automatically every time a transaction that modifies the underlying data is committed. This interval is useful when you need to keep the materialized view up-to-date in near real-time.
  • ON DEMAND: The materialized view is refreshed only when you explicitly request a refresh using the DBMS_MVIEW.REFRESH.
  • START WITH … NEXT: With this interval, the materialized view is refreshed automatically at regular intervals. It is useful when you want to balance the need for up-to-date data with the resources required to refresh the materialized view.

You can specify the refresh interval when creating the materialized view by adding it to the REFRESH clause when creating the view:

CREATE MATERIALIZED VIEW view_name
  REFRESH FAST ON COMMIT
  AS SELECT ...;

The following materialized view gets refreshed every hour:

CREATE MATERIALIZED VIEW view_name
  REFRESH FAST START WITH SYSDATE NEXT SYSDATE + 1/24
  AS SELECT ...;

Storage properties

Storage properties affect how the data in the materialized view is stored and accessed. In Oracle, some of these are:

  • CACHE: The data is stored in the database buffer cache, which is a portion of memory used to cache frequently accessed data. It improves query performance by reducing disk I/O, but it can consume a significant amount of memory.
  • LOGGING: Changes to the materialized view data are logged in the database redo logs. This property ensures that changes to the materialized view can be recovered in case of a system failure but can result in additional overhead.
  • TABLESPACE: Allows you to specify the tablespace where the materialized view data is stored.

Again, you can specify these properties when creating the materialized view:

CREATE MATERIALIZED VIEW view_name
  CACHE
  LOGGING
  TABLESPACE tablespace_name
AS SELECT ... FROM ... WHERE ...;

Now you know the basics for creating materialized views in an Oracle database when needed. There is still more to learn about them. You can find the full reference here.

Pagination in SQL

Pagination is the task of dividing a data set into subsequent parts of the whole data set. For example, a search engine initially only shows the first 15 results for a search query. The user can then step through the rest of the results the by clicking a “Next” button.

Ideally this feature is also supported by the underlying database system. Otherwise, the application would have to load all matching data records from the database, just to filter out the major part of of them, because the user only wanted to see page 3 of 50. A pagination request has two components: a limit and an offset. If a page contains a maximum of 15 items and page 3 is requested, then the limit would be 15 and the offset would be 30 = (page-1) × limit.

PostgreSQL, MySQL, MariaDB

The database systems PostgreSQL, MySQL and MariaDB have a straight forward syntax for pagination: LIMIT {number} OFFSET {number} . So a simple SQL query with pagination might look like this:

SELECT * FROM users ORDER BY name LIMIT 15 OFFSET 30;

Oracle DB

Oracle DB didn’t have a dedicated syntax for pagination before Oracle 12c, but it was still possible to achieve the same result with other means. With Oracle 12c a new syntax for pagination was introduced under the name “Row limiting clause”. First I’ll show the old method, then the new syntax.

The old method is based on ROWNUM . If you wanted to specify both an offset and a limit, you had to nest multiple queries:

SELECT *
FROM (SELECT *, rownum AS rnum
      FROM (SELECT *
            FROM users
            ORDER BY name)
      WHERE rownum < 45)
WHERE rnum >= 30;

The newer row limiting clause syntax is shorter and looks as follows:

SELECT * FROM users ORDER BY name
  OFFSET 30 ROWS FETCH NEXT 15 ROWS ONLY;

This syntax also allows the option to specify a percentage of rows instead of a fixed number of rows:

SELECT * FROM users ORDER BY name
  FETCH FIRST 20 PERCENT ROWS ONLY;

MS SQL Server

Microsoft’s SQL Server also supports the Oracle-like syntax with OFFSET and FETCH clauses and recommends the usage of this syntax for pagination.