Scheduling Jobs in Oracle Database for Reliable Background Tasks

As a software developer, you often write code that must run reliably in the background: data cleanup, periodic recalculations, imports, exports, and batch processing. Pushing this responsibility to external scripts or application-level schedulers is one way to do it, but it can add complexity and increase failure points.

Oracle’s job scheduling feature lets you move this logic into the database, close to the data it operates on, using plain SQL and PL/SQL. The result is simpler code, fewer moving parts, and background tasks that run predictably without constant supervision.

The Scheduler

This feature is provided by the DBMS_SCHEDULER package.

A basic example is a daily cleanup task. Suppose you have a table called ORDERS and you want to remove records older than five years every night. First, you create a stored procedure that performs the cleanup.

CREATE OR REPLACE PROCEDURE cleanup_old_orders IS
BEGIN
  DELETE FROM orders
  WHERE order_date < ADD_MONTHS(SYSDATE, -60);
  COMMIT;
END;
/

Next, you create a scheduler job that runs this procedure every day at 2 a.m.:

BEGIN
  DBMS_SCHEDULER.create_job (
    job_name        => 'cleanup_old_orders_job',
    job_type        => 'STORED_PROCEDURE',
    job_action      => 'CLEANUP_OLD_ORDERS',
    start_date      => TIMESTAMP '2026-01-01 02:00:00',
    repeat_interval => 'FREQ=DAILY;BYHOUR=2;BYMINUTE=0;BYSECOND=0',
    enabled         => TRUE
  );
END;
/

Once enabled, Oracle runs this job automatically every night.

Another common example is running a job at short, repeating intervals. For instance, you may want to refresh a summary table every 10 minutes:

BEGIN
  DBMS_SCHEDULER.create_job (
    job_name        => 'refresh_sales_summary_job',
    job_type        => 'PLSQL_BLOCK',
    job_action      => '
    BEGIN
      refresh_sales_summary;
    END;',
    repeat_interval => 'FREQ=MINUTELY;INTERVAL=10',
    enabled         => TRUE
  );
END;
/

The scheduler can also be used to run jobs only once. For example, you might need to perform a one-time data fix during a maintenance window:

BEGIN
  DBMS_SCHEDULER.create_job (
    job_name   => 'one_time_data_fix_job',
    job_type   => 'PLSQL_BLOCK',
    job_action => '
     BEGIN
       update_customer_status;
     END;',
    start_date => TIMESTAMP '2026-03-15 23:00:00',
    enabled    => TRUE
  );
END;
/
Required privileges

In addition to defining jobs, it is important to understand the required privileges. To create and manage scheduler jobs, a user needs the CREATE JOB privilege. To create jobs in another schema, CREATE ANY JOB is required. Running jobs that use DBMS_SCHEDULER also requires EXECUTE privilege on the DBMS_SCHEDULER package.

If a job runs a stored procedure, the job owner must have direct privileges on the objects used by that procedure, not privileges granted through roles. For example, if a job deletes rows from the ORDERS table, the job owner must have a direct DELETE grant on that table. For external jobs, additional privileges such as CREATE EXTERNAL JOB and specific operating system credentials are required.

These rules ensure that jobs run securely and only perform actions explicitly allowed by the database administrator.

Monitoring

For monitoring, Oracle DB stores job execution details in system views. You can check whether a job is enabled and when it last ran with a simple query:

SELECT job_name, enabled, last_start_date, last_run_duration
  FROM  dba_scheduler_jobs
  WHERE job_name = 'CLEANUP_OLD_ORDERS_JOB';

To see errors and execution history, you can query the job run details:

SELECT job_name, status, actual_start_date, run_duration, error#
  FROM  dba_scheduler_job_run_details
  WHERE job_name = 'CLEANUP_OLD_ORDERS_JOB'
  ORDER BY actual_start_date DESC;

These examples show how Oracle job scheduling works in practice. You define the SQL or PL/SQL logic, attach it to a schedule, and let the database handle execution and logging. This approach keeps automation close to the data, reduces manual intervention, and makes recurring tasks easier to manage and troubleshoot.

Common SQL Performance Gotchas in Application Development

When building apps that use a SQL database, it’s easy to run into performance problems without noticing. Many of these issues come from the way queries are written and used in the code. Below are seven common SQL mistakes developers make, why they happen, and how you can avoid them.

Not Using Prepared Statements

One of the most common mistakes is building SQL queries by concatenating strings. This approach not only introduces the risk of SQL injection but also prevents the database from reusing execution plans. Prepared statements or parameterized queries let the database understand the structure of the query ahead of time, which improves performance and security. They also help avoid subtle bugs caused by incorrect string formatting or escaping.

// Vulnerable and inefficient
String userId = "42";
Statement stmt = connection.createStatement();
ResultSet rs = stmt.executeQuery("SELECT * FROM users WHERE id = " + userId);
// Safe and performant
String sql = "SELECT * FROM users WHERE id = ?";
PreparedStatement ps = connection.prepareStatement(sql);
ps.setInt(1, 42);
ResultSet rs = ps.executeQuery();

The N+1 Query Problem

The N+1 problem happens when an application fetches a list of items and then runs a separate query for each item to retrieve related data. For example, fetching a list of users and then querying each user’s posts in a loop. This results in one query to fetch the users and N additional queries for their posts. The fix is to restructure the query using joins or batch-fetching strategies, so all the data can be retrieved in fewer queries.

We have written about it on our blog before: Understanding, identifying and fixing the N+1 query problem

Missing Indexes

When queries filter or join on columns that do not have indexes, the database may need to scan entire tables to find matching rows. This can be very slow, especially as data grows. Adding the right indexes can drastically improve performance. It’s important to monitor slow queries and check whether indexes exist on the columns used in WHERE clauses, JOINs, and ORDER BY clauses.

Here’s how to create an index on an “orders” table for its “customer_id” column:

CREATE INDEX idx_orders_customer_id ON orders(customer_id);

Once the index is added, the query can efficiently find matching rows without scanning the full table.

Retrieving Too Much Data

Using SELECT * to fetch all columns from a table is a common habit, but it often retrieves more data than the application needs. This can increase network load and memory usage. Similarly, not using pagination when retrieving large result sets can lead to long query times and a poor user experience. Always select only the necessary columns and use LIMIT or OFFSET clauses to manage result size.

For example:

String sql = "SELECT id, name, price FROM products LIMIT ? OFFSET ?";
PreparedStatement ps = connection.prepareStatement(sql);
ps.setInt(1, 50);
ps.setInt(2, 0);
ResultSet rs = ps.executeQuery();

Chatty Database Interactions

Some applications make many small queries in a single request cycle, creating high overhead from repeated database access. Each round-trip to the database introduces latency. Here’s an inefficient example:

for (int id : productIds) {
    PreparedStatement ps = connection.prepareStatement(
        "UPDATE products SET price = price * 1.1 WHERE id = ?"
    );
    ps.setInt(1, id);
    ps.executeUpdate();
}

Instead of issuing separate queries, it’s often better to combine them or use batch operations where possible. This reduces the number of database interactions and improves overall throughput:

PreparedStatement ps = connection.prepareStatement(
    "UPDATE products SET price = price * 1.1 WHERE id = ?"
);

for (int id : productIds) {
    ps.setInt(1, id);
    ps.addBatch();
}
ps.executeBatch();

Improper Connection Pooling

Establishing a new connection to the database for every query or request is slow and resource-intensive. Connection pooling allows applications to reuse database connections, avoiding the cost of repeatedly opening and closing them. Applications that do not use pooling efficiently may suffer from connection exhaustion or high latency under load. To avoid this use a connection pooler and configure it with appropriate limits for the workload.

Unbounded Wildcard Searches

Using wildcard searches with patterns like '%term%' in a WHERE clause causes the database to scan the entire table, because indexes cannot be used effectively. These searches are expensive and scale poorly. To handle partial matches more efficiently, consider using full-text search features provided by the database, which are designed for fast text searching. Here’s an example in PosgreSQL:

SELECT * FROM articles
WHERE to_tsvector('english', title) @@ to_tsquery('database');

One of our previous blog posts dives deeper into this topic: Full-text Search with PostgreSQL.

By being mindful of these common pitfalls, you can write SQL that scales well and performs reliably under load. Good database performance isn’t just about writing correct queries – it’s about writing efficient ones.

Have you faced any of these problems before? Every project is different, and we all learn a lot from the challenges we run into. Feel free to share your experiences or tips in the comments. Your story could help someone else improve their app’s performance too.

What are GIN Indexes in PostgreSQL?

If you’ve worked with PostgreSQL and dealt with things like full-text search, arrays, or JSON data, you might have heard about GIN indexes. But what exactly are they, and why are they useful?

GIN stands for Generalized Inverted Index. Most indexes (like the default B-tree index) work best when there’s one clear value per row – like a number or a name. But sometimes, a single column can hold many values. Think of a column that stores a list of tags, words in a document, or key-value data in JSON. That’s where GIN comes in.

Let’s walk through a few examples to see how GIN indexes work and why they’re helpful.

Full-text search example

Suppose you have a table of articles:

CREATE TABLE articles (
  id    serial PRIMARY KEY,
  title text,
  body  text
);

You want to let users search the content of these articles. PostgreSQL has built-in support for full-text search, which works with a special data type called tsvector. To get started, you’d add a column to store this processed version of your article text:

ALTER TABLE articles ADD COLUMN tsv tsvector;
UPDATE articles SET tsv = to_tsvector('english', body);

Now, to speed up searches, you create a GIN index:

CREATE INDEX idx_articles_tsv ON articles USING GIN(tsv);

With that in place, you can search for articles quickly:

SELECT * FROM articles WHERE tsv @@ to_tsquery('tonic & water');

This finds all articles that contain both “tonic” and “water”, and thanks to the GIN index, it’s fast – even if you have thousands of articles.

Array example

GIN is also great for columns that store arrays. Let’s say you have a table of photos, and each photo can have several tags:

CREATE TABLE photos (
  id   serial PRIMARY KEY,
  tags text[]
);

You want to find all photos tagged with “capybara”. You can create a GIN index on the tags column:

CREATE INDEX idx_photos_tags ON photos USING GIN(tags);
SELECT * FROM photos WHERE tags @> ARRAY['capybara'];

(The @> operator means “contains” or “is a superset of”.)

The index lets PostgreSQL find matching rows quickly, without scanning the entire table.

JSONB example

PostgreSQL’s jsonb type lets you store flexible key-value data. Imagine a table of users with extra info stored in a jsonb column:

CREATE TABLE users (
  id   serial PRIMARY KEY,
  data jsonb
);

One row might store {"age": 42, "city": "Karlsruhe"}. To find all users from New York, you can use:

SELECT * FROM users WHERE data @> '{"city": "Karlsruhe"}';

And again, with a GIN index on the data column, this query becomes much faster:

CREATE INDEX idx_users_data ON users USING GIN(data);

Things to keep in mind

GIN indexes are very powerful, but they come with some tradeoffs. They’re slower to build and can make insert or update operations a bit heavier. So they’re best when you read (search) data often, but don’t write to the table constantly.

In short, GIN indexes are your friend when you’re dealing with columns that contain multiple values – like arrays, full-text data, or JSON. They let PostgreSQL break apart those values and build a fast lookup system. If your queries feel slow and you’re working with these kinds of columns, adding a GIN index might be exactly what you need.

Deferred Constraints in Oracle DB

Foreign key constraints are like rules in your Oracle database that make sure data is linked properly between tables. For example, you can’t add an order for a customer who doesn’t exist – that’s the kind of thing a foreign key will stop. They help enforce data integrity by ensuring that relationships between tables remain consistent. But hidden in the toolbox of Oracle Database is a lesser-known trick: deferred foreign key constraints.

What Are Deferred Constraints?

By default, when you insert or update data that violates a foreign key constraint, Oracle will throw an error immediately. That’s immediate constraint checking.

But with deferred constraints, Oracle lets you temporarily violate a constraint during a transaction – as long as the constraint is satisfied by the time the transaction is committed.

Here’s how you make a foreign key deferrable:

ALTER TABLE orders
  ADD CONSTRAINT fk_orders_customer
  FOREIGN KEY (customer_id)
  REFERENCES customers(customer_id)
  DEFERRABLE INITIALLY DEFERRED;

That last part – DEFERRABLE INITIALLY DEFERRED – is the secret sauce. Now, the constraint check for fk_orders_customer is deferred until the COMMIT.

Use Cases

Let’s look at a few situations where this is really helpful.

One use case are circular references between tables. Say you have two tables: one for employees, one for departments. Each employee belongs to a department. But each department also has a manager – who is an employee. You end up in a “chicken and egg” situation. Which do you insert first? With deferred constraints, it doesn’t matter – you can insert them in any order, and Oracle will only check everything after you’re done.

Another use case is the bulk import of data. If you’re importing a bunch of data (like copying from another system), it can be really hard to insert things in the perfect order to keep all the foreign key rules happy. Deferred constraints let you just insert everything, then validate it all at the end with one COMMIT.

Deferred constraints also help when dealing with temporary incomplete data: Let’s say your application creates a draft invoice before all the customer info is ready. Normally, this would break a foreign key rule. But if the constraint is deferred, Oracle gives you time to finish adding all the pieces before checking.

Caution

Using deferred constraints recklessly can lead to runtime surprises. Imagine writing a huge batch job that appears to work fine… until it crashes at COMMIT with a constraint violation error – rolling back the entire transaction. So only defer constraints when you really need to.

One last tip

If you want to check if a constraint is deferrable in your database you can use the following SQL query:

SELECT constraint_name, deferrable, deferred
  FROM user_constraints
 WHERE table_name='ORDERS';

Inline and Implicit Foreign Key Constraints in SQL

Foreign key constraints are a key part of database design, ensuring that relationships between tables are consistent and reliable. They create a relationship between two tables, ensuring that data matches across them. For example, a column in an “Orders” table (like CustomerID) might reference a column in a “Customers” table. This guarantees that every order belongs to a valid customer.

In earlier versions of SQL systems, defining foreign key constraints often required separate ALTER TABLE statements after the table was created:

CREATE TABLE Orders (
  OrderID    INT PRIMARY KEY,
  CustomerID INT NOT NULL,
  OrderDate  DATE
);

ALTER TABLE Orders
ADD CONSTRAINT FK_Customer FOREIGN KEY (CustomerID)
REFERENCES Customers(CustomerID);

This two-step process was prone to errors and required careful management to ensure all constraints were applied correctly.

Inline Foreign Key Constraints

Most of the popular SQL database systems – PostgreSQL, Oracle, SQL Server, and MySQL since version 9.0, released in July 2024 – now support inline foreign key constraints. This means you can define the relationship directly in the column definition, making table creation easier to read:

CREATE TABLE Orders (
  OrderID    INT PRIMARY KEY,
  CustomerID INT NOT NULL REFERENCES Customers(CustomerID),
  OrderDate  DATE
);

Fortunately, this syntax is the same across these systems. However, MySQL 9 additionally supports implicit foreign key constraints:

CREATE TABLE Orders (
  OrderID    INT PRIMARY KEY,
  CustomerID INT NOT NULL REFERENCES Customers,
  OrderDate  DATE
);

By leaving out the (CustomerID) in the REFERENCES clause it will assume that you want to reference the primary key of the parent table. This syntax is unique to MySQL, and you should avoid it if you need to write SQL DDL statements that works across multiple database systems.

Dependent Subqueries and LATERAL in PostgreSQL

When working with databases, you often need to run a query that depends on results from another query. PostgreSQL offers two main ways to handle this: Dependent Subqueries and LATERAL joins.

A dependent subquery is like a mini-query inside another query. The inner query (subquery) depends on the outer query for its input. For every row in the outer query, the subquery runs separately.

Imagine you have two tables: customers, which holds information about customers (e.g., their id and name), and orders, which holds information about orders (e.g., customer_id, order_date, and amount). Now, you want to find the latest order date for each customer. You can use a dependent subquery like this:

SELECT id AS customer_id,
  name,
  (SELECT order_date
     FROM orders
    WHERE customer_id=customers.id
    ORDER BY order_date DESC
    LIMIT 1) AS latest_order_date
FROM customers;

For each customer in the customers table, the subquery looks for their orders in the orders table. The subquery sorts the orders by date (ORDER BY order_date DESC) and picks the most recent one (LIMIT 1).

This works, but it has a drawback: If you have many customers, this approach can be slow because the subquery runs once for every customer.

LATERAL join

A LATERAL join is a smarter way to solve the same problem. It allows you to write a subquery that depends on the outer query, but instead of repeating the subquery for every row, PostgreSQL handles it more efficiently.

Let’s solve the “latest order date” problem using LATERAL:

SELECT
  c.id AS customer_id,
  c.name,
  o.order_date AS latest_order_date
FROM customers c
LEFT JOIN LATERAL (
  SELECT order_date
   FROM orders
  WHERE orders.customer_id=c.id
  ORDER BY order_date DESC
  LIMIT 1
) o ON TRUE;

For each customer (c.id), the LATERAL subquery finds the latest order in the orders table. The LEFT JOIN ensures that customers with no orders are included in the results, with NULL for the latest_order_date.

It’s easier to read and often faster, especially for large datasets, because PostgreSQL optimizes it better.

Both dependent subqueries and LATERAL joins allow you to handle scenarios where one query depends on the results of another. Dependent subqueries are straightforward and good enough for simple tasks with small datasets. However, you should consider using LATERAL for large datasets where performance matters.

Oracle DB: How to Pick the Right Function for Current Date and Time


When working with date and time in Oracle, you have several functions available to get the current date and time. Three important ones are CURRENT_DATE, CURRENT_TIMESTAMP, and SYSDATE. Let’s see how they are different and when you should use each one.

The CURRENT_DATE function gives you the current date and time based on the time zone of the session you are in. It returns this information in a simple DATE format, which includes the date and time up to the second but does not show fractions of a second or the time zone. For example, if you run:

SELECT CURRENT_DATE FROM dual;

You might get a result like 29-JUL-24 03:43:19 PM. This shows the current date and time according to your session’s time zone.

You can set the session’s time zone to a specific offset from UTC. For example, to set the time zone to UTC+5:30:

ALTER SESSION SET TIME_ZONE = '+05:30';

Use CURRENT_DATE when you need the date and time for tasks that are specific to a certain time zone. It’s good for simple reports or calculations that don’t need to worry about fractions of a second or different time zones.

The CURRENT_TIMESTAMP function provides more detail. It gives you the current date and time, including fractions of a second and the time zone. This function returns the value in the TIMESTAMP WITH TIME ZONE format. For example, if you run:

SELECT CURRENT_TIMESTAMP FROM dual;

You might see something like 29-JUL-24 15.43.19.123456 PM +01:00. This includes the date, time, fractions of a second, and the time zone offset.

Use CURRENT_TIMESTAMP when you need precise time details, such as for logging events, tracking changes, or working across different time zones. It’s useful when you need to know the exact time down to the fraction of a second and the time zone.

The SYSDATE function gives you the current date and time from the database server’s clock. It’s similar to CURRENT_DATE in that it returns the date and time up to the second but not fractions of a second or time zone information. For example, if you run:

SELECT SYSDATE FROM dual;

You might get 29-JUL-24 03:43:19 PM. This shows the current date and time based on the server’s clock.

Use SYSDATE when you need the current date and time according to the server, not the session. This is helpful for server-side operations, scheduling tasks, and ensuring consistency across database operations that rely on the server’s time.

With this information, you should now be able to choose the right function for your use case.

Defeating TimescaleDB or shooting yourself in the foot

Today, almost everything produces periodically data: A car, cloud-connected smart-home solutions, your “smart” refrigerator and yourself using a fitness tracker. This data has to be stored in a certain way to be usable. We want fast access, beautiful charts and different aggregations over time.

There a several options both free and commercial for storing such time-series data like InfluxDB, Prometheus or TimescaleDB. They are optimised for this kind of data taking advantage of different time-series properties:

  • Storing is (mostly) append-only
  • Data points have a (strict) ordering in time
  • Data points often have fixed and variable meta data in addition to the data value itself

In one of our projects we have to store large quantities of data every 200ms. Knowing PostgreSQL as a powerful relational database management system we opted for TimescaleDB that promises time-series features for an SQL database.

Ingestion, probably the biggest problem of traditional (SQL) databases, of the data worked as expected from the beginning. We were able to insert new data at a constant rate without degrading performance.

The problem

However we got severe performance problems when querying data after some time…

So one of the key points of using a time-series database strangely did not work for us… Fortunately, since Timescale is only an extension to PostgreSQL we could use just use explain/explain analyse with our slow queries to find out what was going on.

It directly showed us that *all* chunks of our hypertable were queried instead of only the ones containing the data we were looking for. Checking our setup and hypertable definition showed that Timescale itself was working as expected and chunking the data correctly on the time axis.

After a bit more analysis and thought is was clear: The problem was our query! We used something like start_time <= to AND from <= end_time in our where clause to denote the time interval containing the requested data. Here start_time is the partition column of our hypertable.

This way we were forcing the database to look at all chunks from long ago until our end-time timestamp.

The solution

Ok, so we have to reformulate our where clause that only the relevant chunks are queried. Timescale can easily do this when we do something like start_time >= from AND start_time < to where from and to are the start and end of our desired interval. That way usually only one or only a few chunks of the hypertable are search and everything is lighning-fast even with billions of rows in our hypertable.

Of course the results of our two queries are not 100% the same sematically. But you can easily achieve the desired results by correctly calculation the start and end of the time-series data to fetch.

Conclusion

Time-series databases are powerful tools for todays data requirements. As with other tools you need some understanding of the underlying concepts and sometimes even implementation to not accidently defeat the purpose and features of the tools.

Used in the correct way they enable you to work with large amounts of data in a performant way todays users would expect. We had similar experiences with full text search engines like solr and ElasticSearch, too.

Time Intervals in SQL

SQL not only supports date and timestamp data types but also time intervals.

An interval in SQL represents a period of time, such as days, hours, minutes, or seconds. SQL supports intervals in two main types: INTERVAL YEAR TO MONTH and INTERVAL DAY TO SECOND.

The two types

YEAR TO MONTH is used to represent a span of years and months. Examples include calculating ages, durations of contracts, or project timelines.

DAY TO SECOND is used to represent a span of days, hours, minutes, and seconds. Examples include scheduling events, logging activities, and tracking precise durations.

-- Represents 1 year and 6 months
INTERVAL '1-6' YEAR TO MONTH

-- Represents 5 days, 12 hours, 30 minutes, and 45 seconds
INTERVAL '5 12:30:45' DAY TO SECOND

This distinction may seem arbitrary at first, but it aligns with real-world use cases. An interval of “2 months” is clearly different from “60 days,” even though they might seem similar. The former is handled as a “YEAR TO MONTH” interval, and the latter as a “DAY TO SECOND” interval, making it clear how operations should be performed. Storing only the relevant units also reduces the amount of space required. For instance, storing a “YEAR TO MONTH” interval doesn’t need to allocate space for hours or seconds, thus saving storage.

Despite its name a “DAY TO SECOND” interval can also include fractional seconds:

INTERVAL '1 12:30:45.123' DAY TO SECOND

There are also shorter forms of interval literals if you only want to use one or several units of the interval:

For “YEAR TO MONTH”:

INTERVAL '3' YEAR   -- 3 years
INTERVAL '5' MONTH  -- 5 months

For “DAY TO SECOND”:

INTERVAL '2' DAY                -- 2 days
INTERVAL '5:30' HOUR TO MINUTE  -- 5 hours and 30 minutes
INTERVAL '15' MINUTE            -- 15 minutes
INTERVAL '30' SECOND            -- 30 seconds
INTERVAL '45.678' SECOND        -- 45.678 seconds

Note that the syntax always uses the singular forms of the unit words.

The two types can be used as data types for table columns:

CREATE TABLE interval_examples (
  id NUMBER,
  a INTERVAL YEAR TO MONTH,
  b INTERVAL DAY TO SECOND
);

Calculations with intervals

Intervals can be added to or subtracted from dates to calculate future or past dates.

-- Adding 10 days to the current date
CURRENT_DATE + INTERVAL '10' DAY

-- Subtracting 3 months from a specific date
DATE '2024-06-14' - INTERVAL '3' MONTH

It’s important to understand that intervals of type YEAR TO MONTH do not have a fixed duration. The number of days added to CURRENT_TIMESTAMP when you add an INTERVAL ‘1’ MONTH depends on the current month and year, as different months have different numbers of days.

Intervals can also be compared to each other with the usual operators such as =, <, <=, >, >=, <>.

Combining both types

Suppose you need to add an interval of 1 year, 2 months, 3 days, 4 hours, 5 minutes, and 6 seconds to the current timestamp. In this case you need to combine intervals of the two types:

SELECT (CURRENT_TIMESTAMP + INTERVAL '1-2' YEAR TO MONTH) + INTERVAL '3 04:05:06' DAY TO SECOND AS result;

Notes on PostgreSQL

PostgreSQL supports the above standard interval syntax, but it also supports a unified syntax for intervals with the following syntax:

INTERVAL '1 year 2 months'
INTERVAL '10 days 12 hours 30 minutes 45.123 seconds'
INTERVAL '1 year 2 months 3 days 4 hours 5 minutes 6 seconds'

In fact, in PostgreSQL, YEAR TO MONTH and DAY TO SECOND are not a distinct data type on their own; they are specializations of the broader INTERVAL data type.

It also supports multiplication and division of intervals:

-- Results in an interval of 6 days
SELECT INTERVAL '2 days' * 3 AS multiplied_interval;

-- Results in an interval of 2 hours
SELECT INTERVAL '3 hours' / 1.5 AS divided_interval;

Unveiling the secrets of invisible database columns

After my last blog post, where I wrote about Generated and Virtual Columns, I would like to dedicate this post to another type of database column: Invisible Columns. This feature exists in MySQL since version 8.0 and in Oracle Database since version 12c. PostgreSQL and MS SQL Server do not support this feature.

Invisible columns, as the name suggests, are columns within a table that are hidden from standard query results by default. Unlike traditional columns that are visible and accessible in query results, invisible columns are not included unless explicitly specified in the query.

This feature provides a level of control over data visibility, allowing developers to hide certain columns from applications or other database users while still retaining their functionality within the database.

Defining invisible columns

When creating a table in MySQL or Oracle, you can designate certain columns as invisible by using the INVISIBLE keyword in the column definition. For example:

CREATE TABLE your_table (
  visible_column   INT,
  invisible_column INT INVISIBLE
);

In this example, the invisible_column is marked as invisible, while the visible_column remains visible by default. To alter an existing table and make a column invisible:

ALTER TABLE your_table
  MODIFY COLUMN existing_column_name INVISIBLE;

Replace your_table with the name of your table and existing_column_name with the name of the column you want to make invisible.

When querying the your_table, the invisible column will not be included in the result set unless explicitly specified:

SELECT * FROM your_table;

visible_column
--------------
   4
   8
  15

By default, invisible columns are hidden from query results, providing a cleaner and more concise view of the data. However, developers can still access invisible columns when needed by explicitly including them in the query:

SELECT visible_column, invisible_column FROM your_table;

visible_column | invisible_column
---------------------------------
   4           |   16
   8           |   23
  15           |   42
Unveiling invisible columns

To list the invisible columns of a table in MySQL, you can query the information_schema.columns system table and filter the results based on the COLUMN_DEFAULT column. Invisible columns have NULL as their default value. Here’s a simple SQL query to accomplish this:

SELECT COLUMN_NAME
  FROM information_schema.columns
  WHERE TABLE_SCHEMA = 'your_database'
    AND TABLE_NAME = 'your_table'
    AND COLUMN_DEFAULT IS NULL;

In Oracle, you can query the USER_TAB_COLUMNS or ALL_TAB_COLUMNS data dictionary views to list the invisible columns of a table. Here’s how you can do it:

SELECT COLUMN_NAME
  FROM USER_TAB_COLUMNS
  WHERE TABLE_NAME = 'your_table'
    AND INVISIBLE = 'YES';

If you want to list invisible columns from all tables in the current schema, you can use the ALL_TAB_COLUMNS view instead:

SELECT TABLE_NAME, COLUMN_NAME
  FROM ALL_TAB_COLUMNS
  WHERE INVISIBLE = 'YES';
Are invisible columns actually useful?

Invisible columns can make schema evolution easier by providing a flexible mechanism for evolving database schemas over time without disrupting existing applications or queries. You can test new features or data structures without committing to them fully. Invisible columns provide a way to add experimental columns to your tables without exposing them to production environments until they are fully tested and ready for use.

They can create cleaner and more concise views of your data by hiding less relevant columns. This can make it easier for developers, analysts, and users to work with the data without unnecessary clutter. However, I would argue that this is also achievable with normal database views.

The downside of introducing invisible columns is that they add complexity to the database schema, which can make it harder to understand and maintain, especially for developers who are not familiar with the invisible columns feature. They also add potential for confusion: Developers may forget about the presence of invisible columns, leading to unexpected behavior in queries or applications.

You probably shouldn’t use them to hide sensitive data, since invisible columns don’t have any additional access control, and security through obscurity is not a good idea. If you grant SELECT permission on the table to a user, they will be able to query visible and invisible columns alike.

Now that you know about them, you can make your own choice.