Scheduling Jobs in Oracle Database for Reliable Background Tasks

As a software developer, you often write code that must run reliably in the background: data cleanup, periodic recalculations, imports, exports, and batch processing. Pushing this responsibility to external scripts or application-level schedulers is one way to do it, but it can add complexity and increase failure points.

Oracle’s job scheduling feature lets you move this logic into the database, close to the data it operates on, using plain SQL and PL/SQL. The result is simpler code, fewer moving parts, and background tasks that run predictably without constant supervision.

The Scheduler

This feature is provided by the DBMS_SCHEDULER package.

A basic example is a daily cleanup task. Suppose you have a table called ORDERS and you want to remove records older than five years every night. First, you create a stored procedure that performs the cleanup.

CREATE OR REPLACE PROCEDURE cleanup_old_orders IS
BEGIN
  DELETE FROM orders
  WHERE order_date < ADD_MONTHS(SYSDATE, -60);
  COMMIT;
END;
/

Next, you create a scheduler job that runs this procedure every day at 2 a.m.:

BEGIN
  DBMS_SCHEDULER.create_job (
    job_name        => 'cleanup_old_orders_job',
    job_type        => 'STORED_PROCEDURE',
    job_action      => 'CLEANUP_OLD_ORDERS',
    start_date      => TIMESTAMP '2026-01-01 02:00:00',
    repeat_interval => 'FREQ=DAILY;BYHOUR=2;BYMINUTE=0;BYSECOND=0',
    enabled         => TRUE
  );
END;
/

Once enabled, Oracle runs this job automatically every night.

Another common example is running a job at short, repeating intervals. For instance, you may want to refresh a summary table every 10 minutes:

BEGIN
  DBMS_SCHEDULER.create_job (
    job_name        => 'refresh_sales_summary_job',
    job_type        => 'PLSQL_BLOCK',
    job_action      => '
    BEGIN
      refresh_sales_summary;
    END;',
    repeat_interval => 'FREQ=MINUTELY;INTERVAL=10',
    enabled         => TRUE
  );
END;
/

The scheduler can also be used to run jobs only once. For example, you might need to perform a one-time data fix during a maintenance window:

BEGIN
  DBMS_SCHEDULER.create_job (
    job_name   => 'one_time_data_fix_job',
    job_type   => 'PLSQL_BLOCK',
    job_action => '
     BEGIN
       update_customer_status;
     END;',
    start_date => TIMESTAMP '2026-03-15 23:00:00',
    enabled    => TRUE
  );
END;
/
Required privileges

In addition to defining jobs, it is important to understand the required privileges. To create and manage scheduler jobs, a user needs the CREATE JOB privilege. To create jobs in another schema, CREATE ANY JOB is required. Running jobs that use DBMS_SCHEDULER also requires EXECUTE privilege on the DBMS_SCHEDULER package.

If a job runs a stored procedure, the job owner must have direct privileges on the objects used by that procedure, not privileges granted through roles. For example, if a job deletes rows from the ORDERS table, the job owner must have a direct DELETE grant on that table. For external jobs, additional privileges such as CREATE EXTERNAL JOB and specific operating system credentials are required.

These rules ensure that jobs run securely and only perform actions explicitly allowed by the database administrator.

Monitoring

For monitoring, Oracle DB stores job execution details in system views. You can check whether a job is enabled and when it last ran with a simple query:

SELECT job_name, enabled, last_start_date, last_run_duration
  FROM  dba_scheduler_jobs
  WHERE job_name = 'CLEANUP_OLD_ORDERS_JOB';

To see errors and execution history, you can query the job run details:

SELECT job_name, status, actual_start_date, run_duration, error#
  FROM  dba_scheduler_job_run_details
  WHERE job_name = 'CLEANUP_OLD_ORDERS_JOB'
  ORDER BY actual_start_date DESC;

These examples show how Oracle job scheduling works in practice. You define the SQL or PL/SQL logic, attach it to a schedule, and let the database handle execution and logging. This approach keeps automation close to the data, reduces manual intervention, and makes recurring tasks easier to manage and troubleshoot.

Common SQL Performance Gotchas in Application Development

When building apps that use a SQL database, it’s easy to run into performance problems without noticing. Many of these issues come from the way queries are written and used in the code. Below are seven common SQL mistakes developers make, why they happen, and how you can avoid them.

Not Using Prepared Statements

One of the most common mistakes is building SQL queries by concatenating strings. This approach not only introduces the risk of SQL injection but also prevents the database from reusing execution plans. Prepared statements or parameterized queries let the database understand the structure of the query ahead of time, which improves performance and security. They also help avoid subtle bugs caused by incorrect string formatting or escaping.

// Vulnerable and inefficient
String userId = "42";
Statement stmt = connection.createStatement();
ResultSet rs = stmt.executeQuery("SELECT * FROM users WHERE id = " + userId);
// Safe and performant
String sql = "SELECT * FROM users WHERE id = ?";
PreparedStatement ps = connection.prepareStatement(sql);
ps.setInt(1, 42);
ResultSet rs = ps.executeQuery();

The N+1 Query Problem

The N+1 problem happens when an application fetches a list of items and then runs a separate query for each item to retrieve related data. For example, fetching a list of users and then querying each user’s posts in a loop. This results in one query to fetch the users and N additional queries for their posts. The fix is to restructure the query using joins or batch-fetching strategies, so all the data can be retrieved in fewer queries.

We have written about it on our blog before: Understanding, identifying and fixing the N+1 query problem

Missing Indexes

When queries filter or join on columns that do not have indexes, the database may need to scan entire tables to find matching rows. This can be very slow, especially as data grows. Adding the right indexes can drastically improve performance. It’s important to monitor slow queries and check whether indexes exist on the columns used in WHERE clauses, JOINs, and ORDER BY clauses.

Here’s how to create an index on an “orders” table for its “customer_id” column:

CREATE INDEX idx_orders_customer_id ON orders(customer_id);

Once the index is added, the query can efficiently find matching rows without scanning the full table.

Retrieving Too Much Data

Using SELECT * to fetch all columns from a table is a common habit, but it often retrieves more data than the application needs. This can increase network load and memory usage. Similarly, not using pagination when retrieving large result sets can lead to long query times and a poor user experience. Always select only the necessary columns and use LIMIT or OFFSET clauses to manage result size.

For example:

String sql = "SELECT id, name, price FROM products LIMIT ? OFFSET ?";
PreparedStatement ps = connection.prepareStatement(sql);
ps.setInt(1, 50);
ps.setInt(2, 0);
ResultSet rs = ps.executeQuery();

Chatty Database Interactions

Some applications make many small queries in a single request cycle, creating high overhead from repeated database access. Each round-trip to the database introduces latency. Here’s an inefficient example:

for (int id : productIds) {
    PreparedStatement ps = connection.prepareStatement(
        "UPDATE products SET price = price * 1.1 WHERE id = ?"
    );
    ps.setInt(1, id);
    ps.executeUpdate();
}

Instead of issuing separate queries, it’s often better to combine them or use batch operations where possible. This reduces the number of database interactions and improves overall throughput:

PreparedStatement ps = connection.prepareStatement(
    "UPDATE products SET price = price * 1.1 WHERE id = ?"
);

for (int id : productIds) {
    ps.setInt(1, id);
    ps.addBatch();
}
ps.executeBatch();

Improper Connection Pooling

Establishing a new connection to the database for every query or request is slow and resource-intensive. Connection pooling allows applications to reuse database connections, avoiding the cost of repeatedly opening and closing them. Applications that do not use pooling efficiently may suffer from connection exhaustion or high latency under load. To avoid this use a connection pooler and configure it with appropriate limits for the workload.

Unbounded Wildcard Searches

Using wildcard searches with patterns like '%term%' in a WHERE clause causes the database to scan the entire table, because indexes cannot be used effectively. These searches are expensive and scale poorly. To handle partial matches more efficiently, consider using full-text search features provided by the database, which are designed for fast text searching. Here’s an example in PosgreSQL:

SELECT * FROM articles
WHERE to_tsvector('english', title) @@ to_tsquery('database');

One of our previous blog posts dives deeper into this topic: Full-text Search with PostgreSQL.

By being mindful of these common pitfalls, you can write SQL that scales well and performs reliably under load. Good database performance isn’t just about writing correct queries – it’s about writing efficient ones.

Have you faced any of these problems before? Every project is different, and we all learn a lot from the challenges we run into. Feel free to share your experiences or tips in the comments. Your story could help someone else improve their app’s performance too.

Oracle and the materialized view update

Materialized views are powerful. They give us precomputed, queryable snapshots of expensive joins and aggregations. But the moment you start layering other views on top of them, you enter tricky territory.

The Scenario

You define a materialized view to speed up a reporting query. Soon after, others discover it and start building new views on top of it. The structure spreads.

Now imagine: you need to extend the base materialized view. Maybe add a column, or adjust its definition. That’s when the trouble starts.

The Problem

Unlike regular views, materialized views don’t offer a convenient CREATE OR REPLACE. You can’t just adjust the definition in place. Oracle also doesn’t allow a simple ALTER to add a column or tweak the structure—recreating the materialized views is often the only option.

Things get even more complicated when other views depend on your materialized view. In that case, Oracle won’t even let you drop it. Instead, you’re greeted with an error about dependent objects, leaving you stuck in a dependency lock-in.

The more dependencies there are, the more brittle the setup becomes. What started as a performance optimization can lock you into a rigid structure that resists change.

As a short example, let’s look at how other databases handle this scenario. In Postgres, you can drop a materialized view even if other views depend on it. The dependent views temporarily lose their base and will fail if queried, but you won’t get an error on the drop. Once you recreate the materialized view with the same name and structure, the dependent views automatically start working again.

What to Do?

That is the hard question. Sometimes you can try to hide materialized views behind stable views. Or you take the SQL of all dependent views, drop them, change the materialized view, and then recreate all dependent views— a process that can be a huge pain.

How do you manage changes to materialized views that already have dependent views stacked on top? Do you design around it, fight with rebuild scripts every time, or have another solution?

Tuning Without Dropping: Oracle’s Invisible Indexes

When tuning database performance, removing unused indexes can help reduce write overhead and improve efficiency. But dropping an index outright is risky, especially in production systems, because it’s hard to know for sure whether it’s still needed. Oracle Database offers a practical solution: invisible indexes. This feature allows you to hide an index from the optimizer without deleting it, giving you a safe way to test and fine-tune your indexing strategy.

An invisible index behaves like a regular index in most respects. It is maintained during inserts, updates, and deletes. It consumes storage and has a presence in the data dictionary. However, it is ignored by the optimizer when generating execution plans for SQL statements unless explicitly instructed to consider it.

Creating an Invisible Index

To define an index as invisible at creation time, the INVISIBLE keyword is used:

CREATE INDEX idx_salary ON employees(salary) INVISIBLE;

In this example, the index on the salary column will be maintained, but Oracle’s optimizer will not consider it when evaluating execution plans.

Making an Existing Index Invisible

You can also alter an existing index to become invisible:

ALTER INDEX idx_salary INVISIBLE;

To revert the change and make the index visible again:

ALTER INDEX idx_salary VISIBLE;

This change is instantaneous and does not require the index to be rebuilt. It is also fully reversible.

Verifying Index Visibility

To check the visibility status of an index, query the DBA_INDEXES or USER_INDEXES data dictionary view:

SELECT index_name, visibility
  FROM user_indexes
  WHERE table_name = 'EMPLOYEES';

This will show whether each index is VISIBLE or INVISIBLE.

Forcing the Optimizer to Use Invisible Indexes

By default, the optimizer does not use invisible indexes. However, you can enable their use in a specific session by setting the following parameter:

ALTER SESSION SET OPTIMIZER_USE_INVISIBLE_INDEXES = TRUE;

With this setting in place, the optimizer will consider invisible indexes as if they were visible. This is particularly useful when testing query performance with and without a specific index.

Alternatively, you can use a SQL hint to explicitly direct the optimizer to use a specific index, even if it is invisible:

SELECT /*+ INDEX(employees idx_salary) */ *
  FROM employees
  WHERE salary > 100000;

This gives fine-grained control over execution plans without changing global settings or making the index permanently visible.

Use Cases for Invisible Indexes

Invisible indexes are helpful in scenarios where performance needs to be tested under different indexing strategies. For example, if you suspect that an index is unused or causing performance issues, you can make it invisible and observe how queries behave without it. This avoids the risk of dropping an index that might still be needed.

Invisible indexes also provide a safe way to prepare for index removal in production systems. If no queries rely on the index while it is invisible, it is likely safe to drop it later.

They can also be used for temporarily disabling indexes during bulk data loads, without affecting the application logic that relies on the schema.

Exploring Dynamic SQL in Oracle: Counting Specific Values Across Multiple Tables

Imagine you have a large database where multiple tables contain a column named BOOK_ID. Perhaps you’re tasked with finding how many times a particular book (say, with an ID of 12345) appears across all these tables. How would you approach this?

In this post, I’ll explain a small piece of Oracle PL/SQL code that uses dynamic SQL to search for a specific value in any table that has a column with a specific name.

Finding the relevant tables

First, we need to determine which tables in the schema have a column with the desired name. To do this, we must look at a metadata table or view that contains information about all the tables in the schema and their columns. Most database systems offer such metadata tables, although their names and their structures vary greatly between different systems. In Oracle, the metadata table relevant to our task is called DBA_TAB_COLUMNS. Therefore, to find the names of all tables that contain a column BOOK_ID, you can use the following query:

SELECT table_name 
  FROM dba_tab_columns 
  WHERE column_name = 'BOOK_ID';

The output might be:

TABLE_NAME
-----------------
BOOKS
LIBRARY_BOOKS
ARCHIVED_BOOKS
TEMP_BOOKS
OLD_BOOKS

Looping through the tables

Now we want to loop through these tables in order to execute an SQL query for each of them. In Oracle we use a PL/SQL FOR loop to do this:

BEGIN
  FOR rec IN (SELECT table_name 
              FROM dba_tab_columns 
              WHERE column_name = 'BOOK_ID')
  LOOP
    -- do something for each record
  END LOOP;
END;
/

Dynamic SQL Construction

We can use the loop variable rec to dynamically create an SQL statement as a string by using the string concatenation operator || and assign it to a variable, in this case v_sql:

v_sql := 'SELECT COUNT(BOOK_ID) FROM ' || rec.table_name || ' WHERE BOOK_ID = :val';

The :val part is a placeholder that will become relevant later.

Of course, the variable needs to be declared for the PL/SQL code block first, so we add a DECLARE section:

DECLARE
  v_sql VARCHAR2(4000);
BEGIN
  -- ...
END;

How do we execute the SQL statement that we stored in the variable? By using the EXECUTE IMMEDIATE statement and two other variables; let’s call them v_result and v_value:

EXECUTE IMMEDIATE v_sql INTO v_result USING v_value;

This will execute the SQL in v_sql, replace the :val placeholder by the value in v_value, and store the result in v_result. The latter will capture the result of our dynamic query, which is the count of occurrences.

Of course, we have to declare these two variables as well. We’ll set v_value to the book ID we are looking for. The whole code so far is:

DECLARE
  v_sql VARCHAR2(4000);
  v_value NUMBER := 12345;  -- Value to search for
  v_result VARCHAR2(4000);
BEGIN
  FOR rec IN (SELECT table_name 
              FROM dba_tab_columns 
              WHERE column_name = 'BOOK_ID')
  LOOP
    v_sql := 'SELECT COUNT(BOOK_ID) FROM ' || rec.table_name || ' WHERE BOOK_ID = :val';

    EXECUTE IMMEDIATE v_sql INTO v_result USING v_value;
  END LOOP;
END;
/

Printing the results

If we execute the code above, we might be a little disappointed because it is accepted and executed without any errors, but nothing is printed out. How can we see the results? For that, we need to include DBMS_OUTPUT.PUT_LINE calls:

DBMS_OUTPUT.PUT_LINE('Found in ' || rec.table_name || ': ' || v_result);

But how do we handle the cases if no record was found or if there was an error in the SQL query? We’ll wrap it in an EXCEPTION handling block:

BEGIN
  EXECUTE IMMEDIATE v_sql INTO v_result USING v_value;
  DBMS_OUTPUT.PUT_LINE('Found in ' || rec.table_name || ': ' || v_result);
EXCEPTION
  WHEN NO_DATA_FOUND THEN
    NULL;  -- No matching rows found in this table
  WHEN OTHERS THEN
    DBMS_OUTPUT.PUT_LINE('Error in ' || rec.table_name || ': ' || SQLERRM);
END;

There’s still one thing to do. We first have to enable server output to see any of the printed lines:

SET SERVEROUTPUT ON;

This command ensures that any output generated by DBMS_OUTPUT.PUT_LINE will be displayed in your SQL*Plus or SQL Developer session.

Let’s put it all together. Here’s the full code:

SET SERVEROUTPUT ON;

DECLARE
  v_sql VARCHAR2(4000);
  v_value NUMBER := 12345;  -- Value to search for
  v_result VARCHAR2(4000);
BEGIN
  FOR rec IN (SELECT table_name 
              FROM dba_tab_columns 
              WHERE column_name = 'BOOK_ID')
  LOOP
    v_sql := 'SELECT COUNT(BOOK_ID) FROM ' || rec.table_name || ' WHERE BOOK_ID = :val';

    BEGIN
      EXECUTE IMMEDIATE v_sql INTO v_result USING v_value;
      DBMS_OUTPUT.PUT_LINE('Found in ' || rec.table_name || ': ' || v_result);
    EXCEPTION
      WHEN NO_DATA_FOUND THEN
        NULL; -- No rows found in this table
      WHEN OTHERS THEN
        DBMS_OUTPUT.PUT_LINE('Error in ' || rec.table_name || ': ' || SQLERRM);
    END;
  END LOOP;
END;
/

Here’s what the output might look like when the block is executed:

Found in BOOKS: 5
Found in LIBRARY_BOOKS: 2
Found in ARCHIVED_BOOKS: 0

Alternatively, if one of the tables throws an error (for instance, due to a permissions issue or if the table doesn’t exist in the current schema), you might see an output like this:

Found in BOOKS: 5
Error in TEMP_BOOKS: ORA-00942: table or view does not exist
Found in LIBRARY_BOOKS: 2

Conclusion

Dynamic SQL is particularly useful when the structure of your query is not known until runtime. In this case, since the table names come from a data dictionary view (dba_tab_columns), the query must be constructed dynamically.

Instead of writing a separate query for each table, the above code automatically finds and processes every table with a BOOK_ID column. It works on any table with the right column, making it useful for large databases.

Building and running SQL statements on the fly allows you to handle tasks that are not possible with static SQL alone.

Oracle DB: How to Pick the Right Function for Current Date and Time


When working with date and time in Oracle, you have several functions available to get the current date and time. Three important ones are CURRENT_DATE, CURRENT_TIMESTAMP, and SYSDATE. Let’s see how they are different and when you should use each one.

The CURRENT_DATE function gives you the current date and time based on the time zone of the session you are in. It returns this information in a simple DATE format, which includes the date and time up to the second but does not show fractions of a second or the time zone. For example, if you run:

SELECT CURRENT_DATE FROM dual;

You might get a result like 29-JUL-24 03:43:19 PM. This shows the current date and time according to your session’s time zone.

You can set the session’s time zone to a specific offset from UTC. For example, to set the time zone to UTC+5:30:

ALTER SESSION SET TIME_ZONE = '+05:30';

Use CURRENT_DATE when you need the date and time for tasks that are specific to a certain time zone. It’s good for simple reports or calculations that don’t need to worry about fractions of a second or different time zones.

The CURRENT_TIMESTAMP function provides more detail. It gives you the current date and time, including fractions of a second and the time zone. This function returns the value in the TIMESTAMP WITH TIME ZONE format. For example, if you run:

SELECT CURRENT_TIMESTAMP FROM dual;

You might see something like 29-JUL-24 15.43.19.123456 PM +01:00. This includes the date, time, fractions of a second, and the time zone offset.

Use CURRENT_TIMESTAMP when you need precise time details, such as for logging events, tracking changes, or working across different time zones. It’s useful when you need to know the exact time down to the fraction of a second and the time zone.

The SYSDATE function gives you the current date and time from the database server’s clock. It’s similar to CURRENT_DATE in that it returns the date and time up to the second but not fractions of a second or time zone information. For example, if you run:

SELECT SYSDATE FROM dual;

You might get 29-JUL-24 03:43:19 PM. This shows the current date and time based on the server’s clock.

Use SYSDATE when you need the current date and time according to the server, not the session. This is helpful for server-side operations, scheduling tasks, and ensuring consistency across database operations that rely on the server’s time.

With this information, you should now be able to choose the right function for your use case.

Unveiling the secrets of invisible database columns

After my last blog post, where I wrote about Generated and Virtual Columns, I would like to dedicate this post to another type of database column: Invisible Columns. This feature exists in MySQL since version 8.0 and in Oracle Database since version 12c. PostgreSQL and MS SQL Server do not support this feature.

Invisible columns, as the name suggests, are columns within a table that are hidden from standard query results by default. Unlike traditional columns that are visible and accessible in query results, invisible columns are not included unless explicitly specified in the query.

This feature provides a level of control over data visibility, allowing developers to hide certain columns from applications or other database users while still retaining their functionality within the database.

Defining invisible columns

When creating a table in MySQL or Oracle, you can designate certain columns as invisible by using the INVISIBLE keyword in the column definition. For example:

CREATE TABLE your_table (
  visible_column   INT,
  invisible_column INT INVISIBLE
);

In this example, the invisible_column is marked as invisible, while the visible_column remains visible by default. To alter an existing table and make a column invisible:

ALTER TABLE your_table
  MODIFY COLUMN existing_column_name INVISIBLE;

Replace your_table with the name of your table and existing_column_name with the name of the column you want to make invisible.

When querying the your_table, the invisible column will not be included in the result set unless explicitly specified:

SELECT * FROM your_table;

visible_column
--------------
   4
   8
  15

By default, invisible columns are hidden from query results, providing a cleaner and more concise view of the data. However, developers can still access invisible columns when needed by explicitly including them in the query:

SELECT visible_column, invisible_column FROM your_table;

visible_column | invisible_column
---------------------------------
   4           |   16
   8           |   23
  15           |   42
Unveiling invisible columns

To list the invisible columns of a table in MySQL, you can query the information_schema.columns system table and filter the results based on the COLUMN_DEFAULT column. Invisible columns have NULL as their default value. Here’s a simple SQL query to accomplish this:

SELECT COLUMN_NAME
  FROM information_schema.columns
  WHERE TABLE_SCHEMA = 'your_database'
    AND TABLE_NAME = 'your_table'
    AND COLUMN_DEFAULT IS NULL;

In Oracle, you can query the USER_TAB_COLUMNS or ALL_TAB_COLUMNS data dictionary views to list the invisible columns of a table. Here’s how you can do it:

SELECT COLUMN_NAME
  FROM USER_TAB_COLUMNS
  WHERE TABLE_NAME = 'your_table'
    AND INVISIBLE = 'YES';

If you want to list invisible columns from all tables in the current schema, you can use the ALL_TAB_COLUMNS view instead:

SELECT TABLE_NAME, COLUMN_NAME
  FROM ALL_TAB_COLUMNS
  WHERE INVISIBLE = 'YES';
Are invisible columns actually useful?

Invisible columns can make schema evolution easier by providing a flexible mechanism for evolving database schemas over time without disrupting existing applications or queries. You can test new features or data structures without committing to them fully. Invisible columns provide a way to add experimental columns to your tables without exposing them to production environments until they are fully tested and ready for use.

They can create cleaner and more concise views of your data by hiding less relevant columns. This can make it easier for developers, analysts, and users to work with the data without unnecessary clutter. However, I would argue that this is also achievable with normal database views.

The downside of introducing invisible columns is that they add complexity to the database schema, which can make it harder to understand and maintain, especially for developers who are not familiar with the invisible columns feature. They also add potential for confusion: Developers may forget about the presence of invisible columns, leading to unexpected behavior in queries or applications.

You probably shouldn’t use them to hide sensitive data, since invisible columns don’t have any additional access control, and security through obscurity is not a good idea. If you grant SELECT permission on the table to a user, they will be able to query visible and invisible columns alike.

Now that you know about them, you can make your own choice.

SQL Database Window Functions

Window functions allow users to perform calculations across a set of rows that are somehow related to the current row. This can include calculations like running totals, moving averages, and ranking without the need to group the entire query into one aggregate result.

Despite their flexibility, window functions are sometimes underutilised, either because users are unaware of them or because they’re considered too complex for everyday tasks. Learning how to effectively use window functions can improve the efficiency and readability of SQL queries, particularly for reporting and data analysis purposes. This article will explore several use cases.

Numbering Rows

The simplest application area for window functions is the numbering of rows. The ROW_NUMBER() function assigns a unique number to each row within the partition of a result set. The numbering is sequential and starts at 1. It’s useful for creating a unique identifier for rows within a partition, even when the rows are identical in terms of data.

Consider the following database table of library checkouts:

bookcheckout_datemember_id
The Great Adventure2024-02-15102
The Great Adventure2024-01-10105
Mystery of the Seas2024-01-20103
Mystery of the Seas2024-03-01101
Journey Through Time2024-02-01104
Journey Through Time2024-02-18102

We want to assign a unique row number to each checkout instance for every book, ordered by the checkout date to analyze the circulation trend:

SELECT
 book,
checkout_date,
member_id,
ROW_NUMBER() OVER (PARTITION BY book ORDER BY checkout_date) AS checkout_order
FROM library_checkouts;

The result:

bookcheckout_datemember_idcheckout_order
The Great Adventure2024-01-101051
The Great Adventure2024-02-151022
Mystery of the Seas2024-01-201031
Mystery of the Seas2024-03-011012
Journey Through Time2024-02-011041
Journey Through Time2024-02-181022

Ranking

In the context of SQL and specifically regarding window functions, “ranking” refers to the process of assigning a unique position or rank to each row within a partition of a result set based on a specified ordering.

The RANK() function provides a ranking for each row within a partition, with gaps in the ranking sequence when there are ties. It’s useful for ranking items that have the same value.

Consider the following database table of scores in a game tournament:

playergamescore
AliceSpace Invaders4200
BobSpace Invaders5700
CharlieSpace Invaders5700
DanaDonkey Kong6000
EveDonkey Kong4800
FrankDonkey Kong6000
AliceAsteroids8500
BobAsteroids9300
CharlieAsteroids7600

We want to rank the players within each game based on their score, with gaps in rank for ties:

SELECT
 player,
 game,
score,
RANK() OVER (PARTITION BY game ORDER BY score DESC) AS rank
FROM scores;

The result looks like this:

playergamescorerank
BobSpace Invaders57001
CharlieSpace Invaders57001
AliceSpace Invaders42003
DanaDonkey Kong60001
FrankDonkey Kong60001
EveDonkey Kong48003
BobAsteroids93001
AliceAsteroids85002
CharlieAsteroids76003

If you don’t want to have gaps in the ranking sequence when there are ties, you can substitute DENSE_RANK() for RANK().

Cumulative Sum

The SUM() function can be used as a window function to calculate the cumulative sum of a column over a partition of rows.

Example: We are tracking our garden’s vegetable harvest in a database table, and we want to calculate the cumulative yield for each type of vegetable over the harvesting season.

vegetableharvest_dateyield_kg
Carrots2024-06-1810
Carrots2024-07-1015
Tomatos2024-06-1520
Tomatos2024-07-0130
Tomatos2024-07-2025
Zucchini2024-06-2015
Zucchini2024-07-0520

We calculate the running total (cumulative yield) for each vegetable type as the season progresses, using the SUM() function:

SELECT
 vegetable,
harvest_date,
yield_kg,
SUM(yield_kg) OVER (PARTITION BY vegetable ORDER BY harvest_date ASC) AS cumulative_yield
FROM garden_harvest;

Now we can see which vegetables are most productive and how yield accumulates throughout the season:

vegetableharvest_dateyield_kgcumulative_yield
Carrots2024-06-181010
Carrots2024-07-101525
Tomatos2024-06-152020
Tomatos2024-07-013050
Tomatos2024-07-202575
Zucchini2024-06-201515
Zucchini2024-07-052035

Time travel with Oracle database’s Flashback Queries

Oracle’s database management system offers a feature known as Flashback Queries, allowing users to peek into the past and retrieve data as it existed at a previous point in time. This functionality can eliminate the need for manual data restoration from backups, making it a useful asset for both developers and database administrators.

Enabling Flashback Queries

Before using Flashback Queries, ensure that the database has the required configuration. Firstly, confirm that the database’s DB_FLASHBACK_RETENTION_TARGET parameter is appropriately set. This parameter defines the period for which historical data is retained. Adjust it based on your organization’s data retention policies. Before making changes, you can check its current value:

SHOW PARAMETER DB_FLASHBACK_RETENTION_TARGET;

Use the ALTER SYSTEM command to set the parameter. For example, to set it to retain data for 7 days:

ALTER SYSTEM SET DB_FLASHBACK_RETENTION_TARGET=10080 SCOPE=BOTH;

10080 is the retention period in minutes (7 days × 24 hours × 60 minutes). Please note that setting the parameter to a higher value consumes more space in the flashback recovery area, so consider your storage constraints. SCOPE=BOTH ensures that the change persists across database restarts, i.e. it changes the value both in memory and in the server parameter file.

To enable Flashback Queries for a specific table, execute the ALTER TABLE command with the FLASHBACK option:

ALTER TABLE table_name FLASHBACK ARCHIVE;

This setup allows Oracle to maintain historical changes for the specified table.

Using Flashback Queries

Consider a scenario where an employee accidentally updates critical data in the employees table. With Flashback Queries, you can rectify the mistake:

SELECT * FROM employees AS OF TIMESTAMP TO_TIMESTAMP('2023-11-10 15:00:00', 'YYYY-MM-DD HH24:MI:SS');

This query retrieves the data from the employees table as it existed before the erroneous update.

You can also recover dropped tables if they are still within the retention period:

FLASHBACK TABLE orders TO BEFORE DROP;

This command restores the dropped table and its data.

Flashback Queries offer a mechanism to navigate through time within a database, providing a simple way to recover historical data or inspect changes. They stand as a useful asset in the arsenal of database administrators and developers, fostering greater confidence in managing data.

Oracle database identity column

A few days ago, I came across the topic of identity columns in Oracle. So I decided to try it out to check if it would benefit my database tables.  I would like to share the results in this article. First, I will explain what identity columns are and how to create them.

General

If you declare a column as an identity column, the value will be automatically populated from an associated sequence generator. It is possible to add an identity column by using either the CREATE TABLE or ALTER TABLE statement. In the example below, I will show how to add it using the CREATE TABLE statement.

The column marked as an identity needs to be of the data type integer, long, or number.

CREATE TABLE EXAMPLE (
    id NUMBER GENERATED ALWAYS AS IDENTITY,
    ...
);

In this case, I ALWAYS want a value to be automatically filled in. There is also the option to only fill it if no value was specified (BY DEFAULT) or when the given value is null (BY DEFAULT ON NULL).

Additionally, you can configure the sequence as shown in the following example.

CREATE TABLE EXAMPLE (
    id NUMBER GENERATED ALWAYS AS IDENTITY
    (START WITH 1 INCREMENT BY 2 MAXVALUE 5000 NO CYCLE CACHE 100),
    ...
);

With these  settings, the sequence starts with 1 and increments by 2, so the values are 1, 3, 5, and so on. The sequence counts up to 2000 and then stop. If CYCLE is set, the sequence continues with 1 again.

The cache defines a client to request 100 pre-defined values in advance. Because of the step width of 2 for example the values 1 – 199 will be assigned. If a second client asks, it will receive the values 201 – 399. As a result, the entries in the table can have the IDs 1, 201, 3, 5, 7, 203 for example.

Using with Ruby on Rails / Active Records

My use case is to use the table in Ruby on Rails with Active Records. Here, I want Active Record to auto-increment the ID, so I can retrieve it in my program. However, if I do not use the identity column feature and also do not specify the name of a manually created sequence, Ruby always returns 0 as the ID. When I add a sequence name, I get the correct ID.

Now I am trying to change this to work with the identity column. The insert in Oracle directly with SQL works fine.

The next step is to use it in my Ruby program, and the first problem occurs. Unfortunately, it is not possible to define my own sequence name. Oracle automatically creates sequences with cryptic names like ISEQ$$_12345 for identity columns which reduces readability in code. Additionally if I recreate the table, the sequence name changes, which means I have to update my code afterwards. This seems to be the first showstopper.

Nonetheless I continue and try to create a new entry with it. The entry was created. It works! Or wait a moment. Ruby always shows 0 instead of the right ID. It seems like the program can not use the sequence correctly.  I halt my testing. Maybe it is possible to make Ruby work with identity sequences, maybe it is not. If you have already found a solution, feel free to comment.

In the end I decide that using the manually created sequence still fit better. The main reason is the naming, so I do not want to spend more time searching for a solution to make it work with Ruby.

Conclusion

Although the identity column option does not fit my use case, it is a possible to create a table column with auto-generated values in Oracle. For situations where the ID is not of interest, it can simplify the work. You do not have to create the logic with sequences and triggers on your own.