JDBC’s wasNull method pitfall

Java’s java.sql package provides a general API for accessing data stored in relational databases. It is part of JDBC (Java Database Connectivity). The API is relatively low-level, and is often used via higher-level abstractions based on JDBC, such as query builders like jOOQ, or object–relational mappers (ORMs) like Hibernate.

If you choose to use JDBC directly you have to be aware that the API relatively old. It was added as part of JDK 1.1 and predates later additions to the language such as generics and optionals. There are also some pitfalls to be avoided. One of these pitfalls is ResultSet’s wasNull method.

The wasNull method

The wasNull method reports whether the database value of the last ‘get’ call for a nullable table column was NULL or not:

int height = resultSet.getInt("height");
if (resultSet.wasNull()) {
    height = defaultHeight;

The wasNull check is necessary, because the return type of getInt is the primitive data type int, not the nullable Integer. This way you can find out whether the actual database value is 0 or NULL.

The problem with this API design is that the ResultSet type is very stateful. Its state does not only change with each row (by calling next method), but also with each ‘get’ method call.

If any other ‘get’ method call is inserted between the original ‘get’ method call and its wasNull check the code will be wrong. Here’s an example. The original code is:

var width = rs.getInt("width");
var height = rs.getInt("height");
var size = new Size(width, rs.wasNull() ? defaultHeight : height);

A developer now wants to add a third dimension to the size:

var width = rs.getInt("width");
var height = rs.getInt("height");
var depth = rs.getInt("depth");
var size = new Size(width, rs.wasNull() ? defaultHeight : height, depth);

It’s easy to overlook the wasNull call, or to wrongly assume that adding another ‘get’ method call is a safe code change. But the wasNull check now refers to “depth” instead of “height”, which breaks the original intention.


So my advice is to wrap the ‘get’ calls for nullable database values in their own methods that return an Optional:

Optional<Integer> getOptionalInt(ResultSet rs, String columnName) {
    final int value = rs.getInt(columnName);
    if (rs.wasNull()) {
        return Optional.empty();
    return Optional.of(value);

Now the default value fallback can be safely applied with the orElse method:

var width = rs.getInt("width");
var height = getOptionalInt(rs, "height").orElse(defaultHeight);
var depth = rs.getInt("depth");
var size = new Size(width, height, depth);

Grammar as a leaky abstraction

Internationalisation, or i18n for short, is the process of making the user interface of a program ready for translation into multiple languages. This usually means to factor out texts from the program source code into separate files, often called translation bundles. These files have a key-value structure. The program code then only refers to keys that are resolved into the actual texts from the translation bundle for the selected target language.

Here’s a simple example of two translation bundles, one for English and one for German:

# translations_en.properties
quit_confirm_message=Do you really want to quit?
# translations_de.properties
quit_confirm_message=Wollen Sie die Anwendung wirklich beenden?

The actual source code might look like this:

var answer = showDialog(

Here the function t looks up the key in the currently active translation bundle and returns the translated message as a string.

How not to do it

Last month I discovered an amusing attempt at internationalisation in a third-party code base. It looked similar to this:

# translations_en.properties
# translations_de.properties

These translation keys were used like this:

"${t('an')} ${t('article')} ${t('is')}
${available ? '' : t('not')} ${t('available')}."

to produce messages like

"An article is available."
"An article is not available."

… or in German:

"Ein Artikel ist verfügbar."
"Ein Artikel ist nicht verfügbar."

Why is this a clumsy attempt at internationalisation?

Because it uses single words as translation units, and it relies on the fact that English and German have the same sentence structure in this particular case. In general, of course, languages do not have the same sentence structure, not even related languages like English and German.

The author of the code also introduced separate translation keys for “a” and “an”. The German translation for both keys was “ein”. The author was lucky so far that all texts with “a” or “an” in this particular program translated to “ein” in German, not “eine”, “einen”, “einem”, “einer”, or “eines”.

How to do it

So what would be the correct way to do it? The internationalisation should have looked like this:

# translations_en.properties
article_available=An article is available.
article_not_available=An article is not available.
# translations_de.properties
article_available=Ein Artikel ist verfügbar.
article_not_available=Ein Artikel ist nicht verfügbar.
available ? t("article_available")
          : t("article_not_available")

By using whole phrases and sentences as translation units the translations into various languages have the freedom to use their own word orders and grammatical structures.

Contiguous date ranges in Oracle SQL

In one of my last posts from a couple of weeks ago I wrote about querying gaps between non-contiguous date ranges in Oracle SQL. This week’s post is about contiguous date ranges.

While non-contiguous date ranges are best represented in a database table with a start_date and an end_date column, it is better to represent contiguous date ranges only by one date column, so that we avoid redundancy and do not have to keep the start date of a date range in sync with the end date of the previous date range. In this post I will use the start date:

CREATE TABLE date_ranges (
name VARCHAR2(100),
start_date DATE

The example content of the table is:

----	----------
A	05/02/2020
B	02/04/2020
C	16/04/2020
D	01/06/2020
E	21/06/2020
F	02/07/2020
G	05/08/2020

This representation means that the date range with the most recent start date does not have an end. The application using this data model can choose whether to interpret this as a date range with an open end or just as the end point for the previous range and not as a date range by itself.

While this is a nice non-redundant representation, it is less convenient for queries where we want to have both a start and an end date per row, for example in order to check wether a given date lies within a date range or not. Luckily, we can transform the ranges with a query:

OVER (ORDER BY start_date)
AS end_date
FROM date_ranges;

As in the previous post on non-contiguous date ranges the LEAD analytic function allows you to access the following row from the current row without using a self-join. Here’s the result:

----	----------	--------
A	05/02/2020	02/04/2020
B	02/04/2020	16/04/2020
C	16/04/2020	01/06/2020
D	01/06/2020	21/06/2020
E	21/06/2020	02/07/2020
F	02/07/2020	05/08/2020
G	05/08/2020	(null)

By using a WITH clause you can use this query like a view and join it with the another table, for example with the join condition that a date lies within a date range:

WITH ranges AS
(SELECT date_ranges.*, LEAD(date_ranges.start_date) OVER (ORDER BY start_date) AS end_date FROM date_ranges)
SELECT timeseries.*, ranges.name
FROM timeseries LEFT OUTER JOIN ranges ON
BETWEEN ranges.start_date AND ranges.end_date;

Changing the keyboard navigation behaviour of form inputs

The default behaviour in HTML forms is that you can move the focus from one input element to the next via the tab key and submit the form via the enter key. This is also how dialogs work on most operating systems when using the native UI components. This behaviour is consistent across all browsers, and changing it messes with the user’s expectations and reduces accessibility. So I would normally advise against changing this behaviour without good reasons.

However, one of our customers wanted a different behaviour for an application developed by us. This application replaced an older application where the enter key did not submit the form, but moved the focus to the next input element. The ‘muscle memory’ effect made users accidentally submit the form by hitting the enter key, causing frustration. Since this application is not a public web site, but merely a web technology based intranet application with a small and specialized user base, changing the default behaviour is acceptable if the users want it.

So here’s how to do it. The following JavaScript function focusNextInputOnEnter takes a form element as a parameter and changes the focus behaviour on the input elements within this form.

function focusNextInputOnEnter(form) {
  var inputs = form.querySelectorAll('input, select, textarea');
  for (var i = 0; i < inputs.length; i++) {
    var input = inputs[i];
    input.addEventListener('keypress', (function(index) {
      return function(event) {
        if (!isEnter(event.which)) {
        var nextIndex = index + 1;
        while (nextIndex < inputs.length) {
          var nextInput = inputs[nextIndex];
          if (nextInput.disabled) {

  function isEnter(keyCode) {
    return keyCode === 13;

It works by handling the keypress events on the input elements and checking the key code for the enter key (code 13). It has an additional check so that disabled input elements are skipped.

To apply this change in behaviour to a form we have to call the function when the DOM content is loaded:

<form id="demo-form">
  <input type="text">
  <input type="text" disabled="disabled">
  <input type="checkbox">
  <input type="text">
  <input type="text">

  document.addEventListener('DOMContentLoaded', function() {

I want to reiterate my warning that you should definitely not do this for public web sites, and elsewhere only if you know that this is what your users want.

Querying gaps between date ranges in Oracle SQL

Let’s say we have a database table with date ranges, each range designated by a RANGE_START and a RANGE_END column:

CREATE TABLE date_ranges (
  range_start DATE,
  range_end   DATE
-----------	---------
05/02/2020	01/04/2020
02/04/2020	15/04/2020
16/04/2020	01/05/2020
01/06/2020	20/06/2020
21/06/2020	01/07/2020
02/07/2020	31/07/2020
05/08/2020	30/08/2020

We are now interested in finding the gaps between these date ranges. If we look at this example data set we can see that there are two gaps:

05/02/2020	01/04/2020
02/04/2020	15/04/2020
16/04/2020	01/05/2020
-- gap --
01/06/2020	20/06/2020
21/06/2020	01/07/2020
02/07/2020	31/07/2020
-- gap --
05/08/2020	30/08/2020

What would be the SQL query to find these automatically? With standard SQL this would be a difficult task. However, there are some special functions in Oracle SQL called analytic functions that greatly help with this task. Analytic functions compute an aggregate value based on a group of rows. They differ from aggregate functions in that they return multiple rows for each group. In this case we will use the analytic functions MAX and LEAD:

      OVER(ORDER BY range_start) + 1 gap_start,
      OVER(ORDER BY range_start) - 1 gap_end
  FROM date_ranges
) WHERE gap_start <= gap_end;

The result of this query are the date range gaps we are interested in:

---------	-------
02/05/2020	31/05/2020
01/08/2020	04/08/2020

Note that the MAX function in the query is the analytic MAX function, not the aggregate MAX function, indicated by the OVER keyword with an analytic clause. It operates on a sliding window. The LEAD analytic function allows you to access the following row from the current row without using a self-join.

Using CSV data as external table in Oracle DB

If you want to import CSV data into an Oracle database you can use the SQL*Loader command line tool. You simple create a control file that describes how to load the data and then call the sqlldr command with the control file name as an argument:


INFILE example.csv
INTO TABLE example_table
> sqlldr username/password example.ctl

But there’s another way to load CSV data into an Oracle database: External tables.

External tables

Oracle’s external tables feature allows you to query data from a file on the filesystem like a regular database table.

First you have to create a directory in the file system and put your CSV file inside:

mkdir -p /path/to/directory



Now connect to the database as “SYS as SYSDBA”, define the directory as a database object and grant read/write access to your user:

  external_tables_dir AS '/path/to/directory';
  external_tables_dir TO example_user;

Now you can connect as example_user and create an external table for the CSV file:

CREATE TABLE example_table (
  id NUMBER(4,0),
  name VARCHAR2(50),
  amount NUMBER(8,0)
  DEFAULT DIRECTORY external_tables_dir
  LOCATION ('example.csv')  

The relevant part here is the ORGANIZATION EXTERNAL block. It references the directory and the CSV file inside the directory and allows you to specify format parameters of the CSV file such as record and field delimiters.

Now you can query the table like a regular table:

SELECT * FROM example_table
-- ----- ------
1  Water 250
2  Beer  500
3  Wine  150

Access information and errors such as bad or discarded records are stored in log files in the specified directory. The default names of these log files consist of the table name and an ID, e.g. example_table_12345.log, example_table_12345.bad and example_table_12345.dsc.

Generating Rows in Oracle Database

Sometimes you want to automatically populate a database table with a number rows. Maybe you need a big table with lots of entries for a performance experiment or some dummy data for development. Unfortunately, there’s no standard SQL statement to achieve this task. There are different possibilities for the various database management systems. For the Oracle database (10g or later) I will show you the simplest one I have encountered so far. It actually “abuses” an unrelated functionality: the CONNECT BY clause for hierarchical queries in combination with the DUAL table.

Here’s how it can be used:

FROM dual

This select creates a result set with the numbers from 1 to 1000. You can combine it with INSERT to populate the following table with rows:

CREATE TABLE example (
  id   NUMBER(5,0),
  name VARCHAR2(200)

INSERT INTO example (id, name)
FROM dual

The resulting table is:

1   Name 1
2   Name 2
3   Name 3
10  Name 10

Of course, you can use the incrementing ROWNUM in more creative ways. The following example populates a table for time series data with a million values forming a sinus curve with equidistant timestamps (in this case 15 minute intervals) starting with a specified time:

CREATE TABLE example (
  id    NUMBER(5,0),
  time  TIMESTAMP,
  value NUMBER

INSERT INTO example (id, time, value)
  TIMESTAMP'2020-05-01 12:00:00'
     + (ROWNUM-1)*(INTERVAL '15' MINUTE),
FROM dual
CONNECT BY LEVEL <= 1000000;
ID  TIME              VALUE
1   2020-05-01 12:00  0.099833
2   2020-05-01 12:15  0.198669
3   2020-05-01 12:30  0.295520

As mentioned at the beginning, there are other row generator techniques to achieve this. But this one is the simplest so far, at least for Oracle.

The Java Cache API and Custom Key Generators

The Java Cache API allows you to add a @CacheResult annotation to a method, which means that calls to the method will be cached:

import javax.cache.annotation.CacheResult;

public String exampleMethod(String a, int b) {
    // ...

The cache will be looked up before the annotated method executes. If a value is found in the cache it is returned and the annotated method is never actually executed.

The cache lookup is based on the method parameters. By default a cache key is generated by a key generator that uses Arrays.deepHashCode(Object[]) and Arrays.deepEquals(Object[], Object[]) on the method parameters. The cache lookup based on this key is similar to a HashMap lookup.

You can define and configure multiple caches in your application and reference them by name via the cacheName parameter of the @CacheResult annotation:

public String exampleMethod(String a, int b) {

If no cache name is given the cache name is based on the fully qualified method name and the types of its parameters, for example in this case: “my.app.Example.exampleMethod(java.lang.String,int)”. This way there will be no conflicts with other cached methods with the same set of parameters.

Custom Key Generators

But what if you actually want to use the same cache for multiple methods without conflicts? The solution is to define and use a custom cache key generator. In the following example both methods use the same cache (“examplecache”), but also use a custom cache key generator (MethodSpecificKeyGenerator):

public String exampleMethodA(String a, int b) {
    // ...

public String exampleMethodB(String a, int b) {
    // ...

Now we have to implement the MethodSpecificKeyGenerator:

import org.infinispan.jcache.annotation.DefaultCacheKey;

import javax.cache.annotation.CacheInvocationParameter;
import javax.cache.annotation.CacheKeyGenerator;
import javax.cache.annotation.CacheKeyInvocationContext;
import javax.cache.annotation.GeneratedCacheKey;

public class MethodSpecificKeyGenerator
  implements CacheKeyGenerator {

  public GeneratedCacheKey generateCacheKey(CacheKeyInvocationContext<? extends Annotation> context) {
    Stream<Object> methodIdentity = Stream.of(context.getMethod());
    Stream<Object> parameterValues = Arrays.stream(context.getKeyParameters()).map(CacheInvocationParameter::getValue);
    return new DefaultCacheKey(Stream.concat(methodIdentity, parameterValues).toArray());

This key generator not only uses the parameter values of the method call but also the identity of the method to generate the key. The call to context.getMethod() returns a java.lang.reflect.Method instance for the called method, which has appropriate hashCode() and equals() implementations. Both this method object and the parameter values are passed to the DefaultCacheKey implementation, which uses deep equality on its parameters, as mentioned above.

By adding the method’s identity to the cache key we have ensured that there will be no conflicts with other methods when using the same cache.

24 hour time format: Difference between JodaTime and java.time

We have been using JodaTime in many projects since before Java got better date and time support with Java 8. We update projects to the newer java.time classes whenever we work on them, but some still use JodaTime. One of these was a utility that imports time series from CSV files. The format for the time stamps is flexible and the user can configure it with a format string like “yyyyMMdd HHmmss”. Recently a user tried to import time series with timestamps like this:

20200101 234500
20200101 240000
20200102 001500

As you can see this is a 24-hour format. However, the first hour of the day is represented as the 24th hour of the previous day if the minutes and seconds are zero, and it is represented as “00” otherwise. When the user tried to import this with the “yyyyMMdd HHmmss” format the application failed with an internal exception:

Cannot parse "20200101 240000": Value 24 for
hourOfDay must be in the range [0,23]

Then he tried “yyyyMMdd kkmmss”, which uses the “kk” format for hours. This format allows the string “24” as hour. But now “20200101 240000” was parsed as 2020-01-01T00:00:00 and not as 2020-01-02T00:00:00, as intended.

I tried to help and find a format string that supported this mixed 24-hour format, but I did not find one, at least not for JodaTime. However, I found out that with java.time the import would work with the “yyyyMMdd HHmmss” format, even though the documentation for “H” simply says “hour-of-day (0-23)”, without mentioning 24.

The import tool was finally updated to java.time and the user was able to import the time series file.

Working with JSON data in Oracle databases

In my last post I showed how to work with JSON data in PostgreSQL. This time I want show how it is done with an Oracle database for comparison. I will use the same example scenario: a table named “events” where application events are stored in JSON format.

JSON data types

In Oracle there is no special data type for JSON data. You can use character string datatypes like VARCHAR2 or CLOB. However, you can add a special CHECK constraint to a column in order to ensure that only valid JSON is inserted:

  CONSTRAINT event_is_json CHECK (event IS JSON)

If you try to insert something other than JSON you will get a constraint violaiton error:

INSERT INTO events (datetime, event) VALUES
  (CURRENT_TIMESTAMP, 'This is not JSON.');

ORA-02290: check constraint (EVENT_IS_JSON) violated

Let’s insert some valid JSON data:

INSERT INTO events (datetime, event) VALUES
  (CURRENT_TIMESTAMP, '{"type": "add_shelf", "payload": {"id": 1}}');
INSERT INTO events (datetime, event) VALUES
  (CURRENT_TIMESTAMP, '{"type": "add_book", "payload": {"title": "Ulysses", "shelf": 1}}');
INSERT INTO events (datetime, event) VALUES
  (CURRENT_TIMESTAMP, '{"type": "add_book", "payload": {"title": "Moby Dick", "shelf": 1}}');
INSERT INTO events (datetime, event) VALUES
  (CURRENT_TIMESTAMP, '{"type": "add_shelf", "payload": {"id": 2}}');
INSERT INTO events (datetime, event) VALUES
  (CURRENT_TIMESTAMP, '{"type": "add_book", "payload": {"title": "Don Quixote", "shelf": 2}}');


In Oracle you use the JSON_VALUE function to select a value from a JSON structure. It uses a special path syntax for navigating JSON objects where the object root is represented as ‘$’ and properties are accessed via dot notation. This function can be used both in the SELECT clause and the WHERE clause:

SELECT JSON_VALUE(event, '$.type') AS type
  FROM events;
SELECT event FROM events
  WHERE JSON_VALUE(event, '$.type')='add_book'
    AND JSON_VALUE(event, '$.payload.shelf')=1;
{"type":"add_book","payload":{"shelf":1,"title":"Moby Dick"}}

Constructing JSON objects

JSON objects can be constructed from values via the JSON_OBJECT and JSON_ARRAY functions:

  'id' VALUE 1,
  'name' VALUE 'tree',
  'isPlant' VALUE 'true' FORMAT JSON,
  'colors' VALUE JSON_ARRAY('green', 'brown')
) FROM dual;

Note that you have to use string values with the additional FORMAT JSON clause for boolean values.


Modifying JSON object fields has become feasible with the introduction of the JSON_MERGEPATCH function in Oracle 19c. It takes two JSON parameters:

1) the original JSON data
2) a JSON “patch” snippet that will be merged into the original JSON data. This can either add or update JSON properties.

It can be used in combination with JSON_VALUE and JSON_OBJECT. In this example we convert all the event “type” fields from lower case to upper case:

  JSON_OBJECT('type' VALUE UPPER(JSON_VALUE(event, '$.type')))

Oracle provides a lot more functions for working with JSON data. This post only covered the most basic ones. See the Oracle JSON reference for more.