Schemas, naming and search path in PostgreSQL

Modern (SQL) databases provide a multi-level hierarchy of separated contexts.

A database management system (DBMS) can manage multiple databases.

A database can contain multiple schemas.

A schema usually contains multiple tables, views, sequences and other objects.

Most of the time we developers only care about our database and the objects it contains – ignoring schemas and the DBMS.

In PostgreSQL instances this means we are using the default schema – usually called public – without mentioning it anywhere. Neither in the connection string or in any queries.

Ok, and why does all of the above even matter?

Special situations

Sometimes customers or operators require us to use special schemas in certain databases. When we do not have full control over our database deployment and usage we have to adapt to the situation we encounter.

Camel-casing

One thing I really advise against is using upper case names for tables, columns and so on. SQL itself is not case sensitive but many DBMSes differentiate case when it comes to naming. So they require you to use double-quotes to reference mixed-case objects like schemas and tables, e.g. "MyImportantTable". Imho it is far better to use only lower case letters and use snake_case for all names.

Multiple schemas in one database

I do not endorse using multiple names schemas in one database. Most of the time you do not need this level of separation and can simply use multiple databases in the same DBMS with their default schemas. That way you do not have to specify the schema in your queries.

What if you are required to use a non-default schema? At least for PostgreSQL you can modify your login role to change the search path. So you can leave all your queries and maybe other deployments untouched and without some schema name scattered all around your project:

-- instead of specifying schema name like
select * from "MyTargetSchema".my_table;

-- you can alter the search path of your role
alter role my_login set search_path = public,"MyTargetSchema";

-- to only write
select * from my_table;

Wrapping it up

Today’s software systems are quite complex and the problems we are trying to solve with them have their own, ever growing complexity. So when using DBMSes follow common advice like the above to reduce complexity in your solution even if it seems to be mandated.

Sometimes there are easy to follow conventions or configuration options that can reduce your burden as a developer even if you do not have complete control over the relevant parts of the deployment environment.

What’s your time, database?

Time is a difficult subject. Especially time zones and daylight saving time. Adding layers makes things worse. Ask your database.

Time is a difficult subject. Especially time zones and daylight saving time. Sounds easy? Well, take a look.
Adding layers in software development complicates the issue and every layer has its own view of time. Let’s start with an example: we write a simple application which stores time based data in a SQL database, e.g. Oracle. The table has a column named ‘at’. Since we don’t want to mess around with timezones, we use a column type without timezone information, in Oracle this would be ‘Date’ if we do not need milliseconds and ‘Timestamp’ if we need them. In Java with plain JDBC we can extract it with a call to ‘getTimestamp’:

Date timestamp = resultSet.getTimestamp("at");

The problem is now we have a timestamp in our local timezone. Where is it converted? Oracle itself has two timezone settings: for the database and for the session. We can query them with:

select DBTIMEZONE from dual;

and

select SESSIONTIMEZONE from dual;

First Oracle uses the time zone set in the session, then the database one. The results from those queries are interesting though: some return a named timezone like ‘Europe/Berlin’, the other return an offset ‘+01:00’. Here a first subtle detail is important: the named timezone uses the offset and the daylight saving time from the respective timezone, the offset setting only uses the offset and no daylight saving. So ‘+01:00’ would just add 1 hour to UTC regardless of the date.
In our example changing these two settings does not change our time conversion. The timezone settings are for another column type: timestamp with (local) timezone.
Going up one layer the JDBC API reveals an interesting tidbit:

Timestamp getTimestamp(int columnIndex)
throws SQLException

Retrieves the value of the designated column in the current row of this ResultSet object as a java.sql.Timestamp object in the Java programming language.

Sounds about right, but wait there’s another method:

Timestamp getTimestamp(int columnIndex,
Calendar cal)
throws SQLException


Retrieves the value of the designated column in the current row of this ResultSet object as a java.sql.Timestamp object in the Java programming language. This method uses the given calendar to construct an appropriate millisecond value for the timestamp if the underlying database does not store timezone information.

Just as in Oracle we can use a named timezone or an offset:

Date timestamp = resultSet.getTimestamp("at", Calendar.getInstance(TimeZone.getTimeZone("GMT+1:00")));

This way we have control over what and how the time is extracted from the database. The next time you work with time based information take a close look. And if you work with Java use Joda Time.