Using GENERATED AS IDENTITY Instead of SERIAL in PostgreSQL

In PostgreSQL, the SERIAL keyword is commonly used to create auto-incrementing primary keys. While it remains supported and functional, newer versions of PostgreSQL (version 10 and later) offer a more standardized and flexible alternative: the GENERATED … AS IDENTITY syntax.

Limitations of SERIAL

When you define a column as SERIAL, PostgreSQL automatically creates and links a sequence to that column behind the scenes. But this linkage is not explicitly part of the table definition. This can complicate schema management and make the behavior of the column less transparent.

The SERIAL keyword is also not part of the official SQL standard, which may be a concern in environments where cross-database compatibility is important. Additionally, the column remains writable, meaning it’s possible to insert values manually, potentially leading to inconsistencies or conflicts.

Identity Columns

The GENERATED … AS IDENTITY syntax addresses these concerns by making the auto-increment behavior explicit and standards-compliant. An identity column is defined as follows:

CREATE TABLE users (
  id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  username TEXT NOT NULL
);

This syntax makes it clear that the column is managed by the system. PostgreSQL offers two modes for identity columns:

GENERATED ALWAYS: PostgreSQL always generates a value. Manual insertion requires an override.

GENERATED BY DEFAULT: The application can supply a value, or PostgreSQL will use the next sequence value automatically.

To insert a value manually into an ALWAYS identity column, you must use the OVERRIDING SYSTEM VALUE clause:

INSERT INTO users (id, username)
  VALUES (999, 'admin') OVERRIDING SYSTEM VALUE;

Managing Sequences

Since identity columns integrate the sequence into the column definition, managing them is more straightforward. For example, to reset the sequence:

ALTER TABLE users ALTER COLUMN id RESTART WITH 1000;

The sequence is tied to the column, making it easier to inspect, back up, and restore using tools like pg_dump. This helps avoid issues that can arise with the implicit sequences used by SERIAL.

Conclusion

The GENERATED AS IDENTITY syntax offers clearer semantics, better standards compliance, and more predictable behavior than SERIAL. For new database designs, it is generally the preferred choice. While SERIAL continues to be supported, identity columns provide more transparency and control, especially in environments where portability and schema clarity are important.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.