Partial Indexes in PostgreSQL: Index Only What Matters

Indexes are one of the most effective tools for improving database performance. However, they come at a cost: they consume disk space, slow down write operations, and require maintenance. In many cases, a full index contains a lot of entries that are never used by the queries we want to optimize.

This is where PostgreSQL’s partial indexes become useful. A partial index contains only the rows that satisfy a specified condition. Instead of indexing an entire table, we can index only the subset of data that is relevant for our queries.

Consider a simple user table:

CREATE TABLE users (
    id BIGSERIAL PRIMARY KEY,
    username TEXT NOT NULL,
    active BOOLEAN NOT NULL
);

Suppose that most users are inactive, but our application frequently searches for active users:

SELECT *
FROM users
WHERE active = true
  AND username = 'alice';

A conventional index would cover all rows:

CREATE INDEX idx_users_username
ON users (username);

If only a small fraction of users are active, this index contains many entries that will never help this query.

A partial index can be defined as:

CREATE INDEX idx_active_users_username
ON users (username)
WHERE active = true;

Now the index contains only active users. The PostgreSQL query planner can use this index whenever it detects that the query condition implies the index predicate:

SELECT *
FROM users
WHERE active = true
  AND username = 'alice';

Because the query explicitly restricts the result set to active users, the planner knows that every matching row must be present in the partial index.

Why use partial indexes?

The obvious benefit is size. Imagine a table with ten million users, but only five percent are active. A conventional index stores ten million entries, while the partial index stores only five hundred thousand.

Smaller indexes provide several advantages: less disk usage, reduced memory consumption, faster index scans, lower maintenance overhead during INSERT and UPDATE operations.

In workloads where the indexed subset is significantly smaller than the table, these benefits can be substantial.

A common use case

Soft deletion is a frequent pattern in business applications:

CREATE TABLE orders (
    id BIGSERIAL PRIMARY KEY,
    customer_id BIGINT NOT NULL,
    deleted BOOLEAN NOT NULL DEFAULT false
);

Most queries ignore deleted records:

SELECT *
FROM orders
WHERE deleted = false
  AND customer_id = 42;

Instead of indexing all rows, we can focus on the rows that are actually queried:

CREATE INDEX idx_active_orders_customer
ON orders (customer_id)
WHERE deleted = false;

As the number of logically deleted rows grows over time, the index remains compact.

Limitations

Partial indexes are not a universal solution. The query must contain a condition that allows PostgreSQL to infer the index predicate. For example, the index

WHERE active = true

cannot be used for a query that only filters by username:

SELECT *
FROM users
WHERE username = 'alice';

The planner cannot assume that the result should contain only active users.

Another consideration is changing data distributions. A partial index is most effective when the indexed subset remains relatively small. If almost all rows eventually satisfy the predicate, the advantage largely disappears.

Conclusion

By indexing only the rows that are relevant to specific queries, they can reduce index size, and improve query performance. Whenever you notice that your queries consistently target a small subset of a large table, a partial index may be worth considering.