If you’ve worked with PostgreSQL and dealt with things like full-text search, arrays, or JSON data, you might have heard about GIN indexes. But what exactly are they, and why are they useful?
GIN stands for Generalized Inverted Index. Most indexes (like the default B-tree index) work best when there’s one clear value per row – like a number or a name. But sometimes, a single column can hold many values. Think of a column that stores a list of tags, words in a document, or key-value data in JSON. That’s where GIN comes in.
Let’s walk through a few examples to see how GIN indexes work and why they’re helpful.
Full-text search example
Suppose you have a table of articles:
CREATE TABLE articles (
id serial PRIMARY KEY,
title text,
body text
);
You want to let users search the content of these articles. PostgreSQL has built-in support for full-text search, which works with a special data type called tsvector. To get started, you’d add a column to store this processed version of your article text:
ALTER TABLE articles ADD COLUMN tsv tsvector;
UPDATE articles SET tsv = to_tsvector('english', body);
Now, to speed up searches, you create a GIN index:
CREATE INDEX idx_articles_tsv ON articles USING GIN(tsv);
With that in place, you can search for articles quickly:
SELECT * FROM articles WHERE tsv @@ to_tsquery('tonic & water');
This finds all articles that contain both “tonic” and “water”, and thanks to the GIN index, it’s fast – even if you have thousands of articles.
Array example
GIN is also great for columns that store arrays. Let’s say you have a table of photos, and each photo can have several tags:
CREATE TABLE photos (
id serial PRIMARY KEY,
tags text[]
);
You want to find all photos tagged with “capybara”. You can create a GIN index on the tags column:
CREATE INDEX idx_photos_tags ON photos USING GIN(tags);
SELECT * FROM photos WHERE tags @> ARRAY['capybara'];
(The @> operator means “contains” or “is a superset of”.)
The index lets PostgreSQL find matching rows quickly, without scanning the entire table.
JSONB example
PostgreSQL’s jsonb type lets you store flexible key-value data. Imagine a table of users with extra info stored in a jsonb column:
CREATE TABLE users (
id serial PRIMARY KEY,
data jsonb
);
One row might store {"age": 42, "city": "Karlsruhe"}. To find all users from New York, you can use:
SELECT * FROM users WHERE data @> '{"city": "Karlsruhe"}';
And again, with a GIN index on the data column, this query becomes much faster:
CREATE INDEX idx_users_data ON users USING GIN(data);
Things to keep in mind
GIN indexes are very powerful, but they come with some tradeoffs. They’re slower to build and can make insert or update operations a bit heavier. So they’re best when you read (search) data often, but don’t write to the table constantly.
In short, GIN indexes are your friend when you’re dealing with columns that contain multiple values – like arrays, full-text data, or JSON. They let PostgreSQL break apart those values and build a fast lookup system. If your queries feel slow and you’re working with these kinds of columns, adding a GIN index might be exactly what you need.