Blogs / Postgres / When Rows Don’t Die: MVCC, Index Bloat & How PostgreSQL Stores Your Data

When Rows Don’t Die: MVCC, Index Bloat & How PostgreSQL Stores Your Data

Posted at 15-July-2025 / Written by Rohit Bhatt

30-sec summary

PostgreSQL doesn’t just update rows — it leaves the old ones lying around like forgotten leftovers. It’s called MVCC, and while it’s great for concurrency, it can make your indexes bloated and your queries slow. This blog walks through what really happens inside your DB, how to spot bloat, and what to do about it.

Let’s Clear the Air: Updates Don’t Overwrite Rows

Here’s a surprise I wish someone had told me early: when you run an `UPDATE` or `DELETE` in PostgreSQL, the row doesn’t go away. Postgres just marks it as 'dead' and leaves it there. Why? Because of something called MVCC (Multi-Version Concurrency Control), which is how Postgres handles simultaneous reads and writes without locking everything up like a traffic jam.

Inside PostgreSQL: Pages, Tuples, and Metadata

Postgres stores your data in 8KB pages. These pages live inside heap files, and inside each page are rows — or 'tuples' as Postgres likes to call them. Every tuple has a header attached with hidden columns like:

1.xmin: Transaction ID that created the row.
2.xmax: Transaction ID that deleted or updated it.
3.ctid: Physical location of the row (page number + offset).
4.null bitmap: Optimized way to keep track of NULL columns without wasting space.

MVCC: Why PostgreSQL is So Dang Concurrent

MVCC lets Postgres show different versions of the same row to different users. When you update a row, Postgres inserts a brand-new version and marks the old one as dead. That way, if another transaction started before your update, it still sees the old row. It's genius — until you realize those dead rows don’t vanish automatically.

Where the Trash Piles Up: Dead Tuples

Postgres leaves dead rows lying around in both the heap and the indexes. That’s like deleting a file from your computer but it still shows up in your storage — until something (or someone) cleans it up. Over time, you end up with a lot of invisible junk. This makes queries slower and inflates your database size.

Meet VACUUM: PostgreSQL’s Cleanup Crew

VACUUM is the janitor that Postgres sends in to clear out dead tuples. It figures out which rows are no longer needed and marks that space as reusable. But here’s the kicker: regular VACUUM doesn’t make the file smaller. That requires VACUUM FULL, which rewrites the entire table and locks it while doing so. So:

1.VACUUM: Cleans the mess quietly, no locking, but doesn’t shrink the file.
2.VACUUM FULL: Actually frees disk space — but blocks access during cleanup.

The Index Story: Fast Until It’s Not

You add indexes to speed up lookups, right? Great! But here’s the trap — indexes store pointers to specific row versions. So every time you update a row, Postgres adds a new pointer and keeps the old one hanging around. It doesn’t remove that old pointer until you rebuild the index. Over time, the index grows and grows… even if your data doesn’t.

This Is Called Index Bloat

Dead pointers in your indexes make them fat and slow. You still have to scan through them, even if half the entries point to dead rows. VACUUM won’t remove them — it just marks them as unusable. To fix that, you need REINDEX.

Let’s See Bloat in Action

This creates 1 million dead rows and 1 million dead index entries — just from a single update. That’s how easy it is to bloat your DB without realizing it.

1CREATE TABLE users(id SERIAL, name TEXT);
2CREATE INDEX idx_name ON users(name);
3
4INSERT INTO users(name)
5SELECT 'alice' FROM generate_series(1, 1000000);
6
7UPDATE users SET name = 'bob' WHERE name = 'alice';

How to Fix and Avoid Index Bloat

1.Make Sure Autovacuum Is Doing Its Job: Postgres runs autovacuum automatically — but you can (and should) tune the settings for high-write tables.
2.Use REINDEX CONCURRENTLY: Rebuild indexes in the background without locking the table. Super useful for production systems.
3.Don’t Update Unless You Have To: Even `UPDATE users SET status = 'active' WHERE status = 'active'` creates a new version. Avoid it.
4.Use fillfactor on indexes: Leave space in index pages for future updates. Example: `CREATE INDEX idx_name ON users(name) WITH (fillfactor = 80);`

How to Monitor Index Size and Bloat

You can use built-in views to keep an eye on your indexes:

1SELECT
2  relname AS index_name,
3  pg_size_pretty(pg_relation_size(indexrelid)) AS size
4FROM
5  pg_stat_user_indexes
6ORDER BY
7  pg_relation_size(indexrelid) DESC;

Want more detailed bloat info? Install the `pgstattuple` extension — it gives you stats like how much of your index is dead weight.

Wrap-Up: Your Database Has a Body — Keep It Clean

PostgreSQL is amazing, but it won’t clean up after itself unless you ask. MVCC gives you concurrency, but it leaves behind clutter. Indexes make reads fast, but without a little care, they become junk drawers. Schedule regular VACUUMs, monitor index sizes, use REINDEX, and don’t do unnecessary writes. Your future self (and your query planner) will thank you.

To share this blog on social media click on the icons below :

Also Read : Understanding PostgreSQL MVCC: How Postgres Handles Concurrency Gracefully

Blogs / Postgres / When Rows Don’t Die: MVCC, Index Bloat & How PostgreSQL Stores Your Data

When Rows Don’t Die: MVCC, Index Bloat & How PostgreSQL Stores Your Data

Posted at 15-July-2025 / Written by Rohit Bhatt

30-sec summary

Let’s Clear the Air: Updates Don’t Overwrite Rows

Inside PostgreSQL: Pages, Tuples, and Metadata

1.xmin: Transaction ID that created the row.
2.xmax: Transaction ID that deleted or updated it.
3.ctid: Physical location of the row (page number + offset).
4.null bitmap: Optimized way to keep track of NULL columns without wasting space.

MVCC: Why PostgreSQL is So Dang Concurrent

Where the Trash Piles Up: Dead Tuples

Meet VACUUM: PostgreSQL’s Cleanup Crew

1.VACUUM: Cleans the mess quietly, no locking, but doesn’t shrink the file.
2.VACUUM FULL: Actually frees disk space — but blocks access during cleanup.

The Index Story: Fast Until It’s Not

This Is Called Index Bloat

Let’s See Bloat in Action

This creates 1 million dead rows and 1 million dead index entries — just from a single update. That’s how easy it is to bloat your DB without realizing it.

1CREATE TABLE users(id SERIAL, name TEXT);
2CREATE INDEX idx_name ON users(name);
3
4INSERT INTO users(name)
5SELECT 'alice' FROM generate_series(1, 1000000);
6
7UPDATE users SET name = 'bob' WHERE name = 'alice';

How to Fix and Avoid Index Bloat

1.Make Sure Autovacuum Is Doing Its Job: Postgres runs autovacuum automatically — but you can (and should) tune the settings for high-write tables.
2.Use REINDEX CONCURRENTLY: Rebuild indexes in the background without locking the table. Super useful for production systems.
3.Don’t Update Unless You Have To: Even `UPDATE users SET status = 'active' WHERE status = 'active'` creates a new version. Avoid it.
4.Use fillfactor on indexes: Leave space in index pages for future updates. Example: `CREATE INDEX idx_name ON users(name) WITH (fillfactor = 80);`

How to Monitor Index Size and Bloat

You can use built-in views to keep an eye on your indexes:

1SELECT
2  relname AS index_name,
3  pg_size_pretty(pg_relation_size(indexrelid)) AS size
4FROM
5  pg_stat_user_indexes
6ORDER BY
7  pg_relation_size(indexrelid) DESC;

Want more detailed bloat info? Install the `pgstattuple` extension — it gives you stats like how much of your index is dead weight.

Wrap-Up: Your Database Has a Body — Keep It Clean

To share this blog on social media click on the icons below :

Also Read : Understanding PostgreSQL MVCC: How Postgres Handles Concurrency Gracefully

When Rows Don’t Die: MVCC, Index Bloat & How PostgreSQL Stores Your Data

30-sec summary

Let’s Clear the Air: Updates Don’t Overwrite Rows

Inside PostgreSQL: Pages, Tuples, and Metadata

MVCC: Why PostgreSQL is So Dang Concurrent

Where the Trash Piles Up: Dead Tuples

Meet VACUUM: PostgreSQL’s Cleanup Crew

The Index Story: Fast Until It’s Not

This Is Called Index Bloat

Let’s See Bloat in Action

How to Fix and Avoid Index Bloat

How to Monitor Index Size and Bloat

Wrap-Up: Your Database Has a Body — Keep It Clean

To share this blog on social media click on the icons below :

Loading...

When Rows Don’t Die: MVCC, Index Bloat & How PostgreSQL Stores Your Data

30-sec summary

Let’s Clear the Air: Updates Don’t Overwrite Rows

Inside PostgreSQL: Pages, Tuples, and Metadata

MVCC: Why PostgreSQL is So Dang Concurrent

Where the Trash Piles Up: Dead Tuples

Meet VACUUM: PostgreSQL’s Cleanup Crew

The Index Story: Fast Until It’s Not

This Is Called Index Bloat

Let’s See Bloat in Action

How to Fix and Avoid Index Bloat

How to Monitor Index Size and Bloat

Wrap-Up: Your Database Has a Body — Keep It Clean

To share this blog on social media click on the icons below :

Recent Blogs

Monthly Archive

Category