After deleting the records in the table in pg, you can reclaim the space occupied by the dead tulpe by the vacuum command. So how does vacuum reclaim space? Will the normal use of indexes on the original table be affected after space reclaim?
To clarify these issues, we first need to know what the space allocation inside the heap page in pg looks like. The content of lp is the OFFSET corresponding to the tuple on the current page. The position of lp is fixed at a fixed length (the lp is behind the fixed page head). lp is fixed to facilitate the search of the tuple (for example, the second part of ctid is lp information).
lp is allocated from the page head, tuple is allocated from the page tail, and lp points to the starting position of the corresponding tuple.
Examples:
1. Create a test table
bill@bill=>create table c (id int, info text);
CREATE TABLE
2. Install the pageinspect plugin
bill@bill=>create extension pageinspect ;
CREATE EXTENSION
3. Insert data
bill@bill=>insert into c select id, repeat(md5(random()::text), 16) from generate_series(1,20) t(id);
INSERT 0 20
4. View the size
of the variable-length field. The size of the variable-length field per column is 516 bytes.
bill@bill=>select pg_column_size(repeat(md5(random()::text), 16));
pg_column_size
----------------
516
(1 row)
5. Check that the first page
lp has been allocated to 80 bytes and the
lp offset has been allocated to 576 bytes.
There is less than 496 bytes of space left.
bill@bill=>SELECT * FROM page_header(get_raw_page('c', 0));
lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
-------------+----------+-------+-------+-------+---------+----------+---------+-----------
19/1621C910 | 0 | 0 | 80 | 576 | 8192 | 8192 | 4 | 0
(1 row)
6. View the offset of each record tuple within the page
bill@bill=>select lp,lp_off from heap_page_items(get_raw_page('c', 0));
lp | lp_off
----+--------
1 | 7648
2 | 7104
3 | 6560
4 | 6016
5 | 5472
6 | 4928
7 | 4384
8 | 3840
9 | 3296
10 | 2752
11 | 2208
12 | 1664
13 | 1120
14 | 576
(14 rows)
7. Delete 13 records
bill@bill=>delete from c where ctid not in ('(0,1)','(0,3)','(0,5)','(0,7)','(0,9)','(0,11)','(0,13)');
DELETE 13
8. View lp information and lp offset information, there is no garbage collection yet.
bill@bill=>select lp,lp_off from heap_page_items(get_raw_page('c', 0));
lp | lp_off
----+--------
1 | 7648
2 | 7104
3 | 6560
4 | 6016
5 | 5472
6 | 4928
7 | 4384
8 | 3840
9 | 3296
10 | 2752
11 | 2208
12 | 1664
13 | 1120
14 | 576
(14 rows)
9.vacuum recycling
bill@bill=>vacuum VERBOSE c;
INFO: vacuuming "bill.c"
INFO: "c": removed 13 row versions in 2 pages
INFO: "c": found 13 removable, 7 nonremovable row versions in 2 out of 2 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 12076871
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "c": truncated 2 to 1 pages
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: vacuuming "pg_toast.pg_toast_226382"
INFO: index "pg_toast_226382_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "pg_toast_226382": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 12076872
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM
10. After reviewing the lp information and lp offset information
garbage collection again, the lp information is still there. However, please note that the offset stored in lp has changed. The recovered tuple, MOVE has occurred, it has become compact, and there are no holes. .
The lp and offset corresponding to the deleted tuple all become 0.
bill@bill=>select lp,lp_off from heap_page_items(get_raw_page('c', 0));
lp | lp_off
----+--------
1 | 7648
2 | 0
3 | 7104
4 | 0
5 | 6560
6 | 0
7 | 6016
8 | 0
9 | 5472
10 | 0
11 | 4928
12 | 0
13 | 4384
14 | 0
(14 rows)
11. After inserting a new record, check that
lp number 2 is used, and OFFSET reaches 2312 bytes.
bill@bill=>insert into c select 100, repeat(md5(random()::text), 60);
INSERT 0 1
bill@bill=>select lp,lp_off from heap_page_items(get_raw_page('c', 0));
lp | lp_off
----+--------
1 | 7648
2 | 2432
3 | 7104
4 | 0
5 | 6560
6 | 0
7 | 6016
8 | 0
9 | 5472
10 | 0
11 | 4928
12 | 0
13 | 4384
14 | 0
(14 rows)
Summary:
1. During garbage collection, lp itself remains unchanged, so that the index can remain unchanged.
2. The move row in the page is just the change of lp offset. The advantage is that the inside of PAGE will not expand, which improves the space utilization.