Re: In-placre persistance change of a relation
Kyotaro Horiguchi <horikyota.ntt@gmail.com>
From: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
To: sfrost@snowman.net
Cc: osumi.takamichi@fujitsu.com, masao.fujii@oss.nttdata.com,
ashutosh.bapat.oss@gmail.com, tsunakawa.takay@fujitsu.com,
pgsql-hackers@lists.postgresql.org
Date: 2020-11-13T04:22:01Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
pg_dump: Refactor getIndexes()
- e2c52beecdea 15.0 cited
-
Optimize DropRelFileNodesAllBuffers() for recovery.
- bea449c635c0 14.0 cited
Attachments
- v3-0001-In-place-table-persistence-change.patch (text/x-patch)
Hello. Before posting the next version, I'd like to explain what this patch is. 1. The Issue Bulk data loading is a long-time taking, I/O consuming task. Many DBAs want that task is faster, even at the cost of increasing risk of data-loss. wal_level=minimal is an answer to such a request. Data-loading onto a table that is created in the current transaction omits WAL-logging and synced at commit. However, the optimization doesn't benefit the case where the data-loading is performed onto existing tables. There are quite a few cases where data is loaded into tables that already contains a lot of data. Those cases don't take benefit of the optimization. Another possible solution for bulk data-loading is UNLOGGED tables. But when we switch LOGGED/UNLOGGED of a table, all the table content is copied to a newly created heap, which is costly. 2. Proposed Solutions. There are two proposed solutions are discussed on this mailing list. One is wal_level = none (*1), which omits WAL-logging almost at all. Another is extending the existing optimization to the ALTER TABLE SET LOGGED/UNLOGGED cases, which is to be discussed in this new thread. 3. In-place Persistence Change So the attached is a PoC patch of the "another" solution. When we want to change table persistence in-place, basically we need to do the following steps. (the talbe is exclusively locked) (1) Flip BM_PERMANENT flag of all shared buffer blocks for the heap. (2) Create or delete the init fork for existing heap. (3) Flush all buffers of the relation to file system. (4) Sync heap files. (5) Make catalog changes. 4. Transactionality The 1, 2 and 5 above need to be abort-able. 5 is rolled back by existing infrastructure, and rolling-back of 1 and 2 are achieved by piggybacking on the pendingDeletes mechanism. 5. Replication Furthermore, that changes ought to be replicable to standbys. Catalog changes are replicated as usual. On-the-fly creation of the init fork leads to recovery mess. Even though it is removed at abort, if the server crashed before transaction end, the file is left alone and corrupts database in the next recovery. I sought a way to create the init fork in smgrPendingDelete but that needs relcache and relcache is not available at that late of commit. Finally, I introduced the fifth fork kind "INITTMP"(_itmp) only to signal that the init file is not committed. I don't like that way but it seems working fine... 6. SQL Command The second file in the patchset adds a syntax that changes persistence of all tables in a tablespace. ALTER TABLE ALL IN TABLESPACE <tsp> SET LOGGED/UNLOGGED [ NOWAIT ]; 7. Testing I tried to write TAP test for this, but IPC::Run::harness (or interactive_psql) doesn't seem to work for me. I'm not sure what exactly is happening but pty redirection doesn't work. $in = "ls\n"; $out = ""; run ["/usr/bin/bash"], \$in, \$out; print $out; works but $in = "ls\n"; $out = ""; run ["/usr/bin/bash"], '<pty<', \$in, '>pty>', \$out; print $out; doesn't respond. The patch is attached. regards. -- Kyotaro Horiguchi NTT Open Source Software Center