Re: Sequence Access Methods, round two
Michael Paquier <michael@paquier.xyz>
From: Michael Paquier <michael@paquier.xyz>
To: Peter Eisentraut <peter@eisentraut.org>
Cc: Tomas Vondra <tomas.vondra@enterprisedb.com>, Peter Smith <smithpb2250@gmail.com>, Postgres hackers <pgsql-hackers@lists.postgresql.org>
Date: 2024-04-26T06:21:29Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Refactor init_params() in sequence.c to not use FormData_pg_sequence_data
- ba3d93b2e806 19 (unreleased) landed
-
Fix comment thinko in sequence.c
- 17a3f79f812c 17.0 landed
-
Group more closely cache updates for backends in sequence.c
- 6e951bf98e2e 17.0 landed
-
Introduce sequence_*() access functions
- 449e798c77ed 17.0 landed
Attachments
- v5-0001-Switch-pg_sequence_last_value-to-report-a-tuple-a.patch (text/x-diff) patch v5-0001
- v5-0002-Remove-FormData_pg_sequence_data-from-init_params.patch (text/x-diff) patch v5-0002
- v5-0003-Integrate-addition-of-attributes-for-sequences-wi.patch (text/x-diff) patch v5-0003
- v5-0004-Refactor-code-for-in-core-local-sequences.patch (text/x-diff) patch v5-0004
- v5-0005-Sequence-access-methods-backend-support.patch (text/x-diff) patch v5-0005
- v5-0006-Sequence-access-methods-dump-restore-support.patch (text/x-diff) patch v5-0006
- v5-0007-Sequence-access-methods-core-documentation.patch (text/x-diff) patch v5-0007
- v5-0008-snowflake-Add-sequence-AM-based-on-it.patch (text/x-diff) patch v5-0008
On Fri, Apr 19, 2024 at 04:00:28PM +0900, Michael Paquier wrote: > I have plans to rework this patch set for the next commit fest, > and this includes some investigation about custom data types that > could be plugged into these AMs. So, I have worked more on this patch set, and finished by reorganizing it more, with more things: - The in-core sequence access method is split into more files: -- One for its callbacks, called seqlocalam.c. -- The WAL replay routines are moved into their own file. - As asked, Implementation of a contrib module that introduces a sequence access method for snowflake IDs, to demonstrate what can be done using the APIs of the patch. The data of such sequences is stored in an unlogged table, based on the assumption that the timestamps and the machine IDs ensure the unicity of the IDs for the sequences. The advantage of what's presented here is that support for lastval(), nextval() and currval() is straight-forward. Identity columns are able to feed on that. cache is handled by sequence.c, not the AM. WAL-logging is needed for the init fork, it goes through the generic WAL APIs like bloom to log full pages. Some docs are included. This is still a rough WIP, though, and the buffer handling is not optimal, and could be made transactional this time (assuming autovacuum is able to process them at some point, or perhaps the sequence AMs should offer a way for autovacuum to know if such sequences should be cleaned up or not). After having done that, I've also found about a module developed by pgEdge, that copies a bunch of the code from sequence.c, though it is not able to handle the sequence cache: https://github.com/pgEdge/snowflake The approach this module uses is quite similar to what I have here, but it is smarter regarding clock ticking, where the internal sequence counter is bumped only when we fetch the same timestamp as a previous attempt. The module presented could be switched to do something similar by storing into the heap table used by the sequence a bit more data than just the sequence counter. Well, the point I want to make at this stage is what can be done with sequence AMs, so let's discuss about that later. Finally, custom types, where I have come up with a list of open questions: - Catalog representation. pg_sequence and pg_sequences switch to something else than int64. - The existing functions are also interesting to consider here. nextval() & co would not be usable as they are for sequence AMs that use more than int64. Note that the current design taken in the patch has a strong dependency on int64 (see sequenceam.h). So the types would need to reflect. With identity columns, the change would not be that hard as the executor has NextValueExpr. Perhaps each sequence AM should just have callback equivalents for currval(), nextval() and lastval(). This hits with the fact that this makes sequence AMs less transparent to applications because custom data types means different functions than the native ones. - Option representation. - I have polled Twitter and Fosstodon with four choices: -- No need for that, 64b representation is enough. -- Possibility to have integer-like types (MSSQL does something like that). -- Support for 128b or larger (UUIDs, etc, with text-like representation or varlenas). -- Support for binary representation, which comes down to the possibility of having sequence values even larger than 128b. Based on the first estimations, 50%-ish of people mentioned than 64b is more than enough, while Jelte mentioned that Citus has tackled this problem with an equivalent of 128b (64b for the sequence values, some more for machine states). Then there's a trend of 25%-ish in favor of 128b and 25%-ish for more than that. The results are far from being final, but that's something. My own take after pondering about it is that 64b is still more than enough for the clustering cases I've seen in the past 15 years or so, while offering room for implementations even if it comes to thousands of nodes. So there's some margin depending on the number of bits reserved for the "machine" part of the sequence IDs when used in clusters. The next plan is to hopefully be able to trigger a discussion at the next pgconf.dev at the end of May, but let's see how it goes. Thanks, -- Michael