Re: Trying out read streams in pgvector (an extension)
Melanie Plageman <melanieplageman@gmail.com>
From: Melanie Plageman <melanieplageman@gmail.com>
To: Thomas Munro <thomas.munro@gmail.com>
Cc: Peter Geoghegan <pg@bowt.ie>, Nazir Bilal Yavuz <byavuz81@gmail.com>, "Jonathan S. Katz" <jkatz@postgresql.org>,
pgsql-hackers <pgsql-hackers@postgresql.org>
Date: 2025-12-09T21:42:21Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
Add read_stream_{pause,resume}()
- 38229cb90516 19 (unreleased) landed
On Mon, Dec 8, 2025 at 10:47 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > I think it'd be better if that were the consumer's choice. I don't > want the consumer to be required to drain the stream before resuming, > as that'd be an unprincipled stall. For example, if new WAL arrives > over the network then I think it should be possible for recovery's > WAL-powered stream of heap pages to resume looking ahead even if > recovery hasn't drained the existing stream completely. > > 1. read_stream_resume() as before, but with a new explicit > read_stream_pause(): if a block number callback would like to report a > temporary lack of information, it should return > read_stream_pause(stream), not InvalidBlockNumber. Then after > read_stream_resume(stream) is called, the next > read_stream_next_buffer() enters the lookahead loop again. While > paused, if the consumer drains all the existing buffers in the stream > and then one more, it will receive InvalidBuffer, but if the _resume() > call is made sooner, the consumer won't ever know about the temporary > lack of buffers in the stream. I like this new interface. If the user does want to exhaust the stream (as was the case with earlier pgvector read stream user code), I assume you would want to do: read_stream_pause() read_stream_reset() read_stream_resume() - Melanie