Re: Should we optimize the `ORDER BY random() LIMIT x` case?
Andrei Lepikhov <lepihov@gmail.com>
From: Andrei Lepikhov <lepihov@gmail.com>
To: Aleksander Alekseev <aleksander@timescale.com>,
PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Cc: wenhui qiu <qiuwenhuifx@gmail.com>
Date: 2025-05-15T09:32:35Z
Lists: pgsql-hackers
On 15/5/2025 11:17, Aleksander Alekseev wrote: >> What kind of optimisation trick may the optimiser use here to provide an >> optimal plan? As I see it, it will need to think that all the tuples >> should be returned from the subquery. The only profit is to skip sorting >> the massive sample. > > Doesn't look like a generic optimization trick will help us. I was > thinking about a custom aggregate function, e.g. `SELECT sample(*, 10) > ...`. However I doubt that aggregate functions are flexible enough. Or > alternatively a rewrite rule. I never dealt with those before so I > have no idea what I'm talking about :D A custom SRF seems great to me. You may propose such an aggregate in the core - it seems it doesn't even need any syntax changes. For example: SELECT * FROM (SELECT sample(q, 10, <type>) FROM (SELECT ...) AS q); or something like that. -- regards, Andrei Lepikhov