Re: Should we optimize the `ORDER BY random() LIMIT x` case?

Andrei Lepikhov <lepihov@gmail.com>

From: Andrei Lepikhov <lepihov@gmail.com>
To: Aleksander Alekseev <aleksander@timescale.com>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Cc: Nico Williams <nico@cryptonector.com>, Tom Lane <tgl@sss.pgh.pa.us>, wenhui qiu <qiuwenhuifx@gmail.com>, Vik Fearing <vik@postgresfriends.org>
Date: 2025-05-19T18:04:06Z
Lists: pgsql-hackers
On 5/19/25 12:25, Aleksander Alekseev wrote:
> ```
> -- imagine replacing inefficient array_sample(array_agg(t), 10)
> -- with more efficient array_sample_reservoir(t, 10)
> SELECT (unnest(agg)).* AS k FROM
> (  SELECT array_sample(array_agg(t), 10) AS agg FROM (
>     ... here goes the subquery ...
>     ) AS t
> );
> ```
> 
> ... if only we supported such a column expansion for not registered
> records. Currently such a query fails with:
> 
> ```
> ERROR:  record type has not been registered
> ```
I know about this issue. Having resolved it in a limited number of local 
cases (like FDW push-down of row types), I still do not have a universal 
solution worth proposing upstream. Do you have any public implementation 
of the array_sample_reservoir to play with?

-- 
regards, Andrei Lepikhov