Re: [PATCH] Generate random dates/times in a specified range

Damien Clochard <damien@dalibo.info>

From: Damien Clochard <damien@dalibo.info>
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: pgsql-hackers@lists.postgresql.org
Date: 2025-07-11T10:09:09Z
Lists: pgsql-hackers

Commits

Same data as JSON: GET /api/v1/messages/:b64id/commits the thread's linked commits as JSON, with link sources. API reference →
  1. doc: Improve description of new random(min, max) functions.

  2. Add date and timestamp variants of random(min, max).

Le 10.07.2025 00:14, Tom Lane a écrit :
> Damien Clochard <damien@dalibo.info> writes:
>> So this adds 5 new variants of the random() function:
> 
>>      random(min date, max date) returns date
>>      random(min time, max time) returns time
>>      random(min time, max time, zone text) returns timetz
>>      random(min timestamp, max timestamp) returns timestamp
>>      random(min timestamptz, max timestamptz) returns timestamptz
> 
> I'm a little uncomfortable with this proposal, mainly because it
> overloads the random() function name to the point where I'm afraid
> of "ambiguous function" failures in SQL code that used to be fine.
> 

Hi

Thanks for the feedback !

I agree with this, I overloaded the random() function because this is 
what was done previously with `random(int,int)` and I did the same like 
the good sheep that I am :)
but i'm fine with renaming this functions to daterandom, timerandom or 
whatever....

> The traditional way of achieving these results would be something like
> 
>     select now() + random() * interval '10 days';
> 
> and I'm not convinced that the use-case is so large as to justify
> adding built-in forms of that.
> 


 From my experience, when users are writing a set of masking rules, they 
tend to anonymize the dates with "a random date between start_date and 
end_date"

Which can be trasnlated like this

SELECT start_date+(random()*(end_date-start_date))::interval;

But when you have hundreds of masking rules, the meaning of this one is 
not so clear.


Now with PostgreSQL 18, we can write

SELECT random(start_date::int, end_date::int)::date;


Which is more explicit, but we could extend that logic to:

SELECT daterandom(start_date,end_date);


I agree this is merely syntactic sugar for the developers, but I don't 
see why it is ok to provide random(int,int) or random(numeric,numeric) 
and why random(date,date) is not.

Regards,

-- 
Damien Clochard