Thread

  1. array_random

    jian he <jian.universality@gmail.com> — 2025-06-30T15:04:25Z

    hi.
    
    context: [1].
    the attached patch already posted in [1].
    I don't want to hijack another thread. so I post it in a separate thread.
    
    The attached patch introduces a new function: array_random.
    array_random description:
    Returns an array filled with random values in the range min <= x <= max,
    having dimensions of the lengths specified by dims. The optional lbounds
    argument supplies lower-bound values for each dimension (which default
     to all 1).
    
    array_random function signature:
    array_random(min int4, max int4, dims int[] [, lbounds int[]]) -> int[]
    array_random(min int8, max int8, dims int[] [, lbounds int[]]) -> int8[]
    array_random(min numeric, max numeric, dims int[] [, lbounds int[]])
    -> numeric[]
    
    demo:
    SELECT array_random(1, 6, array[2,5], array[2,4]);
                 array_random
    --------------------------------------
     [2:3][4:8]={{6,2,2,5,4},{4,5,6,4,6}}
    
    reasons for adding array_random is:
    1. This is better than array_fill. This can fill random and constant
    values (random, min and max the same).
    2.  Building a multi-dimensional PL/pgSQL function equivalent to
    array_random is not efficient and is also not easier.
    
    [1] https://www.postgresql.org/message-id/CACJufxGRCP19Rm66%3DTSBwmEuVr92FwL_e6YFjmCpJrgu6Km9hQ%40mail.gmail.com
    
  2. Re: array_random

    jian he <jian.universality@gmail.com> — 2025-07-05T03:35:28Z

    On Mon, Jun 30, 2025 at 11:04 PM jian he <jian.universality@gmail.com> wrote:
    >
    > demo:
    > SELECT array_random(1, 6, array[2,5], array[2,4]);
    >              array_random
    > --------------------------------------
    >  [2:3][4:8]={{6,2,2,5,4},{4,5,6,4,6}}
    >
    > reasons for adding array_random is:
    > 1. This is better than array_fill. This can fill random and constant
    > values (random, min and max the same).
    > 2.  Building a multi-dimensional PL/pgSQL function equivalent to
    > array_random is not efficient and is also not easier.
    >
    > [1] https://www.postgresql.org/message-id/CACJufxGRCP19Rm66%3DTSBwmEuVr92FwL_e6YFjmCpJrgu6Km9hQ%40mail.gmail.com
    
    i didn't use ./src/include/catalog/duplicate_oids
    then there is some duplicated oid issue.
    the attached patch fixes this issue.
    
  3. Re: array_random

    Vik Fearing <vik@postgresfriends.org> — 2025-07-05T07:32:24Z

    On 30/06/2025 17:04, jian he wrote:
    > reasons for adding array_random is:
    > 1. This is better than array_fill. This can fill random and constant
    > values (random, min and max the same).
    > 2.  Building a multi-dimensional PL/pgSQL function equivalent to
    > array_random is not efficient and is also not easier.
    
    
    I am not against this at all, but what is the actual use case?
    
    -- 
    
    Vik Fearing
    
  4. Re: array_random

    Dean Rasheed <dean.a.rasheed@gmail.com> — 2025-07-07T07:53:46Z

    On Sat, 5 Jul 2025 at 08:32, Vik Fearing <vik@postgresfriends.org> wrote:
    >
    > On 30/06/2025 17:04, jian he wrote:
    >
    > reasons for adding array_random is:
    > 1. This is better than array_fill. This can fill random and constant
    > values (random, min and max the same).
    > 2.  Building a multi-dimensional PL/pgSQL function equivalent to
    > array_random is not efficient and is also not easier.
    
    I took a quick look at this and there are a number of things that I
    don't like about the implementation. However, ...
    
    > I am not against this at all, but what is the actual use case?
    
    I think that's a reasonable question to ask.
    
    The original use case [1] was to produce arrays of random numbers for
    testing gin indexes.
    
    [1] https://www.postgresql.org/message-id/87plssezpc.fsf%40163.com
    
    However, this is adding quite a lot of C code for something that's
    actually pretty easy to do as a user-defined function in SQL. Maybe
    that's OK, if it's something that there's a lot of demand for, but
    it's worth asking that question.
    
    Regards,
    Dean
    
    
    
    
  5. Re: array_random

    jian he <jian.universality@gmail.com> — 2025-07-08T07:48:07Z

    On Sat, Jul 5, 2025 at 3:32 PM Vik Fearing <vik@postgresfriends.org> wrote:
    >
    > On 30/06/2025 17:04, jian he wrote:
    >
    > reasons for adding array_random is:
    > 1. This is better than array_fill. This can fill random and constant
    > values (random, min and max the same).
    > 2.  Building a multi-dimensional PL/pgSQL function equivalent to
    > array_random is not efficient and is also not easier.
    >
    >
    > I am not against this at all, but what is the actual use case?
    >
    > --
    
    it seems not trivial to wrap up all the generated random values into a specific
    multi-dimensional array (more than 2 dimensions).
    for example, say we generated 24 random values and wanted to arrange them into a
    3-dimensional array with shape [4, 3, 2].
    we can easily use:
    SELECT array_random(1, 6, array[4,3, 2]);
    
    of course, we can use plpgsql to do it, but the c function would be
    more convenient.
    does this make sense?
    
    
    
    
  6. Re: array_random

    Aleksander Alekseev <aleksander@tigerdata.com> — 2025-07-08T08:45:44Z

    Hi,
    
    > it seems not trivial to wrap up all the generated random values into a specific
    > multi-dimensional array (more than 2 dimensions).
    > for example, say we generated 24 random values and wanted to arrange them into a
    > 3-dimensional array with shape [4, 3, 2].
    > we can easily use:
    > SELECT array_random(1, 6, array[4,3, 2]);
    >
    > of course, we can use plpgsql to do it, but the c function would be
    > more convenient.
    > does this make sense?
    
    The proposed function seems to do two things at a time - generating
    random values and transforming them into an array of desired
    dimensions. Generally we try to avoid such interfaces. Can you think
    of something like array_transform() / array_reshape() that takes an
    arbitrary single-dimension array and modifies it?
    
    
    
    
  7. Re: array_random

    Dean Rasheed <dean.a.rasheed@gmail.com> — 2025-07-11T07:49:09Z

    On Tue, 8 Jul 2025 at 15:26, Aleksander Alekseev
    <aleksander@tigerdata.com> wrote:
    >
    > The proposed function seems to do two things at a time - generating
    > random values and transforming them into an array of desired
    > dimensions. Generally we try to avoid such interfaces. Can you think
    > of something like array_transform() / array_reshape() that takes an
    > arbitrary single-dimension array and modifies it?
    
    That's a good point. Arguably, creating a simple 1-D array of random
    values is trivial enough to leave to users, and there isn't sufficient
    demand to justify creating core functions for it.
    
    OTOH, creating multi-dimensional arrays (of any values, not just
    random) is harder. Perhaps there is an argument for in-core functions
    to make that easier, though I'm not sure how many people actually use
    multi-dimensional arrays. In any case, that seems like a separate
    discussion for a new thread, if there's demand.
    
    Regards,
    Dean
    
    
    
    
  8. Re: array_random

    jian he <jian.universality@gmail.com> — 2025-07-21T03:15:36Z

    On Fri, Jul 11, 2025 at 3:49 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
    >
    > On Tue, 8 Jul 2025 at 15:26, Aleksander Alekseev
    > <aleksander@tigerdata.com> wrote:
    > >
    > > The proposed function seems to do two things at a time - generating
    > > random values and transforming them into an array of desired
    > > dimensions. Generally we try to avoid such interfaces. Can you think
    > > of something like array_transform() / array_reshape() that takes an
    > > arbitrary single-dimension array and modifies it?
    >
    > That's a good point. Arguably, creating a simple 1-D array of random
    > values is trivial enough to leave to users, and there isn't sufficient
    > demand to justify creating core functions for it.
    >
    use sql, we generally do something like:
    select array_agg(random(1, 10)) from generate_series(1, 2) g;
    but its performance is worse than array_random.
    
    Does performance and other factors justify adding array_random?
    
    
    
    
  9. Re: array_random

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-07-21T03:49:12Z

    jian he <jian.universality@gmail.com> writes:
    > Does performance and other factors justify adding array_random?
    
    There hasn't really been field demand for such a function, AFAIR.
    There's an infinite amount of stuff that would be faster if coded
    in C than if written in SQL or plpgsql.  We can't support it all,
    so we need some fairly high bar to accepting what functionality
    we'll support.
    
    In the case at hand, "fill a 1-D array with random values" is
    quite trivial to do in either SQL or plpgsql.  However, "reshape
    an array to have such-and-such dimensions" isn't.  So in the abstract
    I'd be more sympathetic to inventing a function that can do reshaping.
    But there is still the question of whether anybody really needs that
    functionality.
    
    			regards, tom lane