Thread

  1. Possible performance issues: DISTINCT ON + ORDER BY and JIT with aggregation

    Jhonathan Cruz <jhonathancruz48@gmail.com> — 2025-12-17T00:25:08Z

    Hello,
    
    I would like to report two query execution behaviors that may indicate
    performance issues or regressions. I am not certain whether these are
    known or expected behaviors, so I would appreciate guidance.
    
    ----------------------------------------------------------------------
    1) DISTINCT ON combined with ORDER BY
    ----------------------------------------------------------------------
    
    I observed cases where queries using DISTINCT ON together with ORDER BY
    produce execution plans with explicit sorting steps, even when the ordering
    requirements are clearly defined.
    
    Minimal reproduction:
    
    CREATE TABLE t (
        id bigint,
        grp int,
        created_at timestamp
    );
    
    INSERT INTO t
    SELECT
        g,
        g % 100,
        now() - (g || ' seconds')::interval
    FROM generate_series(1, 500000) g;
    
    ANALYZE t;
    
    Query:
    
    EXPLAIN (ANALYZE, BUFFERS)
    SELECT DISTINCT ON (grp)
           grp, id, created_at
    FROM t
    ORDER BY grp, created_at DESC;
    
    Example EXPLAIN ANALYZE output:
    
    Unique  (cost=45231.12..47731.12 rows=100 width=24)
            (actual time=182.413..198.721 rows=100 loops=1)
      Buffers: shared hit=12845
      ->  Sort  (cost=45231.12..46481.12 rows=500000 width=24)
                (actual time=182.410..191.256 rows=500000 loops=1)
            Sort Key: grp, created_at DESC
            Sort Method: quicksort  Memory: 51200kB
            Buffers: shared hit=12845
            ->  Seq Scan on t  (cost=0.00..11231.00 rows=500000 width=24)
                  (actual time=0.012..38.117 rows=500000 loops=1)
                  Buffers: shared hit=12845
    Planning Time: 0.213 ms
    Execution Time: 202.981 ms
    
    Observed behavior:
    - Explicit Sort node before Unique
    - Full sort over all rows
    - Execution time higher than expected
    
    Question:
    - Is this planner behavior expected for DISTINCT ON queries, or could
      this case potentially avoid a full sort?
    
    ----------------------------------------------------------------------
    2) Possible JIT performance regression with aggregation and light filtering
    ----------------------------------------------------------------------
    
    On PostgreSQL 16.x, I observed that queries performing aggregation with
    light filtering may run slower with JIT enabled compared to JIT disabled.
    
    Query:
    
    SET jit = on;
    
    EXPLAIN (ANALYZE, BUFFERS)
    SELECT grp, count(*)
    FROM t
    WHERE id > 100
    GROUP BY grp;
    
    Example EXPLAIN ANALYZE output (JIT enabled):
    
    HashAggregate  (cost=12431.22..12441.22 rows=100 width=12)
                   (actual time=156.234..156.941 rows=100 loops=1)
      Group Key: grp
      Buffers: shared hit=12483
      ->  Seq Scan on t  (cost=0.00..11231.00 rows=499900 width=4)
            (actual time=0.014..52.731 rows=499900 loops=1)
            Filter: (id > 100)
            Rows Removed by Filter: 100
            Buffers: shared hit=12483
    JIT:
      Functions: 5
      Options: Inlining true, Optimization true, Expressions true, Deforming
    true
      Timing: Generation 2.312 ms, Inlining 18.942 ms,
              Optimization 94.117 ms, Emission 22.884 ms, Total 138.255 ms
    Planning Time: 0.184 ms
    Execution Time: 159.214 ms
    
    For comparison, with JIT disabled:
    
    SET jit = off;
    
    EXPLAIN (ANALYZE, BUFFERS)
    SELECT grp, count(*)
    FROM t
    WHERE id > 100
    GROUP BY grp;
    
    Example EXPLAIN ANALYZE output (JIT disabled):
    
    HashAggregate  (cost=12431.22..12441.22 rows=100 width=12)
                   (actual time=41.312..42.117 rows=100 loops=1)
      Group Key: grp
      Buffers: shared hit=12483
      ->  Seq Scan on t  (cost=0.00..11231.00 rows=499900 width=4)
            (actual time=0.011..35.912 rows=499900 loops=1)
            Filter: (id > 100)
            Rows Removed by Filter: 100
            Buffers: shared hit=12483
    Planning Time: 0.121 ms
    Execution Time: 44.287 ms
    
    Observed behavior:
    - JIT compilation time dominates query execution
    - Query runs significantly faster with JIT disabled
    - This differs from behavior observed on PostgreSQL 14.x for similar queries
    
    Environment:
    - PostgreSQL version: 16.x
    - OS: <fill if needed>
    - CPU: <fill if needed>
    - jit: on/off
    - shared_buffers: <fill if needed>
    
    I am unsure whether these behaviors are expected, regressions, or already
    tracked issues, and would appreciate any feedback or direction.
    
    Regards,
    Jhonathan Cruz