Thread

  1. Re: Relaxing constraints on BitmapAnd eligibility?

    Dmytro Astapov <dastapov@gmail.com> — 2025-02-26T20:44:50Z

    Hi
    
    And here is a proposed code change, alternative to the doc change.
    
    I hope that it is ok to relax the restriction like so, as
    `cost_bitmap_node_and` is already resigned to inputs not being independent:
    * We estimate AND selectivity on the assumption that the inputs are
    * independent.  This is probably often wrong, but we don't have the info
    * to do better.
    
    I've ran "make check", and one test have changed (the last one in the "test
    extraction of restriction OR clauses from join OR clause" group - in an
    expected way, IMO).
    I am attaching the regression.diff for convenience.
    
    Is this a generally desirable change to pursue?
    
    Best regards, Dmytro
    
    
    On Wed, Feb 26, 2025 at 7:31 PM Dmytro Astapov <dastapov@gmail.com> wrote:
    
    > Hi!
    >
    > I am (still) very unsure if the code change I mentioned will make sense,
    > but documentation chage could perhaps look like something along these lines?
    >
    >
    >
    > Best regards, Dmytro
    >
    >
    > On Tue, Feb 25, 2025 at 9:14 PM Dmytro Astapov <dastapov@gmail.com> wrote:
    >
    >> Hi!
    >>
    >> I've been investigating why postgres does not do BitmapAnd of two
    >> well-suited indexes, and reading indxpath.c
    >>
    >> In my case, there is a table (d date, col1 int, col2 int) -- types not
    >> really important -- and there are two indices on (d,col1) and (d, col2).
    >>
    >> For queries that do WHERE d>=X AND col1=Y AND col2=Z postgres will never
    >> BitmapAnd those two indices because both indexes include (d) and we have a
    >> condition on (d). Here is a full example, which could also be seen here:
    >> https://www.db-fiddle.com/f/uPLx5bRtDEoZw3Dx4kkwKh/0:
    >>
    >> begin;
    >>
    >> CREATE TABLE test_table (
    >>     d date,
    >>     col1 int,
    >>     col2 int
    >> );
    >>
    >> INSERT INTO test_table (d, col1, col2)
    >> SELECT
    >>     d.date,
    >>     c1.val as col1,
    >>     c2.val as col2
    >> FROM
    >>     generate_series(
    >>         '2023-01-01'::date,
    >>         '2023-12-31'::date,
    >>         '1 day'::interval
    >>     ) as d(date),
    >>     generate_series(1, 1000) as c1(val),
    >>     generate_series(1, 1000) as c2(val)
    >> WHERE
    >>     random() < 0.001;
    >>
    >> create index on test_table(col1,d);
    >> create index on test_table(col2,d);
    >>
    >> -- This uses BitmapAnd
    >> explain select * from test_table where col1=123 and col2=321;
    >>
    >> -- This does not use BitmapAnd, even though it could!
    >> explain select * from test_table where col1=123 and col2=321 and d >=
    >> '2023-05-05';
    >>
    >> I checked that BitmapAnd is rejected by this
    >> <https://github.com/postgres/postgres/blob/master/src/backend/optimizer/path/indxpath.c#L1878>
    >> line in choose_bitmap_and:
    >>
    >>    if (bms_overlap(pathinfo->clauseids, clauseidsofar))
    >>       continue; /* consider it redundant */
    >>
    >> There is a comment on choose_bitmap_and that explains the rationale of
    >> this check, but reading it I can't help but feel that what the comment
    >> describes is this condition:
    >>
    >>    if (bms_is_subset(pathinfo->clauseids, clauseidsofar))
    >>       continue; /* consider it redundant */
    >>
    >> And indeed, in my (admittedly not super-extensive) testing changing
    >> bms_overlap to bms_is_subset leads to better faster execution plans.
    >>
    >> Is it possible that this condition could thus be relaxed?
    >>
    >> Even if I am wrong, and the condition absolutely should be bms_overlap, I
    >> feel that this restriction is very very hard to discover and perhaps
    >> https://www.postgresql.org/docs/current/indexes-bitmap-scans.html should
    >> mention that compound indexes that have columns in common will never be
    >> combined?
    >>
    >> Best regards, Dmytro
    >>
    >