Thread

  1. Questions About TODO: Issuing NOTICEs for row count differences in EXPLAIN ANALYZE

    KAZAR Ayoub <ma_kazar@esi.dz> — 2024-10-28T23:43:16Z

    Hello Hackers,
    I am currently looking into the following TODO item, "Have EXPLAIN ANALYZE
    issue NOTICE messages when the estimated and actual row counts differ by a
    specified percentage."
    What's the current status of this TODO, and is there any prior discussion
    or rationale behind it ?
    Specifically:
    - How can we come up with a percentage for row count differences (fixed
    value or dynamic)?
    - Should we consider a configurable param for users ?
    - Is there anything extra to consider ?
    
    Thank you.
    
  2. Re: Questions About TODO: Issuing NOTICEs for row count differences in EXPLAIN ANALYZE

    David Rowley <dgrowleyml@gmail.com> — 2024-10-29T00:39:13Z

    On Tue, 29 Oct 2024 at 12:43, KAZAR Ayoub <ma_kazar@esi.dz> wrote:
    > I am currently looking into the following TODO item, "Have EXPLAIN ANALYZE issue NOTICE messages when the estimated and actual row counts differ by a specified percentage."
    > What's the current status of this TODO, and is there any prior discussion or rationale behind it ?
    
    The status is that we don't have anything like that and I don't recall
    it being mentioned that anyone is working on it.  Normally these items
    only get added when there has been some discussion about it, but
    normally that discussion gets linked along with the todo item. Clearly
    that's not been done in this case.  I imagine the rationale is to make
    it more clear when the estimates are off from the actual execution.
    
    You might need to do some digging into the history of who added that
    todo item and see if you can find any relevant discussion on hackers
    around the time it was added.
    
    > Specifically:
    > - How can we come up with a percentage for row count differences (fixed value or dynamic)?
    > - Should we consider a configurable param for users ?
    > - Is there anything extra to consider ?
    
    The biggest thing to consider is if we'd want anything like this in
    core PostgreSQL. It feels more like something additional tooling such
    as explain.depesz.com would concern themselves with. I could also
    imagine features along those lines in some sort of statistics advisor
    contrib module. My personal view is that it would feel like a very
    misplaced feature if we were to add only what the todo item describes
    into core PostgreSQL. In any case, adding a NOTICE for this seems
    horrible. Doing it that way means the information about the row
    estimate's accuracy is very disconnected from the EXPLAIN line that it
    belongs to.
    
    Additionally, there are cases where we expect the actual and estimates
    to be off, even with perfect statistics. Consider the Seq Scan in the
    following:
    
    postgres=# explain analyze select * from pg_class limit 1;
                                                     QUERY PLAN
    -------------------------------------------------------------------------------------------------------------
     Limit  (cost=0.00..0.04 rows=1 width=273) (actual time=0.035..0.036
    rows=1 loops=1)
       ->  Seq Scan on pg_class  (cost=0.00..18.15 rows=415 width=273)
    (actual time=0.033..0.033 rows=1 loops=1)
    
    I don't think we'd want false alarms for cases like that.
    
    David