Thread

  1. Explain returns different number of rows

    Vince McMahon <sippingonesandzeros@gmail.com> — 2022-10-20T11:58:29Z

    > I executed the following statements 3 times
    >  explain(analyze, buffet) select * from table1
    >
    > The number of rows are different.  Is the table corrupted?  How to confirm
    > and how to fix it?
    >
    
  2. Explain returns different number of rows

    Vince McMahon <sippingonesandzeros@gmail.com> — 2022-10-20T16:52:12Z

    I did get reply so I am trying again.
    
    I executed the following statements 3 times
    >  explain(analyze, buffet) select * from table1
    >
    > The number of rows are different.  Is the table corrupted?  How to confirm
    > and how to fix it?
    >
    
  3. Re: Explain returns different number of rows

    Christophe Pettus <xof@thebuild.com> — 2022-10-20T16:56:23Z

    
    > On Oct 20, 2022, at 09:52, Vince McMahon <sippingonesandzeros@gmail.com> wrote:
    > The number of rows are different. 
    
    This isn't unexpected.  EXPLAIN does not actually run the query and determine how many rows are returned; it calculates an estimate based on the current system statistics, which vary constantly depending on activity in the database.
    
    
    
  4. Re: Explain returns different number of rows

    Peter J. Holzer <hjp-pgsql@hjp.at> — 2022-10-22T09:32:32Z

    On 2022-10-20 09:56:23 -0700, Christophe Pettus wrote:
    > On Oct 20, 2022, at 09:52, Vince McMahon <sippingonesandzeros@gmail.com> wrote:
    > > The number of rows are different. 
    > 
    > This isn't unexpected.  EXPLAIN does not actually run the query and
    > determine how many rows are returned; it calculates an estimate based
    > on the current system statistics, which vary constantly depending on
    > activity in the database.
    
    EXPLAIN ANALYZE (which is what he did) does run the query and return the
    actual number of rows:
    
    #v+
    wdsah=> explain (analyze, buffers) select * from facttable_eurostat_comext_cpa2_1 ;
    ╔══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
    ║                                                                    QUERY PLAN                                                                    ║
    ╟──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
    ║ Seq Scan on facttable_eurostat_comext_cpa2_1  (cost=0.00..1005741.32 rows=39633432 width=85) (actual time=0.396..6541.701 rows=39633591 loops=1) ║
    ║   Buffers: shared read=609407                                                                                                                    ║
    ║ Planning Time: 1.650 ms                                                                                                                          ║
    ║ Execution Time: 7913.027 ms                                                                                                                      ║
    ╚══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
    (4 rows)
    #v-
    
    The first tuple (cost=0.00..1005741.32 rows=39633432 width=85) is an
    estimate used to plan the query. But the second one
    (actual time=0.396..6541.701 rows=39633591 loops=1)
    contains measurements from actually running the query.
    
    I think it's possible that the rows estimate in the first tuple changes
    without any actual data change (although the only reason I can think of
    right now would be an ANALYZE (in another session or by autovacuum)).
    But the actual rows definitely shouldn't change.
    
            hp
    
    -- 
       _  | Peter J. Holzer    | Story must make more sense than reality.
    |_|_) |                    |
    | |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
    __/   | http://www.hjp.at/ |       challenge!"
    
  5. Re: Explain returns different number of rows

    Vince McMahon <sippingonesandzeros@gmail.com> — 2022-10-24T14:31:44Z

    Thanks for the clarification, Peter.
    
    
    
    On Sat, Oct 22, 2022, 05:32 Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
    
    > On 2022-10-20 09:56:23 -0700, Christophe Pettus wrote:
    > > On Oct 20, 2022, at 09:52, Vince McMahon <sippingonesandzeros@gmail.com>
    > wrote:
    > > > The number of rows are different.
    > >
    > > This isn't unexpected.  EXPLAIN does not actually run the query and
    > > determine how many rows are returned; it calculates an estimate based
    > > on the current system statistics, which vary constantly depending on
    > > activity in the database.
    >
    > EXPLAIN ANALYZE (which is what he did) does run the query and return the
    > actual number of rows:
    >
    > #v+
    > wdsah=> explain (analyze, buffers) select * from
    > facttable_eurostat_comext_cpa2_1 ;
    >
    > ╔══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
    > ║                                                                    QUERY
    > PLAN                                                                    ║
    >
    > ╟──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
    > ║ Seq Scan on facttable_eurostat_comext_cpa2_1  (cost=0.00..1005741.32
    > rows=39633432 width=85) (actual time=0.396..6541.701 rows=39633591 loops=1)
    > ║
    > ║   Buffers: shared read=609407
    >                                                                         ║
    > ║ Planning Time: 1.650 ms
    >                                                                         ║
    > ║ Execution Time: 7913.027 ms
    >                                                                         ║
    >
    > ╚══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
    > (4 rows)
    > #v-
    >
    > The first tuple (cost=0.00..1005741.32 rows=39633432 width=85) is an
    > estimate used to plan the query. But the second one
    > (actual time=0.396..6541.701 rows=39633591 loops=1)
    > contains measurements from actually running the query.
    >
    > I think it's possible that the rows estimate in the first tuple changes
    > without any actual data change (although the only reason I can think of
    > right now would be an ANALYZE (in another session or by autovacuum)).
    > But the actual rows definitely shouldn't change.
    >
    >         hp
    >
    > --
    >    _  | Peter J. Holzer    | Story must make more sense than reality.
    > |_|_) |                    |
    > | |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
    > __/   | http://www.hjp.at/ |       challenge!"
    >