Thread

  1. Query Plan Columns

    David Wheeler <david@kineticode.com> — 2010-11-05T20:27:21Z

    Fellow Hackers,
    
    I'm writing a function to turn an EXPLAIN plan into a table with columns. As such, I need to have a complete list of the various bits of each plan node and their types for the table. Here's what I've got so far:
    
        "Node Type"           TEXT,
        "Strategy"            TEXT,
        "Startup Cost"        NUMERIC,
        "Total Cost"          NUMERIC,
        "Plan Rows"           INTEGER,
        "Plan Width"          INTEGER,
        "Actual Startup Time" NUMERIC,
        "Actual Total Time"   NUMERIC,
        "Actual Rows"         INTEGER,
        "Actual Width"        INTEGER,
        "Actual Loops"        INTEGER,
        "Parent Relationship" TEXT
        "Sort Key"            TEXT[],  
        "Sort Method"         TEXT,
        "Sort-Space-Used"     TEXT,
        "Sort-Space-Type"     TEXT,
        "Join Type"           TEXT,
        "Join Filter"         TEXT,
        "Hash Cond"           TEXT,
        "Relation Name"       NAME,
        "Alias"               NAME,
        "Scan Direction"      TEXT,
        "Index Name"          NAME,
        "Index Cond"          TEXT,
        "Subplan Name"        TEXT,
        "Function Name"       TEXT,
        "Filter"              TEXT
    
    Questions:
    
    * Am I missing any?
    * Are the data types correct?
    * Is there some place in the source with a canonical list?
    
    Thanks,
    
    David
    
    
    
  2. Re: Query Plan Columns

    Andrew Dunstan <andrew@dunslane.net> — 2010-11-05T20:36:34Z

    
    On 11/05/2010 04:27 PM, David E. Wheeler wrote:
    > Fellow Hackers,
    >
    > I'm writing a function to turn an EXPLAIN plan into a table with columns. As such, I need to have a complete list of the various bits of each plan node and their types for the table. Here's what I've got so far:
    >
    >      "Node Type"           TEXT,
    >      "Strategy"            TEXT,
    >      "Startup Cost"        NUMERIC,
    >      "Total Cost"          NUMERIC,
    >      "Plan Rows"           INTEGER,
    >      "Plan Width"          INTEGER,
    >      "Actual Startup Time" NUMERIC,
    >      "Actual Total Time"   NUMERIC,
    >      "Actual Rows"         INTEGER,
    >      "Actual Width"        INTEGER,
    >      "Actual Loops"        INTEGER,
    >      "Parent Relationship" TEXT
    >      "Sort Key"            TEXT[],
    >      "Sort Method"         TEXT,
    >      "Sort-Space-Used"     TEXT,
    >      "Sort-Space-Type"     TEXT,
    >      "Join Type"           TEXT,
    >      "Join Filter"         TEXT,
    >      "Hash Cond"           TEXT,
    >      "Relation Name"       NAME,
    >      "Alias"               NAME,
    >      "Scan Direction"      TEXT,
    >      "Index Name"          NAME,
    >      "Index Cond"          TEXT,
    >      "Subplan Name"        TEXT,
    >      "Function Name"       TEXT,
    >      "Filter"              TEXT
    >
    > Questions:
    >
    > * Am I missing any?
    > * Are the data types correct?
    > * Is there some place in the source with a canonical list?
    
    
    src/backend/commands/explain.c
    
    Of course, there are containers too, which are not in your list at all. 
    How do you intend to represent the tree-ish structure in a flat table?
    
    cheers
    
    andrew
    
    
  3. Re: Query Plan Columns

    Dimitri Fontaine <dimitri@2ndquadrant.fr> — 2010-11-05T20:38:58Z

    "David E. Wheeler" <david@kineticode.com> writes:
    > * Is there some place in the source with a canonical list?
    
    It seems that you need to read through ExplainNode in
    src/backend/commands/explain.c:
    
      http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617
    
    Regards,
    -- 
    Dimitri Fontaine
    http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support
    
    
  4. Re: Query Plan Columns

    Josh Berkus <josh@agliodbs.com> — 2010-11-05T20:39:07Z

    > Of course, there are containers too, which are not in your list at all.
    > How do you intend to represent the tree-ish structure in a flat table?
    
    Andrew: we'll use a proximity tree.
    
    
    -- 
                                      -- Josh Berkus
                                         PostgreSQL Experts Inc.
                                         http://www.pgexperts.com
    
    
  5. Re: Query Plan Columns

    David Wheeler <david@kineticode.com> — 2010-11-05T20:40:43Z

    On Nov 5, 2010, at 1:36 PM, Andrew Dunstan wrote:
    
    > Of course, there are containers too, which are not in your list at all. How do you intend to represent the tree-ish structure in a flat table?
    
    So far I see only two containers: Subplans and Sot Keys. The latter is represented as an array. The former I'm going to represent simply by having a parent_id column to point to a parent from a child node.
    
    Are there others?
    
    Thanks,
    
    David
    
    
    
  6. Re: Query Plan Columns

    David Wheeler <david@kineticode.com> — 2010-11-05T20:42:20Z

    On Nov 5, 2010, at 1:38 PM, Dimitri Fontaine wrote:
    
    > It seems that you need to read through ExplainNode in
    > src/backend/commands/explain.c:
    > 
    >  http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617
    
    Ah, great, thanks.
    
    David
    
    
    
  7. Re: Query Plan Columns

    David Wheeler <david@kineticode.com> — 2010-11-05T20:51:53Z

    On Nov 5, 2010, at 1:42 PM, David E. Wheeler wrote:
    
    > On Nov 5, 2010, at 1:38 PM, Dimitri Fontaine wrote:
    > 
    >> It seems that you need to read through ExplainNode in
    >> src/backend/commands/explain.c:
    >> 
    >> http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617
    > 
    > Ah, great, thanks.
    > 
    > David
    
    Hrm…
    
     960             ExplainPropertyFloat("Plan Rows", plan->plan_rows, 0, es);
    
    Should rows really be a float?
    
    Best,
    
    David
    
    
    
  8. Re: Query Plan Columns

    Josh Berkus <josh@agliodbs.com> — 2010-11-05T20:53:46Z

    >  960             ExplainPropertyFloat("Plan Rows", plan->plan_rows, 0, es);
    > 
    > Should rows really be a float?
    
    Yeah, since PropertyInteger is 32-bit.
    
    
    -- 
                                      -- Josh Berkus
                                         PostgreSQL Experts Inc.
                                         http://www.pgexperts.com
    
    
  9. Re: Query Plan Columns

    Tom Lane <tgl@sss.pgh.pa.us> — 2010-11-05T22:03:20Z

    Josh Berkus <josh@agliodbs.com> writes:
    >> 960             ExplainPropertyFloat("Plan Rows", plan->plan_rows, 0, es);
    >> 
    >> Should rows really be a float?
    
    > Yeah, since PropertyInteger is 32-bit.
    
    Yes, plan_rows is a float.  We might need to represent very large
    values, and we do not need high accuracy, so it's really the right type.
    
    			regards, tom lane
    
    
  10. Re: Query Plan Columns

    David Fetter <david@fetter.org> — 2010-11-06T17:17:05Z

    On Fri, Nov 05, 2010 at 01:39:07PM -0700, Josh Berkus wrote:
    > 
    > > Of course, there are containers too, which are not in your list at all.
    > > How do you intend to represent the tree-ish structure in a flat table?
    > 
    > Andrew: we'll use a proximity tree.
    
    Adjacency list?
    
    If so, in my experience, it's best to separate the node descriptions
    from the adjacency list that links them together.
    
    Cheers,
    David.
    -- 
    David Fetter <david@fetter.org> http://fetter.org/
    Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
    Skype: davidfetter      XMPP: david.fetter@gmail.com
    iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
    
    Remember to vote!
    Consider donating to Postgres: http://www.postgresql.org/about/donate
    
    
  11. Re: Query Plan Columns

    David Wheeler <david@kineticode.com> — 2010-11-06T18:44:54Z

    On Nov 5, 2010, at 1:42 PM, David E. Wheeler wrote:
    
    >> http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617
    > 
    > Ah, great, thanks.
    
    So based on this, I've come up with:
    
        "Node Type"             TEXT,
        "Strategy"              TEXT,
        "Operation"             TEXT,
        "Startup Cost"          FLOAT,
        "Total Cost"            FLOAT,
        "Plan Rows"             FLOAT,
        "Plan Width"            INTEGER,
        "Actual Startup Time"   FLOAT,
        "Actual Total Time"     FLOAT,
        "Actual Rows"           FLOAT,
        "Actual Loops"          FLOAT,
        "Parent Relationship"   TEXT,
        "Sort Key"              TEXT[],
        "Sort Method"           TEXT[],
        "Sort Space Used"       BIGINT,
        "Sort Space Type"       TEXT,
        "Join Type"             TEXT,
        "Join Filter"           TEXT,
        "Hash Cond"             TEXT,
        "Relation Name"         NAME,
        "Alias"                 NAME,
        "Scan Direction"        TEXT,
        "Index Name"            TEXT,
        "Index Cond"            TEXT,
        "Recheck Cond"          TEXT,
        "TID Cond"              TEXT,
        "Merge Cond"            TEXT,
        "Subplan Name"          TEXT,
        "Function Name"         TEXT,
        "Function Call"         TEXT,
        "Filter"                TEXT,
        "One-Time Filter"       TEXT,
        "Command"               TEXT,
        "Shared Hit Blocks"     BIGINT,
        "Shared Read Blocks"    BIGINT,
        "Shared Written Blocks" BIGINT,
        "Local Hit Blocks"      BIGINT,
        "Local Read Blocks"     BIGINT,
        "Local Written Blocks"  BIGINT,
        "Temp Read Blocks"      BIGINT,
        "Temp Written Blocks"   BIGINT,
        "Output"                TEXT[],
        "Hash Buckets"          BIGINT,
        "Hash Batches"          BIGINT,
        "Original Hash Batches" BIGINT,
        "Peak Memory Usage"     BIGINT,
        "Schema"                TEXT,
        "CTE Name"              TEXT
    
    Does that seem reasonable? Am I missing anything obvious?
    
    Thanks,
    
    David
    
  12. Re: Query Plan Columns

    David Wheeler <david@kineticode.com> — 2010-11-06T22:24:16Z

    On Nov 6, 2010, at 11:44 AM, David E. Wheeler wrote:
    
    > On Nov 5, 2010, at 1:42 PM, David E. Wheeler wrote:
    > 
    >>> http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617
    >> 
    >> Ah, great, thanks.
    > 
    > So based on this, I've come up with:
    
    Updated:
    
    CREATE TEMPORARY TABLE plans (
        "Node Type"             TEXT NOT NULL,
        "Strategy"              TEXT,
        "Operation"             TEXT,
        "Startup Cost"          FLOAT,
        "Total Cost"            FLOAT,
        "Plan Rows"             FLOAT,
        "Plan Width"            INTEGER,
        "Actual Startup Time"   FLOAT,
        "Actual Total Time"     FLOAT,
        "Actual Rows"           FLOAT,
        "Actual Loops"          FLOAT,
        "Parent Relationship"   TEXT,
        "Sort Key"              TEXT[],
        "Sort Method"           TEXT[],
        "Sort Space Used"       BIGINT,
        "Sort Space Type"       TEXT,
        "Join Type"             TEXT,
        "Join Filter"           TEXT,
        "Hash Cond"             TEXT,
        "Relation Name"         TEXT,
        "Alias"                 TEXT,
        "Scan Direction"        TEXT,
        "Index Name"            TEXT,
        "Index Cond"            TEXT,
        "Recheck Cond"          TEXT,
        "TID Cond"              TEXT,
        "Merge Cond"            TEXT,
        "Subplan Name"          TEXT,
        "Function Name"         TEXT,
        "Function Call"         TEXT,
        "Filter"                TEXT,
        "One-Time Filter"       TEXT,
        "Command"               TEXT,
        "Shared Hit Blocks"     BIGINT,
        "Shared Read Blocks"    BIGINT,
        "Shared Written Blocks" BIGINT,
        "Local Hit Blocks"      BIGINT,
        "Local Read Blocks"     BIGINT,
        "Local Written Blocks"  BIGINT,
        "Temp Read Blocks"      BIGINT,
        "Temp Written Blocks"   BIGINT,
        "Output"                TEXT[],
        "Hash Buckets"          BIGINT,
        "Hash Batches"          BIGINT,
        "Original Hash Batches" BIGINT,
        "Peak Memory Usage"     BIGINT,
        "Schema"                TEXT,
        "CTE Name"              TEXT
    );
    
    Would I be right that "Node Type" is the only column can be NOT NULL?
    
    Also, I'm thinking of making the "Actual Startup Time" and ""Actual Total Time" columns into INTERVALs. The times are expressed in milliseconds, yes? I'm wondering if INTERVAL would be more convenient for querying…
    
    Thanks,
    
    David
    
    
    
  13. Re: Query Plan Columns

    Roberto Mello <roberto.mello@gmail.com> — 2010-11-07T13:24:06Z

    On Sat, Nov 6, 2010 at 6:24 PM, David E. Wheeler <david@kineticode.com> wrote:
    >
    > Would I be right that "Node Type" is the only column can be NOT NULL?
    >
    > Also, I'm thinking of making the "Actual Startup Time" and ""Actual Total Time" columns into INTERVALs. The times are expressed in milliseconds, yes? I'm wondering if INTERVAL would be more convenient for querying…
    
    Yes, but I am wondering whether you should just stick to what would
    come out of a normal explain, for consistency sake. Maybe provide
    another function, or parameter that would cast the results to
    intervals?
    
    Not sure.
    
    Roberto
    
    
  14. Re: Query Plan Columns

    David Wheeler <david@kineticode.com> — 2010-11-08T23:12:35Z

    On Nov 7, 2010, at 5:24 AM, Roberto Mello wrote:
    
    > Yes, but I am wondering whether you should just stick to what would
    > come out of a normal explain, for consistency sake. Maybe provide
    > another function, or parameter that would cast the results to
    > intervals?
    
    I think it's more convenient to have intervals.
    
    So my final list:
    
        CREATE TABLE plans (
            planned_at              TIMESTAMPTZ,
            node_id                 TEXT PRIMARY KEY,
            parent_id               TEXT REFERENCES plans(node_id),
            node_type               TEXT NOT NULL,
            total_runtime           INTERVAL,
            strategy                TEXT,
            operation               TEXT,
            startup_cost            FLOAT,
            total_cost              FLOAT,
            plan_rows               FLOAT,
            plan_width              INTEGER,
            actual_startup_time     INTERVAL,
            actual_total_time       INTERVAL,
            actual_rows             FLOAT,
            actual_loops            FLOAT,
            parent_relationship     TEXT,
            sort_key                TEXT[],
            sort_method             TEXT[],
            sort_space_used         BIGINT,
            sort_space_type         TEXT,
            join_type               TEXT,
            join_filter             TEXT,
            hash_cond               TEXT,
            relation_name           TEXT,
            alias                   TEXT,
            scan_direction          TEXT,
            index_name              TEXT,
            index_cond              TEXT,
            recheck_cond            TEXT,
            tid_cond                TEXT,
            merge_cond              TEXT,
            subplan_name            TEXT,
            function_name           TEXT,
            function_call           TEXT,
            filter                  TEXT,
            one_time_filter         TEXT,
            command                 TEXT,
            shared_hit_blocks       BIGINT,
            shared_read_blocks      BIGINT,
            shared_written_blocks   BIGINT,
            local_hit_blocks        BIGINT,
            local_read_blocks       BIGINT,
            local_written_blocks    BIGINT,
            temp_read_blocks        BIGINT,
            temp_written_blocks     BIGINT,
            output                  TEXT[],
            hash_buckets            BIGINT,
            hash_batches            BIGINT,
            original_hash_batches   BIGINT,
            peak_memory_usage       BIGINT,
            schema                  TEXT,
            cte_name                TEXT,       
            triggers                trigger_plan[]
        );
    
    planned_at is just the current time (from NOW()).
    
    node_id is simply `md5( pg_backend_pid() || clock_timestamp() )`, which is run just before each node is parsed.
    
    trigger_plan is a composite type:
    
        CREATE TYPE trigger_plan AS (
            trigger_name    TEXT,
            constraint_name TEXT,
            relation        TEXT,
            time            INTERVAL,
            calls           FLOAT
        );
    
    I'm wondering if there's any reason why we couldn't have EXPLAIN do something like this itself in core:
    
        EXPLAIN (format table) SELECT * FROM bar;
    
    It could output a table like the above. FWIW, The function I've written works like this:
    
        SELECT plan('SELECT * FROM bar');
    
    Which is an okay workaround. Anyone else think that this might be useful?
    
    Best,
    
    David
    
    
    
  15. Re: Query Plan Columns

    David Wheeler <david@kineticode.com> — 2010-11-08T23:15:13Z

    On Nov 8, 2010, at 3:12 PM, David E. Wheeler wrote:
    
    > It could output a table like the above. FWIW, The function I've written works like this:
    > 
    >    SELECT plan('SELECT * FROM bar');
    
    Sorry, that's
    
        SELECT * FROM plan('SELECT * FROM bar');
    
    Best,
    
    David
    
    
    
  16. Re: Query Plan Columns

    Dimitri Fontaine <dimitri@2ndquadrant.fr> — 2010-11-09T08:12:06Z

    "David E. Wheeler" <david@kineticode.com> writes:
    > I'm wondering if there's any reason why we couldn't have EXPLAIN do something like this itself in core:
    >
    >     EXPLAIN (format table) SELECT * FROM bar;
    
    +1 from me here, as it happens "parsing" a table-like resultset is
    already implemented everywhere :)
    
    I'd add another +1 if it were easily usable as a "normal" relation (or
    SRF) from inside a query, e.g. in a WITH table expression.
    
    WITH plan AS (
      EXPLAIN (format table) SELECT * FROM bar
    )
    SELECT some_plan_analysis_query_here;
    
    Or even:
    
    WITH plan AS (
      EXPLAIN (format table) SELECT * FROM bar
    )
    INSERT INTO plan_audit
    SELECT * FROM plan
     WHERE actual_total_time > 12 * interval '100 ms';
    
    Regards,
    -- 
    Dimitri Fontaine
    http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support
    
    
  17. Re: Query Plan Columns

    David Wheeler <david@kineticode.com> — 2010-11-09T08:19:23Z

    On Nov 9, 2010, at 12:12 AM, Dimitri Fontaine wrote:
    
    > WITH plan AS (
    >  EXPLAIN (format table) SELECT * FROM bar
    > )
    > INSERT INTO plan_audit
    > SELECT * FROM plan
    > WHERE actual_total_time > 12 * interval '100 ms';
    
    Yeah, that would be nice, but my current implementation has a row for each node, and a single explain can have many nodes. With this, you'd only get the top-level node (and not even that, as you didn't do EXPLAIN ANALYZE, so actual_total_time would be blank!).
    
    But I do like the idea…
    
    Best,
    
    David