Thread

Query Plan Columns

David Wheeler <david@kineticode.com> — 2010-11-05T20:27:21Z

Fellow Hackers,

I'm writing a function to turn an EXPLAIN plan into a table with columns. As such, I need to have a complete list of the various bits of each plan node and their types for the table. Here's what I've got so far:

    "Node Type"           TEXT,
    "Strategy"            TEXT,
    "Startup Cost"        NUMERIC,
    "Total Cost"          NUMERIC,
    "Plan Rows"           INTEGER,
    "Plan Width"          INTEGER,
    "Actual Startup Time" NUMERIC,
    "Actual Total Time"   NUMERIC,
    "Actual Rows"         INTEGER,
    "Actual Width"        INTEGER,
    "Actual Loops"        INTEGER,
    "Parent Relationship" TEXT
    "Sort Key"            TEXT[],  
    "Sort Method"         TEXT,
    "Sort-Space-Used"     TEXT,
    "Sort-Space-Type"     TEXT,
    "Join Type"           TEXT,
    "Join Filter"         TEXT,
    "Hash Cond"           TEXT,
    "Relation Name"       NAME,
    "Alias"               NAME,
    "Scan Direction"      TEXT,
    "Index Name"          NAME,
    "Index Cond"          TEXT,
    "Subplan Name"        TEXT,
    "Function Name"       TEXT,
    "Filter"              TEXT

Questions:

* Am I missing any?
* Are the data types correct?
* Is there some place in the source with a canonical list?

Thanks,

David

Re: Query Plan Columns

Andrew Dunstan <andrew@dunslane.net> — 2010-11-05T20:36:34Z


On 11/05/2010 04:27 PM, David E. Wheeler wrote:
> Fellow Hackers,
>
> I'm writing a function to turn an EXPLAIN plan into a table with columns. As such, I need to have a complete list of the various bits of each plan node and their types for the table. Here's what I've got so far:
>
>      "Node Type"           TEXT,
>      "Strategy"            TEXT,
>      "Startup Cost"        NUMERIC,
>      "Total Cost"          NUMERIC,
>      "Plan Rows"           INTEGER,
>      "Plan Width"          INTEGER,
>      "Actual Startup Time" NUMERIC,
>      "Actual Total Time"   NUMERIC,
>      "Actual Rows"         INTEGER,
>      "Actual Width"        INTEGER,
>      "Actual Loops"        INTEGER,
>      "Parent Relationship" TEXT
>      "Sort Key"            TEXT[],
>      "Sort Method"         TEXT,
>      "Sort-Space-Used"     TEXT,
>      "Sort-Space-Type"     TEXT,
>      "Join Type"           TEXT,
>      "Join Filter"         TEXT,
>      "Hash Cond"           TEXT,
>      "Relation Name"       NAME,
>      "Alias"               NAME,
>      "Scan Direction"      TEXT,
>      "Index Name"          NAME,
>      "Index Cond"          TEXT,
>      "Subplan Name"        TEXT,
>      "Function Name"       TEXT,
>      "Filter"              TEXT
>
> Questions:
>
> * Am I missing any?
> * Are the data types correct?
> * Is there some place in the source with a canonical list?


src/backend/commands/explain.c

Of course, there are containers too, which are not in your list at all. 
How do you intend to represent the tree-ish structure in a flat table?

cheers

andrew

Re: Query Plan Columns

Dimitri Fontaine <dimitri@2ndquadrant.fr> — 2010-11-05T20:38:58Z

"David E. Wheeler" <david@kineticode.com> writes:
> * Is there some place in the source with a canonical list?

It seems that you need to read through ExplainNode in
src/backend/commands/explain.c:

  http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Query Plan Columns

Josh Berkus <josh@agliodbs.com> — 2010-11-05T20:39:07Z

> Of course, there are containers too, which are not in your list at all.
> How do you intend to represent the tree-ish structure in a flat table?

Andrew: we'll use a proximity tree.


-- 
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Query Plan Columns

David Wheeler <david@kineticode.com> — 2010-11-05T20:40:43Z

On Nov 5, 2010, at 1:36 PM, Andrew Dunstan wrote:

> Of course, there are containers too, which are not in your list at all. How do you intend to represent the tree-ish structure in a flat table?

So far I see only two containers: Subplans and Sot Keys. The latter is represented as an array. The former I'm going to represent simply by having a parent_id column to point to a parent from a child node.

Are there others?

Thanks,

David

Re: Query Plan Columns

David Wheeler <david@kineticode.com> — 2010-11-05T20:42:20Z

On Nov 5, 2010, at 1:38 PM, Dimitri Fontaine wrote:

> It seems that you need to read through ExplainNode in
> src/backend/commands/explain.c:
> 
>  http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617

Ah, great, thanks.

David

Re: Query Plan Columns

David Wheeler <david@kineticode.com> — 2010-11-05T20:51:53Z

On Nov 5, 2010, at 1:42 PM, David E. Wheeler wrote:

> On Nov 5, 2010, at 1:38 PM, Dimitri Fontaine wrote:
> 
>> It seems that you need to read through ExplainNode in
>> src/backend/commands/explain.c:
>> 
>> http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617
> 
> Ah, great, thanks.
> 
> David

Hrm…

 960             ExplainPropertyFloat("Plan Rows", plan->plan_rows, 0, es);

Should rows really be a float?

Best,

David

Re: Query Plan Columns

Josh Berkus <josh@agliodbs.com> — 2010-11-05T20:53:46Z

>  960             ExplainPropertyFloat("Plan Rows", plan->plan_rows, 0, es);
> 
> Should rows really be a float?

Yeah, since PropertyInteger is 32-bit.


-- 
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Query Plan Columns

Tom Lane <tgl@sss.pgh.pa.us> — 2010-11-05T22:03:20Z

Josh Berkus <josh@agliodbs.com> writes:
>> 960             ExplainPropertyFloat("Plan Rows", plan->plan_rows, 0, es);
>> 
>> Should rows really be a float?

> Yeah, since PropertyInteger is 32-bit.

Yes, plan_rows is a float.  We might need to represent very large
values, and we do not need high accuracy, so it's really the right type.

			regards, tom lane

Re: Query Plan Columns

David Fetter <david@fetter.org> — 2010-11-06T17:17:05Z

On Fri, Nov 05, 2010 at 01:39:07PM -0700, Josh Berkus wrote:
> 
> > Of course, there are containers too, which are not in your list at all.
> > How do you intend to represent the tree-ish structure in a flat table?
> 
> Andrew: we'll use a proximity tree.

Adjacency list?

If so, in my experience, it's best to separate the node descriptions
from the adjacency list that links them together.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Query Plan Columns

David Wheeler <david@kineticode.com> — 2010-11-06T18:44:54Z

On Nov 5, 2010, at 1:42 PM, David E. Wheeler wrote:

>> http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617
> 
> Ah, great, thanks.

So based on this, I've come up with:

    "Node Type"             TEXT,
    "Strategy"              TEXT,
    "Operation"             TEXT,
    "Startup Cost"          FLOAT,
    "Total Cost"            FLOAT,
    "Plan Rows"             FLOAT,
    "Plan Width"            INTEGER,
    "Actual Startup Time"   FLOAT,
    "Actual Total Time"     FLOAT,
    "Actual Rows"           FLOAT,
    "Actual Loops"          FLOAT,
    "Parent Relationship"   TEXT,
    "Sort Key"              TEXT[],
    "Sort Method"           TEXT[],
    "Sort Space Used"       BIGINT,
    "Sort Space Type"       TEXT,
    "Join Type"             TEXT,
    "Join Filter"           TEXT,
    "Hash Cond"             TEXT,
    "Relation Name"         NAME,
    "Alias"                 NAME,
    "Scan Direction"        TEXT,
    "Index Name"            TEXT,
    "Index Cond"            TEXT,
    "Recheck Cond"          TEXT,
    "TID Cond"              TEXT,
    "Merge Cond"            TEXT,
    "Subplan Name"          TEXT,
    "Function Name"         TEXT,
    "Function Call"         TEXT,
    "Filter"                TEXT,
    "One-Time Filter"       TEXT,
    "Command"               TEXT,
    "Shared Hit Blocks"     BIGINT,
    "Shared Read Blocks"    BIGINT,
    "Shared Written Blocks" BIGINT,
    "Local Hit Blocks"      BIGINT,
    "Local Read Blocks"     BIGINT,
    "Local Written Blocks"  BIGINT,
    "Temp Read Blocks"      BIGINT,
    "Temp Written Blocks"   BIGINT,
    "Output"                TEXT[],
    "Hash Buckets"          BIGINT,
    "Hash Batches"          BIGINT,
    "Original Hash Batches" BIGINT,
    "Peak Memory Usage"     BIGINT,
    "Schema"                TEXT,
    "CTE Name"              TEXT

Does that seem reasonable? Am I missing anything obvious?

Thanks,

David

Re: Query Plan Columns

David Wheeler <david@kineticode.com> — 2010-11-06T22:24:16Z

On Nov 6, 2010, at 11:44 AM, David E. Wheeler wrote:

> On Nov 5, 2010, at 1:42 PM, David E. Wheeler wrote:
> 
>>> http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/commands/explain.c;h=f494ec98e510c23120e072bd5ee8821ea12738a4;hb=HEAD#l617
>> 
>> Ah, great, thanks.
> 
> So based on this, I've come up with:

Updated:

CREATE TEMPORARY TABLE plans (
    "Node Type"             TEXT NOT NULL,
    "Strategy"              TEXT,
    "Operation"             TEXT,
    "Startup Cost"          FLOAT,
    "Total Cost"            FLOAT,
    "Plan Rows"             FLOAT,
    "Plan Width"            INTEGER,
    "Actual Startup Time"   FLOAT,
    "Actual Total Time"     FLOAT,
    "Actual Rows"           FLOAT,
    "Actual Loops"          FLOAT,
    "Parent Relationship"   TEXT,
    "Sort Key"              TEXT[],
    "Sort Method"           TEXT[],
    "Sort Space Used"       BIGINT,
    "Sort Space Type"       TEXT,
    "Join Type"             TEXT,
    "Join Filter"           TEXT,
    "Hash Cond"             TEXT,
    "Relation Name"         TEXT,
    "Alias"                 TEXT,
    "Scan Direction"        TEXT,
    "Index Name"            TEXT,
    "Index Cond"            TEXT,
    "Recheck Cond"          TEXT,
    "TID Cond"              TEXT,
    "Merge Cond"            TEXT,
    "Subplan Name"          TEXT,
    "Function Name"         TEXT,
    "Function Call"         TEXT,
    "Filter"                TEXT,
    "One-Time Filter"       TEXT,
    "Command"               TEXT,
    "Shared Hit Blocks"     BIGINT,
    "Shared Read Blocks"    BIGINT,
    "Shared Written Blocks" BIGINT,
    "Local Hit Blocks"      BIGINT,
    "Local Read Blocks"     BIGINT,
    "Local Written Blocks"  BIGINT,
    "Temp Read Blocks"      BIGINT,
    "Temp Written Blocks"   BIGINT,
    "Output"                TEXT[],
    "Hash Buckets"          BIGINT,
    "Hash Batches"          BIGINT,
    "Original Hash Batches" BIGINT,
    "Peak Memory Usage"     BIGINT,
    "Schema"                TEXT,
    "CTE Name"              TEXT
);

Would I be right that "Node Type" is the only column can be NOT NULL?

Also, I'm thinking of making the "Actual Startup Time" and ""Actual Total Time" columns into INTERVALs. The times are expressed in milliseconds, yes? I'm wondering if INTERVAL would be more convenient for querying…

Thanks,

David

Re: Query Plan Columns

Roberto Mello <roberto.mello@gmail.com> — 2010-11-07T13:24:06Z

On Sat, Nov 6, 2010 at 6:24 PM, David E. Wheeler <david@kineticode.com> wrote:
>
> Would I be right that "Node Type" is the only column can be NOT NULL?
>
> Also, I'm thinking of making the "Actual Startup Time" and ""Actual Total Time" columns into INTERVALs. The times are expressed in milliseconds, yes? I'm wondering if INTERVAL would be more convenient for querying…

Yes, but I am wondering whether you should just stick to what would
come out of a normal explain, for consistency sake. Maybe provide
another function, or parameter that would cast the results to
intervals?

Not sure.

Roberto

Re: Query Plan Columns

David Wheeler <david@kineticode.com> — 2010-11-08T23:12:35Z

On Nov 7, 2010, at 5:24 AM, Roberto Mello wrote:

> Yes, but I am wondering whether you should just stick to what would
> come out of a normal explain, for consistency sake. Maybe provide
> another function, or parameter that would cast the results to
> intervals?

I think it's more convenient to have intervals.

So my final list:

    CREATE TABLE plans (
        planned_at              TIMESTAMPTZ,
        node_id                 TEXT PRIMARY KEY,
        parent_id               TEXT REFERENCES plans(node_id),
        node_type               TEXT NOT NULL,
        total_runtime           INTERVAL,
        strategy                TEXT,
        operation               TEXT,
        startup_cost            FLOAT,
        total_cost              FLOAT,
        plan_rows               FLOAT,
        plan_width              INTEGER,
        actual_startup_time     INTERVAL,
        actual_total_time       INTERVAL,
        actual_rows             FLOAT,
        actual_loops            FLOAT,
        parent_relationship     TEXT,
        sort_key                TEXT[],
        sort_method             TEXT[],
        sort_space_used         BIGINT,
        sort_space_type         TEXT,
        join_type               TEXT,
        join_filter             TEXT,
        hash_cond               TEXT,
        relation_name           TEXT,
        alias                   TEXT,
        scan_direction          TEXT,
        index_name              TEXT,
        index_cond              TEXT,
        recheck_cond            TEXT,
        tid_cond                TEXT,
        merge_cond              TEXT,
        subplan_name            TEXT,
        function_name           TEXT,
        function_call           TEXT,
        filter                  TEXT,
        one_time_filter         TEXT,
        command                 TEXT,
        shared_hit_blocks       BIGINT,
        shared_read_blocks      BIGINT,
        shared_written_blocks   BIGINT,
        local_hit_blocks        BIGINT,
        local_read_blocks       BIGINT,
        local_written_blocks    BIGINT,
        temp_read_blocks        BIGINT,
        temp_written_blocks     BIGINT,
        output                  TEXT[],
        hash_buckets            BIGINT,
        hash_batches            BIGINT,
        original_hash_batches   BIGINT,
        peak_memory_usage       BIGINT,
        schema                  TEXT,
        cte_name                TEXT,       
        triggers                trigger_plan[]
    );

planned_at is just the current time (from NOW()).

node_id is simply `md5( pg_backend_pid() || clock_timestamp() )`, which is run just before each node is parsed.

trigger_plan is a composite type:

    CREATE TYPE trigger_plan AS (
        trigger_name    TEXT,
        constraint_name TEXT,
        relation        TEXT,
        time            INTERVAL,
        calls           FLOAT
    );

I'm wondering if there's any reason why we couldn't have EXPLAIN do something like this itself in core:

    EXPLAIN (format table) SELECT * FROM bar;

It could output a table like the above. FWIW, The function I've written works like this:

    SELECT plan('SELECT * FROM bar');

Which is an okay workaround. Anyone else think that this might be useful?

Best,

David

Re: Query Plan Columns

David Wheeler <david@kineticode.com> — 2010-11-08T23:15:13Z

On Nov 8, 2010, at 3:12 PM, David E. Wheeler wrote:

> It could output a table like the above. FWIW, The function I've written works like this:
> 
>    SELECT plan('SELECT * FROM bar');

Sorry, that's

    SELECT * FROM plan('SELECT * FROM bar');

Best,

David

Re: Query Plan Columns

Dimitri Fontaine <dimitri@2ndquadrant.fr> — 2010-11-09T08:12:06Z

"David E. Wheeler" <david@kineticode.com> writes:
> I'm wondering if there's any reason why we couldn't have EXPLAIN do something like this itself in core:
>
>     EXPLAIN (format table) SELECT * FROM bar;

+1 from me here, as it happens "parsing" a table-like resultset is
already implemented everywhere :)

I'd add another +1 if it were easily usable as a "normal" relation (or
SRF) from inside a query, e.g. in a WITH table expression.

WITH plan AS (
  EXPLAIN (format table) SELECT * FROM bar
)
SELECT some_plan_analysis_query_here;

Or even:

WITH plan AS (
  EXPLAIN (format table) SELECT * FROM bar
)
INSERT INTO plan_audit
SELECT * FROM plan
 WHERE actual_total_time > 12 * interval '100 ms';

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support

Re: Query Plan Columns

David Wheeler <david@kineticode.com> — 2010-11-09T08:19:23Z

On Nov 9, 2010, at 12:12 AM, Dimitri Fontaine wrote:

> WITH plan AS (
>  EXPLAIN (format table) SELECT * FROM bar
> )
> INSERT INTO plan_audit
> SELECT * FROM plan
> WHERE actual_total_time > 12 * interval '100 ms';

Yeah, that would be nice, but my current implementation has a row for each node, and a single explain can have many nodes. With this, you'd only get the top-level node (and not even that, as you didn't do EXPLAIN ANALYZE, so actual_total_time would be blank!).

But I do like the idea…

Best,

David