Thread

  1. Re: [SQL/PGQ] Early pruning for GRAPH_TABLE path generation

    Henson Choi <assam258@gmail.com> — 2026-05-13T16:32:25Z

    Hi Junwang,
    
    Thanks for the quick turnaround on v2. Walked through it against the
    prior notes -- everything reads well; one small thing on 0002 to fix
    before the next spin, inline below.
    
    > 1. A long chain pattern that survives pruning but used to enumerate
    > >    the full N^K product (the g5 chain from your benchmark is a
    > >    natural fit). Not currently covered in
    > >    src/test/regress/sql/graph_table.sql -- worth adding.
    >
    > Added as v2-0002.
    >
    
    The g5 chain is a good fit. One nit: the comment above the SELECT
    has a U+2192 arrow in both the .sql and the .out -- unless a regress
    case is specifically exercising encoding, these stay ASCII-only, so
    please replace with a plain `->` in both files.
    
    
    > I added the following paragraph to the commit message, feel free to
    > recommend wording.
    >
    >     The cyclic case where a closing edge has both endpoints already
    >     in the prefix is already exercised by the existing same-variable
    >     loop patterns in graph_table.sql (e.g. (a)-[b]->(a)-[b]->(a)),
    >     because repeated vertex names are merged into a single path factor
    >     before DFS. Likewise, the "all edge candidates pruned" path into
    >     generate_query_for_empty_path_pattern() is already hit by the
    >     MATCH (o IS orders)-[IS customer_orders]->(c IS customers) case,
    >     where no catalog edge matches the declared direction; pruning just
    >     makes those branches easier to reach. Neither case needs a
    >     dedicated new test beyond what is already there.
    >
    
    Good.
    
    
    > >
    > > ## Nits (all stylistic -- feel free to skip)
    > >
    > > 1. In graph_path_prefix_is_feasible_with_new_element(), the
    > >    `if (!IS_EDGE_PATTERN(prev_pe->path_factor->kind)) return true;`
    > >    branch is unreachable -- gram.y enforces V-E alternation, and
    > >    the path-factor linkage code in
    > >    generate_queries_for_path_pattern() re-asserts it. Tightening
    > >    to `Assert(IS_EDGE_PATTERN(...))` would match the rest of the
    > >    file.
    >
    > Yeah, for a pattern such as (a)-[e1]-(b)-[e2]-(a)-[e3]-(c), the
    > path_factors
    > is V(a), E(e1), V(b), E(e2), E(e3), V(c), though not totally V-E
    > alternation,
    > for a Vertex, we can sure that the prev_pe is always Edge,  so changed
    > to Assert.
    >
    
    Thanks -- I hadn't realized path_factors isn't strictly V-E
    alternating; your E(e2), E(e3) example pins down the actual
    invariant the Assert relies on.
    
    
    > >
    > > 2. `normal_ok` / `reverse_ok` reads slightly off from this file's
    > >    diction. Consider `feasible` / `rev_feasible` -- the asymmetric
    > >    `rev_` prefix matches the existing `edge_qual` / `rev_edge_qual`
    > >    in generate_query_for_graph_path(), and echoes the function name
    > >    itself.
    >
    > Changed.
    >
    > >
    > > 3. A one-line comment above the V-E alternation check (whether
    > >    kept as `if` or tightened per nit 1) citing parse analysis would
    > >    save the next reader a trip through gram.y.
    >
    > I added the following, feel free to suggest wording.
    >
    > /*
    > * Merged duplicate vertices only drop redundant factors from path_factors,
    > * not from the DFS path; preceding slot is always an edge for a vertex.
    > */
    >
    
    +1, and actually a better framing than the gram.y pointer I
    originally suggested -- this is the invariant the Assert relies on.
    
    
    >
    > >
    > > 4. In the prune branch of generate_queries_for_path_pattern_recurse(),
    > >    a one-liner noting that an exhausted level returns an empty list,
    > >    which the caller routes to generate_query_for_empty_path_pattern(),
    > >    would help -- pre-existing behavior, but far more reachable after
    > >    the patch.
    >
    > Added, feel free to suggest wording.
    >
    
    Good.
    
    
    >
    > >
    > > ## On the trailing edge_qual guard
    > >
    > > The `if (edge_qual == NULL) return NULL;` in
    > > generate_query_for_graph_path() becomes unreachable after the patch,
    > > but worth keeping as a defensive backstop -- it is the literal dual
    > > of the helper's check, and removing it would only tighten coupling
    > > for no measurable gain. A short comment marking it as such would help.
    >
    > Added.
    >
    
    Thanks.
    
    With the ASCII fix in 0002, no further comments from my side.
    
    Regards,
    Henson