Thread

  1. Re: Fixing some ancient errors in hash join costing

    Tom Lane <tgl@sss.pgh.pa.us> — 2025-12-28T20:15:56Z

    I wrote:
    > I remain a bit confused by the change in postgres_fdw.out though.
    > It's deciding to push an ORDER BY down to the remote side when
    > it didn't before, which seems like an improvement; but I fail to
    > see how a marginal change in hash join costing would lead to that.
    > Perhaps that is worth looking into more closely.
    
    I dug into that bit today, and concluded that it's a "nothing to
    see here" case.  We are comparing the costs of doing a sort step
    locally vs remotely --- but if the remote server is identically
    configured, which it surely is in this test, then cost_sort()
    will produce the same answers on both sides, and we are comparing
    path costs that are the same to within roundoff error and
    cost-quantization effects.  My patch does move the underlying
    semijoin's cost just a hair, and that results in changes in what
    add_path does with the locally-sorted versus remotely-sorted paths,
    but really there's no reason to prefer one over the other.
    
    I was amused to notice that the postgres_fdw.out change made in my
    patch reverts one made in aa86129e1 (which also affected semijoin
    costing).  So we've had trouble before with that test case being
    fundamentally unstable.  I wonder if we shouldn't do something to try
    to stabilize it?  I see that the test immediately before this one
    forces the matter by turning off enable_sort (which'd affect only
    the local side not the remote).  That's a hack all right but maybe
    we should extend it to this test.
    
    			regards, tom lane