Re: Bug in pg_stat_statements
Sami Imseih <samimseih@gmail.com>
From: Sami Imseih <samimseih@gmail.com>
To: Dmitry Dolgov <9erthalion6@gmail.com>
Cc: Konstantin Knizhnik <knizhnik@garret.ru>, PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
Date: 2025-10-24T21:04:20Z
Lists: pgsql-hackers
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
pg_stat_statements: Fix handling of duplicate constant locations
- b1635c166698 18.1 landed
- 16edc1b94fc2 19 (unreleased) landed
Attachments
- v3-0001-pg_stat_statements-Fix-duplicate-constant-locatio.patch (application/octet-stream) patch v3-0001
- v3-0002-pg_stat_statements-move-out-jumble-specific-routi.patch (application/octet-stream) patch v3-0002
> > > > Having said > > > > that it seems that another solution would be to check for duplicated constants > > > > in fill_in_constant_length > > > > > > Yes, I started thinking along these lines as well, where we check for > > > duplicates > > > in fill_in_constant_length; or after jumbling, we de-duplicate locations if we > > > have a squashable list, which is what I have attached with updated test cases. > > > > > > This means we do need to scan the locations one extra time during jumbling, > > > but I don't see that as a problem. Maybe there is another better way? > > > > Why? What I hand in mind is something like this, after a quick test it seems to > > be able to address both the original case and the one you've discovered. > > ahh, what you have works because clocations is sorted by location before > the loop starts in `fill_in_constant_lengths` , so comparing locations > makes sense in this case. I am OK with this. v3-0001 does what Dimiti suggests with some slight tweaks: 1/ Moving "last_loc = loc;" earlier in the loop 2/ removing a comment for fill_in_constant_lengths that was an oversight from 0f65f3eec, as we no longer assume that the constant location is -1 when entering that routine. Also added the test cases that were discovered, besides the one from the original report. > I was really hoping that the fix could be inside of query jumbling, as I think > pg_stat_statements or any consumers of query jumbling don't need to > care more about squashing ( or other internals of jumbling ). > So now I also wonder if we should also move: > > ``` > static char *generate_normalized_query(JumbleState *jstate, const char *query, > int query_loc, int *query_len_p); > > static void fill_in_constant_lengths(JumbleState *jstate, const char *query, > int query_loc); > ``` > > from pg_stat_statements.c to queryjumblefuncs.c? > > v3-0002 moved both generate_normalized_query and fill_in_constant_lengths to queryjumblefuncs.c, as these routines deal with the internal of jumbling and should not be a concern of pg_stat_statements. I think this is a good cleanup, but others may have different opinions on this. -- Sami Imseih Amazon Web Services (AWS)