Thread
Commits
Same data as JSON:
GET /api/v1/messages/:b64id/commits
the thread's linked commits as JSON, with link sources.
API reference →
-
injection_points: Add proper locking when reporting fixed-variable stats
- b5f898944d12 18.1 landed
- 3cc689f833b1 19 (unreleased) landed
-
Fix locking issue with fixed-size stats template in injection_points
Michael Paquier <michael@paquier.xyz> — 2025-09-29T00:48:44Z
Hi all, While doing some concurrency benchmarking with injection_points stats enabled in a server, I have been able to trigger an assertion failure in pgstat_begin_changecount_write(): #4 0x0000564917fdc816 in ExceptionalCondition (conditionName=0x7f60af0dc5c8 "(*cc & 1) == 0", fileName=0x7f60af0dc598 "../../../../src/include/utils/pgstat_internal.h", lineNumber=831) at assert.c:65 #5 0x00007f60af0da3ef in pgstat_begin_changecount_write (cc=0x7f60acbb8e10) at ../../../../src/include/utils/pgstat_internal.h:831 #6 0x00007f60af0db1d9 in pgstat_report_inj_fixed (numattach=0, numdetach=0, numrun=1, numcached=0, numloaded=0) at injection_stats_fixed.c:155 #7 0x00007f60af0d8b5c in injection_points_run (fcinfo=0x564931b66588) at injection_points.c:429 This can be reproduced as follows. First, postgresql.conf: shared_preload_libraries = 'injection_points' injection_points.stats = on Then something like the following command: $ cat create_inj.sql \set id random(1,100000) select injection_points_attach('popo:id', 'notice'); select injection_points_run('popo:id'); select injection_points_detach('popo:id'); $ pgbench -n -T 300 -f create_inj.sql -c 10 The failure is not surprising, because the stats reports can happen in a concurrent fashion when a point is run for example, contrary to other fixed-sized stats kind where the reports are only done by a single process (archiver, bgwriter, checkpointer). So this is just a matter of acquiring a lock that was forgotten, to make sure that the changes are consistent. Far from critical as this is template code, still embarrassing. Thoughts or comments? -- Michael -
Re: Fix locking issue with fixed-size stats template in injection_points
Chao Li <li.evan.chao@gmail.com> — 2025-09-29T01:46:05Z
> On Sep 29, 2025, at 08:48, Michael Paquier <michael@paquier.xyz> wrote: > > > The failure is not surprising, because the stats reports can happen in > a concurrent fashion when a point is run for example, contrary to > other fixed-sized stats kind where the reports are only done by a > single process (archiver, bgwriter, checkpointer). So this is just a > matter of acquiring a lock that was forgotten, to make sure that the > changes are consistent. Far from critical as this is template code, > still embarrassing. > > Thoughts or comments? I saw pg_state_begin_changecount_write() is called multiple places, as you mention, for example bgwriter. But there are not the same lock acquired in other places, for example, in bgwriter: void pgstat_report_bgwriter(void) { PgStatShared_BgWriter *stats_shmem = &pgStatLocal.shmem->bgwriter; Assert(!pgStatLocal.shmem->is_shutdown); pgstat_assert_is_up(); /* * This function can be called even if nothing at all has happened. In * this case, avoid unnecessarily modifying the stats entry. */ if (pg_memory_is_all_zeros(&PendingBgWriterStats, sizeof(struct PgStat_BgWriterStats))) return; pgstat_begin_changecount_write(&stats_shmem->changecount); #define BGWRITER_ACC(fld) stats_shmem->stats.fld += PendingBgWriterStats.fld BGWRITER_ACC(buf_written_clean); BGWRITER_ACC(maxwritten_clean); BGWRITER_ACC(buf_alloc); #undef BGWRITER_ACC pgstat_end_changecount_write(&stats_shmem->changecount); Only adding the lock in pg_report_inj_fixed() won’t prevent the race conditions from bgwriter. So I wonder, do we need to add the same lock in the other places? Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/ -
Re: Fix locking issue with fixed-size stats template in injection_points
Michael Paquier <michael@paquier.xyz> — 2025-09-29T06:27:56Z
On Mon, Sep 29, 2025 at 09:46:05AM +0800, Chao Li wrote: > I saw pg_state_begin_changecount_write() is called multiple places, > as you mention, for example bgwriter. I've mentioned that in my first email, and put in details: - pgstat_report_bgwriter() is called once, by the bgwriter. - pgstat_report_checkpointer() is called three time, all by the checkpointer. - pgstat_report_archiver() is called twice, all by pgarch.c. So all of them don't have a problem, two calls cannot happen concurrently. -- Michael
-
Re: Fix locking issue with fixed-size stats template in injection_points
Bertrand Drouvot <bertranddrouvot.pg@gmail.com> — 2025-09-29T06:52:31Z
Hi, On Mon, Sep 29, 2025 at 09:48:44AM +0900, Michael Paquier wrote: > Then something like the following command: > $ cat create_inj.sql > \set id random(1,100000) > select injection_points_attach('popo:id', 'notice'); > select injection_points_run('popo:id'); > select injection_points_detach('popo:id'); > $ pgbench -n -T 300 -f create_inj.sql -c 10 > > The failure is not surprising, because the stats reports can happen in > a concurrent fashion when a point is run for example, contrary to > other fixed-sized stats kind where the reports are only done by a > single process (archiver, bgwriter, checkpointer). So this is just a > matter of acquiring a lock that was forgotten, to make sure that the > changes are consistent. Far from critical as this is template code, > still embarrassing. > > Thoughts or comments? Patch LGTM. Remark: I like the "popo" prefix in your test ;-) Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com -
Re: Fix locking issue with fixed-size stats template in injection_points
Chao Li <li.evan.chao@gmail.com> — 2025-09-29T07:29:25Z
> On Sep 29, 2025, at 14:27, Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, Sep 29, 2025 at 09:46:05AM +0800, Chao Li wrote: >> I saw pg_state_begin_changecount_write() is called multiple places, >> as you mention, for example bgwriter. > > I've mentioned that in my first email, and put in details: > - pgstat_report_bgwriter() is called once, by the bgwriter. > - pgstat_report_checkpointer() is called three time, all by the > checkpointer. > - pgstat_report_archiver() is called twice, all by pgarch.c. > > So all of them don't have a problem, two calls cannot happen > concurrently. > -- > Michael Thanks for the clarification. Then the patch looks good to me. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
-
Re: Fix locking issue with fixed-size stats template in injection_points
wenhui qiu <qiuwenhuifx@gmail.com> — 2025-09-29T07:54:39Z
HI This patch looks good to me. The issue is clear: unlike other fixed-size stats kinds (archiver, bgwriter, checkpointer), the injection_points stats can be updated concurrently by multiple backends. Without synchronization, this can lead to inconsistent changecount state and assertion failures in pgstat_begin_changecount_write(), as shown in your reproduction. Thanks On Mon, Sep 29, 2025 at 3:29 PM Chao Li <li.evan.chao@gmail.com> wrote: > > > On Sep 29, 2025, at 14:27, Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, Sep 29, 2025 at 09:46:05AM +0800, Chao Li wrote: > > I saw pg_state_begin_changecount_write() is called multiple places, > as you mention, for example bgwriter. > > > I've mentioned that in my first email, and put in details: > - pgstat_report_bgwriter() is called once, by the bgwriter. > - pgstat_report_checkpointer() is called three time, all by the > checkpointer. > - pgstat_report_archiver() is called twice, all by pgarch.c. > > So all of them don't have a problem, two calls cannot happen > concurrently. > -- > Michael > > > Thanks for the clarification. Then the patch looks good to me. > > Best regards, > -- > Chao Li (Evan) > HighGo Software Co., Ltd. > https://www.highgo.com/ > > > > >
-
Re: Fix locking issue with fixed-size stats template in injection_points
Michael Paquier <michael@paquier.xyz> — 2025-09-29T08:01:37Z
On Mon, Sep 29, 2025 at 06:52:31AM +0000, Bertrand Drouvot wrote: > Remark: I like the "popo" prefix in your test ;-) It's that or popopop. "glop" and "pas-glop" were my other candidates, but I doubt we can use them freely, and you may be the only one around here to know what this refers to. :D -- Michael
-
Re: Fix locking issue with fixed-size stats template in injection_points
Michael Paquier <michael@paquier.xyz> — 2025-09-30T00:08:32Z
On Mon, Sep 29, 2025 at 05:01:37PM +0900, Michael Paquier wrote: > It's that or popopop. "glop" and "pas-glop" were my other candidates, > but I doubt we can use them freely, and you may be the only one around > here to know what this refers to. :D Putting that aside, fixed down to v18. -- Michael