Thread

  1. Re: Add wait events for server logging destination writes

    신성준 <shinsj4653@gmail.com> — 2026-05-31T10:42:41Z

    Hi,
    
    cfbot caught a build failure on v1, in the SanityCheck task on Linux
    and Windows: elog.c uses pgstat_report_wait_start()/end() and the
    WAIT_EVENT_* constants but didn't include utils/wait_event.h. It only
    built here because of an accidental transitive include on my machine;
    on the CI images the declarations weren't visible.
    
    v2 fixes that by adding the missing #include "utils/wait_event.h" to
    elog.c, folded into 0001 so that patch builds on its own. No other
    changes; the wait events and the reported write paths are the same as
    in v1.
    
    v2-0001 adds the two events and covers the write(2) paths.
    v2-0002 covers the Windows WriteConsoleW() path, split out as before.
    
    Applies cleanly on current master; full build passes locally.
    
    Thanks,
    Seongjun Shin
    
    2026년 5월 31일 (일) 오후 5:50, 신성준 <shinsj4653@gmail.com>님이 작성:
    >
    > Hi hackers,
    >
    > The write(2) calls that flush server log output aren't covered by wait
    > events. When a backend logs something, the writes go out in:
    >
    >   - write_pipe_chunks(): write(2) to the syslogger pipe
    >   - write_console(): write(2) to stderr (WriteConsoleW() on Windows)
    >
    > If one of those blocks -- syslogger pipe full, slow console, slow log
    > device -- pg_stat_activity just shows wait_event = NULL until it
    > returns. Since NULL usually reads as "on CPU", a backend stuck writing
    > logs looks like it's doing work, so logging-related stalls are easy to
    > miss.
    >
    > Attached is a short series that adds two WaitEventIO events and reports
    > them around those writes:
    >
    >   IO / SysloggerWrite - write(2) to the syslogger pipe
    >   IO / StderrWrite - write(2) to stderr, and WriteConsoleW()
    >
    > 0001 adds the events and covers the write(2) paths. 0002 does the
    > Windows WriteConsoleW() path, split out since it's platform-specific.
    >
    > It only wraps the leaf write call and uses the existing
    > pgstat_report_wait_start()/end() helpers, so it stays allocation-free
    > and safe to call from inside the error-reporting path.
    >
    > I did a quick before/after to make sure the events show up: 8 backends
    > each emitting large RAISE LOG lines, sampling wait_event from
    > pg_stat_activity every 50 ms for 20 s.
    >
    >   - logging_collector = on (syslogger pipe):
    >     master:  NULL                100.0%  (2184/2184)
    >     patched: IO/SysloggerWrite     99.1%  (2204/2224), NULL 0.9%
    >
    >   - logging_collector = off (stderr):
    >     master:  NULL                100.0%  (2144/2144)
    >     patched: IO/StderrWrite        90.7%  (1952/2152), NULL 9.3%
    >
    > On master that wait time is just invisible; with the patch it lands on
    > the new events. I can send the scripts and raw samples if anyone wants
    > to reproduce it.
    >
    > Applies on current master. A couple of things I'm unsure about and
    > would appreciate input on: whether the event names fit the surrounding
    > conventions, and whether splitting the Windows path into its own patch
    > is the right call.
    >
    > Thanks,
    > Seongjun Shin