Re: Logical replication is missing block of rows when sending initial sync?

Amit Kapila <amit.kapila16@gmail.com>

From: Amit Kapila <amit.kapila16@gmail.com>
To: depesz@depesz.com
Cc: Kyotaro Horiguchi <horikyota.ntt@gmail.com>, kuroda.hayato@fujitsu.com, pgsql-bugs@lists.postgresql.org
Date: 2023-11-03T03:39:12Z
Lists: pgsql-bugs
On Thu, Nov 2, 2023 at 4:53 PM hubert depesz lubaczewski
<depesz@depesz.com> wrote:
>
> On Thu, Nov 02, 2023 at 10:17:13AM +0900, Kyotaro Horiguchi wrote:
> > At Mon, 30 Oct 2023 07:10:35 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote in
> > > I've tried, but I could not reproduce the failure. PSA the script what I did.
> >
> > I'm not well-versed in the details of logical replication, but does
> > logical replication inherently operate in such a way that it fully
> > maintains relationships between tables? If not, isn't it possible that
> > the issue in question is not about missing referenced data, but merely
> > a temporary delay?
>
> The problem is that date that appeared *later* was visible on the
> subscriber. Data that came earlier was visible too. Just some block of
> data got, for some reason, skipped.
>

Quite strange. I think to narrow down such a problem, the first thing
to figure out is whether the data is skipped by initial sync or later
replication. To find that out, you can check remote_lsn value in
pg_replication_origin_status for the origin used in the initial sync
once the relation reaches the 'ready' state. Then, you can try to see
on the publisher side using pg_waldump whether the missing rows exist
before the value of remote_lsn or after it. That can help us to narrow
down the problem and could give us some clues for the next steps.

-- 
With Regards,
Amit Kapila.