Re: Online base backup from the hot-standby

Fujii Masao <masao.fujii@gmail.com>

From: Fujii Masao <masao.fujii@gmail.com>
To: Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>
Cc: ssinger_pg@sympatico.ca, simon@2ndquadrant.com, pgsql-hackers@postgresql.org, magnus@hagander.net, robertmhaas@gmail.com, cedric.villemain.debian@gmail.com, heikki.linnakangas@enterprisedb.com
Date: 2011-10-12T07:53:52Z
Lists: pgsql-hackers
2011/10/12 Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>:
> > ERROR: full_page_writes on master is set invalid at least once since
> > latest checkpoint
> >
> > I think this error should be rewritten as
> > ERROR: full_page_writes on master has been off at some point since
> > latest checkpoint
> >
> > We should be using 'off' instead of 'invalid' since that is what is what
> > the user sets it to.
>
> Sure.

What about the following message? It sounds more precise to me.

ERROR: WAL generated with full_page_writes=off was replayed since last
restartpoint

> I updated to patch corresponded above-comments.

Thanks for updating the patch! Here are the comments:

 	 * don't yet have the insert lock, forcePageWrites could change under us,
 	 * but we'll recheck it once we have the lock.
 	 */
-	doPageWrites = fullPageWrites || Insert->forcePageWrites;
+	doPageWrites = Insert->fullPageWrites || Insert->forcePageWrites;

The source comment needs to be modified.

 	 * just turned off, we could recompute the record without full pages, but
 	 * we choose not to bother.)
 	 */
-	if (Insert->forcePageWrites && !doPageWrites)
+	if ((Insert->fullPageWrites || Insert->forcePageWrites) && !doPageWrites)

Same as above.

+	LWLockAcquire(WALInsertLock, LW_EXCLUSIVE);
+	XLogCtl->Insert.fullPageWrites = fullPageWrites;
+	LWLockRelease(WALInsertLock);

I don't think WALInsertLock needs to be hold here because there is no
concurrently running process which can access Insert.fullPageWrites.
For example, Insert->currpos and Insert->LogwrtResult are also changed
without the lock there.

The source comment of XLogReportParameters() needs to be modified.

XLogReportParameters() should skip writing WAL if full_page_writes has not been
changed by SIGHUP.

XLogReportParameters() should skip updating pg_control if any parameter related
to hot standby has not been changed.

+	if (!fpw_manager)
+	{
+		LWLockAcquire(WALInsertLock, LW_EXCLUSIVE);
+		fpw = XLogCtl->Insert.fullPageWrites;
+		LWLockRelease(WALInsertLock);

It's safe to take WALInsertLock with shared mode here.

In checkpoint, XLogReportParameters() is called only when wal_level is
hot_standby.
OTOH, in walwriter, it's always called even when wal_level is not hot_standby.
Can't we skip calling XLogReportParameters() whenever wal_level is not
hot_standby?

In do_pg_start_backup() and do_pg_stop_backup(), the spinlock must be held to
see XLogCtl->lastFpwDisabledLSN.

+	/* check whether the master's FPW is 'off' since pg_start_backup. */
+	if (recovery_in_progress && XLByteLE(startpoint, XLogCtl->lastFpwDisabledLSN))
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+			  errmsg("full_page_writes on master has been off at some point
during online backup")));

What about changing the error message to:
ERROR: WAL generated with full_page_writes=off was replayed during online backup

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center