Thread
-
Re: Proposal: Conflict log history table for Logical Replication
Peter Smith <smithpb2250@gmail.com> — 2025-12-23T06:49:21Z
Hi Dilip. Here are some review comments after a first pass of patch v15-0001. ====== Commit Message 1. If user choose to log into the table the table will automatically created while creating the subscription with internal name i.e. conflict_log_table_$subid$. The table will be created in the current search path and table would be automatically dropped while dropping the subscription. English: /If user choose/ /the table the table/ /and table would/ ====== src/backend/commands/subscriptioncmds.c 2. +#define SUBOPT_CONFLICT_LOG_DESTINATION 0x00040000 For the values, you are using DEST instead of DESTINATION. You can do the same here to keep the macro name a bit shorter. ~~~ parse_subscription_options: 3. + dest = GetLogDestination(val); + + if (dest == CONFLICT_LOG_DEST_INVALID) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("unrecognized conflict_log_destination value: \"%s\"", val), + errhint("Valid values are \"log\", \"table\", and \"all\"."))); I don't think CONFLICT_LOG_DEST_INVALID should even exist as an enum value. Instead, the validation and the ereport(ERROR) should all be done within GetLogDestination function. So, it should only return valid values, else give an error. ~~~ CreateSubscription: 4. + /* Always set the destination, default will be log. */ + values[Anum_pg_subscription_sublogdestination - 1] = + CStringGetTextDatum(ConflictLogDestLabels[opts.logdest]); + + /* + * If the conflict log destination includes 'table', generate an internal + * name using the subscription OID and determine the target namespace based + * on the current search path. Store the namespace OID and the conflict log + * format in the pg_subscription catalog tuple., then physically create + * the table. + */ 4a. When referring to these parameter values, you should always consistently quote them. Currently, there is a mix of lots of formats. (e.g. log (unquoted), 'table' (single-quoted), "log" (double-quoted)). Pick one style, and make them all the same. Check for the same everywhere. ~ 4b. Typo "tuple.," ~~~ 5. + if (opts.logdest == CONFLICT_LOG_DEST_TABLE || + opts.logdest == CONFLICT_LOG_DEST_ALL) IIUC, you are effectively treating these parameter values like bits that can be OR-ed together. And if in the future a "list" is supported, then that's exactly what you will be doing. So, IMO, they should be defined that way. See a review comment later in this post. e.g. this condition would be written more like: if ((opts.logdest & CONFLICT_LOG_DEST_TABLE) != 0) or, using the macro if (IsSet(opts.logdest, CONFLICT_LOG_DEST_TABLE)) ~~~ AlterSubscription: 6. + if (opts.logdest != old_dest) + { + bool want_table = + (opts.logdest == CONFLICT_LOG_DEST_TABLE || + opts.logdest == CONFLICT_LOG_DEST_ALL); + bool has_oldtable = + (old_dest == CONFLICT_LOG_DEST_TABLE || + old_dest == CONFLICT_LOG_DEST_ALL); + This is more of the same kind of logic that convinces me the code should be using bitmasks. SUGGESTION bool want_table = IsSet(opts.logdest, CONFLICT_LOG_DEST_TABLE); bool has_oldtable = IsSet(olddest, CONFLICT_LOG_DEST_TABLE); ~~~ create_conflict_log_table: 7. +/* + * Create conflict log table. + * + * The subscription owner becomes the owner of this table and has all + * privileges on it. + */ +static Oid +create_conflict_log_table(Oid subid, char *subname, Oid namespaceId, + char *conflictrel) I felt something like 'relname' is a better name for the char * conflictrel param. It clearly is the name of the conflict relation because of the name of the function. ~~~ 8. + /* Add a comments for the conflict log table. */ + snprintf(comment, sizeof(comment), + "Conflict log table for subscription \"%s\"", subname); + CreateComments(relid, RelationRelationId, 0, comment); + 8a. typo /Add a comments/Add a comment/ ~ 8b. My (previous review) suggestion for adding a table comment/description made more sense when the CLT was some arbitrary name chosen by the user. But, now that the CLT is a name like "conflict_log_table_%u", the idea for a comment seems redundant. ~~~ 9. +/* + * Format the standardized internal conflict log table name for a subscription + * + * Use the OID to prevent collisions during rename operations. + */ +void +GetConflictLogTableName(char *dest, Oid subid) +{ + snprintf(dest, NAMEDATALEN, "conflict_log_table_%u", subid); +} + 9a. To emphasise that this is an "internal" table, IMO there should be a "pg_" prefix for this table name. ~ 9b. Since it is internal anyway, why not make the tablename descriptive to clarify what that number means? e.g. "pg_conflict_log_table_for_subid_%u" BTW, since it is already a TABLE, then why is "table" even part of this name? Why not just "pg_conflict_log_for_subid_%u" ~~~ 10. +/* + * GetLogDestination + * + * Convert string to enum by comparing against standardized labels. + */ +ConflictLogDest +GetLogDestination(const char *dest) +{ + /* Empty string or NULL defaults to LOG. */ + if (dest == NULL || dest[0] == '\0') + return CONFLICT_LOG_DEST_LOG; + + for (int i = CONFLICT_LOG_DEST_LOG; i <= CONFLICT_LOG_DEST_ALL; i++) + { + if (pg_strcasecmp(dest, ConflictLogDestLabels[i]) == 0) + return (ConflictLogDest) i; + } + + /* Unrecognized string. */ + return CONFLICT_LOG_DEST_INVALID; +} Mentioned previously: I think there should be no such thing as CONFLICT_LOG_DEST_INVALID. I also think this function should be responsible for the ereport(ERROR). ====== src/include/catalog/pg_subscription.h 11. + /* + * Strategy for logging replication conflicts: + * log - server log only, + * table - internal table only, + * all - both log and table. + */ + text sublogdestination; + SUGGEST 'subconflictlogdest' (see next review comment #12 for why) ~~~ 12. + Oid conflictrelid; /* conflict log table Oid */ char *conninfo; /* Connection string to the publisher */ char *slotname; /* Name of the replication slot */ char *synccommit; /* Synchronous commit setting for worker */ List *publications; /* List of publication names to subscribe to */ char *origin; /* Only publish data originating from the * specified origin */ + char *logdestination; /* Conflict log destination */ } Subscription; These don't seem very good member names: Maybe 'conflictrelid' -> 'conflictlogrelid' (because it's rel of the log; not the conflict) Maybe 'logdestination' -> 'conflictlogdest' (because in future there might be other kinds of subscription logs) ====== src/include/replication/conflict.h 13. +typedef enum ConflictLogDest +{ + CONFLICT_LOG_DEST_INVALID = 0, + CONFLICT_LOG_DEST_LOG, /* "log" (default) */ + CONFLICT_LOG_DEST_TABLE, /* "table" */ + CONFLICT_LOG_DEST_ALL /* "all" */ +} ConflictLogDest; + I didn't like this enum much. Suggest removing CONFLICT_LOG_DEST_INVALID. And use bits for the other values. And you can still have a default enum if you want. SUGGESTION typedef enum ConflictLogDest { CONFLICT_LOG_DEST_LOG = 0x001, CONFLICT_LOG_DEST_TABLE = 0x010, CONFLICT_LOG_DEST_DEFAULT = CONFLICT_LOG_DEST_LOG, CONFLICT_LOG_DEST_ALL = CONFLICT_LOG_DEST_LOG | CONFLICT_LOG_DEST_TABLE, } ConflictLogDest; BTW, there are only a few values that the array won't exceed length 0x11, so I guess you can still keep your same designated initialiser for the dest labels. ====== Kind Regards, Peter Smith. Fujitsu Australia