Thread

  1. RE: libpq debug log

    Aya Iwata (Fujitsu) <iwata.aya@fujitsu.com> — 2021-03-17T15:06:14Z

    Hi Tsunakawa san, Alvaro san,
    
    Thank you very much for your review. It helped me a lot to make the patch better.
    I update patch to v26. 
    This patch has been updated according to Tsunakawa san and Alvaro san review comments.
    
    The output is following;
    ```
    2021-03-17 14:43:16.411238  >   Terminate   4                                         
    2021-03-17 14:43:16.441775  >   Query   155 "CREATE TABLESPACE regress_tblspacewith LOCATION '/home/iwata/pgsql/postgres/src/test/regress/testtablespace' WITH (some_nonexistent_parameter = true);"
    2021-03-17 14:43:16.442935  <   ErrorResponse   124 S "ERROR" V "ERROR" C "22023" M "unrecognized parameter "some_nonexistent_parameter"" F "reloptions.c" L "1447" R "parseRelOptionsInternal" \x00 "Z"
    2021-03-17 14:43:16.442977  <   ReadyForQuery   5 I                                   
    2021-03-17 14:43:16.443042  >   Query   144 "CREATE TABLESPACE regress_tblspacewith LOCATION '/home/iwata/pgsql/postgres/src/test/regress/testtablespace' WITH (random_page_cost = 3.0);"
    2021-03-17 14:43:16.444774  <   CommandComplete 22 "CREATE TABLESPACE"                
    2021-03-17 14:43:16.444807  <   ReadyForQuery   5 I                             
    2021-03-17 14:43:16.444878  >   Query   81 "SELECT spcoptions FROM pg_tablespace WHERE spcname = 'regress_tblspacewith';"
    2021-03-17 14:43:16.448117  <   RowDescription  35 1 "spcoptions" 1213 5 1009 65535 -1 0
    2021-03-17 14:43:16.448157  <   DataRow 32 1 22 '{random_page_cost=3.0}'              
    2021-03-17 14:43:16.448171  <   CommandComplete 13 "SELECT 1"       
    ```
    
    > -----Original Message-----
    > From: Tsunakawa, Takayuki/綱川 貴之 <tsunakawa.takay@fujitsu.com>
    > Sent: Tuesday, March 16, 2021 12:03 PM
     
     
    > (1)
    > -      Enables  tracing of the client/server communication to a debugging
    > file stream.
    > +      Enables tracing of the client/server communication to a debugging file
    > +      stream.
    > +      Only available when protocol version 3 or higher is used.
    > 
    > The last line is unnecessary now that v2 is not supported.
    
    I removed last comment from documentation file.
    
    > (2)
    > @@ -113,6 +114,7 @@ install: all installdirs install-lib
    >  	$(INSTALL_DATA) $(srcdir)/libpq-fe.h '$(DESTDIR)$(includedir)'
    >  	$(INSTALL_DATA) $(srcdir)/libpq-events.h '$(DESTDIR)$(includedir)'
    >  	$(INSTALL_DATA) $(srcdir)/libpq-int.h
    > '$(DESTDIR)$(includedir_internal)'
    > +	$(INSTALL_DATA) $(srcdir)/libpq-trace.h
    > '$(DESTDIR)$(includedir_internal)'
    >  	$(INSTALL_DATA) $(srcdir)/pqexpbuffer.h
    > '$(DESTDIR)$(includedir_internal)'
    >  	$(INSTALL_DATA) $(srcdir)/pg_service.conf.sample
    > '$(DESTDIR)$(datadir)/pg_service.conf.sample'
    > 
    > Why is this necessary?
    
    I think it is not necessary. I removed it.
    
    > (3) libpq-trace.h
    > +#ifdef __cplusplus
    > +extern "C"
    > +{
    > 
    > This is unnecessary because pqTraceOutputMessage() is for libpq's internal
    > use.  This extern is for the user's C++ application to call public libpq
    > functions.
    > 
    > +#include "libpq-fe.h"
    > +#include "libpq-int.h"
    > 
    > I think these can be removed by declaring the struct and function as follows:
    > 
    > struct pg_conn;
    > extern void pqTraceOutputMessage(struct pg_conn *conn, const char
    > *message, bool toServer);
    > 
    > But... after doing the above, only this declaration is left in libpq-trace.h.  Why
    > don't you put your original declaration using PGconn in libpq-int.h and remove
    > libpq-trace.h?
    
    I remove this file.
    I was wondering whether to delete this file during the development of v25 patch. I leave it because it keep scalability.
    If tracing process become update and it have a lot of extern function, leave this header file is meaningful.
    However I think it does not happen. So I remove it.
    
    > (4)
    > +	/* Trace message only when there is first 1 byte */
    > +	if (conn->Pfdebug && (conn->outCount < conn->outMsgStart))
    > +		pqTraceOutputMessage(conn, conn->outBuffer +
    > conn->outCount, true);
    > 
    > () surrounding the condition after && can be removed.
    
    I removed this (). And This if () statement has been modified according to the review comment (14).
    
    > (5)
    > +static const char *pqGetProtocolMsgType(unsigned char c,
    > +
    > 	bool toServer);
    > 
    > This is unnecessary because the function definition precedes its only one call
    > site.
    
    Yes, I removed this definition.
    
    > (6)
    > + * We accumulate frontend message pieces in an array as the libpq code
    > + writes
    > + * them, and log the complete message when pqTraceOutputFeMsg is called.
    > + * For backend, we print the pieces as soon as we receive them from the
    > server.
    > 
    > The first paragraph is no longer true.  I think this entire comment is
    > unnecessary.
    
    I removed this explanation. 
    
    > (7)
    > +static const char *
    > +pqGetProtocolMsgType(unsigned char c, bool toServer) {
    > +	if (toServer == true &&
    > +		c < lengthof(protocol_message_type_f) &&
    > +		protocol_message_type_f[c] != NULL)
    > +		return protocol_message_type_f[c];
    > +
    > +	if (toServer == false &&
    > +		c < lengthof(protocol_message_type_b) &&
    > +		protocol_message_type_b[c] != NULL)
    > +		return protocol_message_type_b[c];
    > +
    > +	return "UnknownMessage:";
    > +}
    > 
    > "== true/false" can be removed.  libpq doesn't appear to use such style.
    
    Yes, I removed it.
    
    > 
    > Why don't we embed this processing in pqTraceOutputMessage() because this
    > function is short and called only once?  The added benefit would be that you
    > can print an invalid message type byte like this:
    > 
    > 	fprintf(..., "Unknown message: %02x\n", ...);
    
    Sure. I fixed this message output.
    And I removed switch-case default fprintf which output `Invalidate Protocol`. 
    Because Unknown...message covers this message meaning.
    
    ...
    > (8)
    > +	char 		id = '\0';
    > 
    > This initialization is not required because id will always be assigned a value
    > shortly.
    
    I see, I removed this initialization.
    
    > (9)
    > +static int
    > +pqTraceOutputInt32(const char *data, FILE *pfdebug) {
    > +	int			result;
    > +
    > +	memcpy(&result, data, 4);
    > +	result = (int) pg_ntoh32(result);
    > +	fputc(' ', pfdebug);
    > +	fprintf(pfdebug, "%d", result);
    > +
    > +	return result;
    > +}
    > 
    > fputc() and fprintf() can be merged into one fprintf() call for efficiency.
    
    The reason I separated this is because I thought it would be easy to handle any changes 
    that I wanted to make, such as changing spaces to tabs. 
    However this appear only 5 times. So I merged fprintf and fputc.
    
    
    > (10)
    > +/* BackendKeyData */
    > +static void
    > +pqTraceOutputK(const char *message, FILE *f) {
    > +	int	cursor = 0;
    > +
    > +	pqTraceOutputInt32(message + cursor, f);
    > +	cursor += 4;
    > +	pqTraceOutputInt32(message + cursor, f); }
    > 
    > I don't think you need to always use a cursor variable in order to align with
    > more complex messages.  That is, you can just write:
    > 
    > +	pqTraceOutputInt32(message, f);
    > +	pqTraceOutputInt32(message + 4, f);
    
    To follow Alvaro san's suggestion which is * number 4, 
    I left this cursor.
    
    > (11)
    > +		default:
    > +			fprintf(conn->Pfdebug, "Invalid Protocol:%c\n", id);
    > +			break;
    > +
    > 
    > (This is related to (7))
    > You can remove this default label if you exit the function before the switch
    > statement when the message type is unknown.  Make sure to output \n in
    > that case.
    
    I removed the default.
    
    > (12) pqTraceOutputB
    > +	for (i = 0; i < nparams; i++)
    > +	{
    > +		nbytes = pqTraceOutputInt32(message + cursor, f);
    > +		cursor += 4;
    > +		if (nbytes == -1)
    > +			break;
    > +		pqTraceOutputNchar(message + cursor, f, nbytes);
    > +		cursor += nbytes;
    > +	}
    > 
    > Not break but continue, because non-NULL parameters may follow a NULL
    > parameter.
    
    I changed break to continue.
    
    > (13) pqTraceOutputB
    > +	pqTraceOutputInt16(message + cursor, f);
    > +	cursor += 4;
    > +	pqTraceOutputInt16(message + cursor, f); }
    > 
    > This part doesn't seem to match the following description.
    > 
    > ----------
    > After the last parameter, the following fields appear:
    > 
    > Int16
    > The number of result-column format codes that follow (denoted R below).
    > This can be zero to indicate that there are no result columns or that the result
    > columns should all use the default format (text); or one, in which case the
    > specified format code is applied to all result columns (if any); or it can equal
    > the actual number of result columns of the query.
    > 
    > Int16[R]
    > The result-column format codes. Each must presently be zero (text) or one
    > (binary).
    > ---------
    
    Thank you. I fixed it to use for loop.
    
    
    > (14)
    > The processing for CancelRequest message is missing?
    
    This message does not have first 1 byte. So I remove it from my patch.
    I create new function pqTraceOutputNoTypeByteMessage() and prepare it in the function.
    
    > (15)
    > +/* CopyInResponse */
    > +static void
    > +pqTraceOutputG(const char *message, int end, FILE *f) {
    > +	int	cursor = 0;
    > +
    > +	pqTraceOutputByte1(message + cursor, f);
    > +	cursor++;
    > +
    > +	while (end > cursor)
    > +	{
    > +		pqTraceOutputInt16(message + cursor, f);
    > +		cursor += 2;
    > +	}
    > +}
    > +
    > 
    > According to the following description, for loop should be used.
    > 
    > ----------
    > Int16
    > The number of columns in the data to be copied (denoted N below).
    > 
    > Int16[N]
    > The format codes to be used for each column. Each must presently be zero
    > (text) or one (binary). All must be zero if the overall copy format is textual.
    > ----------
    
    I fixed it.
    
    > (16)
    > (15) is also true for the processing of 'H' message.
    
    Thank you. I fixed it.
    
    > (17) pqTraceOutputD
    > +		for (i = 0; i < nfields; i++)
    > +		{
    > +			len = pqTraceOutputInt32(message + cursor, f);
    > +			cursor += 4;
    > +			if (len == -1)
    > +				break;
    > +			pqTraceOutputNchar(message + cursor, f, len);
    > +			cursor += len;
    > +		}
    > 
    > Not break but continue, because non-NULL columns may follow a NULL
    > column.
    
    I fixed it.
    
    > (18)
    > +		case 'E':	/* Execute(F) or Error Return(B) */
    > +			pqTraceOutputE(message + LogCursor, LogEnd,
    > conn->Pfdebug, toServer);
    > +			break;
    > 
    > Error Return -> ErrorResponse
    
    I fixed it.
    
    > (19) pqTraceOutputF
    > Check the protocol again.  Two for loops should be required, one for format
    > codes and another for arguments.
    
    Thank you. I fixed it.
    
    > (20)
    > +	if ( len != -1)
    > 
    > Remove extra space before len.
    
    I remove this space.
    
    > (21)
    > I guess you intentionally omitted the following messages because they are
    > only used during connection establishment.  I'm fine with it.  I wrote this just
    > in case you missed them.
    > 
    > GSSENCRequest (F)
    > Int32(8)
    > 
    > GSSResponse (F)
    > Byte1('p')
    > 
    > PasswordMessage (F)
    > Byte1('p')
    > 
    > SASLInitialResponse (F)
    > Byte1('p')
    > 
    > SASLResponse (F)
    > Byte1('p')
    
    Yes, I think it does not appear.
    
    > (22)
    > +/* NotificationResponse */
    > +static void
    > +pqTraceOutputA(const char *message, int end, FILE *f) {
    > +	int	cursor = 0;
    > +
    > +	pqTraceOutputInt16(message + cursor, f);
    > +	cursor += 2;
    > +	pqTraceOutputString(message + cursor, f);
    > 
    > Not Int16 but Int32.
    
    I fixed it.
    
    > From: Tsunakawa, Takayuki/綱川 貴之 <tsunakawa.takay@fujitsu.com>
    > Sent: Wednesday, March 17, 2021 11:10 AM
    
    > > * I would make the pqTraceOutputInt16() function and siblings advance
    > >   the cursor themselves, actually.  I think something like this:
    > >   static int
    > >   pqTraceOutputInt16(const char *data, int *cursor, FILE *pfdebug)
    > >   {
    > > 	  uint16		tmp;
    > > 	  int			result;
    > >
    > > 	  memcpy(&tmp, data + *cursor, 2);
    > > 	  *cursor += 2;
    > > 	  result = (int) pg_ntoh16(tmp);
    > > 	  fprintf(pfdebug, " #%d", result);
    > >
    > > 	  return result;
    > >   }
    > >   (So the caller has to pass the original "data" pointer, not
    > >   "data+cursor").  This means the caller no longer has to do "cursor+=N"
    > >   at each place.  Also, you get rid of the moveStrCursor() which does
    > >   not look good.
    > 
    > That makes sense, because in fact the patch mistakenly added 4 when it
    > should add 2.  Also, the code would become smaller.
    
    I changed 5 message data type function this style.
    
    > > * I'm not fond of the idea of prefixing "#" for 16bit ints and no
    > >   prefix for 32bit ints.  Seems obscure and the output looks weird.
    > >   I would use a one-letter prefix for each type, "w" for 32-bit ints
    > >   (stands for "word") and "h" for 16-bit ints (stands for "half-word").
    > >   Message length goes unadorned.  Then you'd have lines like this
    > >
    > > 2021-03-15 08:10:44.324296  <   RowDescription  35 h1 "spcoptions"
    > > w1213 h5 w1009 h65535 w-1 h0
    > > 2021-03-15 08:10:44.324335  <   DataRow 32 h1 w22
    > > '{random_page_cost=3.0}'
    > 
    > Yes, actually I felt something similar.  Taking a second thought, I think we
    > don't have to have any prefix because it doesn't help users.  So we're
    > removing the prefix.  We don't have any opinion on adding some prefix.
    
    Having only # can be confusing for users. So I removed #. 
    
    
    > > * I don't like that pqTraceOutputS/H print nothing when !toServer.  I
    > > think they should complain.
    > 
    > Yes, the caller should not call the function if there's no message content.
    > That way, the function doesn't need the toServer argument.
    
    I moved pqTraceOutputS/H's `if(!toServer)` check to pqTraceOutputMessage() switch statement.
    
    
    Regards,
    Aya Iwata