Thread

  1. Profiling the backend (gprof output) [current devel]

    Mattias Kregert <matti@algonet.se> — 1998-01-22T16:36:16Z

    Here is the top part of my gprof output from a simple session, creating
    two tables, inserting some rows, creating an index and doing a couple
    of simple selects (one minute of typing):
    ----------
      %   cumulative   self              self     total           
     time   seconds   seconds    calls  ms/call  ms/call  name    
     39.74     12.39    12.39                             mcount (profiler overhead)
      7.86     14.84     2.45   964885     0.00     0.00  fastgetattr
      2.79     15.71     0.87   906153     0.00     0.00  fastgetiattr
      2.44     16.47     0.76                             _psort_cmp
      2.08     17.12     0.65   400783     0.00     0.00  _bt_compare
      1.60     17.62     0.50   125987     0.00     0.01  hash_search
      1.48     18.08     0.46   128756     0.00     0.01  SearchSysCache
      1.28     18.48     0.40   120307     0.00     0.00  SpinAcquire
      1.25     18.87     0.39  1846682     0.00     0.00  fmgr_faddr
      1.06     19.20     0.33   253022     0.00     0.00  StrategyTermEvaluate
      1.03     19.52     0.32    31578     0.01     0.04  heapgettup
      0.99     19.83     0.31   128842     0.00     0.00  CatalogCacheComputeHashIndex
    ----------  
    Fastgetattr() doesn't seem to be so fast, after all... or perhaps it would be
    best to try and reduce the number of calls to it? One million calls to read
    attributes out of tuples seems to me as extreme when we are talking about less
    than one hundred rows.
    
    Perhaps it would be better to add a new function 'fastgetattrlist' to retrieve
    multiple attributes at once, instead of calling a macro wrapped around another
    bunch of macros, calling 'fastgetattr' for each attribute to retrieve?
    
    Or perhaps the tuples could be fitted with a "lookup table" when being stored
    in the backend cache? It could take .000005 second or so to build the table and
    attach it to the tuple, but it would definitively speed up retrieval of attributes
    from that tuple. If the same tuple is searched for its atributtes lots of times (as
    seem to be the case) then this would be faster in the end.
    
    Can we afford not to optimize this? I just hate those MySql people showing their
    performance figures. PostgreSQL should be the best...
    
    
    How about this (seemingly) unnecessarily complex part of
    access/common/heaptuple.c [fastgetattr] ...
    ----------
    switch (att[i]->attlen)
    {
    	case sizeof(char):
    		off++;		<-- why not 'sizeof(char)'?
    		break;
    	case sizeof(int16):
    		off += sizeof(int16);
    		break;
    	case sizeof(int32):
    		off += sizeof(int32);
    		break;
    	case -1:
    		usecache = false;
    		off += VARSIZE(tp + off);
    		break;
    	default:
    		off += att[i]->attlen;
    		break;
    }
    ----------
    
    Would it not be faster *and* easier to read if written as:
    ----------
    off += (att[i]->attlen == -1 ? (usecache=false,VARSIZE(tp+off)) : att[i]->attlen);
    ----------
    
    ...or is this some kind of magic which I should not worry about? There are almost
    no comments in this code, and most of the stuff is totally incomprehensible to me.
    
    Would it be a good idea to try and optimize things like this, or will these
    functions be replace sometime anyway?
    
    /* m */