Thread

  1. Re: [HACKERS] Inefficiencies in COPY command

    Tom Lane <tgl@sss.pgh.pa.us> — 1999-08-07T15:01:15Z

    Wayne Piekarski <wayne@senet.com.au> writes:
    > So I made some changes to CopyAttributeOut so that it escapes the string
    > initially into a temporary buffer (allocated onto the stack) and then
    > feeds the whole string to the CopySendData which is a lot more efficient
    > because it can blast the whole string in one go, saving about 1/3 to 1/4
    > the number of memcpy and so on.
    
    copy.c is pretty much of a hack job to start with, IMHO.  If you can
    speed it up without making it even uglier, have at it!  However, it
    also has to be portable, and what you've done here:
    
    >  CopyAttributeOut(FILE *fp, char *server_string, char *delim, int is_array)
    >  {
    >  	char	   *string;
    >  	char		c;
    > +	char __buffer [(strlen (server_string) * 2) + 4]; /* Use +4 for safety */
    
    is not portable --- variable-sized local arrays are a gcc-ism.  (I use
    'em a lot too, but not in code intended for public release...)  Also,
    be careful that you don't introduce any assumptions about maximum
    field or tuple width; we want to get rid of those, not add more.
    
    > to also make changes to remove all use of sprintf when numbers
    > and floats are being converted into strings.
      ^^^^^^^^^^
    
    While formatting an int is pretty simple, formatting a float is not so
    simple.  I'd be leery of replacing sprintf with quick-hack float
    conversion code.  OTOH, if someone wanted to go to the trouble of doing
    it *right*, using our own code would tend to yield more consistent
    results across different OSes, which would be a Good Thing.  I'm not
    sure it'd be any faster than the typical sprintf, but it might be worth
    doing anyway.
    
    (My idea of *right* is the guaranteed-inverse float<=>ASCII routines
    published a few years ago in some SIGPLAN proceedings ... I've got the
    proceedings, and I even know approximately where they are, but I don't
    feel like excavating for them right now...)
    
    			regards, tom lane
    
    
  2. Re: [HACKERS] Inefficiencies in COPY command

    Wayne Piekarski <wayne@senet.com.au> — 1999-08-21T06:33:08Z

    Tom Lane wrote -
    > Wayne Piekarski <wayne@senet.com.au> writes:
    > > So I made some changes to CopyAttributeOut so that it escapes the string
    > > initially into a temporary buffer (allocated onto the stack) and then
    > > feeds the whole string to the CopySendData which is a lot more efficient
    > > because it can blast the whole string in one go, saving about 1/3 to 1/4
    > > the number of memcpy and so on.
    > 
    > copy.c is pretty much of a hack job to start with, IMHO.  If you can
    > speed it up without making it even uglier, have at it!  However, it
    > also has to be portable, and what you've done here:
    
    Ok, well I will write up a proper patch for CopyAttributeOut so it is not
    such a hack (using all those #defines and stuff wasn't very "elegant") and
    then submit a proper patch for it.... This was pretty straight forward to
    fix up.
    
    > While formatting an int is pretty simple, formatting a float is not so
    > simple.  I'd be leery of replacing sprintf with quick-hack float
    > conversion code.  OTOH, if someone wanted to go to the trouble of doing
    > it *right*, using our own code would tend to yield more consistent
    > results across different OSes, which would be a Good Thing.  I'm not
    > sure it'd be any faster than the typical sprintf, but it might be worth
    > doing anyway.
    
    I understand there are issues to do with not being able to use GPL code
    with Postgres, because its BSD license is not compatible, but would it be
    acceptable to extract code from BSD style code? If so, my FreeBSD here has
    libc code and includes the internals used by sprintf for rendering
    integers (and floats) and so we could include that code in, and should
    hopefully be portable at the same time as well. 
    
    This would be a lot faster than going via sprintf and lots of other
    functions, and would make not just COPY, but I think any SELECT query runs
    faster as well (because they get rewritten to strings by the output
    functions don't they). I guess other advantages would be improvements in
    the regression tests maybe, for problem types like int8 which in the past
    have had trouble under some BSDs.
    
    Does what I've written above sound ok? If so I'll go and work up something
    and come back with a patch.
    
    bye,
    Wayne
    
    ------------------------------------------------------------------------------
    Wayne Piekarski                               Tel:     (08) 8221 5221
    Research & Development Manager                Fax:     (08) 8221 5220
    SE Network Access Pty Ltd                     Mob:     0407 395 889
    222 Grote Street                              Email:   wayne@senet.com.au
    Adelaide SA 5000                              WWW:     http://www.senet.com.au
    
    
    
  3. Re: [HACKERS] Inefficiencies in COPY command

    Bruce Momjian <maillist@candle.pha.pa.us> — 1999-08-24T04:13:17Z

    > Tom Lane wrote -
    > > Wayne Piekarski <wayne@senet.com.au> writes:
    > > > So I made some changes to CopyAttributeOut so that it escapes the string
    > > > initially into a temporary buffer (allocated onto the stack) and then
    > > > feeds the whole string to the CopySendData which is a lot more efficient
    > > > because it can blast the whole string in one go, saving about 1/3 to 1/4
    > > > the number of memcpy and so on.
    > > 
    > > copy.c is pretty much of a hack job to start with, IMHO.  If you can
    > > speed it up without making it even uglier, have at it!  However, it
    > > also has to be portable, and what you've done here:
    > 
    > Ok, well I will write up a proper patch for CopyAttributeOut so it is not
    > such a hack (using all those #defines and stuff wasn't very "elegant") and
    > then submit a proper patch for it.... This was pretty straight forward to
    > fix up.
    
    Great.
    
    > 
    > > While formatting an int is pretty simple, formatting a float is not so
    > > simple.  I'd be leery of replacing sprintf with quick-hack float
    > > conversion code.  OTOH, if someone wanted to go to the trouble of doing
    > > it *right*, using our own code would tend to yield more consistent
    > > results across different OSes, which would be a Good Thing.  I'm not
    > > sure it'd be any faster than the typical sprintf, but it might be worth
    > > doing anyway.
    > 
    > I understand there are issues to do with not being able to use GPL code
    > with Postgres, because its BSD license is not compatible, but would it be
    > acceptable to extract code from BSD style code? If so, my FreeBSD here has
    > libc code and includes the internals used by sprintf for rendering
    > integers (and floats) and so we could include that code in, and should
    > hopefully be portable at the same time as well. 
    > 
    > This would be a lot faster than going via sprintf and lots of other
    > functions, and would make not just COPY, but I think any SELECT query runs
    > faster as well (because they get rewritten to strings by the output
    > functions don't they). I guess other advantages would be improvements in
    > the regression tests maybe, for problem types like int8 which in the past
    > have had trouble under some BSDs.
    
    Does using the FreeBSD sprintf conversion functions really make it
    faster than just calling sprintf?  How?
    
    
    -- 
      Bruce Momjian                        |  http://www.op.net/~candle
      maillist@candle.pha.pa.us            |  (610) 853-3000
      +  If your life is a hard drive,     |  830 Blythe Avenue
      +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
    
    
  4. Re: [HACKERS] Inefficiencies in COPY command

    Bruce Momjian <maillist@candle.pha.pa.us> — 1999-09-27T18:04:42Z

    Yes, I too would be interested in any code that would speed up COPY
    without loosing modularity or portability.
    
    Please let us know if you get a patch we can include in our source tree.
    
    > Wayne Piekarski <wayne@senet.com.au> writes:
    > > So I made some changes to CopyAttributeOut so that it escapes the string
    > > initially into a temporary buffer (allocated onto the stack) and then
    > > feeds the whole string to the CopySendData which is a lot more efficient
    > > because it can blast the whole string in one go, saving about 1/3 to 1/4
    > > the number of memcpy and so on.
    > 
    > copy.c is pretty much of a hack job to start with, IMHO.  If you can
    > speed it up without making it even uglier, have at it!  However, it
    > also has to be portable, and what you've done here:
    > 
    > >  CopyAttributeOut(FILE *fp, char *server_string, char *delim, int is_array)
    > >  {
    > >  	char	   *string;
    > >  	char		c;
    > > +	char __buffer [(strlen (server_string) * 2) + 4]; /* Use +4 for safety */
    > 
    > is not portable --- variable-sized local arrays are a gcc-ism.  (I use
    > 'em a lot too, but not in code intended for public release...)  Also,
    > be careful that you don't introduce any assumptions about maximum
    > field or tuple width; we want to get rid of those, not add more.
    > 
    > > to also make changes to remove all use of sprintf when numbers
    > > and floats are being converted into strings.
    >   ^^^^^^^^^^
    > 
    > While formatting an int is pretty simple, formatting a float is not so
    > simple.  I'd be leery of replacing sprintf with quick-hack float
    > conversion code.  OTOH, if someone wanted to go to the trouble of doing
    > it *right*, using our own code would tend to yield more consistent
    > results across different OSes, which would be a Good Thing.  I'm not
    > sure it'd be any faster than the typical sprintf, but it might be worth
    > doing anyway.
    > 
    > (My idea of *right* is the guaranteed-inverse float<=>ASCII routines
    > published a few years ago in some SIGPLAN proceedings ... I've got the
    > proceedings, and I even know approximately where they are, but I don't
    > feel like excavating for them right now...)
    > 
    > 			regards, tom lane
    > 
    > 
    
    
    -- 
      Bruce Momjian                        |  http://www.op.net/~candle
      maillist@candle.pha.pa.us            |  (610) 853-3000
      +  If your life is a hard drive,     |  830 Blythe Avenue
      +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026