Thread

  1. Massimo patch

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-02-15T02:31:46Z

    Here is a description of the major patch from Massimo.  The irony of
    this is that the mail message is dated January 27th, when application of
    the patch would have been easier because we were not in beta testing.  I
    have a copy of the patch here on my machine.
    
    What do people want to do with this?  I have reviewed the patch, and it
    looks good, but it may take work to merge in because it is against
    6.2.1p7, not 6.3 beta, and it does introduce quite a bit of new code.
    
    ---------------------------------------------------------------------------
    
    Forwarded message:
    > From owner-pgsql-patches@hub.org Sat Feb 14 09:31:23 1998
    > From: Massimo Dal Zotto <dz@cs.unitn.it>
    > Message-Id: <199802132233.XAA01003@nikita.wizard.it>
    > Subject: [PATCHES] patches for 6.2.1p6
    > To: pgsql-patches@postgreSQL.org (Pgsql Patches)
    > Date: Tue, 27 Jan 1998 22:46:38 +0100 (MET)
    > X-Mailer: ELM [version 2.4 PL24 ME4]
    > MIME-Version: 1.0
    > Content-Type: multipart/mixed; boundary=%#%record%#%
    > Sender: owner-pgsql-patches@hub.org
    > Precedence: bulk
    > 
    > --%#%record%#%
    > Content-Type: text/plain; charset=iso-8859-1
    > Content-Transfer-Encoding: 8bit
    > Content-Length: 6638      
    > 
    > Hi hackers,
    > 
    > I have old patches for version 6.2.1p6 which fix some problems and add
    > new features. Here is a short description of each patch file:
    > 
    > 
    > assert.patch
    > 
    > 	adds a switch to turn on/off the assert checking if enabled at compile
    > 	time. You can now compile postgres with assert checking and disable it
    > 	at runtime in a production environment.
    > 
    > async-unlisten.patch
    > 
    > 	declares Async_Unlisten() external so that it can be called by user
    > 	modules.
    > 
    > exec-limit.patch
    > 
    > 	removes the #ifdef NOT_USED around ExecutorLimit(). It is used.
    > 
    > exitpg.patch
    > 
    > 	limits recursive calls to exitpg() preventing an infinite loop
    > 	if an error is found inside exitpg.
    > 
    > libpgtcl-listen.patch
    > 
    > 	Just a change from upper to lowercase of an sql command in libpgtcl,
    > 	totally harmless.
    > 
    > new-locks.patch
    > 
    > 	After long studying and many debugging sessions I have finally
    > 	understood how the low level locks work.
    > 	I have completely rewritten lock.c cleaning up the code and adding
    > 	better assert checking. I have also added some fields to the lock
    > 	and xid tags for better support of user locks. This patch includes
    > 	also a patch submitted by Bruce Momjian which changes the handling
    > 	of lock priorities. It can however be disabled if an option is set
    > 	in pg_options, see tprintf.patch (Bruce patch works by building
    > 	the queue in reverse priority order, my old patch kept the queue in
    > 	decreasing order and traversed it from the other side).
    > 
    > pg-flush.patch
    > 
    > 	removes an unnecessary flush in libpq reducing network traffic and
    > 	increasing performance.
    > 
    > relname.patch
    > 
    > 	an utility which returns the relname corresponding to a given oid.
    > 	Useful for debug messages (see vacum.patch).
    > 
    > sequence.patch
    > 
    > 	added a setval() function which enables othe owner of a sequence
    > 	to set its value without need to delete and recreate it.
    > 
    > sinval.patch
    > 
    > 	fixes a problem in SI cache which causes table overflow if some
    > 	backend is idle for a long time while other backends keep adding
    > 	entries.
    > 	It uses the new signal handling implemented in tprintf.patch.
    > 	I have also increacasesed the max number of backends from 32 to 64 and
    > 	the table size from 1000 to 5000.
    > 
    > spin-lock.patch
    > 
    > 	I'm not sure if this is really useful, but it seems stupid to have
    > 	a backend wasting cpu cycles in a busy loop while the process which
    > 	should release the lock is waiting for the cpu. So I added a call
    > 	to process_yield() if the spin lock can't obtained.
    > 	This has been implemented and tested only on Linux. I don't know if
    > 	other OS have process_yield(). If someone can check please do it.
    > 
    > tprintf.patch
    > 
    > 	adds functions and macros which implement a conditional trace package
    > 	with the ability to change flags and numeric options of running
    > 	backends at runtime.
    > 	Options/flags can be specified in the command line and/or read from
    > 	the file pg_options in the data directory.
    > 	Running backends can be forced to update their options from this file
    > 	by sending them a SIGHUP signal (this is the convention used by most
    > 	unix daemons so I changed the meaning of SIGHUP).
    > 	Options can be debugging flags used by the trace package or any other
    > 	numeric	value used by the backend, for example the deadlock_timeout.
    > 	Having flags and options specified at runtime and changed while the
    > 	backends are running can greatly simplify the debugging and tuning
    > 	of the database. New options can be defined in utils/misc/trace.c and
    > 	include/utils/trace.h.  As an example of the usage of this package
    > 	see lock.c and proc.c which make use of new runtime options.
    > 
    > 	Old files using int flags or variables can be easily changed to
    > 	use the new package by substituting the old variable with a #define
    > 	like in the following example:
    > 	
    > 	  /* int my_flag = 0; */
    > 	  #include "trace.h"
    >  	  #define my_flag pg_options[OPT_MYFLAG]
    > 
    > 	I have done it in postgres.c and some other files and now I can turn
    > 	on/off any single debug flag on the fly with a simple shell script.
    > 	I have removed the IpcConfigTip() from ipc.c, it should better be
    > 	described in the postgres manual instead of being printed on stderr.
    > 
    > 	This patch provides also a new format of debugging messages which
    > 	are always in a single line with a timestamp and the backend pid:
    > 
    > 	  #timestamp          #pid    #message
    > 	  980127.17:52:14.173 [29271] StartTransactionCommand
    > 	  980127.17:52:14.174 [29271] ProcessUtility:  drop table t;
    > 	  980127.17:52:14.186 [29271] SIIncNumEntries: table is 70% full
    > 	  980127.17:52:14.186 [29286] Async_NotifyHandler
    > 	  980127.17:52:14.186 [29286] Waking up sleeping backend process
    > 	  980127.19:52:14.292 [29286] Async_NotifyFrontEnd
    > 	  980127.19:52:14.413 [29286] Async_NotifyFrontEnd done
    > 	  980127.19:52:14.466 [29286] Async_NotifyHandler done
    > 
    > 	This improves the readability of the log and allows one to understand
    > 	exactly which backend is doing what and at which time. It also makes
    > 	easier to write simple awk or perl scripts which monitor the log to
    > 	detect database errors or problem, or to compute transaction times.
    > 
    > 	The patch changes also the meaning of signals used by postgres, as
    > 	described by the following table:
    > 
    > 		    postmaster				backend
    > 
    > 	SIGHUP      kill(*,sighup)			read_pg_options
    > 	SIGINT	    kill(*,sigint), die			die
    > 	SIGCHLD	    reaper				-
    > 	SIGTTIN	    ignored				-
    > 	SIGTTOU	    ignored				-
    > 	SIGQUIT	    die					handle_warn
    > 	SIGTERM	    kill(*,sigterm), kill(*,9), die	die
    > 	SIGCONT	    dumpstatus				-
    > 	SIGPIPE	    ignored				die
    > 	SIGFPE	    -					FloatExceptionHandler
    > 	SIGTSTP	    -					ignored (alive test)
    > 	SIGUSR1	    kill(*,sigusr1), die		quickdie
    > 	SIGUSR2	    kill(*,sigusr2)			Async_NotifyHandler
    > 							(also SI buffer flush)
    > 
    > 	The main changes to the old implementation are SIGQUIT instead of
    > 	SIGHUP to handle warns, SIGHUP to reread pg_options and redirection
    > 	to all backends of SIGHUP, SIGINT, SIGTERM, SIGUSR1 and SIGUSR2.
    > 	In this way some of the signals sent to the postmaster can be sent
    > 	automatically to all the backends. To shut down postgres one needs
    > 	only to send a SIGTERM to postmaster and it will stop automatically
    > 	all the backends. This new signal handling mechanism is also used
    > 	to prevent SI cache table overflows: when a backend detects the SI
    > 	table full at 70% it simply sends a signal to the postmaster which
    > 	will wake up all idle backends and make them flush the cache.
    > 
    > vacuum.patch
    > 
    > 	adds a debug message to vacuum that prints the name of a table or
    > 	index *before* vacuuming it, if the verbose keyword is set.
    > 	This is useful to know which table is causing troubles if a
    > 	vacuum all crashes. Currently table information is printed only
    > 	at the end of each vacuum operation and is never printed if the
    > 	vacuum crashes.
    > 
    > -- 
    > Massimo Dal Zotto
    
    
    -- 
    Bruce Momjian
    maillist@candle.pha.pa.us
    
    
    -- 
    Bruce Momjian
    maillist@candle.pha.pa.us
    
    
  2. Re: [HACKERS] Massimo patch

    Marc G. Fournier <scrappy@hub.org> — 1998-02-15T04:29:06Z

    On Sat, 14 Feb 1998, Bruce Momjian wrote:
    
    > Here is a description of the major patch from Massimo.  The irony of
    > this is that the mail message is dated January 27th, when application of
    > the patch would have been easier because we were not in beta testing.  I
    > have a copy of the patch here on my machine.
    > 
    > What do people want to do with this?  I have reviewed the patch, and it
    > looks good, but it may take work to merge in because it is against
    > 6.2.1p7, not 6.3 beta, and it does introduce quite a bit of new code.
    
    	I sent Massimo and email (CC'd to the list) explaining the fact
    that we are in Beta mode, and that some of these patches are currently
    questionable for v6.3 (the lock patch being one)...with another 2 weeks of
    beta testing before release, my opinion is that these involve changes that
    will not have sufficient time to test prior to release, especially with at
    least one major bug still existing...
    
    	The other question that I do have is how appropriate some of these
    patches are to v6.3...2 weeks, in my opinion, isn't a suitable amount of
    time, in which to accurately test these, and should wait until after the
    release...
    
    	Unless anyone can really argue this, those that add features
    (assert.patch) should be omitted...those that fix a bug are appropriate.
    Not that I haven't looked at the patch itself, only at the descriptions
    presented...
    
    
    
    
    > > assert.patch
    > > 
    > > 	adds a switch to turn on/off the assert checking if enabled at compile
    > > 	time. You can now compile postgres with assert checking and disable it
    > > 	at runtime in a production environment.
    
    	New feature...not a bug fix...
    
    > > async-unlisten.patch
    > > 
    > > 	declares Async_Unlisten() external so that it can be called by user
    > > 	modules.
    
    	New feature...not a bug fix...
    
    > > exec-limit.patch
    > > 
    > > 	removes the #ifdef NOT_USED around ExecutorLimit(). It is used.
    
    	if it has a "#ifdef NOT_USED" around it, then how is it used? 
    
    > > exitpg.patch
    > > 
    > > 	limits recursive calls to exitpg() preventing an infinite loop
    > > 	if an error is found inside exitpg.
    
    	Potential bug...but how is it triggered?
    
    > > libpgtcl-listen.patch
    > > 
    > > 	Just a change from upper to lowercase of an sql command in libpgtcl,
    > > 	totally harmless.
    	
    	??
    
    > > new-locks.patch
    > > 
    > > 	After long studying and many debugging sessions I have finally
    > > 	understood how the low level locks work.
    > > 	I have completely rewritten lock.c cleaning up the code and adding
    > > 	better assert checking. I have also added some fields to the lock
    > > 	and xid tags for better support of user locks. This patch includes
    > > 	also a patch submitted by Bruce Momjian which changes the handling
    > > 	of lock priorities. It can however be disabled if an option is set
    > > 	in pg_options, see tprintf.patch (Bruce patch works by building
    > > 	the queue in reverse priority order, my old patch kept the queue in
    > > 	decreasing order and traversed it from the other side).
    
    	I don't understand this one, but it sounds like its negates the
    work you just did?  Again, sounds like a feature change, not a bug fix,
    since you have recently re-written this...
    
    > > pg-flush.patch
    > > 
    > > 	removes an unnecessary flush in libpq reducing network traffic and
    > > 	increasing performance.
    
    	Didn't we just do a protocol rewrite?  
    
    > > relname.patch
    > > 
    > > 	an utility which returns the relname corresponding to a given oid.
    > > 	Useful for debug messages (see vacum.patch).
    
    	Is this a new program?  src/bin/relname?
    
    > > sequence.patch
    > > 
    > > 	added a setval() function which enables othe owner of a sequence
    > > 	to set its value without need to delete and recreate it.
    
    	Feature change, not a bug...
    
    > > sinval.patch
    > > 
    > > 	fixes a problem in SI cache which causes table overflow if some
    > > 	backend is idle for a long time while other backends keep adding
    > > 	entries.
    > > 	It uses the new signal handling implemented in tprintf.patch.
    > > 	I have also increacasesed the max number of backends from 32 to 64 and
    > > 	the table size from 1000 to 5000.
    
    	Bug fix...
    
    > > spin-lock.patch
    > > 
    > > 	I'm not sure if this is really useful, but it seems stupid to have
    > > 	a backend wasting cpu cycles in a busy loop while the process which
    > > 	should release the lock is waiting for the cpu. So I added a call
    > > 	to process_yield() if the spin lock can't obtained.
    > > 	This has been implemented and tested only on Linux. I don't know if
    > > 	other OS have process_yield(). If someone can check please do it.
    
    	Sounds like a bug fix to me...
    
    > > tprintf.patch
    > > 
    > > 	adds functions and macros which implement a conditional trace package
    > > 	with the ability to change flags and numeric options of running
    > > 	backends at runtime.
    > > 	Options/flags can be specified in the command line and/or read from
    > > 	the file pg_options in the data directory.
    > > 	Running backends can be forced to update their options from this file
    > > 	by sending them a SIGHUP signal (this is the convention used by most
    > > 	unix daemons so I changed the meaning of SIGHUP).
    > > 	Options can be debugging flags used by the trace package or any other
    > > 	numeric	value used by the backend, for example the deadlock_timeout.
    > > 	Having flags and options specified at runtime and changed while the
    > > 	backends are running can greatly simplify the debugging and tuning
    > > 	of the database. New options can be defined in utils/misc/trace.c and
    > > 	include/utils/trace.h.  As an example of the usage of this package
    > > 	see lock.c and proc.c which make use of new runtime options.
    
    	Definitely have to say new feature...
    
    > > vacuum.patch
    > > 
    > > 	adds a debug message to vacuum that prints the name of a table or
    > > 	index *before* vacuuming it, if the verbose keyword is set.
    > > 	This is useful to know which table is causing troubles if a
    > > 	vacuum all crashes. Currently table information is printed only
    > > 	at the end of each vacuum operation and is never printed if the
    > > 	vacuum crashes.
    
    	Again, a feature more then a bug fix...
    
    Marc G. Fournier                                
    Systems Administrator @ hub.org 
    primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
    
    
    
  3. Re: [HACKERS] Massimo patch

    Thomas Lockhart <lockhart@alumni.caltech.edu> — 1998-02-15T04:37:03Z

    > Here is a description of the major patch from Massimo.  The irony of
    > this is that the mail message is dated January 27th, when application of
    > the patch would have been easier because we were not in beta testing.  I
    > have a copy of the patch here on my machine.
    >
    > What do people want to do with this?  I have reviewed the patch, and it
    > looks good, but it may take work to merge in because it is against
    > 6.2.1p7, not 6.3 beta, and it does introduce quite a bit of new code.
    
    Jeesh. Sure looks nice. Without knowing how fundamental the code changes are, it
    is hard to say for sure, but I would think we would want to put this in, unless
    it leads to major breakage. Someone would need to take the patches, merge them
    all in as a group, and validate the regression tests to see where we are on them.
    I'd even think it would be worth doing if it delays release a couple of weeks,
    which it may not.
    
    More comments??
    
                                                          - Tom
    
    
    
  4. spin locks

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-02-15T05:27:17Z

    > spin-lock.patch
    > 
    > 	I'm not sure if this is really useful, but it seems stupid to have
    > 	a backend wasting cpu cycles in a busy loop while the process which
    > 	should release the lock is waiting for the cpu. So I added a call
    > 	to process_yield() if the spin lock can't obtained.
    > 	This has been implemented and tested only on Linux. I don't know if
    > 	other OS have process_yield(). If someone can check please do it.
    
    Massimo brings up a good point.  Most of our s_lock.h locking does asm
    mutex loops looking for a lock.  Unless we are using a multi-cpu
    machine, there is no way this is going to change while we are spinning.
    
    Linux has process_yield(), but most OS's don't.  Is there a
    platform-independent way to relinquish the cpu if the first attempt at
    the spinlock fails?  Would a select() of 1 microsecond work?
    
    -- 
    Bruce Momjian
    maillist@candle.pha.pa.us
    
    
  5. Re: [HACKERS] spin locks

    Marc G. Fournier <scrappy@hub.org> — 1998-02-15T06:11:04Z

    On Sun, 15 Feb 1998, Bruce Momjian wrote:
    
    > > spin-lock.patch
    > > 
    > > 	I'm not sure if this is really useful, but it seems stupid to have
    > > 	a backend wasting cpu cycles in a busy loop while the process which
    > > 	should release the lock is waiting for the cpu. So I added a call
    > > 	to process_yield() if the spin lock can't obtained.
    > > 	This has been implemented and tested only on Linux. I don't know if
    > > 	other OS have process_yield(). If someone can check please do it.
    > 
    > Massimo brings up a good point.  Most of our s_lock.h locking does asm
    > mutex loops looking for a lock.  Unless we are using a multi-cpu
    > machine, there is no way this is going to change while we are spinning.
    
    	I'm not quite sure I follow this...in a multi-cpu environment,
    would process_yield() introduce a problem? *raised eyebrow*
    
    > Linux has process_yield(), but most OS's don't.  Is there a
    > platform-independent way to relinquish the cpu if the first attempt at
    > the spinlock fails?  Would a select() of 1 microsecond work?
    
    	There is nothing wrong with introducing an OS specific
    optimization to the code...we can add a HAVE_PROCESS_YIELD to config.h and
    if a system has it, use it...
    
    Marc G. Fournier                                
    Systems Administrator @ hub.org 
    primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
    
    
    
  6. Re: [HACKERS] Massimo patch

    Massimo Dal Zotto <dz@cs.unitn.it> — 1998-02-16T09:43:46Z

    > 
    > On Sat, 14 Feb 1998, Bruce Momjian wrote:
    > 
    > > Here is a description of the major patch from Massimo.  The irony of
    > > this is that the mail message is dated January 27th, when application of
    > > the patch would have been easier because we were not in beta testing.  I
    > > have a copy of the patch here on my machine.
    > > 
    > > What do people want to do with this?  I have reviewed the patch, and it
    > > looks good, but it may take work to merge in because it is against
    > > 6.2.1p7, not 6.3 beta, and it does introduce quite a bit of new code.
    > 
    > 	I sent Massimo and email (CC'd to the list) explaining the fact
    > that we are in Beta mode, and that some of these patches are currently
    > questionable for v6.3 (the lock patch being one)...with another 2 weeks of
    > beta testing before release, my opinion is that these involve changes that
    > will not have sufficient time to test prior to release, especially with at
    > least one major bug still existing...
    > 
    > 	The other question that I do have is how appropriate some of these
    > patches are to v6.3...2 weeks, in my opinion, isn't a suitable amount of
    > time, in which to accurately test these, and should wait until after the
    > release...
    > 
    > 	Unless anyone can really argue this, those that add features
    > (assert.patch) should be omitted...those that fix a bug are appropriate.
    > Not that I haven't looked at the patch itself, only at the descriptions
    > presented...
    > 
    > 
    > 
    > 
    > > > assert.patch
    > > > 
    > > > 	adds a switch to turn on/off the assert checking if enabled at compile
    > > > 	time. You can now compile postgres with assert checking and disable it
    > > > 	at runtime in a production environment.
    > 
    > 	New feature...not a bug fix...
    
    Old feature, I submitted this patch more than six months ago, I'm using it
    from that date without problems. It simply adds a global variable to a macro.
    
    > > > async-unlisten.patch
    > > > 
    > > > 	declares Async_Unlisten() external so that it can be called by user
    > > > 	modules.
    > 
    > 	New feature...not a bug fix...
    
    Just declaring external an existing function shouldn't break anything.
    
    > 
    > > > exec-limit.patch
    > > > 
    > > > 	removes the #ifdef NOT_USED around ExecutorLimit(). It is used.
    > 
    > 	if it has a "#ifdef NOT_USED" around it, then how is it used? 
    
    It is used by a loadable module in contrib.
    
    > 
    > > > exitpg.patch
    > > > 
    > > > 	limits recursive calls to exitpg() preventing an infinite loop
    > > > 	if an error is found inside exitpg.
    > 
    > 	Potential bug...but how is it triggered?
    
    I don't know exactly how it is triggered but I have seen it happen. This
    patch tries to limit damages.
    
    > 
    > > > libpgtcl-listen.patch
    > > > 
    > > > 	Just a change from upper to lowercase of an sql command in libpgtcl,
    > > > 	totally harmless.
    > 	
    > 	??
    
    Not really important, it just changes 'LISTEN' to 'listen' so that when I
    look at the postgres log the query is shown in lowercase like the others.
    Totally harmless.
    
    > 
    > > > new-locks.patch
    > > > 
    > > > 	After long studying and many debugging sessions I have finally
    > > > 	understood how the low level locks work.
    > > > 	I have completely rewritten lock.c cleaning up the code and adding
    > > > 	better assert checking. I have also added some fields to the lock
    > > > 	and xid tags for better support of user locks. This patch includes
    > > > 	also a patch submitted by Bruce Momjian which changes the handling
    > > > 	of lock priorities. It can however be disabled if an option is set
    > > > 	in pg_options, see tprintf.patch (Bruce patch works by building
    > > > 	the queue in reverse priority order, my old patch kept the queue in
    > > > 	decreasing order and traversed it from the other side).
    > 
    > 	I don't understand this one, but it sounds like its negates the
    > work you just did?  Again, sounds like a feature change, not a bug fix,
    > since you have recently re-written this...
    
    I rewrote part of the work I did one year ago, cleaned up the original lock
    code and did some small changes. I admit that this should be handled with
    care, but, again, I'm using it from months and it seems to work well.
    
    > 
    > > > pg-flush.patch
    > > > 
    > > > 	removes an unnecessary flush in libpq reducing network traffic and
    > > > 	increasing performance.
    > 
    > 	Didn't we just do a protocol rewrite?  
    
    I don't know about it, but Bruce approved the patch some months ago, so it
    has probably been included in the new protocol.
    
    > 
    > > > relname.patch
    > > > 
    > > > 	an utility which returns the relname corresponding to a given oid.
    > > > 	Useful for debug messages (see vacum.patch).
    > 
    > 	Is this a new program?  src/bin/relname?
    
    It is just a new utility function which returns the name of a relation given
    its oid. It may be used to show relname information in user messages and
    traces.
    
    > 
    > > > sequence.patch
    > > > 
    > > > 	added a setval() function which enables othe owner of a sequence
    > > > 	to set its value without need to delete and recreate it.
    > 
    > 	Feature change, not a bug...
    > 
    > > > sinval.patch
    > > > 
    > > > 	fixes a problem in SI cache which causes table overflow if some
    > > > 	backend is idle for a long time while other backends keep adding
    > > > 	entries.
    > > > 	It uses the new signal handling implemented in tprintf.patch.
    > > > 	I have also increacasesed the max number of backends from 32 to 64 and
    > > > 	the table size from 1000 to 5000.
    > 
    > 	Bug fix...
    
    Yes, and very nasty.
    
    > 
    > > > spin-lock.patch
    > > > 
    > > > 	I'm not sure if this is really useful, but it seems stupid to have
    > > > 	a backend wasting cpu cycles in a busy loop while the process which
    > > > 	should release the lock is waiting for the cpu. So I added a call
    > > > 	to process_yield() if the spin lock can't obtained.
    > > > 	This has been implemented and tested only on Linux. I don't know if
    > > > 	other OS have process_yield(). If someone can check please do it.
    > 
    > 	Sounds like a bug fix to me...
    
    No, it is just a performance optimization. And I'm not sure how much we could
    gain from this patch, so I ask a discussion about it.
    
    > 
    > > > tprintf.patch
    > > > 
    > > > 	adds functions and macros which implement a conditional trace package
    > > > 	with the ability to change flags and numeric options of running
    > > > 	backends at runtime.
    > > > 	Options/flags can be specified in the command line and/or read from
    > > > 	the file pg_options in the data directory.
    > > > 	Running backends can be forced to update their options from this file
    > > > 	by sending them a SIGHUP signal (this is the convention used by most
    > > > 	unix daemons so I changed the meaning of SIGHUP).
    > > > 	Options can be debugging flags used by the trace package or any other
    > > > 	numeric	value used by the backend, for example the deadlock_timeout.
    > > > 	Having flags and options specified at runtime and changed while the
    > > > 	backends are running can greatly simplify the debugging and tuning
    > > > 	of the database. New options can be defined in utils/misc/trace.c and
    > > > 	include/utils/trace.h.  As an example of the usage of this package
    > > > 	see lock.c and proc.c which make use of new runtime options.
    > 
    > 	Definitely have to say new feature...
    
    Yes, but very handy. I highly recommend it, maybe after 6.3.
    
    > 
    > > > vacuum.patch
    > > > 
    > > > 	adds a debug message to vacuum that prints the name of a table or
    > > > 	index *before* vacuuming it, if the verbose keyword is set.
    > > > 	This is useful to know which table is causing troubles if a
    > > > 	vacuum all crashes. Currently table information is printed only
    > > > 	at the end of each vacuum operation and is never printed if the
    > > > 	vacuum crashes.
    > 
    > 	Again, a feature more then a bug fix...
    
    Just a debug message, should not break anything.
    
    
    I would like also to add a new patch for the oracle_compat patch added in
    version 6.2.1p7, which is definitely broken. Try this and you will see:
    
    test=> create table t (x text);
    CREATE
    test=> insert into t values('1');
    INSERT 17422 1
    test=> insert into t values('12');
    INSERT 17423 1
    test=> insert into t values('123');
    INSERT 17424 1
    test=> insert into t values('1234');
    INSERT 17425 1
    test=> insert into t values('12345');
    INSERT 17426 1
    test=> insert into t values('123456');
    INSERT 17427 1
    test=> insert into t values('1234567');
    INSERT 17428 1
    test=> insert into t values('12345678');
    INSERT 17429 1
    test=> insert into t values('123456789');
    INSERT 17430 1
    test=> insert into t values('1234567890');
    INSERT 17431 1
    test=> insert into t values('12345678901');
    INSERT 17432 1
    test=> insert into t values('123456789012');
    INSERT 17433 1
    test=> insert into t values('1234567890123');
    INSERT 17434 1
    test=> insert into t values('12345678901234');
    INSERT 17435 1
    test=> insert into t values('123456789012345');
    INSERT 17436 1
    test=> insert into t values('1234567890123456');
    INSERT 17437 1
    test=> select x from t;
                   x
    ----------------
                   1
                  12
                 123
                1234
               12345
              123456
             1234567
            12345678
           123456789
          1234567890
         12345678901
        123456789012
       1234567890123
      12345678901234
     123456789012345
    1234567890123456
    (16 rows)
    
    test=> select substr(x,1) from t;
    substr          
    ----------------
    1               
    12              
    123             
    12347           
    12345           
    123456          
    1234567         
    12345678        
    123456789       
    1234567890      
    12345678901     
    123456789012?   
    1234567890123   
    12345678901234  
    123456789012345 
    1234567890123456
    (16 rows)
    
    Note the extra character after 1234 and 123456789012. The bug is random and
    is present also in the latest version sent me by Bruce. The following patch
    corrects the problem. BTW, why are the results aligned differently by psql?
    
    *** src/backend/utils/adt/oracle_compat-621p7.c	Wed Jan 28 00:11:16 1998
    --- src/backend/utils/adt/oracle_compat.c	Tue Feb  3 14:48:42 1998
    ***************
    *** 510,526 ****
      			   *ptr_ret;
      	int			len;
      
      	if ((string == (text *) NULL) ||
    ! 		(m <= 0) || (n <= 0) ||
    ! 		((len = VARSIZE(string) - VARHDRSZ - m + 1) <= 0))
      		return string;
      
    ! 	len = len + 1 < n ? len + 1 : n;
      
      	ret = (text *) palloc(VARHDRSZ + len);
      	VARSIZE(ret) = VARHDRSZ + len;
      
    ! 	ptr = VARDATA(string) + m - 1;
      	ptr_ret = VARDATA(ret);
      
      	while (len--)
    --- 510,529 ----
      			   *ptr_ret;
      	int			len;
      
    + 	/* Make offset 0-based */
    + 	m--;
    + 
      	if ((string == (text *) NULL) ||
    ! 		(m < 0) || (n <= 0) ||
    ! 		((len = VARSIZE(string) - VARHDRSZ - m) <= 0))
      		return string;
      
    ! 	len = (len < n) ? len : n;
      
      	ret = (text *) palloc(VARHDRSZ + len);
      	VARSIZE(ret) = VARHDRSZ + len;
      
    ! 	ptr = VARDATA(string) + m;
      	ptr_ret = VARDATA(ret);
      
      	while (len--)
    
    
    -- 
    Massimo Dal Zotto
    
    +----------------------------------------------------------------------+
    |  Massimo Dal Zotto                e-mail:  dz@cs.unitn.it            |
    |  Via Marconi, 141                 phone:  ++39-461-534251            |
    |  38057 Pergine Valsugana (TN)     www:  http://www.cs.unitn.it/~dz/  |
    |  Italy                            pgp:  finger dz@tango.cs.unitn.it  |
    +----------------------------------------------------------------------+
    
    
  7. Re: [HACKERS] Massimo patch

    Marc Howard Zuckman <marc@fallon.classyad.com> — 1998-02-16T16:57:33Z

    Peanut Gallery Vote on tprintf patch:
    
    I would vote to put this into 6.3.
    
    RATIONALE:  It seems to offer database _users_ the potential of
    providing more useful information about (potential) bugs to
    the development team.  Seems like it could help speed up
    the development process.  Getting it into the code base
    sooner rather than later seems desireable.
    
    Marc Zuckman
    marc@fallon.classyad.com
    
    
    
  8. Re: [HACKERS] Massimo patch

    Marc G. Fournier <scrappy@hub.org> — 1998-02-16T17:10:09Z

    On Mon, 16 Feb 1998, Marc Howard Zuckman wrote:
    
    > Peanut Gallery Vote on tprintf patch:
    > 
    > I would vote to put this into 6.3.
    > 
    > RATIONALE:  It seems to offer database _users_ the potential of
    > providing more useful information about (potential) bugs to
    > the development team.  Seems like it could help speed up
    > the development process.  Getting it into the code base
    > sooner rather than later seems desireable.
    
    	Except, it is a new feature, not a bug fix...except for a *very*
    select few features that overlapped the beta freeze (subselects being the
    only one that I can thnk of), nothing new is going into this release...
    
    	To add to it, the patches are against v6.2.1, which would mean
    having to spend time away from current bug fixes to safely be able to
    merge them in.
    
    	As I said before, all of the patches look good and I want to get
    them in, but not until after v6.3 is released.
    
    	Massimo, it would really help things if you could submit a new
    patch, after v6.3 is released, against v6.3...I've kept the original
    patch, and will manually apply it if I have to, but...:)
    
    
    
    
    
  9. Re: [PATCHES] patches for 6.2.1p6

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-03-16T03:07:20Z

    > 
    > Hi hackers,
    > 
    > I have old patches for version 6.2.1p6 which fix some problems and add
    > new features. Here is a short description of each patch file:
    > 
    > 
    > assert.patch
    > 
    > 	adds a switch to turn on/off the assert checking if enabled at compile
    > 	time. You can now compile postgres with assert checking and disable it
    > 	at runtime in a production environment.
    > 
    > async-unlisten.patch
    > 
    > 	declares Async_Unlisten() external so that it can be called by user
    > 	modules.
    > 
    > exec-limit.patch
    > 
    > 	removes the #ifdef NOT_USED around ExecutorLimit(). It is used.
    > 
    > exitpg.patch
    > 
    > 	limits recursive calls to exitpg() preventing an infinite loop
    > 	if an error is found inside exitpg.
    > 
    > libpgtcl-listen.patch
    > 
    > 	Just a change from upper to lowercase of an sql command in libpgtcl,
    > 	totally harmless.
    > 
    > new-locks.patch
    > 
    > 	After long studying and many debugging sessions I have finally
    > 	understood how the low level locks work.
    > 	I have completely rewritten lock.c cleaning up the code and adding
    > 	better assert checking. I have also added some fields to the lock
    > 	and xid tags for better support of user locks. This patch includes
    > 	also a patch submitted by Bruce Momjian which changes the handling
    > 	of lock priorities. It can however be disabled if an option is set
    > 	in pg_options, see tprintf.patch (Bruce patch works by building
    > 	the queue in reverse priority order, my old patch kept the queue in
    > 	decreasing order and traversed it from the other side).
    > 
    > pg-flush.patch
    > 
    > 	removes an unnecessary flush in libpq reducing network traffic and
    > 	increasing performance.
    > 
    > relname.patch
    > 
    > 	an utility which returns the relname corresponding to a given oid.
    > 	Useful for debug messages (see vacum.patch).
    > 
    > sequence.patch
    > 
    > 	added a setval() function which enables othe owner of a sequence
    > 	to set its value without need to delete and recreate it.
    > 
    > sinval.patch
    > 
    > 	fixes a problem in SI cache which causes table overflow if some
    > 	backend is idle for a long time while other backends keep adding
    > 	entries.
    > 	It uses the new signal handling implemented in tprintf.patch.
    > 	I have also increacasesed the max number of backends from 32 to 64 and
    > 	the table size from 1000 to 5000.
    > 
    > spin-lock.patch
    > 
    > 	I'm not sure if this is really useful, but it seems stupid to have
    > 	a backend wasting cpu cycles in a busy loop while the process which
    > 	should release the lock is waiting for the cpu. So I added a call
    > 	to process_yield() if the spin lock can't obtained.
    > 	This has been implemented and tested only on Linux. I don't know if
    > 	other OS have process_yield(). If someone can check please do it.
    > 
    > tprintf.patch
    > 
    > 	adds functions and macros which implement a conditional trace package
    > 	with the ability to change flags and numeric options of running
    > 	backends at runtime.
    > 	Options/flags can be specified in the command line and/or read from
    > 	the file pg_options in the data directory.
    > 	Running backends can be forced to update their options from this file
    > 	by sending them a SIGHUP signal (this is the convention used by most
    > 	unix daemons so I changed the meaning of SIGHUP).
    > 	Options can be debugging flags used by the trace package or any other
    > 	numeric	value used by the backend, for example the deadlock_timeout.
    > 	Having flags and options specified at runtime and changed while the
    > 	backends are running can greatly simplify the debugging and tuning
    > 	of the database. New options can be defined in utils/misc/trace.c and
    > 	include/utils/trace.h.  As an example of the usage of this package
    > 	see lock.c and proc.c which make use of new runtime options.
    > 
    > 	Old files using int flags or variables can be easily changed to
    > 	use the new package by substituting the old variable with a #define
    > 	like in the following example:
    > 	
    > 	  /* int my_flag = 0; */
    > 	  #include "trace.h"
    >  	  #define my_flag pg_options[OPT_MYFLAG]
    > 
    > 	I have done it in postgres.c and some other files and now I can turn
    > 	on/off any single debug flag on the fly with a simple shell script.
    > 	I have removed the IpcConfigTip() from ipc.c, it should better be
    > 	described in the postgres manual instead of being printed on stderr.
    > 
    > 	This patch provides also a new format of debugging messages which
    > 	are always in a single line with a timestamp and the backend pid:
    > 
    > 	  #timestamp          #pid    #message
    > 	  980127.17:52:14.173 [29271] StartTransactionCommand
    > 	  980127.17:52:14.174 [29271] ProcessUtility:  drop table t;
    > 	  980127.17:52:14.186 [29271] SIIncNumEntries: table is 70% full
    > 	  980127.17:52:14.186 [29286] Async_NotifyHandler
    > 	  980127.17:52:14.186 [29286] Waking up sleeping backend process
    > 	  980127.19:52:14.292 [29286] Async_NotifyFrontEnd
    > 	  980127.19:52:14.413 [29286] Async_NotifyFrontEnd done
    > 	  980127.19:52:14.466 [29286] Async_NotifyHandler done
    > 
    > 	This improves the readability of the log and allows one to understand
    > 	exactly which backend is doing what and at which time. It also makes
    > 	easier to write simple awk or perl scripts which monitor the log to
    > 	detect database errors or problem, or to compute transaction times.
    > 
    > 	The patch changes also the meaning of signals used by postgres, as
    > 	described by the following table:
    > 
    > 		    postmaster				backend
    > 
    > 	SIGHUP      kill(*,sighup)			read_pg_options
    > 	SIGINT	    kill(*,sigint), die			die
    > 	SIGCHLD	    reaper				-
    > 	SIGTTIN	    ignored				-
    > 	SIGTTOU	    ignored				-
    > 	SIGQUIT	    die					handle_warn
    > 	SIGTERM	    kill(*,sigterm), kill(*,9), die	die
    > 	SIGCONT	    dumpstatus				-
    > 	SIGPIPE	    ignored				die
    > 	SIGFPE	    -					FloatExceptionHandler
    > 	SIGTSTP	    -					ignored (alive test)
    > 	SIGUSR1	    kill(*,sigusr1), die		quickdie
    > 	SIGUSR2	    kill(*,sigusr2)			Async_NotifyHandler
    > 							(also SI buffer flush)
    > 
    > 	The main changes to the old implementation are SIGQUIT instead of
    > 	SIGHUP to handle warns, SIGHUP to reread pg_options and redirection
    > 	to all backends of SIGHUP, SIGINT, SIGTERM, SIGUSR1 and SIGUSR2.
    > 	In this way some of the signals sent to the postmaster can be sent
    > 	automatically to all the backends. To shut down postgres one needs
    > 	only to send a SIGTERM to postmaster and it will stop automatically
    > 	all the backends. This new signal handling mechanism is also used
    > 	to prevent SI cache table overflows: when a backend detects the SI
    > 	table full at 70% it simply sends a signal to the postmaster which
    > 	will wake up all idle backends and make them flush the cache.
    > 
    > vacuum.patch
    > 
    > 	adds a debug message to vacuum that prints the name of a table or
    > 	index *before* vacuuming it, if the verbose keyword is set.
    > 	This is useful to know which table is causing troubles if a
    > 	vacuum all crashes. Currently table information is printed only
    > 	at the end of each vacuum operation and is never printed if the
    > 	vacuum crashes.
    > 
    > -- 
    > Massimo Dal Zotto
    
    Massimo, now that 6.3 is released, any chance of getting these patches
    against the 6.3 source code?
    
    -- 
    Bruce Momjian                          |  830 Blythe Avenue
    maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
      +  If your life is a hard drive,     |  (610) 353-9879(w)
      +  Christ can be your backup.        |  (610) 853-3000(h)
    
    
  10. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    David Gould <dg@illustra.com> — 1998-03-16T23:27:02Z

    > > Hi hackers,
    > > 
    > > I have old patches for version 6.2.1p6 which fix some problems and add
    > > new features. Here is a short description of each patch file:
    
    > > spin-lock.patch
    > > 
    > > 	I'm not sure if this is really useful, but it seems stupid to have
    > > 	a backend wasting cpu cycles in a busy loop while the process which
    > > 	should release the lock is waiting for the cpu. So I added a call
    > > 	to process_yield() if the spin lock can't obtained.
    > > 	This has been implemented and tested only on Linux. I don't know if
    > > 	other OS have process_yield(). If someone can check please do it.
    
    The generic way to do this is
    
        select( NULL_FDSET, NULL_FDSET, NULL_FDSET, &delaytime, NULL);
    
    Delay time may be 0, but a random value between 0 and say 30 msec seems
    to be optimal. Hard busy wait spinlocks cause huge performance problems with
    heavily loaded systems and lots of postgres backends. Basically one backend
    ends up with the lock and gets scheduled out holding it, every else queues
    up busywaiting behind this one. But the backend holding the lock cannot
    release it until all the other backeds waiting on the lock exhaust a full
    timeslice busywaiting. Get 20 of these guys going (like on a busy website) and
    the system pretty much stops doing any work at all.
    
    I say we should get this in as soon as we can.
    
    > > -- 
    > > Massimo Dal Zotto
    > 
    > Massimo, now that 6.3 is released, any chance of getting these patches
    > against the 6.3 source code?
    > 
    > -- 
    > Bruce Momjian                          |  830 Blythe Avenue
    > maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
    
    
    -dg
    
    
    David Gould            dg@illustra.com           510.628.3783 or 510.305.9468 
    Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
     - I realize now that irony has no place in business communications.
    
    
    
  11. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Marc G. Fournier <scrappy@hub.org> — 1998-03-17T02:47:57Z

    On Mon, 16 Mar 1998, David Gould wrote:
    
    > The generic way to do this is
    > 
    >     select( NULL_FDSET, NULL_FDSET, NULL_FDSET, &delaytime, NULL);
    > 
    > Delay time may be 0, but a random value between 0 and say 30 msec seems
    > to be optimal. Hard busy wait spinlocks cause huge performance problems with
    > heavily loaded systems and lots of postgres backends. Basically one backend
    > ends up with the lock and gets scheduled out holding it, every else queues
    > up busywaiting behind this one. But the backend holding the lock cannot
    > release it until all the other backeds waiting on the lock exhaust a full
    > timeslice busywaiting. Get 20 of these guys going (like on a busy website) and
    > the system pretty much stops doing any work at all.
    > 
    > I say we should get this in as soon as we can.
    
    Can you submit an appropriate patch that can be included in the mega-patch
    to be created on Sunday?
    
    Thanks...
    
    Marc G. Fournier                                
    Systems Administrator @ hub.org 
    primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
    
    
    
  12. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-03-17T03:20:48Z

    > 
    > On Mon, 16 Mar 1998, David Gould wrote:
    > 
    > > The generic way to do this is
    > > 
    > >     select( NULL_FDSET, NULL_FDSET, NULL_FDSET, &delaytime, NULL);
    > > 
    > > Delay time may be 0, but a random value between 0 and say 30 msec seems
    > > to be optimal. Hard busy wait spinlocks cause huge performance problems with
    > > heavily loaded systems and lots of postgres backends. Basically one backend
    > > ends up with the lock and gets scheduled out holding it, every else queues
    > > up busywaiting behind this one. But the backend holding the lock cannot
    > > release it until all the other backeds waiting on the lock exhaust a full
    > > timeslice busywaiting. Get 20 of these guys going (like on a busy website) and
    > > the system pretty much stops doing any work at all.
    > > 
    > > I say we should get this in as soon as we can.
    > 
    > Can you submit an appropriate patch that can be included in the mega-patch
    > to be created on Sunday?
    
    Just a warning that this is not going to be easy.  We have OS-specific
    code for spinlocks in include/storage/s_lock.h and
    backend/storage/buffer/s_lock.c.  So each S_LOCK macro call has to have
    its test-and-set logic de-coupled with its while-lock-fail-try-again
    logic.  Most of them are easy, but some like VAX:
    
    #define S_LOCK(addr)        __asm__("1: bbssi $0,(%0),1b": :"r"(addr))
    
    are hard to de-couple.  Now, I did not know we supported NetBSD on VAX. 
    Does it work, anyone?  Can I remove it?
    
    This is going to be pretty tough to test on every platform we support,
    so if it is done now, it will have to be done carefully.
    
    -- 
    Bruce Momjian                          |  830 Blythe Avenue
    maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
      +  If your life is a hard drive,     |  (610) 353-9879(w)
      +  Christ can be your backup.        |  (610) 853-3000(h)
    
    
  13. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Thomas Lockhart <lockhart@alumni.caltech.edu> — 1998-03-17T03:46:30Z

    > > Can you submit an appropriate patch that can be included in the 
    > > mega-patch to be created on Sunday?
    > 
    > Just a warning that this is not going to be easy.  We have OS-specific
    > code for spinlocks in include/storage/s_lock.h and
    > backend/storage/buffer/s_lock.c.  So each S_LOCK macro call has to
    > have its test-and-set logic de-coupled with its 
    > while-lock-fail-try-again logic. 
    > Most of them are easy, but some like VAX:
    > 
    > #define S_LOCK(addr)        __asm__("1: bbssi $0,(%0),1b": :"r"(addr))
    > 
    > are hard to de-couple.  Now, I did not know we supported NetBSD on 
    > VAX. Does it work, anyone?  Can I remove it?
    
    NetBSD on VAX in on our supported list, and was verified for v6.3 by Tom
    Helbekkmo.
    
    > This is going to be pretty tough to test on every platform we support,
    > so if it is done now, it will have to be done carefully.
    
    Is this behavior in v6.2.x? In any case, if it is anything but minimally
    trivial, it should be given a test on every supported platform, since it
    hits the heart of the platform-specific code, doesn't it? Seems like it
    should be put into the CVS tree and shaken out until the next release...
    
                     - Tom
    
    
  14. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Marc G. Fournier <scrappy@hub.org> — 1998-03-17T03:49:04Z

    On Tue, 17 Mar 1998, Thomas G. Lockhart wrote:
    
    > > This is going to be pretty tough to test on every platform we support,
    > > so if it is done now, it will have to be done carefully.
    > 
    > Is this behavior in v6.2.x? In any case, if it is anything but minimally
    > trivial, it should be given a test on every supported platform, since it
    > hits the heart of the platform-specific code, doesn't it? Seems like it
    > should be put into the CVS tree and shaken out until the next release...
    
    	Not realizing what was involved, I have to agree here...*after* I
    get a post-release patch out on Sunday? :)
    
    Marc G. Fournier                                
    Systems Administrator @ hub.org 
    primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
    
    
    
  15. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-03-17T03:54:53Z

    > 
    > > > Can you submit an appropriate patch that can be included in the 
    > > > mega-patch to be created on Sunday?
    
    > > are hard to de-couple.  Now, I did not know we supported NetBSD on 
    > > VAX. Does it work, anyone?  Can I remove it?
    > 
    > NetBSD on VAX in on our supported list, and was verified for v6.3 by Tom
    > Helbekkmo.
    > 
    > > This is going to be pretty tough to test on every platform we support,
    > > so if it is done now, it will have to be done carefully.
    > 
    > Is this behavior in v6.2.x? In any case, if it is anything but minimally
    > trivial, it should be given a test on every supported platform, since it
    > hits the heart of the platform-specific code, doesn't it? Seems like it
    > should be put into the CVS tree and shaken out until the next release...
    
    Yea, that is what I was hinting at.
    
    -- 
    Bruce Momjian                          |  830 Blythe Avenue
    maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
      +  If your life is a hard drive,     |  (610) 353-9879(w)
      +  Christ can be your backup.        |  (610) 853-3000(h)
    
    
  16. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    David Gould <dg@illustra.com> — 1998-03-17T04:04:23Z

    > > > > Can you submit an appropriate patch that can be included in the 
    > > > > mega-patch to be created on Sunday?
    > 
    > > > are hard to de-couple.  Now, I did not know we supported NetBSD on 
    > > > VAX. Does it work, anyone?  Can I remove it?
    > > 
    > > NetBSD on VAX in on our supported list, and was verified for v6.3 by Tom
    > > Helbekkmo.
    > > 
    > > > This is going to be pretty tough to test on every platform we support,
    > > > so if it is done now, it will have to be done carefully.
    > > 
    > > Is this behavior in v6.2.x? In any case, if it is anything but minimally
    > > trivial, it should be given a test on every supported platform, since it
    > > hits the heart of the platform-specific code, doesn't it? Seems like it
    > > should be put into the CVS tree and shaken out until the next release...
    > 
    > Yea, that is what I was hinting at.
    > 
    > -- 
    > Bruce Momjian                          |  830 Blythe Avenue
    > maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
    
    
    I tend to agree but am willing to compromise.
    
    Can we do only the easy platforms at this time and then fix the others later?
    
    Since S_LOCK is a macro, it could be
    
    #define S_LOCK  s_lock_with_backoff
    
    on the easy platforms and
    
    #define S_LOCK  original_definition
    
    on the tricky or hard to test platforms
    
    If this will work, I am willing to hack this together tomorrow.
    What is the time frame for accepting a patch like this?
    
    -dg
    
    David Gould            dg@illustra.com           510.628.3783 or 510.305.9468
    Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
     - I realize now that irony has no place in business communications.
    
    
    
    
  17. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Thomas Lockhart <lockhart@alumni.caltech.edu> — 1998-03-17T04:16:10Z

    David Gould wrote:
    > 
    > > > > > Can you submit an appropriate patch that can be included in the
    > > > > > mega-patch to be created on Sunday?
    > >
    > > > > are hard to de-couple.  Now, I did not know we supported NetBSD on
    > > > > VAX. Does it work, anyone?  Can I remove it?
    > > >
    > > > NetBSD on VAX in on our supported list, and was verified for v6.3 by Tom
    > > > Helbekkmo.
    > > >
    > > > > This is going to be pretty tough to test on every platform we support,
    > > > > so if it is done now, it will have to be done carefully.
    > > >
    > > > Is this behavior in v6.2.x? In any case, if it is anything but minimally
    > > > trivial, it should be given a test on every supported platform, since it
    > > > hits the heart of the platform-specific code, doesn't it? Seems like it
    > > > should be put into the CVS tree and shaken out until the next release...
    > >
    > > Yea, that is what I was hinting at.
    > >
    > > --
    > > Bruce Momjian                          |  830 Blythe Avenue
    > > maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
    > 
    > I tend to agree but am willing to compromise.
    > 
    > Can we do only the easy platforms at this time
    > If this will work, I am willing to hack this together tomorrow.
    > What is the time frame for accepting a patch like this?
    
    What do you think it would take to exercise the patch heavily enough on
    any of the affected platforms to ensure that it behaves no worse than
    the current code? Is a test case easy to set up and run? If so, you can
    probably get ~5 platforms tested in the next few days, and reduce the
    risk of including this in the mega-patch.
    
    Alternatively, we could do this the day _after_ the mega-patch, so that
    the two are decoupled, and have a nice "slock patch" a few days later.
    
    At the moment, there aren't any large backend patches waiting to go (I
    think Vadim is still sleeping to recover from v6.3 :)
    
                            - Tom
    
    
  18. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Marc G. Fournier <scrappy@hub.org> — 1998-03-17T04:51:23Z

    On Mon, 16 Mar 1998, David Gould wrote:
    
    > If this will work, I am willing to hack this together tomorrow.
    > What is the time frame for accepting a patch like this?
    
    	Assuming that its *clean* (clean meaning that Bruce fully approves
    of it, as this is his area of the code...well, one of them
    *grin*)...tomorrow would be great :)  If Bruce has *any* doubts though, it
    doesn't go in until after I do the patch I want to do...
    
    Marc G. Fournier                                
    Systems Administrator @ hub.org 
    primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
    
    
    
  19. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    David Gould <dg@illustra.com> — 1998-03-17T06:38:14Z

    > David Gould wrote:
    > > 
    > > > > > > Can you submit an appropriate patch that can be included in the
    > > > > > > mega-patch to be created on Sunday?
    > > >
    > > > > > are hard to de-couple.  Now, I did not know we supported NetBSD on
    > > > > > VAX. Does it work, anyone?  Can I remove it?
    > > > >
    > > > > NetBSD on VAX in on our supported list, and was verified for v6.3 by Tom
    > > > > Helbekkmo.
    > > > >
    > > > > > This is going to be pretty tough to test on every platform we support,
    > > > > > so if it is done now, it will have to be done carefully.
    > > > >
    > > > > Is this behavior in v6.2.x? In any case, if it is anything but minimally
    > > > > trivial, it should be given a test on every supported platform, since it
    > > > > hits the heart of the platform-specific code, doesn't it? Seems like it
    > > > > should be put into the CVS tree and shaken out until the next release...
    > > >
    > > > Yea, that is what I was hinting at.
    > > >
    > > > --
    > > > Bruce Momjian                          |  830 Blythe Avenue
    > > > maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
    > > 
    > > I tend to agree but am willing to compromise.
    > > 
    > > Can we do only the easy platforms at this time
    > > If this will work, I am willing to hack this together tomorrow.
    > > What is the time frame for accepting a patch like this?
    > 
    > What do you think it would take to exercise the patch heavily enough on
    > any of the affected platforms to ensure that it behaves no worse than
    > the current code? Is a test case easy to set up and run? If so, you can
    > probably get ~5 platforms tested in the next few days, and reduce the
    > risk of including this in the mega-patch.
    
    Simple but clumsy test:
    
    Multiple backends all doing single row inserts into private tables. Run it
    for a fixed time period and add up the total number of rows. Do this for both
    a small (eg 2) and large (eg 20 or 40) number of backends.
    
    psuedo perl /sh / sql follows
    # driver <number of users> <delay time> <run time>
    synchtime = timenow
    starttime = synctime + delaytime;
    endtime = starttime + runtime
    for user 1 to number of users
       singleuser synchtime delay_time run_time
    
    # singleuser <userno> <starttime> <endtime>
    exec sql 'create table user$user (i int);'
    
    sleep until start time
    n = 1;
    while timenow < endtime
       exec sql "insert into user$user values($n);"
    
    print "user $user inserted $n"
    
    The figure of merit is the total number of inserts. Oh and that the system
    does not fall over.
    
    But, it may be best to leave this until after the mega patch. I am not sure
    I want to share the blame ;-).
    
    Just to fill me in, where does the mega patch fall in with the next release
    snapshot? That is, if this misses the mega patch is it waiting until 6.4?
    
    -dg
    
    
    David Gould            dg@illustra.com           510.628.3783 or 510.305.9468 
    Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
     - I realize now that irony has no place in business communications.
    
    
    
  20. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Marc G. Fournier <scrappy@hub.org> — 1998-03-17T07:40:44Z

    On Mon, 16 Mar 1998, David Gould wrote:
    
    > Just to fill me in, where does the mega patch fall in with the next release
    > snapshot? That is, if this misses the mega patch is it waiting until 6.4?
    
    	That is pretty much up to you, actually.  There is nothing wrong
    with a clean patch for this being placed in the patches directory, if it
    can be done shortly after the post-release patch.  Basically, starting
    April 1st (or so), development basically starts up again, so backtracking
    a patch from v6.4-alpha to v6.3 release can prove difficult :(
    
    
    Marc G. Fournier                                
    Systems Administrator @ hub.org 
    primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
    
    
    
  21. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-03-17T14:20:00Z

    > 
    > On Mon, 16 Mar 1998, David Gould wrote:
    > 
    > > If this will work, I am willing to hack this together tomorrow.
    > > What is the time frame for accepting a patch like this?
    > 
    > 	Assuming that its *clean* (clean meaning that Bruce fully approves
    > of it, as this is his area of the code...well, one of them
    > *grin*)...tomorrow would be great :)  If Bruce has *any* doubts though, it
    > doesn't go in until after I do the patch I want to do...
    
    David, go for it.  The code is all local in two files, and I think you
    can basically change all the do{test-and-set} while(lock-is-false)
    loops to:
    
    	do{test-and-set} while(lock-is-false && select ())
    
    Pretty easy.  No need to test multiple platforms.  The ones where the
    loop is integrated into the asm(), leave them for later.
    
    -- 
    Bruce Momjian                          |  830 Blythe Avenue
    maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
      +  If your life is a hard drive,     |  (610) 353-9879(w)
      +  Christ can be your backup.        |  (610) 853-3000(h)
    
    
  22. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Thomas Lockhart <lockhart@alumni.caltech.edu> — 1998-03-17T16:45:10Z

    > But, it may be best to leave this until after the mega patch. I am not 
    > sure I want to share the blame ;-).
    > 
    > Just to fill me in, where does the mega patch fall in with the next 
    > release snapshot? That is, if this misses the mega patch is it waiting 
    > until 6.4?
    
    I would guess that we could post a separate patch any time soon,
    especially since the changes are apparently isolated to only a few
    places in the code. In the last release, we posted ~7 patches, each
    independent of the others, and generated on our local source trees. Each
    of the patches was, however, fairly simple, quite often only one or a
    few lines of change, and were intended as bug fixes. Also, they were
    easily tested. In any case, at a minimum the regression test should be
    run (and passed!).
    
                          - Tom
    
    
  23. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Tom Ivar Helbekkmo <tih@hamartun.priv.no> — 1998-03-18T21:50:35Z

    * Bruce Momjian
    |
    | Just a warning that this is not going to be easy.  We have OS-specific
    | code for spinlocks in include/storage/s_lock.h and
    | backend/storage/buffer/s_lock.c.  So each S_LOCK macro call has to have
    | its test-and-set logic de-coupled with its while-lock-fail-try-again
    | logic.  Most of them are easy, but some like VAX:
    | 
    | #define S_LOCK(addr)        __asm__("1: bbssi $0,(%0),1b": :"r"(addr))
    | 
    | are hard to de-couple.  Now, I did not know we supported NetBSD on VAX. 
    | Does it work, anyone?  Can I remove it?
    
    Yes, it works.  No, please don't break it.  Heck, I only just got it
    in in time for 6.3!  :-) The not-so-busy-waiting-spinlock stuff can be
    put in on a platform at a time -- I'll expand the VAX version to do
    the right thing once someone has done another platform, so I can see
    what's the preferred way of doing it.
    
    -tih
    -- 
    Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"
    
    
  24. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    David Gould <dg@illustra.com> — 1998-03-18T23:51:51Z

    > * Bruce Momjian
    > | Just a warning that this is not going to be easy.  We have OS-specific
    > | code for spinlocks in include/storage/s_lock.h and
    > | backend/storage/buffer/s_lock.c.  So each S_LOCK macro call has to have
    > | its test-and-set logic de-coupled with its while-lock-fail-try-again
    > | logic.  Most of them are easy, but some like VAX:
    > | 
    > | #define S_LOCK(addr)        __asm__("1: bbssi $0,(%0),1b": :"r"(addr))
    > | 
    > | are hard to de-couple.  Now, I did not know we supported NetBSD on VAX. 
    > | Does it work, anyone?  Can I remove it?
    > 
    > Yes, it works.  No, please don't break it.  Heck, I only just got it
    > in in time for 6.3!  :-) The not-so-busy-waiting-spinlock stuff can be
    > put in on a platform at a time -- I'll expand the VAX version to do
    > the right thing once someone has done another platform, so I can see
    > what's the preferred way of doing it.
    >
    > -tih
    
    I won't. I hope.
    
    Seriously, if you want to, please create a function to emulate the following:
    
    /*
     * tas(lock)
     *
     * Access to platform specific test_and_set functionality. Given pointer to
     * lock attempts to acquire the lock atomically.
     *
     * Returns 0 for success, nonzero for failure.
     */
    typedef slock_t unsigned char;		/* or whatever works on the platform */
    
    int tas(slock_t *lock)
    {
        slock_t	tmp;
    
        /* atomic, interlocked */
        tmp = *lock;
        *lock = -1; 			/* any nonzero will do here */
    
        return (tmp != 0);
    }
    
    Given this, I can fold the VAX right into the grand scheme, just like a
    normal computer (;-)).
    
    -dg
    
    David Gould            dg@illustra.com           510.628.3783 or 510.305.9468 
    Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
     - I realize now that irony has no place in business communications.
    
    
    
    
    
  25. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-03-19T02:35:56Z

    > 
    > * Bruce Momjian
    > |
    > | Just a warning that this is not going to be easy.  We have OS-specific
    > | code for spinlocks in include/storage/s_lock.h and
    > | backend/storage/buffer/s_lock.c.  So each S_LOCK macro call has to have
    > | its test-and-set logic de-coupled with its while-lock-fail-try-again
    > | logic.  Most of them are easy, but some like VAX:
    > | 
    > | #define S_LOCK(addr)        __asm__("1: bbssi $0,(%0),1b": :"r"(addr))
    > | 
    > | are hard to de-couple.  Now, I did not know we supported NetBSD on VAX. 
    > | Does it work, anyone?  Can I remove it?
    > 
    > Yes, it works.  No, please don't break it.  Heck, I only just got it
    > in in time for 6.3!  :-) The not-so-busy-waiting-spinlock stuff can be
    > put in on a platform at a time -- I'll expand the VAX version to do
    > the right thing once someone has done another platform, so I can see
    > what's the preferred way of doing it.
    
    OK, now I know that the VAX stuff is still used and supported.  Good. 
    We don't have good platform-specific information on NetBSD and Linux
    ports.
    
    
    -- 
    Bruce Momjian                          |  830 Blythe Avenue
    maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
      +  If your life is a hard drive,     |  (610) 353-9879(w)
      +  Christ can be your backup.        |  (610) 853-3000(h)
    
    
  26. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Thomas Lockhart <lockhart@alumni.caltech.edu> — 1998-03-19T03:24:25Z

    > OK, now I know that the VAX stuff is still used and supported.  Good.
    > We don't have good platform-specific information on NetBSD and Linux
    > ports.
    
    ?? 
    
    Have you checked the hardcopy or html versions of the "Supported
    Platforms" section in the Administrator's Guide? _Every_ entry in this
    was updated and refreshed at the end of February.
    
    What other info should we be collecting for this?
    
                      - Tom
    
    
  27. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-03-19T04:33:49Z

    > 
    > > OK, now I know that the VAX stuff is still used and supported.  Good.
    > > We don't have good platform-specific information on NetBSD and Linux
    > > ports.
    > 
    > ?? 
    > 
    > Have you checked the hardcopy or html versions of the "Supported
    > Platforms" section in the Administrator's Guide? _Every_ entry in this
    > was updated and refreshed at the end of February.
    > 
    > What other info should we be collecting for this?
    > 
    >                   - Tom
    > 
    
    Sorry, haven't looked there yet.
    
    -- 
    Bruce Momjian                          |  830 Blythe Avenue
    maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
      +  If your life is a hard drive,     |  (610) 353-9879(w)
      +  Christ can be your backup.        |  (610) 853-3000(h)
    
    
  28. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Massimo Dal Zotto <dz@cs.unitn.it> — 1998-03-19T17:25:04Z

    > 
    > > 
    > > Hi hackers,
    > > 
    > > I have old patches for version 6.2.1p6 which fix some problems and add
    > > new features. Here is a short description of each patch file:
    > > 
    > > ...
    > > 
    > > -- 
    > > Massimo Dal Zotto
    > 
    > Massimo, now that 6.3 is released, any chance of getting these patches
    > against the 6.3 source code?
    
    I have already applied all my old patches against 6.3 except the lock
    patch. The lock code has changed and I have to verify if my patches are
    still compatible. It will take some time and I haven't very much.
    I found also an interesting bug in the notify code. I will post the new
    patches when I find some spare time to verify them.
    Could you please send me some documentation on the new lock and deadlock
    code in the meantime ?
    
    Massimo Dal Zotto
    
    +----------------------------------------------------------------------+
    |  Massimo Dal Zotto                e-mail:  dz@cs.unitn.it            |
    |  Via Marconi, 141                 phone:  ++39-461-534251            |
    |  38057 Pergine Valsugana (TN)     www:  http://www.cs.unitn.it/~dz/  |
    |  Italy                            pgp:  finger dz@tango.cs.unitn.it  |
    +----------------------------------------------------------------------+
    
    
  29. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-03-19T19:47:10Z

    > I have already applied all my old patches against 6.3 except the lock
    > patch. The lock code has changed and I have to verify if my patches are
    
    Great.
    
    > still compatible. It will take some time and I haven't very much.
    > I found also an interesting bug in the notify code. I will post the new
    > patches when I find some spare time to verify them.
    > Could you please send me some documentation on the new lock and deadlock
    > code in the meantime ?
    
    Here is some comments from storage/lmgr/lock.c:DeadLockCheck().  This
    was a real mind-bender for me.  It finds holders of the lock it has been
    waiting on, and checks to see if any of those holders is waiting on the
    lock I own.  If not, it then checks all the holders of locks these new
    processes are waiting on, and checks to see if they are waiting on my
    lock, and it continues until it has traced all backend process id's
    related to my lock.  I have a static checked_procs[] array that keeps
    track of what I have checked so I don't loop.  The code is recursive.
    
    The new wait queue handling is documented in storage/lmgr/lock.c:ProcSleep().
    
    ---------------------------------------------------------------------------
    
     * This code takes a list of locks a process holds, and the lock that
     * the process is sleeping on, and tries to find if any of the processes
     * waiting on its locks hold the lock it is waiting for.  If no deadlock
     * is found, it goes on to look at all the processes waiting on their locks.
     *
     * We have already locked the master lock before being called.
     */
    bool
    DeadLockCheck(SHM_QUEUE *lockQueue, LOCK *findlock, bool skip_check)
    
    
    ---------------------------------------------------------------------------
    
            /*
             * This is our only check to see if we found the lock we want.
             *
             * The lock we are waiting for is already in MyProc->lockQueue so we
             * need to skip it here.  We are trying to find it in someone
             * else's lockQueue.
             */
            if (lock == findlock && !skip_check)
    
     
    ---------------------------------------------------------------------------
    
                        /*
                         * For findlock's wait queue, we are interested in
                         * procs who are blocked waiting for a write-lock on
                         * the table we are waiting on, and already hold a
                         * lock on it. We first check to see if there is an
                         * escalation deadlock, where we hold a readlock and
                         * want a writelock, and someone else holds readlock
                         * on the same table, and wants a writelock.
                         *
                         * Basically, the test is, "Do we both hold some lock on
                         * findlock, and we are both waiting in the lock
                         * queue?"
                         */
    
    ---------------------------------------------------------------------------
    
                             * For non-MyProc entries, we are looking only
                             * waiters, not necessarily people who already
                             * hold locks and are waiting. Now we check for
                             * cases where we have two or more tables in a
                             * deadlock.  We do this by continuing to search
                             * for someone holding a lock
                             */
                            if (DeadLockCheck(&(proc->lockQueue), findlock, false))
    
    ---------------------------------------------------------------------------
    
        /*
         * If the first entries in the waitQueue have a greater priority than
         * we have, we must be a reader, and they must be a writers, and we
         * must be here because the current holder is a writer or a reader but
         * we don't share shared locks if a writer is waiting. We put
         * ourselves after the writers.  This way, we have a FIFO, but keep
         * the readers together to give them decent priority, and no one
         * starves.  Because we group all readers together, a non-empty queue
         * only has a few possible configurations:
         *
         * [readers] [writers] [readers][writers] [writers][readers]
         * [writers][readers][writers]
         *
         * In a full queue, we would have a reader holding a lock, then a writer
         * gets the lock, then a bunch of readers, made up of readers who
         * could not share the first readlock because a writer was waiting,
         * and new readers arriving while the writer had the lock.
         *
         */
    
    -- 
    Bruce Momjian                          |  830 Blythe Avenue
    maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
      +  If your life is a hard drive,     |  (610) 353-9879(w)
      +  Christ can be your backup.        |  (610) 853-3000(h)
    
    
  30. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Massimo Dal Zotto <dz@cs.unitn.it> — 1998-03-20T10:59:01Z

    > 
    > > 
    > > On Mon, 16 Mar 1998, David Gould wrote:
    > > 
    > > > If this will work, I am willing to hack this together tomorrow.
    > > > What is the time frame for accepting a patch like this?
    > > 
    > > 	Assuming that its *clean* (clean meaning that Bruce fully approves
    > > of it, as this is his area of the code...well, one of them
    > > *grin*)...tomorrow would be great :)  If Bruce has *any* doubts though, it
    > > doesn't go in until after I do the patch I want to do...
    > 
    > David, go for it.  The code is all local in two files, and I think you
    > can basically change all the do{test-and-set} while(lock-is-false)
    > loops to:
    > 
    > 	do{test-and-set} while(lock-is-false && select ())
    > 
    > Pretty easy.  No need to test multiple platforms.  The ones where the
    > loop is integrated into the asm(), leave them for later.
    > 
    > -- 
    > Bruce Momjian                          |  830 Blythe Avenue
    > maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
    >   +  If your life is a hard drive,     |  (610) 353-9879(w)
    >   +  Christ can be your backup.        |  (610) 853-3000(h)
    > 
    > 
    > 
    
    I'am against a generic patch using select(). If we have sched_yield() on an
    architecture I don't see why dont't use it. Here is the patch for Linux.
    It has been tested for two months by 100 users without any problem.
    The only thing I would add is a more general configuration test in configure
    to include the proper include files.
    
    *** src/include/storage/s_lock.h.orig	Sat Oct 18 22:39:21 1997
    --- src/include/storage/s_lock.h	Wed Nov 19 23:11:14 1997
    ***************
    *** 294,300 ****
    --- 294,314 ----
       */
      
      #if defined(NEED_I386_TAS_ASM)
    + #include <unistd.h>
    + #include <sched.h>
      
    + #ifdef _POSIX_PRIORITY_SCHEDULING
    + #define	S_LOCK(lock)	do \
    + 						{ \
    + 							slock_t		_res; \
    + 							do \
    + 							{ \
    + 								__asm__("xchgb %0,%1": "=q"(_res), \
    + 										"=m"(*lock):"0"(0x1)); \
    + 								if (_res) sched_yield(); \
    + 							} while (_res != 0); \
    + 						} while (0)
    + #else
      #define	S_LOCK(lock)	do \
      						{ \
      							slock_t		_res; \
    ***************
    *** 303,308 ****
    --- 317,323 ----
      				__asm__("xchgb %0,%1": "=q"(_res), "=m"(*lock):"0"(0x1)); \
      							} while (_res != 0); \
      						} while (0)
    + #endif
      
      #define	S_UNLOCK(lock)	(*(lock) = 0)
      
    
    Massimo Dal Zotto
    
    +----------------------------------------------------------------------+
    |  Massimo Dal Zotto                e-mail:  dz@cs.unitn.it            |
    |  Via Marconi, 141                 phone:  ++39-461-534251            |
    |  38057 Pergine Valsugana (TN)     www:  http://www.cs.unitn.it/~dz/  |
    |  Italy                            pgp:  finger dz@tango.cs.unitn.it  |
    +----------------------------------------------------------------------+
    
    
  31. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-03-20T17:06:55Z

    > I'am against a generic patch using select(). If we have sched_yield() on an
    > architecture I don't see why dont't use it. Here is the patch for Linux.
    > It has been tested for two months by 100 users without any problem.
    > The only thing I would add is a more general configuration test in configure
    > to include the proper include files.
    
    I understand your issue.  Unfortunately, only Linux has sched_yield(),
    as far as I know.  Perhaps we can implement sched_yield/select based on
    the platform.  However, if someone is holding a spinlock, does
    sched_yield() give the other process enough time to finish with the
    spinlock before we start checking it again.  Seems select() allows us to
    control the time we wait before checking again.
    
    Also, it looks like the s_lock.h file is going to change pretty
    radically from David's change, so when he is done, we can put some
    OS-specific stuff if you wish.
    
    -- 
    Bruce Momjian                          |  830 Blythe Avenue
    maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
      +  If your life is a hard drive,     |  (610) 353-9879(w)
      +  Christ can be your backup.        |  (610) 853-3000(h)
    
    
  32. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    David Gould <dg@illustra.com> — 1998-03-20T19:15:47Z

    Massimo Dal Zotto writes:
    > > David, go for it.  The code is all local in two files, and I think you
    > > can basically change all the do{test-and-set} while(lock-is-false)
    > > loops to:
    > > 
    > > 	do{test-and-set} while(lock-is-false && select ())
    > > 
    > > Pretty easy.  No need to test multiple platforms.  The ones where the
    > > loop is integrated into the asm(), leave them for later.
    > > 
    > > -- 
    > > Bruce Momjian                          |  830 Blythe Avenue
    > 
    > I'am against a generic patch using select(). If we have sched_yield() on an
    > architecture I don't see why dont't use it. Here is the patch for Linux.
    > It has been tested for two months by 100 users without any problem.
    > The only thing I would add is a more general configuration test in configure
    > to include the proper include files.
    > 
    > *** src/include/storage/s_lock.h.orig	Sat Oct 18 22:39:21 1997
    > --- src/include/storage/s_lock.h	Wed Nov 19 23:11:14 1997
    > ***************
    > *** 294,300 ****
    > --- 294,314 ----
    >    */
    >   
    >   #if defined(NEED_I386_TAS_ASM)
    > + #include <unistd.h>
    > + #include <sched.h>
    >   
    > + #ifdef _POSIX_PRIORITY_SCHEDULING
    > + #define	S_LOCK(lock)	do \
    > + 					{ \
    > + 						slock_t		_res; \
    > + 						do \
    > + 						{ \
    > + 							__asm__("xchgb %0,%1": "=q"(_res), \
    > + 									"=m"(*lock):"0"(0x1)); \
    > + 							if (_res) sched_yield(); \
    > + 						} while (_res != 0); \
    > + 					} while (0)
    > + #else
    >   #define	S_LOCK(lock)	do \
    >   					{ \
    >   						slock_t		_res; \
    > ***************
    > *** 303,308 ****
    > --- 317,323 ----
    >   			__asm__("xchgb %0,%1": "=q"(_res), "=m"(*lock):"0"(0x1)); \
    >   						} while (_res != 0); \
    >   					} while (0)
    > + #endif
    >   
    >   #define	S_UNLOCK(lock)	(*(lock) = 0)
    > 
    > +----------------------------------------------------------------------+
    > |  Massimo Dal Zotto                e-mail:  dz@cs.unitn.it            |
    > |  Via Marconi, 141                 phone:  ++39-461-534251            |
    > |  38057 Pergine Valsugana (TN)     www:  http://www.cs.unitn.it/~dz/  |
    > |  Italy                            pgp:  finger dz@tango.cs.unitn.it  |
    > +----------------------------------------------------------------------+
    
    
    
    I am perfectly happy to use sched_yield() on Linux. My goal however is to
    make the concept work on all platforms.
    
    There was a recent thread on comp.os.linux.system on context switch times
    of Linux vs NTthat revealed that sched_yield() is fairly costly as it causes
    the scheduler to do a full scan of the process table and recalculate the
    the priorities of all processes. Probably not a problem, but it should
    be it should probably be benchmarked both ways.
    
    Finally, even though this appears to work, there is a possible stability
    problem with both approaches. Here is some of the discussion I had about
    that with Bruce:
    
    -----------------------------------------------------------------------
    David:
    > Right now, there is not much chance of catching a signal while waiting for
    > a spinlock. This is good, cause the code that waits for spinlocks tends to
    > be doing funny things with buffers and locks and shared stuff like that.
    > We don't catch signals because we don't make syscalls. But, once this goes in,
    > we will be calling select() a _lot_ and so it kinda becomes the likely place
    > for a signal to get delivered. Without thinking about it and looking at the
    > code a bit longer, I am not sure it is prudent to rush this in. I still want
    > it in as soon a possible, but possible includes free from harmful side effects.
    Bruce:
    > Well, signals are handled in the backend by tcop/postgres.c.  In
    > most/all? cases, a signal causes a longjump() out of the code and
    > releases locks and aborts the transaction.
    
    David:
    > I was afraid that would be the answer. Basically, this never worked. The
    > problem is that in an indeterminate number of places the code manipulates
    > multiple distinct but related structures in a sequence. To leave the server
    > in a consistant state all these updates need to be done in the right order
    > so that if the sequence in interrupted by a signal the partially updated
    > state is consistant. Unhappily, the original coders did not always have this
    > in mind. An example:
    > 
    >  cleaning up after a scan
    >   1 - release buffer pointed by scan descriptor
    >   2 - release scan descriptor
    > 
    > If a signal is taken just after one, the abort code will see the scan
    > descriptor and call the cleanup for it resulting in:
    > 
    > cleaning up after a scan (take 2)
    >   1 - release buffer pointed by scan descriptor
    >     - Whoopsie, buffer already released!
    >   2 - release scan descriptor
    > 
    > These sequences either _all_ have to identified and fixed, or made atomic
    > somehow, which is a biggish job.
    > 
    > Or the system has to acknowledge signals at only clearly defined points.
    > My preference is to have signal handlers only set a global flag to indicate
    > that a signal was seen. Then one just sprinkles tests of the flag in
    > all the likely places: step to next node, fetch next page, call function, etc.
    > 
    > The way this shows up in real life is strange unreproduceable errors on busy
    > systems, especially when backends are killed or transactions are forced to
    > abort.
    > 
    > Fixing this is a bit of a project, but the result is that a lot of mystery
    > bugs go away.
    
    --------------------------------------------------------------------------
    
    -dg
    
    
    David Gould            dg@illustra.com           510.628.3783 or 510.305.9468 
    Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
     - I realize now that irony has no place in business communications.
    
    
    
  33. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Marc G. Fournier <scrappy@hub.org> — 1998-03-20T23:27:52Z

    On Fri, 20 Mar 1998, Bruce Momjian wrote:
    
    > > I'am against a generic patch using select(). If we have sched_yield() on an
    > > architecture I don't see why dont't use it. Here is the patch for Linux.
    > > It has been tested for two months by 100 users without any problem.
    > > The only thing I would add is a more general configuration test in configure
    > > to include the proper include files.
    > 
    > I understand your issue.  Unfortunately, only Linux has sched_yield(),
    > as far as I know.  Perhaps we can implement sched_yield/select based on
    > the platform.  
    
    	What's the possibility of doing this similar to how we do some of
    the other functions (dl_open comes immediately to mind)...make a
    pg_sched_yield function and use that, which is built based on the various
    platforms?
    
    	Right now, I don't believe we have *anything* in place, so have
    pg_sched_yield() return 0 (or an equivalent) for every platform except for
    Linux...
    
    Marc G. Fournier                                
    Systems Administrator @ hub.org 
    primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
    
    
    
  34. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    David Gould <dg@illustra.com> — 1998-03-21T01:05:32Z

    > 
    > On Fri, 20 Mar 1998, Bruce Momjian wrote:
    > 
    > > > I'am against a generic patch using select(). If we have sched_yield() on an
    > > > architecture I don't see why dont't use it. Here is the patch for Linux.
    > > > It has been tested for two months by 100 users without any problem.
    > > > The only thing I would add is a more general configuration test in configure
    > > > to include the proper include files.
    > > 
    > > I understand your issue.  Unfortunately, only Linux has sched_yield(),
    > > as far as I know.  Perhaps we can implement sched_yield/select based on
    > > the platform.  
    > 
    > 	What's the possibility of doing this similar to how we do some of
    > the other functions (dl_open comes immediately to mind)...make a
    > pg_sched_yield function and use that, which is built based on the various
    > platforms?
    > 
    > 	Right now, I don't believe we have *anything* in place, so have
    > pg_sched_yield() return 0 (or an equivalent) for every platform except for
    > Linux...
    > 
    > Marc G. Fournier                                
    > Systems Administrator @ hub.org 
    > primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
    I appreciate all the help, but I think I have a solution for this. Details
    next week...
    
    -dg
    
    David Gould            dg@illustra.com           510.628.3783 or 510.305.9468 
    Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
     - I realize now that irony has no place in business communications.
    
    
    
  35. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-03-21T03:40:54Z

    > 	What's the possibility of doing this similar to how we do some of
    > the other functions (dl_open comes immediately to mind)...make a
    > pg_sched_yield function and use that, which is built based on the various
    > platforms?
    > 
    > 	Right now, I don't believe we have *anything* in place, so have
    > pg_sched_yield() return 0 (or an equivalent) for every platform except for
    > Linux...
    > 
    
    Probably even easier.  Just use #ifdef linux around the select or
    sched_yield.
    
    -- 
    Bruce Momjian                          |  830 Blythe Avenue
    maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
      +  If your life is a hard drive,     |  (610) 353-9879(w)
      +  Christ can be your backup.        |  (610) 853-3000(h)
    
    
  36. Re: sched_yield()

    Marc G. Fournier <scrappy@hub.org> — 1998-03-22T01:33:16Z

    On Sun, 22 Mar 1998, Mattias Kregert wrote:
    
    > The Hermit Hacker wrote:
    > >
    > > What's the possibility of doing this similar to how we do some of
    > > the other functions (dl_open comes immediately to mind)...make a
    > > pg_sched_yield function and use that, which is built based on the various
    > > platforms?
    > > 
    > >         Right now, I don't believe we have *anything* in place, so have
    > > pg_sched_yield() return 0 (or an equivalent) for every platform except for
    > > Linux...
    > 
    > But sched_yield() is not Linux-specific:
    > -- The sched_yield() function relinquishes the processor for the
    > -- running process.
    > -- IEEE Std 1003.1b-1993, 13.3.5. (POSIX real-time standard 1003.lb)
    > 
    > Except from Linux, I can find references to sched_yield() in LynxOS,
    > DECthreads thread library, AIX 4.1 and up (libc), Solaris (thread.h
    > (c)1994 Sun
    > Microsystems), Unix98, GNU, C EXECUTIVE(r) and PSX(tm) real time kernels
    > ...
    > This is just a quick search.
    > 
    > Perhaps we should enable sched_yield() for every OS except for... well,
    > what's the
    > name of that OS which does not have sched_yield()...  FreeBSD ;)
    > 
    > After all, sched_yield() is five years old. Any reasonable OS should
    > have it.
    
    	You missed my point...so far as I've heard, there are two ways of
    doing what is being proposed...either using sched_yield() on those systems
    that support it, or select() on those that don't.  If you are going to
    build a patch for this, it should look something like:
    
    #ifdef HAVE_SCHED_YIELD
    	<insert sched_yield() code here>
    #else
    	<insert select() code here>
    #endif
    
    	Totally 'configure' configurable, and non-system dependent :)
    
    Marc G. Fournier                                
    Systems Administrator @ hub.org 
    primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
    
    
    
  37. Re: sched_yield()

    Mattias Kregert <matti@algonet.se> — 1998-03-22T01:50:12Z

    The Hermit Hacker wrote:
    >
    > What's the possibility of doing this similar to how we do some of
    > the other functions (dl_open comes immediately to mind)...make a
    > pg_sched_yield function and use that, which is built based on the various
    > platforms?
    > 
    >         Right now, I don't believe we have *anything* in place, so have
    > pg_sched_yield() return 0 (or an equivalent) for every platform except for
    > Linux...
    
    But sched_yield() is not Linux-specific:
    -- The sched_yield() function relinquishes the processor for the
    -- running process.
    -- IEEE Std 1003.1b-1993, §13.3.5. (POSIX real-time standard 1003.lb)
    
    Except from Linux, I can find references to sched_yield() in LynxOS,
    DECthreads thread library, AIX 4.1 and up (libc), Solaris (thread.h
    (c)1994 Sun
    Microsystems), Unix98, GNU, C EXECUTIVE(r) and PSX(tm) real time kernels
    ...
    This is just a quick search.
    
    Perhaps we should enable sched_yield() for every OS except for... well,
    what's the
    name of that OS which does not have sched_yield()...  FreeBSD ;)
    
    After all, sched_yield() is five years old. Any reasonable OS should
    have it.
    
    /* m */
    
    
  38. Re: [HACKERS] Re: sched_yield()

    David Gould <dg@illustra.com> — 1998-03-22T03:02:56Z

    > On Sun, 22 Mar 1998, Mattias Kregert wrote:
    > 
    > > The Hermit Hacker wrote:
    > > >
    > > > What's the possibility of doing this similar to how we do some of
    > > > the other functions (dl_open comes immediately to mind)...make a
    > > > pg_sched_yield function and use that, which is built based on the various
    > > > platforms?
    > > > 
    > > >         Right now, I don't believe we have *anything* in place, so have
    > > > pg_sched_yield() return 0 (or an equivalent) for every platform except for
    > > > Linux...
    > > 
    > > But sched_yield() is not Linux-specific:
    > > -- The sched_yield() function relinquishes the processor for the
    > > -- running process.
    > > -- IEEE Std 1003.1b-1993, 13.3.5. (POSIX real-time standard 1003.lb)
    > > 
    > > Except from Linux, I can find references to sched_yield() in LynxOS,
    > > DECthreads thread library, AIX 4.1 and up (libc), Solaris (thread.h
    > > (c)1994 Sun
    > > Microsystems), Unix98, GNU, C EXECUTIVE(r) and PSX(tm) real time kernels
    > > ...
    > > This is just a quick search.
    > > 
    > > Perhaps we should enable sched_yield() for every OS except for... well,
    > > what's the
    > > name of that OS which does not have sched_yield()...  FreeBSD ;)
    > > 
    > > After all, sched_yield() is five years old. Any reasonable OS should
    > > have it.
    > 
    > 	You missed my point...so far as I've heard, there are two ways of
    > doing what is being proposed...either using sched_yield() on those systems
    > that support it, or select() on those that don't.  If you are going to
    > build a patch for this, it should look something like:
    > 
    > #ifdef HAVE_SCHED_YIELD
    > 	<insert sched_yield() code here>
    > #else
    > 	<insert select() code here>
    > #endif
    > 
    > 	Totally 'configure' configurable, and non-system dependent :)
    > 
    > Marc G. Fournier                                
    > Systems Administrator @ hub.org 
    > primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
    Ok, I will add a configuration option
    
      USE_SCHED_YIELD
    
    to the select patch I am working on. This can be enabled by configure.
    Assuming someone can find the header files needed on all the platforms.
    
    However, we should not assume that sched_yield() even where available is
    the automatic "right thing". It might be, but...
    
    The situation that either the select() or the sched_yield() style of
    spinlock back off is meant to help is when there are a number of processes
    busywaiting on the same spinlock.
    
    On Linux, sched_yield() triggers the scheduler to to a full priority re-calc
    for all processses. This is slightly expensive especially with a long run
    queue. Having a bunch of processes pound on sched_yield() might be actually
    worse than to use select(). At the very least it needs testing.
    
    Secondly, the select() backoff patch I am working on starts out with a zero
    timeout and backs off incrementally by increasing the timeout value on
    subsequent iterations. The idea is to break up convoys and avoid big piles of
    processes pounding on a spinlock. This cannot be done with sched_yield().
    
    Which is better? Well, golly gosh, I have no idea. I know that the select()
    flavor effectively solves the problem caused by hard loop busy waiting.
    Without some testing it is kinda hard to say more than that. I will try to
    fit in some testing, but if someone has a favorite many process workload
    and would like to try comparing both flavors it would be useful.
    
    -dg
    
    David Gould            dg@illustra.com           510.628.3783 or 510.305.9468 
    Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
     - I realize now that irony has no place in business communications.
    
    
    
  39. Re: sched_yield()

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-03-22T03:59:03Z

    > 
    > 	You missed my point...so far as I've heard, there are two ways of
    > doing what is being proposed...either using sched_yield() on those systems
    > that support it, or select() on those that don't.  If you are going to
    > build a patch for this, it should look something like:
    > 
    > #ifdef HAVE_SCHED_YIELD
    > 	<insert sched_yield() code here>
    > #else
    > 	<insert select() code here>
    > #endif
    > 
    > 	Totally 'configure' configurable, and non-system dependent :)
    
    Yep, I like it.
    
    -- 
    Bruce Momjian                          |  830 Blythe Avenue
    maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
      +  If your life is a hard drive,     |  (610) 353-9879(w)
      +  Christ can be your backup.        |  (610) 853-3000(h)
    
    
  40. Re: [HACKERS] Re: sched_yield()

    Marc G. Fournier <scrappy@hub.org> — 1998-03-22T04:00:01Z

    On Sat, 21 Mar 1998, David Gould wrote:
    
    > Ok, I will add a configuration option
    > 
    >   USE_SCHED_YIELD
    > 
    > to the select patch I am working on. This can be enabled by configure.
    > Assuming someone can find the header files needed on all the platforms.
    > 
    > However, we should not assume that sched_yield() even where available is
    > the automatic "right thing". It might be, but...
    > 
    > The situation that either the select() or the sched_yield() style of
    > spinlock back off is meant to help is when there are a number of processes
    > busywaiting on the same spinlock.
    > 
    > On Linux, sched_yield() triggers the scheduler to to a full priority re-calc
    > for all processses. This is slightly expensive especially with a long run
    > queue. Having a bunch of processes pound on sched_yield() might be actually
    > worse than to use select(). At the very least it needs testing.
    > 
    > Secondly, the select() backoff patch I am working on starts out with a zero
    > timeout and backs off incrementally by increasing the timeout value on
    > subsequent iterations. The idea is to break up convoys and avoid big piles of
    > processes pounding on a spinlock. This cannot be done with sched_yield().
    > 
    > Which is better? Well, golly gosh, I have no idea. I know that the select()
    > flavor effectively solves the problem caused by hard loop busy waiting.
    > Without some testing it is kinda hard to say more than that. I will try to
    > fit in some testing, but if someone has a favorite many process workload
    > and would like to try comparing both flavors it would be useful.
    
    	Okay, we have two differing viewpoints here...from what I've been
    able to read, the select() solution will work on *all* platforms, while
    the sched_yield() will work on *some* systems, but not all.
    
    	I personally like the "work on all platform" solution, but that's
    just me :)
    
    	I may have missed it, but I'm curious as to under what
    circumstance sched_yield() is better then the select() solution?  The
    "con", as I see it, is that sched_yield() doesn't work everywhere...
    
    Marc G. Fournier                                
    Systems Administrator @ hub.org 
    primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
    
    
    
  41. Re: sched_yield()

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-03-22T04:01:12Z

    > Perhaps we should enable sched_yield() for every OS except for... well,
    > what's the
    > name of that OS which does not have sched_yield()...  FreeBSD ;)
    > 
    > After all, sched_yield() is five years old. Any reasonable OS should
    > have it.
    
    Gee, I just checked and BSDI has it.  However, it appears to work only
    on threaded applications:
    
         #include <pthread.h>
    ...
         If other threads are ready to run, the sched_yield() function forces the
         current thread to suspend itself temporarily and let them execute.
    
    
    
    -- 
    Bruce Momjian                          |  830 Blythe Avenue
    maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
      +  If your life is a hard drive,     |  (610) 353-9879(w)
      +  Christ can be your backup.        |  (610) 853-3000(h)
    
    
  42. Re: [HACKERS] Re: sched_yield()

    Bruce Momjian <maillist@candle.pha.pa.us> — 1998-03-22T04:03:41Z

    > Secondly, the select() backoff patch I am working on starts out with a zero
    > timeout and backs off incrementally by increasing the timeout value on
    > subsequent iterations. The idea is to break up convoys and avoid big piles of
    > processes pounding on a spinlock. This cannot be done with sched_yield().
    
    Hard to beat the backoff argument.  I vote we only use select().
    
    -- 
    Bruce Momjian                          |  830 Blythe Avenue
    maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
      +  If your life is a hard drive,     |  (610) 353-9879(w)
      +  Christ can be your backup.        |  (610) 853-3000(h)
    
    
  43. Re: [HACKERS] Re: sched_yield()

    Marc G. Fournier <scrappy@hub.org> — 1998-03-22T05:44:50Z

    On Sat, 21 Mar 1998, Bruce Momjian wrote:
    
    > > Secondly, the select() backoff patch I am working on starts out with a zero
    > > timeout and backs off incrementally by increasing the timeout value on
    > > subsequent iterations. The idea is to break up convoys and avoid big piles of
    > > processes pounding on a spinlock. This cannot be done with sched_yield().
    > 
    > Hard to beat the backoff argument.  I vote we only use select().
    
    	I haven't heard any compelling arguments so far as to why
    sched_yield() is better then select(), so I tend to vote the same way...
    
    Marc G. Fournier                                
    Systems Administrator @ hub.org 
    primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
    
    
    
  44. Re: [HACKERS] Re: sched_yield()

    Tom Samplonius <tom@sdf.com> — 1998-03-22T06:32:15Z

    On Sun, 22 Mar 1998, Mattias Kregert wrote:
    
    > Except from Linux, I can find references to sched_yield() in LynxOS,
    > DECthreads thread library, AIX 4.1 and up (libc), Solaris (thread.h
    > (c)1994 Sun
    > Microsystems), Unix98, GNU, C EXECUTIVE(r) and PSX(tm) real time kernels
    > ...
    > This is just a quick search.
    
      This seems to be part of Posix threads, which means that sched_yield
    should only be callable by a thread....  I'm surprised that normal
    processes can call sched_yield.  In fact, in most of the environments
    you've mentioned, it certainly can't be used outside a thread.
    
    Tom
    
    
    
  45. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Tom Ivar Helbekkmo <tih@hamartun.priv.no> — 1998-03-31T21:20:40Z

    David Gould wrote:
    
    > Seriously, if you want to, please create a function to emulate the following:
    > 
    > /*
    >  * tas(lock)
    >  *
    >  * Access to platform specific test_and_set functionality. Given pointer to
    >  * lock attempts to acquire the lock atomically.
    >  *
    >  * Returns 0 for success, nonzero for failure.
    >  */
    > typedef slock_t unsigned char;		/* or whatever works on the platform */
    > 
    > int tas(slock_t *lock)
    > {
    >     slock_t	tmp;
    > 
    >     /* atomic, interlocked */
    >     tmp = *lock;
    >     *lock = -1; 			/* any nonzero will do here */
    > 
    >     return (tmp != 0);
    > }
    > 
    > Given this, I can fold the VAX right into the grand scheme, just like a
    > normal computer (;-)).
    
    Hmpf!  The true worth of a computer is a function of its weight!  :-)
    
    Sorry this took a while, but anyway, this should do it for the VAX (in
    fact, it's more or less the version of the code that I figured I'd use
    until Bruce asked me to bum it down maximally for performance, only
    now with the return values from tas() swapped).  I include the macros
    that would fit the current (6.3) locking scheme:
    
    typedef unsigned char slock_t;
    
    int tas(slock_t *lock) {
    	register ret;
    
    asm("	movl $1, r0
    	bbssi $0,(%1),1f
    	clrl r0
    1:	movl r0,%0"
    	: "=r"(ret)	/* return value, in register */
    	: "r"(lock)	/* argument, 'lock pointer', in register */
    	: "r0");	/* inline code uses this register */
    
    	return ret;
    }
    
    #define	S_LOCK(addr)		do { while (tas(addr)) ; } while (0)
    #define	S_UNLOCK(addr)		(*(addr) = 0)
    #define	S_INIT_LOCK(addr)	(*(addr) = 0)
    
    -tih
    -- 
    Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"
    
    
  46. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    David Gould <dg@illustra.com> — 1998-04-01T19:56:56Z

    > David Gould wrote:
    > 
    > >Seriously, if you want to, please create a function to emulate the following:
    > > 
    > > /*
    > >  * tas(lock)
    > >  *
    > >  * Access to platform specific test_and_set functionality. Given pointer to
    > >  * lock attempts to acquire the lock atomically.
    > >  *
    > >  * Returns 0 for success, nonzero for failure.
    > >  */
    > > typedef slock_t unsigned char;		/* or whatever works on the platform */
    > > 
    > > int tas(slock_t *lock)
    > > {
    > >     slock_t	tmp;
    > > 
    > >     /* atomic, interlocked */
    > >     tmp = *lock;
    > >     *lock = -1; 			/* any nonzero will do here */
    > > 
    > >     return (tmp != 0);
    > > }
    > > 
    > > Given this, I can fold the VAX right into the grand scheme, just like a
    > > normal computer (;-)).
    > 
    > Hmpf!  The true worth of a computer is a function of its weight!  :-)
    > 
    > Sorry this took a while, but anyway, this should do it for the VAX (in
    > fact, it's more or less the version of the code that I figured I'd use
    > until Bruce asked me to bum it down maximally for performance, only
    > now with the return values from tas() swapped).  I include the macros
    
    What do you mean "now with the return values from tas() swapped"? I think
    your code looks ok, but just want to be sure we are following the same
    grand plan...
    
    > that would fit the current (6.3) locking scheme:
    > 
    > typedef unsigned char slock_t;
    > 
    > int tas(slock_t *lock) {
    > 	register ret;
    > 
    > asm("	movl $1, r0
    > 	bbssi $0,(%1),1f
    > 	clrl r0
    > 1:	movl r0,%0"
    > 	: "=r"(ret)	/* return value, in register */
    > 	: "r"(lock)	/* argument, 'lock pointer', in register */
    > 	: "r0");	/* inline code uses this register */
    > 
    > 	return ret;
    > }
    > 
    > #define	S_LOCK(addr)		do { while (tas(addr)) ; } while (0)
    > #define	S_UNLOCK(addr)		(*(addr) = 0)
    > #define	S_INIT_LOCK(addr)	(*(addr) = 0)
    > 
    > -tih
    > -- 
    
    Thanks, this is just what I was looking for. I will fold it in to my changes.
    I have gotten a little snowed under with other tasks, but I expect to finalize
    my patch next week and will post it.
    
    -dg
    
    David Gould            dg@illustra.com           510.628.3783 or 510.305.9468 
    Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
     - Linux. Not because it is free. Because it is better.
    
    
    
  47. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    Tom Ivar Helbekkmo <tih@hamartun.priv.no> — 1998-04-02T04:52:33Z

    * David Gould
    |
    | What do you mean "now with the return values from tas() swapped"? I think
    | your code looks ok, but just want to be sure we are following the same
    | grand plan...
    
    I just meant that my original code (which has been posted before) had
    the tas() function implemented so that it returned 0 on failure, not
    on success, as you asked for.  Thus, I had to swap the sense of the
    return value.  In practice, I changed
    
    	clrl r0			; clear register r0
    	bbssi $0,(%1),1f	; branch on bit set else set
    	incl r0			; increment register r0
    1:	movl r0,%0		; return register r0
    
    	[...]
    
    #define	S_LOCK(addr)		do { while (!tas(addr)) ; } while (0)
    
    ...into...
    
    	movl $1, r0		; set register r0 to 1
    	bbssi $0,(%1),1f	; branch on bit set else set
    	clrl r0			; clear register r0
    1:	movl r0,%0		; return register r0
    
    	[...]
    
    #define	S_LOCK(addr)		do { while (tas(addr)) ; } while (0)
    
    -tih
    -- 
    Popularity is the hallmark of mediocrity.  --Niles Crane, "Frasier"
    
    
  48. Re: [HACKERS] Re: [PATCHES] patches for 6.2.1p6

    David Gould <dg@illustra.com> — 1998-04-02T07:06:36Z

    Tom Ivar Helbekkmo:
    > * David Gould
    > |
    > | What do you mean "now with the return values from tas() swapped"? I think
    > | your code looks ok, but just want to be sure we are following the same
    > | grand plan...
    > 
    > I just meant that my original code (which has been posted before) had
    > the tas() function implemented so that it returned 0 on failure, not
    > on success, as you asked for.  Thus, I had to swap the sense of the
    > return value.  In practice, I changed
    > 
    > 	clrl r0			; clear register r0
    > 	bbssi $0,(%1),1f	; branch on bit set else set
    > 	incl r0			; increment register r0
    > 1:	movl r0,%0		; return register r0
    > 
    > 	[...]
    > 
    > #define	S_LOCK(addr)		do { while (!tas(addr)) ; } while (0)
    > 
    > ...into...
    > 
    > 	movl $1, r0		; set register r0 to 1
    > 	bbssi $0,(%1),1f	; branch on bit set else set
    > 	clrl r0			; clear register r0
    > 1:	movl r0,%0		; return register r0
    > 
    > 	[...]
    > 
    > #define	S_LOCK(addr)		do { while (tas(addr)) ; } while (0)
    > 
    
    Thats what I thought, but it has been a few years (not saying how many ;-) )
    since I wrote any Macro-32 so I figured I should check.
    
    The tas() definition is not to return success or failure so much as to
    return the _previous_state_ of the lock. So if you test_and_set the lock
    the test part returns true if it was previously locked and false if it was
    unlocked. In either case, it is locked after the tas() (the set part). 
    Only, if it was previously unlocked, someone else owns the lock so we
    have to wait for them to unlock it.
    
    -dg
    
    David Gould            dg@illustra.com           510.628.3783 or 510.305.9468 
    Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
     - Linux. Not because it is free. Because it is better.