WIP pgindent replacement

Andrew Dunstan <andrew@dunslane.net>

From: Andrew Dunstan <andrew@dunslane.net>
To: PostgreSQL-development <pgsql-hackers@postgresql.org>
Date: 2011-06-22T00:27:45Z
Lists: pgsql-hackers

Attachments

Attached is a WIP possible replacement for pgindent. Instead of a shell 
script invoking a mishmash of awk and sed, some of which is pretty 
impenetrable, it uses a single engine (perl) to do all the pre and post 
indent processing. Of course, if your regex-fu and perl-fu is not up the 
scratch this too might be impenetrable, but all but a couple of the 
recipes are reduced to single lines, and I'd argue that they are all at 
least as comprehensible as what they replace.

Attached also is a diff file showing what it does differently from the 
existing script. I think that these are all things where the new script 
is more correct than the existing script. Most of the changes come into 
two categories:

    * places where the existing script fails to combine the function
      return type and the function name on a single line in function
      prototypes.
    * places where unwanted blank lines are removed by the new script
      but not by the existing script.

Features include:

    * command line compatibility with the existing script, so you can do:
      find ../../.. -name '*.[ch]' -type f -print | egrep -v -f
      exclude_file_patterns | xargs -n100 ./pgindent.pl typedefs.list
    * a new way of doing the same thing much more nicely:
      ./pgindent.pl --search-base=../../.. --typedefs=typedefs.list
      --excludes=exclude_file_patterns
    * only passes relevant typedefs to indent, not the whole huge list
    * should in principle be runnable on Windows, unlike existing script
      (I haven't tested yet)
    * no semantic tab literals; tabs are only generated using \t and
      tested for using \t, \h or \s as appropriate. This makes debugging
      the script much less frustrating. If something looks like a space
      it should be a space.

In one case I used perl's extended regex mode to comment a fairly hairy 
regex. This should probably be done a bit more, maybe for all of them.

If anybody is so inclined, this could be used as a basis for removing 
the use of bsd indent altogether, as has been suggested before, as well 
as external entab/detab.

Comments welcome.


cheers

andrew