Thread

  1. Modules

    Mattias Kregert <matti@algonet.se> — 1998-03-28T13:05:11Z

    David Gould wrote:
    >
    > To load a module into a kernel all you need to do is read the code in,
    > resolve the symbols, and maybe call an intialization routine. This is
    > merely a variation on loading a shared object (.so) file into a program.
    > 
    > To add a type and related stuff to a database is really a much harder problem.
    
    I don't agree.
    
    > You need to be able to
    >   - add one or more type descriptions                   types table
    >   - add input and output functions                      types, functions tables
    >   - add cast functions                                  casts, functions tables
    >   - add any datatype specific behavior functions        functions table
    >   - add access method operators (maybe)                 amops, functions tables
    >   - add aggregate operators                             aggregates, functions
    >   - add operators                                       operators, functions
    >   - provide statistics functions
    >   - provide destroy operators
    >   - provide .so files for C functions, SQL for sql functions
    >     (note this is the part needed for a unix kernel module)
    >   - do all the above within a particular schema
    > 
    > You may also need to create and populate data tables, rules, defaults, etc
    > required by the implementation of the new type.
    
    All this would be done by the init function in the module you load.
    What we need is a set of functions callable by modules, like
    module_register_type(name, descr, func*, textin*, textout*, whatever
    ...)
    module_register_smgr(name, descr, .....)
    module_register_command(....
    Casts would be done by converting to a common format (text) and then to
    the desired type. Use textin/textout. No special cast functions would
    have to exist. Why doesn't it work this way already??? Would not that
    solve all casting problems?
    
    
    > To unload a type requires undoing all the above. But there is a wrinkle: first
    > you have to check if there are any dependancies. That is, if the user has
    > created a table with one of the new types, you have to drop that table
    > (including column defs, indexes, rules, triggers, defaults etc) before
    > you can drop the type. Of course the user may not want to drop their tables
    > which brings us to the the next problem.
    
    Dependencies are checked by the OS kernel when you try to unload
    modules.
    You cannot unload slhc without first unloading ppp, for example. What's
    the
    difference?
    If you have Mod4X running with /dev/dsp opened, then you can't unload
    the sound driver, because it is in use, and you cannot unload a.out
    module
    if you have a non-ELF program running, and you can see the refcount on
    all
    modules and so on... This would not be different in a SQL server.
    If you have a cursor open, accessing IP types, then you cannot unload
    the IP-types module. Close the cursor, and you can unload the module if
    you want to.
    You don't have to drop tables containing new types just because you
    unload
    the module. If you want to SELECT from it, then that module would be
    loaded
    automagically when it is needed.
    
    
    > When this gets really hard is when it is time to upgrade an existing database
    > to a new version. Suppose you add a new column to a type in the new version.
    > How does a user with lots of data in dozens of tables using the old type
    > install the new module?
    > 
    > What about restoring a dump from an old version into a system with the new
    > version installed?
    
    Suppose you change TIMESTAMP to 64 bits time and 16 bits userid... how
    do you
    solve that problem? You would probably have to make the textin/textout
    functions
    for the type recognize the old format and make the appropriate
    conversions.
    Perhaps add zero userid, or default to postmaster userid?
    This would not be any different if TIMESTAMP was in a separate module.
    
    For the internal storage format, every type could have it's own way
    of recognizing different versions of the data. For example, say you have
    an IPv4 module and inserts millions of IP-addresses, then you upgrade
    to IPv6 module. It would then be able to look at the data and see if
    it is a IPv4 or IPv6 address. Of course, you would have problems if you
    tried to downgrade and had lots of IPv6 addresses inserted.
    MyOwnType could use the first few bits of the data to decide which
    version it is, and later releases of MyOwnType-module would be able
    to recognize the older formats.
    This way, types could be upgraded without dump-and-load procedure.
    
    
    > Or how about migrating to a different platform? Can we move data from
    > a little endian platform (x86) to a big endian platform (sparc)? Obviously
    > the .so files will be different, but what about the copying the data out and
    > reloading it?
    
    Is this a problem right now? Dump and reload, how can it fail?
    
    
    > Just to belabor this, it is perfectly reasonable to add a set of types and
    > functions that have no 'C' implementation. The 'loadable module' analogy
    > misses a lot of the real requirements.
    
    Why would someone want a type without implementation?
    Ok, let the module's init function register a type marked as
    "non-existant"? Null
    function?
    
    /* m */