Thread

  1. SQL queries accessing tables in more than one db

    De Clarke <de@ucolick.org> — 1998-06-22T23:02:00Z

    RE:  ability to access tables from different DB in same SQL query
    
    I have a good example right now of why this is a good feature, 
    and why I suspect it is impractical or impossible to implement 
    one of my bread-n-butter sybase-based apps using PG.  This is 
    rather a fraught question, because NOT migrating to PG may cost 
    us thousands of dollars.  More like tens of thousands, since we 
    would need to bring our old licenses up to date (I just found a 
    horrendous bug in Sybase SQL Server 4.9 date arithmetic) and 
    buy a couple of new licenses for our project.  So a lot is
    riding on our ability to port existing apps to PG.
    
    I don't think this feature is in SQL92, so I'm making a plea for
    an "extra", another way in which PG can be "better than" the bare
    standard.
    
    -------
    
    There is a telemetry gathering daemon (several actually) dumping
    telemetry into tables.  The tables are large (within the limits
    of Sybase, which are 250 fields per record and 3700-some bytes).
    Say about 220 fields per record, in the high hundreds of bytes,
    all ints and floats, and the tables are ever growing in length.
    Say between 10K and 250K records depending on how long the logger
    has been running and at what frequency telemetry has been gathered.
    
    The engineering staff person wants to do some analysis on these
    data.  The front end app provides an easy, friendly way to select
    a date range (or other RSE) from the huge mass of telemetry,
    and to refine the FSE.
    
    The selected data are then copied into a temp table using a 
    SELECT INTO.
    
    Here's the point, then.  The user's temp table wants to live in
    a DB with generous permissions:  ordinary users can create and
    delete tables!  But the original telemetry data want to live in
    a very protected DB where users absolutely cannot mess with the
    original tables OR go creating tables of their own that compete
    with the originals for storage space.
    
    This is where two important features of Sybase come in handy, and
    I don't think Oracle does this (correct me, O Oracle users, if I'm
    slandering the product):  Different DB can be located on different
    partitions, or different inviolable physical chunks of one
    partition.  One server can "see" multiple DB.  And Sybase SQL queries 
    can span databases, that is, the database name is part of the FQON 
    (fully qual object name).  So it's as easy as
    
    	select * into sandbox.guest.DMyn_de897082595_D1 from
    		telem.dbo.hires_Log_1 where logstamp between
    		'Jun 19 1998 03:00' and 'Jun 19 1988 08:00'
    
    The user has no privs other than 'select' anywhere in the telem DB.
    He/she now works freely with the smaller, lighter table in the
    sandbox DB (*not* repeating expensive queries against the potentially
    very, very large table in 'telem').  The user could, of course, 
    specify a very large range of data and create quite a large temp table 
    (there are some safety limits in the app, but they are generous).  
    The worst result of this would be that the user would get bored waiting 
    for the query to return :-), the dataset would be too big to plot easily, 
    and other users of the sandbox DB would be annoyed when all the disk 
    space was used up.
    
    Now, as I understand PG, the user would have to create the temp table 
    in the incoming telemetry db, because the SELECT query could not
    reference tables in 2 different DB.   So he/she would have to be granted 
    table create/delete in a place where creation of large tables could 
    easily interfere with the essential job of logging the incoming telemetry.
    Not acceptable.  The user playpen has to be separate from the incoming
    production data.
    
    If PG offers some other way of 
    
    	guaranteeing inviolable space for tables that grow with a 
    		realtime feed, 
    	yet making those tables accessible -- via SQL query --
    		to nonpriv users who want to grab chunks of the data into 
    		temp tables created on the fly, 
    
    I would very much like to know how to do that.  Yeah, I could do it 
    by buffering the data row by row in the app and then inserting it from 
    the app into the new table, but what the heck is SQL for if not to do 
    that job more efficiently and concisely?  SELECT INTO is the right
    syntax.  But the limitations of PG prevent my using it.
    
    Yeah, I could mirror the whole telemetry db to a "public" server 
    periodically, and let the users query that.  But the engineers want 
    up-to-the-last-sample data to analyze;  they can change the sampling 
    speed in real time, if something interesting is happening;  they don't 
    want to wait for the 24-hourly or 12-hourly flush to the warehouse.
    
    Solutions using psql special commands are not acceptable;  it has to
    be SQL.
    
    I hope this makes the case for cross-DB queries, which I have agitated
    feebly for in the past but never really justified.  If not (if there is
    a good way to achieve the same result in PG without x-db queries) then
    please do tell!
    
    de
    
    
    
  2. Re: [HACKERS] SQL queries accessing tables in more than one db

    ocie@paracel.com — 1998-06-22T23:24:39Z

    De Clarke wrote:
    > 
    > RE:  ability to access tables from different DB in same SQL query
    
    [snip]
    
    > Here's the point, then.  The user's temp table wants to live in
    > a DB with generous permissions:  ordinary users can create and
    > delete tables!  But the original telemetry data want to live in
    > a very protected DB where users absolutely cannot mess with the
    > original tables OR go creating tables of their own that compete
    > with the originals for storage space.
    > 
    > 	select * into sandbox.guest.DMyn_de897082595_D1 from
    > 		telem.dbo.hires_Log_1 where logstamp between
    > 		'Jun 19 1998 03:00' and 'Jun 19 1988 08:00'
    
    A couple of comments on this.  Sybase does have quite an elegant
    system for this <db>.<owner>.<table>.<column> for simple queries,
    table can be omitted, when not ambiguous, user can be omitted and the
    full form can be used in all queries if the permissions are right.  So
    you could just as easily say:
    
    select * from telem..hires_Log_1 where logstamp=sandbox..table1.timestamp
    
    Now we could take it one step further and put the name of the database
    server before all of this, so you could say:
    
    select * from office1.sales..saleslog,
                  office2.sales..saleslog,
    ... etc
    
    I wouldn't vouch for performance in this case though.
    
    Ocie