Thread

  1. Re: TS: Limited cover density ranking

    Oleg Bartunov <oleg@sai.msu.su> — 2012-01-28T19:04:31Z

    I suggest you work on more general approach, see 
    http://www.sai.msu.su/~megera/wiki/2009-08-12 for example.
    
    btw, I don't like you changed ts_rank_cd arguments.
    
    Oleg
    On Fri, 27 Jan 2012, karavelov@mail.bg wrote:
    
    > Hello,
    >
    > I have developed a variation of cover density ranking functions that counts only covers that are lesser than a specified limit. It is useful for finding combinations of terms that appear nearby one another. Here is an example of usage:
    >
    > -- normal cover density ranking : not changed
    > luben=> select ts_rank_cd(to_tsvector('a b c d e g h i j k'), to_tsquery('a&d'));
    > ts_rank_cd
    > ------------
    >  0.0333333
    > (1 row)
    >
    > -- limited to 2
    > luben=> select ts_rank_cd(2, to_tsvector('a b c d e g h i j k'), to_tsquery('a&d'));
    > ts_rank_cd
    > ------------
    >          0
    > (1 row)
    >
    > luben=> select ts_rank_cd(2, to_tsvector('a b c d e g h i j k a d'), to_tsquery('a&d'));
    > ts_rank_cd
    > ------------
    >        0.1
    > (1 row)
    >
    > -- limited to 3
    > luben=> select ts_rank_cd(3, to_tsvector('a b c d e g h i j k'), to_tsquery('a&d'));
    > ts_rank_cd
    > ------------
    >  0.0333333
    > (1 row)
    >
    > luben=> select ts_rank_cd(3, to_tsvector('a b c d e g h i j k a d'), to_tsquery('a&d'));
    > ts_rank_cd
    > ------------
    >   0.133333
    > (1 row)
    >
    > Find attached a path agains 9.1.2 sources. I preferred to make a patch, not a separate extension because it is only 1 statement change in calc_rank_cd function. If I have to make an extension a lot of code would be duplicated between backend/utils/adt/tsrank.c and the extension.
    >
    > I have some questions:
    >
    > 1. Is it interesting to develop it further (documentation, cleanup, etc) for inclusion in one of the next versions? If this is the case, there are some further questions:
    >
    > - should I overload ts_rank_cd (as in examples above and the patch) or should I define new set of functions, for example ts_rank_lcd ?
    > - should I define define this new sql level functions in core or should I go only with this 2 lines change in calc_rank_cd() and define the new functions as an extension? If we prefer the later, could I overload core functions with functions defined in extensions?
    > - and finally there is always the possibility to duplicate the code and make an independent extension.
    >
    > 2. If I run the patched version on cluster that was initialized with unpatched server, is there a way to register the new functions in the system catalog without reinitializing the cluster?
    >
    > Best regards
    > luben
    >
    > --
    > Luben Karavelov
    
     	Regards,
     		Oleg
    _____________________________________________________________
    Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
    Sternberg Astronomical Institute, Moscow University, Russia
    Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
    phone: +007(495)939-16-83, +007(495)939-23-83