Thread
-
Volunteer: Large Tuples / Tuple chaining
Christof Petig <christof.petig@wtal.de> — 1999-12-09T22:13:51Z
Hello, I'll donate some (read all freely available) of my spare time to implementing tuple chaining. It looks like this feature is most wanted and it would be a pity to hold this until post 7.0. Personally I don't need it, yet ... But I will definitely find a use for it once available ;-) And it looks like a good start for hacking on pgsql. I already dived into the depth of pgsql's page and tuple structures and it looks like it is possible. But before I start coding I would like to hear some more experienced opinions on how to implement it. Did you alread discuss technical matters about the implementation? How can I get in touch with it? (Simply browse the mailing list archives?) Here's a layout how I imagine the work: What is needed: - lay out a tuple continuation structure - put tuple into multiple chunks when pages are considered, reconcile when loaded from disk (how to continue a tuple - need a structure) how is a tuple (read page item) addressed? ItemPointerData I imagine to store a continuation address as the last bytes of the tuple unless it fits into one page. I need to mark large tuples (how, just one flag in tuple) How to tell a maximum possible size last block from a continued (which carries a pointer to the next one at its end)? Or don't care: make item continued and put last 6(?) bytes into a new block - note that the continued tuples are not referenced directly (vacuum?) mark them as used. I hope vacuum operates on a tuple basis and has no concept of pages - I guess that the tuple pointer points into page memory, if multiple pages are concatenated for a tuple, these pages must not reside in memory but the full tuple's memory must be allocated (from a memory similar to pages) (shared mem?) - should be possible for memory only pages see PageGetPageSize but od_pagesize is 16bit! Reuse another variable? Another type of page? (32bit od_pagesize) Very fascinated by this large beast of ancient code to explore Christof PS: I think the documentation on page layout is far outdated (or points into the future since it speaks about ItemContinuationData structures.) Should I update it? The table doesn't match actual structure components. At least I don't understand what it's about. The source code mentions a different page layout. PPS: Do not pity me, I have ten+ years of coding experience in C. PPPS: Could someone in few words tell me what an access method is (a tuple is an access method, log pages are another?) -
RE: [HACKERS] Volunteer: Large Tuples / Tuple chaining
Hiroshi Inoue <inoue@tpf.co.jp> — 1999-12-10T15:33:36Z
> -----Original Message----- > From: owner-pgsql-hackers@postgreSQL.org > [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Christof Petig > > Hello, > > I'll donate some (read all freely available) of my spare time to > implementing tuple > chaining. It looks like this feature is most wanted and it would be a > pity to hold this until post 7.0. Personally I don't need it, yet ... > But I will definitely find a use for it once available ;-) And it looks > like a good start for hacking on pgsql. > > I already dived into the depth of pgsql's page and tuple structures and > it looks like it is possible. But before I start coding I would like to > hear some more experienced opinions on how to implement it. > Will you put a long tuple into a long logical page(continued multiple phisical(?) pages) ? I'm suspicious about the way that allows non-page-formatted page. Anyway it would need a big change around bufmgr/smgr etc. Could someone estimate the influence/danger before going forward ? Regards. Hiroshi Inoue Inoue@tpf.co.jp
-
Re: [HACKERS] Volunteer: Large Tuples / Tuple chaining
Christof Petig <christof.petig@wtal.de> — 1999-12-13T21:59:27Z
Hiroshi Inoue wrote: > > Will you put a long tuple into a long logical page(continued multiple > phisical(?) pages) ? > I'm suspicious about the way that allows non-page-formatted page. > > Anyway it would need a big change around bufmgr/smgr etc. > Could someone estimate the influence/danger before going forward ? > I planned to use as many of PostgreSQL data structures unaltered as possible. Storing one Tuple in multiple Items should not pose too much danger on bufmgr and smgr unless they access tuple internals. (I didn't check that yet). This would mean that on disk Items do no longer correspond to Tuples. (Some of them might form one tuple). I dropped the plan of Unformatted pages very soon. But the issue of tuple in-memory-storage remains (I don't know the internals of allocating/freeing, yet). Christof
-
Re: [HACKERS] Volunteer: Large Tuples / Tuple chaining
Bruce Momjian <pgman@candle.pha.pa.us> — 1999-12-14T01:59:38Z
Thanks. Seems like Jan is going to be doing this. > Hello, > > I'll donate some (read all freely available) of my spare time to > implementing tuple > chaining. It looks like this feature is most wanted and it would be a > pity to hold this until post 7.0. Personally I don't need it, yet ... > But I will definitely find a use for it once available ;-) And it looks > like a good start for hacking on pgsql. > > I already dived into the depth of pgsql's page and tuple structures and > it looks like it is possible. But before I start coding I would like to > hear some more experienced opinions on how to implement it. > > Did you alread discuss technical matters about the implementation? How > can I get in touch with it? (Simply browse the mailing list archives?) > > Here's a layout how I imagine the work: > > What is needed: > - lay out a tuple continuation structure > - put tuple into multiple chunks when pages are considered, reconcile > when > loaded from disk > (how to continue a tuple - need a structure) > how is a tuple (read page item) addressed? ItemPointerData > I imagine to store a continuation address as the last bytes of the > tuple unless it > fits into one page. > I need to mark large tuples (how, just one flag in tuple) > How to tell a maximum possible size last block from a continued > (which carries a pointer to the next one at its end)? > Or don't care: make item continued and put last 6(?) bytes into a new > block > - note that the continued tuples are not referenced directly (vacuum?) > mark them as used. I hope vacuum operates on a tuple basis and has no > concept of > pages > - I guess that the tuple pointer points into page memory, if multiple > pages > are concatenated for a tuple, these pages must not reside in memory > but > the full tuple's memory must be allocated (from a memory similar to > pages) > (shared mem?) > - should be possible for memory only pages > see PageGetPageSize but od_pagesize is 16bit! > Reuse another variable? Another type of page? (32bit od_pagesize) > > Very fascinated by this large beast of ancient code to explore > Christof > > PS: I think the documentation on page layout is far outdated (or points > into the future since it speaks about ItemContinuationData structures.) > Should I update it? > The table doesn't match actual structure components. At least I don't > understand what it's about. The source code mentions a different page > layout. > > PPS: Do not pity me, I have ten+ years of coding experience in C. > > PPPS: Could someone in few words tell me what an access method is (a > tuple is an access method, log pages are another?) > > > ************ > -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
-
RE: [HACKERS] Volunteer: Large Tuples / Tuple chaining
Hiroshi Inoue <inoue@tpf.co.jp> — 1999-12-14T08:58:57Z
> -----Original Message----- > From: christof@to.wtal.de [mailto:christof@to.wtal.de]On Behalf Of > Christof Petig > > Hiroshi Inoue wrote: > > > > Will you put a long tuple into a long logical page(continued multiple > > phisical(?) pages) ? > > I'm suspicious about the way that allows non-page-formatted page. > > > > Anyway it would need a big change around bufmgr/smgr etc. > > Could someone estimate the influence/danger before going forward ? > > > > I planned to use as many of PostgreSQL data structures unaltered as > possible. Storing one Tuple in multiple Items should not pose too much > danger on bufmgr and smgr unless they access tuple internals. (I didn't > check that yet). This would mean that on disk Items do no longer > correspond to Tuples. (Some of them might form one tuple). > Hmm,we have discussed about LONG. Change by LONG is transparent to users and would resolve the big tuple problem mostly. I'm suspicious that tuple chaining is worth the work now. At least a consensus is needed before going,I think. Bad design would only introduce a confusion. Regards. Hiroshi Inoue Inoue@tpf.co.jp
-
Re: [HACKERS] Volunteer: Large Tuples / Tuple chaining
Bruce Momjian <pgman@candle.pha.pa.us> — 1999-12-14T16:25:40Z
> > I planned to use as many of PostgreSQL data structures unaltered as > > possible. Storing one Tuple in multiple Items should not pose too much > > danger on bufmgr and smgr unless they access tuple internals. (I didn't > > check that yet). This would mean that on disk Items do no longer > > correspond to Tuples. (Some of them might form one tuple). > > > > Hmm,we have discussed about LONG. > Change by LONG is transparent to users and would resolve > the big tuple problem mostly. > I'm suspicious that tuple chaining is worth the work now. > > At least a consensus is needed before going,I think. > Bad design would only introduce a confusion. Agreed. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
-
Re: [HACKERS] Volunteer: Large Tuples / Tuple chaining
Jan Wieck <wieck@debis.com> — 1999-12-14T18:45:11Z
Bruce Momjian wrote: > > > I planned to use as many of PostgreSQL data structures unaltered as > > > possible. Storing one Tuple in multiple Items should not pose too much > > > danger on bufmgr and smgr unless they access tuple internals. (I didn't > > > check that yet). This would mean that on disk Items do no longer > > > correspond to Tuples. (Some of them might form one tuple). > > > > > > > Hmm,we have discussed about LONG. > > Change by LONG is transparent to users and would resolve > > the big tuple problem mostly. > > I'm suspicious that tuple chaining is worth the work now. > > > > At least a consensus is needed before going,I think. > > Bad design would only introduce a confusion. > > Agreed. Me too. I think that only a combination of LONG attributes and split tuples will be a complete solution. What I'm worried about is to make the segments of a large tuple specialized things in the main table. The reliability of Vacuum is one of the most important things for any system in production. While the general operation of vacuum seems to be well known, it's requirements for atomicy of some actions appears to be lesser. The more chunks a tuple consists of, the more possible an abort of vacuum in the middle of their moving becomes. So keeping the links of chained tuples fail safe intact is IMHO an issue, a little underestimated in this discussion. Maybe we can split tuples in another way, must think about it for another hour - 'til later. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) # -
RE: [HACKERS] Volunteer: Large Tuples / Tuple chaining
Hiroshi Inoue <inoue@tpf.co.jp> — 1999-12-15T02:43:51Z
> -----Original Message----- > From: Jan Wieck [mailto:wieck@debis.com] > Sent: Wednesday, December 15, 1999 3:45 AM > > Bruce Momjian wrote: > > > > > I planned to use as many of PostgreSQL data structures unaltered as > > > > possible. Storing one Tuple in multiple Items should not > pose too much > > > > danger on bufmgr and smgr unless they access tuple > internals. (I didn't > > > > check that yet). This would mean that on disk Items do no longer > > > > correspond to Tuples. (Some of them might form one tuple). > > > > > > > > > > Hmm,we have discussed about LONG. > > > Change by LONG is transparent to users and would resolve > > > the big tuple problem mostly. > > > I'm suspicious that tuple chaining is worth the work now. > > > > > > At least a consensus is needed before going,I think. > > > Bad design would only introduce a confusion. > > > > Agreed. > > Me too. > > I think that only a combination of LONG attributes and split > tuples will be a complete solution. > > What I'm worried about is to make the segments of a large > tuple specialized things in the main table. The reliability > of Vacuum is one of the most important things for any system > in production. While the general operation of vacuum seems to > be well known, it's requirements for atomicy of some actions > appears to be lesser. The more chunks a tuple consists of, > the more possible an abort of vacuum in the middle of their > moving becomes. So keeping the links of chained tuples fail > safe intact is IMHO an issue, a little underestimated in this > discussion. > There exists another related problem. Vacuum could hardly move big tuples if some tuples of each page live long. Though we have to move a long tuple at once,there won't be so many clean pages. Probably vacuum couldn't move even a 8K tuple in some cases. The problem is already there,more or less. But it seems very difficult to solve this problem without giving up to preserve consistency in case of a crash. Regards. Hiroshi Inoue Inoue@tpf.co.jp
-
Re: [HACKERS] Volunteer: Large Tuples / Tuple chaining
Bruce Momjian <pgman@candle.pha.pa.us> — 1999-12-15T02:52:14Z
Remember, chaining tuples had all sorts of performance, vacuum, code handling, and UPDATE problems. They buy us very little, and almost nothing if we have LONG tables. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
-
Re: [HACKERS] Volunteer: Large Tuples / Tuple chaining
Christof Petig <christof.petig@wtal.de> — 1999-12-15T08:27:39Z
Bruce Momjian wrote: > > Remember, chaining tuples had all sorts of performance, vacuum, code > handling, and UPDATE problems. They buy us very little, and almost > nothing if we have LONG tables. > I had already contacted Jan in private Email. Since we share country, native language and time zone, this is even the most comfortable way. I agree with the concerns you mailed and will (most likely) start helping Jan to implement LONG. As I had seen your LONG discussion _after_ my original post, this had been a strange coincidence. But I had been following it with interest. Christof