Thread

  1. Re: Speed up COPY FROM text/CSV parsing using SIMD

    Manni Wood <manni.wood@enterprisedb.com> — 2025-12-29T17:03:17Z

    On Wed, Dec 24, 2025 at 9:08 AM KAZAR Ayoub <ma_kazar@esi.dz> wrote:
    
    > Hello,
    > Following the same path of optimizing COPY FROM using SIMD, i found that
    > COPY TO can also benefit from this.
    >
    > I attached a small patch that uses SIMD to skip data and advance as far as
    > the first special character is found, then fallback to scalar processing
    > for that character and re-enter the SIMD path again...
    > There's two ways to do this:
    > 1) Essentially we do SIMD until we find a special character, then continue
    > scalar path without re-entering SIMD again.
    > - This gives from 10% to 30% speedups depending on the weight of special
    > characters in the attribute, we don't lose anything here since it advances
    > with SIMD until it can't (using the previous scripts: 1/3, 2/3 specials
    > chars).
    >
    > 2) Do SIMD path, then use scalar path when we hit a special character,
    > keep re-entering the SIMD path each time.
    > - This is equivalent to the COPY FROM story, we'll need to find the same
    > heuristic to use for both COPY FROM/TO to reduce the regressions (same
    > regressions: around from 20% to 30% with 1/3, 2/3 specials chars).
    >
    > Something else to note is that the scalar path for COPY TO isn't as heavy
    > as the state machine in COPY FROM.
    >
    > So if we find the sweet spot for the heuristic, doing the same for COPY TO
    > will be trivial and always beneficial.
    > Attached is 0004 which is option 1 (SIMD without re-entering), 0005 is the
    > second one.
    >
    >
    > Regards,
    > Ayoub
    >
    
    Hello, Nazir and Ayoub!
    
    Nazir, sorry for the late reply, I am on holiday. :-) I wanted to thank you
    for the tips on using cpupower to get less variance in my test results.
    
    Ayoub, I suppose it was inevitable the SIMD patch would work for copying
    out as well as copying in!
    
    I am back at work on 5 Jan 2026, so I till try to carve out time to test
    this then, using Nazir's tips.
    
    Happy Holidays!
    
    -Manni
    -- 
    -- Manni Wood EDB: https://www.enterprisedb.com