Thread
-
Re: Speed up COPY FROM text/CSV parsing using SIMD
Nazir Bilal Yavuz <byavuz81@gmail.com> — 2025-12-18T07:35:44Z
Hi, On Sat, 13 Dec 2025 at 02:09, Manni Wood <manni.wood@enterprisedb.com> wrote: > > Hello, Everyone! > > I have attached two files. 1) the shell script that Mark and I have been using to get our test results, and 2) a screenshot of a spreadsheet of my latest test results. (Please let me know if there's a different format than a screenshot that I could share my spreadsheet in.) > > I took greater care this time to compile all three variants of Postgres (master at bfb335df, master at bfb335df with v4.2 patches installed, master at bfb335df with v3 patches installed) with the same gcc optimization flags that would be used to build Postgres packages. To the best of my knowledge, the two gcc flags of greatest interest would be -g and -O2. I built all three variants of Postgres using meson like so: > > BRANCH=$(git branch --show-current) > meson setup build --prefix=/home/mwood/compiled-pg-instances/${BRANCH} --buildtype=debugoptimized > > It occurred to me that in addition to end users only caring about 1) wall clock time (is the speedup noticeable in "real time" or just technically faster / uses less CPU?) and 2) Postgres binaries compiled with the same optimization level one would get when installing Postgres from packages like .deb or .rpm; in other words, will the user see speedups without having do manually compile postgres. > > My interesting finding, on my laptop (ThinkPad P14s Gen 1 running Ubuntu 24.04.3), is different from Mark Wong's. On my laptop, using three Postgres installations all compiled with the -O2 optimization flag, I see speedups with the v4.2 patch except for a 2% slowdown with CSV with 1/3rd quotes (a 2% slowdown). But with Nazir's proposed v3 patch, I see improvements across the board. So even for a text file with 1/3rd escape characters, and even with a CSV file with 1/3rd quotes, I see speedups of 11% and 26% respectively. > > The format of these test files originally comes from Ayoub Kazar's test scripts; all Mark and I have done in playing with them is make them much larger: 5,000,000 rows, based on the assumption that longer tests are better tests. > > I find my results interesting enough that I'd be curious to know if anybody else can reproduce them. It is very interesting that Mark's results are noticeably different from mine. Thank you for sharing the benchmark script! I ran the benchmarks using your script with --buildtype=debugoptimized. My results are below: master: 85ddcc2f4c text, no special: 102294 text, 1/3 special: 108946 csv, no special: 121831 csv, 1/3 special: 140063 v3 text, no special: 88890 (13.1% speedup) text, 1/3 special: 110463 (1.4% regression) csv, no special: 89781 (26.3% speedup) csv, 1/3 special: 147094 (5.0% regression) v4.2 text, no special: 87785 (14.2% speedup) text, 1/3 special: 127008 (16.6% regression) csv, no special: 88093 (27.7% speedup) csv, 1/3 special: 164487 (17.4% regression) One thing I noticed is that your benchmark timings appear to have some variance. In my runs, I did not observe differences greater than one second between runs. It is possible that this variance is affecting your results. Before running the benchmarks, I use the these commands [1] to improve result stability; they might be helpful if you are not already using something similar: I did this benchmark on my local and my specs are Intel i5 13600k, 32GB Memory and SATA SSD. [1] sudo cpupower frequency-set --governor=performance sudo cpupower idle-set -D 0 # disable idle echo "1" | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo (intel only) -- Regards, Nazir Bilal Yavuz Microsoft