Thread
-
Re: [Proposal] Adding TRIM_SPACE option to COPY
Tatsuya Kawata <kawatatatsuya0913@gmail.com> — 2025-11-26T16:00:41Z
Hi all, Thank you for the feedback and for taking the time to review this patch. ## Performance Measurement Results > I also agree that this feature probably won't add noticeable overhead to > COPY when it isn't used, but it would still be good to measure > the performance impact of the patch. I conducted simple performance measurements with 5 million rows to assess the overhead when TRIM_SPACE is not used: **BEFORE patch (master branch):** Time: 5565.584 ms (00:05.566) Time: 5626.593 ms (00:05.627) **AFTER patch (TRIM_SPACE disabled):** Time: 5840.472 ms (00:05.840) Time: 5523.806 ms (00:05.524) The overhead appears to be approximately 1.5%, which seems to be within an acceptable range for typical COPY operations. ## Regarding COPY PROGRAM as an Alternative I understand Tom's concern about adding features to core that could be implemented externally. > As for TRIM_SPACE, it seems possible to implement it as an external module > and call it via COPY PROGRAM. Is this true? You're right that TRIM_SPACE can technically be achieved using COPY PROGRAM with an external script. I'd like to share some observations about this approach: **Example implementation with COPY PROGRAM:** ```python #!/usr/bin/env python3 import sys import csv reader = csv.reader(sys.stdin) writer = csv.writer(sys.stdout) for row in reader: trimmed_row = [field.strip() for field in row] writer.writerow(trimmed_row) ``` ```sql COPY users FROM PROGRAM 'python3 /tmp/trim_csv.py' WITH (FORMAT csv, HEADER true); ``` **Some considerations:** While the COPY PROGRAM approach works, it does have some practical limitations: it requires elevated privileges (`pg_execute_server_program`), may not be available on all platforms (particularly Windows), and requires users to handle parsing correctly. That said, I understand the concern about where to draw the line for features in core, especially given that options like FORCE_NULL follow a similar pattern. ## My Thoughts > So the question is which ones belong in core. On second thought, > perhaps features that are difficult or impossible to implement outside the core > are the ones that can be considered for COPY itself. Otherwise, it might be > better to avoid expanding COPY unnecessarily. Anyway I'd like to hear more > opinons on this. I can see the validity of the concern about feature creep. I'm open to the community's perspective on whether this belongs in core. If the general feeling is that this functionality should remain external, I'm happy to withdraw the patch. Thank you again for your time and valuable feedback. This discussion has been very educational for me. Best regards, Tatsuya Kawata