Re: PATCH: jsonpath string methods: lower, upper, initcap, l/r/btrim, replace, split_part
David E. Wheeler <david@justatheory.com>
From: "David E. Wheeler" <david@justatheory.com>
To: Chao Li <li.evan.chao@gmail.com>
Cc: Álvaro Herrera <alvherre@kurilemu.de>, Florents Tselai <florents.tselai@gmail.com>, Tom Lane <tgl@sss.pgh.pa.us>, Peter Eisentraut <peter@eisentraut.org>, Robert Haas <robertmhaas@gmail.com>, Alexander Korotkov <aekorotkov@gmail.com>, pgsql-hackers <pgsql-hackers@lists.postgresql.org>, Andrew Dunstan <andrew@dunslane.net>
Date: 2025-10-28T19:38:47Z
Lists: pgsql-hackers
Attachments
- v14-0001-Rename-jsonpath-method-arg-tokens.patch (application/octet-stream)
- v14-0002-Add-additional-jsonpath-string-methods.patch (application/octet-stream)
- (unnamed) (text/plain)
On Oct 22, 2025, at 22:43, Chao Li <li.evan.chao@gmail.com> wrote:
> I wonder if there is some consideration for the order? Feels that jpiSttLtrim and the following jpiStrXXX should be placed above jpiTimeXXX.
I wouldn’t think the order would matter.
> I know “b” in “btrim” stands for “both”, just curious why trim both side function is named “btrim()”? In most of programming languages I am aware of, trim() is the choice.
This patch uses existing Postgres functions, of which btrim is one[1].
> + default:
> + ;
> + /* cant' happen */
> + }
> ```
>
> As “default” clause has a comment “can’t happen”, I believe “break” is missing in the case clause.
>
> Also, do we want to add an assert in default to report a message in case it happens?
Good call, will change.
> 6 - jsonpath_exec.c
> ```
> + resStr = TextDatumGetCString(DirectFunctionCall3Coll(replace_text,
> + C_COLLATION_OID,
> + CStringGetTextDatum(tmp),
> + CStringGetTextDatum(from_str),
> + CStringGetTextDatum(to_str)));
> ```
>
> For trim functions, DEFAULT_COLLATION_OID used. Why C_COLLATION_OID is used for replace and split_part? I don’t see anything mentioned in your changes to the doc (func-json.sgml).
Intuitively that makes sense to me. Tests pass if I change it. Will update the patch.
> 7 - jsonpath_exec.c
> ```
> + if (!(jb = getScalar(jb, jbvString)))
> + RETURN_ERROR(ereport(ERROR,
> + (errcode(ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION),
> + errmsg("jsonpath item method .%s() can only be applied to a string",
> + jspOperationName(jsp->type)))));
> ```
>
> ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION seems wrong, this is a string function, not a date time function.
Yes. Maybe `ERRCODE_INVALID_PARAMETER_VALUE`? There’s also `ERRCODE_INVALID_JSON_TEXT`, but I think that’s about invalid bytes in a JSON string.
> The two nested “switch (jsp->type)” are quit redundant. We can pull up the second one, and simplify the first one, something like:
Well they assign different values to `func`: ltrim, rtrim, btrim when no arg vs ltrim1, rtrim1, btrim1 when there is an argument.
> 9 - jsonpath_exec.c
> ```
> + if (elem.type != jpiString)
> + elog(ERROR, "invalid jsonpath item type for .replace() from");
> +
> + from_str = jspGetString(&elem, &from_len);
> +
> + jspGetRightArg(jsp, &elem);
> + if (elem.type != jpiString)
> + elog(ERROR, "invalid jsonpath item type for .replace() to");
> ```
>
> In these two elog(), do we want to log the invalid type? As I see in the “default” clause, jsp->type is logged:
> ```
> + default:
> + elog(ERROR, "unsupported jsonpath item type: %d", jsp->type);
> ```
I think it’s going on precedents such as
```
if (elem.type != jpiNumeric)
elog(ERROR, "invalid jsonpath item type for .decimal() precision");
```
And also the date time method execution:
```
(errcode(ERRCODE_INVALID_ARGUMENT_FOR_SQL_JSON_DATETIME_FUNCTION),
errmsg("jsonpath item method .%s() can only be applied to a string",
jspOperationName(jsp->type)))));
```
I see types mentioned only in the context of failed numeric conversions (ERRCODE_NON_NUMERIC_SQL_JSON_ITEM).
Updated patches attached.
Best,
David