Thread
-
Fix some ubsan/asan related issues
Tristan Partin <tristan@neon.tech> — 2024-01-30T15:57:24Z
Patch 1: Passing NULL as a second argument to memcpy breaks ubsan, and there didn't seem to be anything preventing that in the LogLogicalMessage() codepath. Here is a preventative measure in LogLogicalMessage() and an Assert() in CopyXLogRecordToWAL(). Patch 2: Support building with -Db_sanitize=address in Meson. Various executables are leaky which can cause the builds to fail and tests to fail, when we are fine leaking this memory. Personally, I am a big stickler for always freeing my memory in executables even if it gets released at program exit because it makes the leak sanitizer much more effective. However (!), I am not here to change the philosophy of memory management in one-off executables, so I have just silenced memory leaks in various executables for now. Patch 3: THIS DOESN'T WORK. DO NOT COMMIT. PROPOSING FOR DISCUSSION! In my effort to try to see if the test suite would pass with asan enabled, I ran into a max_stack_depth issue. I tried maxing it out (hence, the patch), but that still didn't remedy my issue. I tried to look on the list for any relevant emails, but nothing turned up. Maybe others have not attempted this before? Seems doubtful. Not entirely sure how to fix this issue. I personally find asan extremely effective, even more than valgrind, so it would be great if I could run Postgres with asan enabled to catch various stupid C issues I might create. On my system, ulimit -a returns 8192 kbytes, so Postgres just lets me set 7680 kbytes as the max. Whatever asan is doing doesn't seem to leave enough stack space for Postgres. --- I would like to see patch 1 reviewed and committed. Patch 2 honestly doesn't matter unless asan support can be fixed. I can also add a patch that errors out the Meson build if asan support is requested. That way others don't spend time heading down a dead end. -- Tristan Partin Neon (https://neon.tech)
-
Re: Fix some ubsan/asan related issues
Tristan Partin <tristan@neon.tech> — 2024-01-30T15:59:25Z
Spend so much time writing out the email, I once again forget attachments...UGH. -- Tristan Partin Neon (https://neon.tech)
-
Re: Fix some ubsan/asan related issues
Heikki Linnakangas <hlinnaka@iki.fi> — 2024-01-30T20:05:28Z
On 30/01/2024 17:57, Tristan Partin wrote: > Patch 1: > > Passing NULL as a second argument to memcpy breaks ubsan, and there > didn't seem to be anything preventing that in the LogLogicalMessage() > codepath. Here is a preventative measure in LogLogicalMessage() and an > Assert() in CopyXLogRecordToWAL(). For the audience: We ran into this one with the neon extension. The solution is to call LogLogicalMessage("", 0, ...) instead of LogLogicalMessage(NULL, 0, ...). . But it's true that it's pointlessfor LogLogicalMessage to call XLogRegisterData() if the message is empty. If we do this, we should check for 'size == 0' rather than 'message == NULL'. > Patch 2: > > Support building with -Db_sanitize=address in Meson. Various executables > are leaky which can cause the builds to fail and tests to fail, when we > are fine leaking this memory. > > Personally, I am a big stickler for always freeing my memory in > executables even if it gets released at program exit because it makes > the leak sanitizer much more effective. However (!), I am not here to > change the philosophy of memory management in one-off executables, so > I have just silenced memory leaks in various executables for now. > > Patch 3: > > THIS DOESN'T WORK. DO NOT COMMIT. PROPOSING FOR DISCUSSION! > > In my effort to try to see if the test suite would pass with asan > enabled, I ran into a max_stack_depth issue. I tried maxing it out > (hence, the patch), but that still didn't remedy my issue. I tried to > look on the list for any relevant emails, but nothing turned up. Maybe > others have not attempted this before? Seems doubtful. > > Not entirely sure how to fix this issue. I personally find asan > extremely effective, even more than valgrind, so it would be great if > I could run Postgres with asan enabled to catch various stupid C issues > I might create. On my system, ulimit -a returns 8192 kbytes, so Postgres > just lets me set 7680 kbytes as the max. Whatever asan is doing doesn't > seem to leave enough stack space for Postgres. I'm a bit confused by these. We already run with sanitizer in the cirrus CI. What does this enable that we're not already doing? -- Heikki Linnakangas Neon (https://neon.tech) -
Re: Fix some ubsan/asan related issues
Andres Freund <andres@anarazel.de> — 2024-01-30T21:23:04Z
Hi, On 2024-01-30 22:05:28 +0200, Heikki Linnakangas wrote: > On 30/01/2024 17:57, Tristan Partin wrote: > > In my effort to try to see if the test suite would pass with asan > > enabled, I ran into a max_stack_depth issue. I tried maxing it out > > (hence, the patch), but that still didn't remedy my issue. I tried to > > look on the list for any relevant emails, but nothing turned up. Maybe > > others have not attempted this before? Seems doubtful. > > > > Not entirely sure how to fix this issue. I personally find asan > > extremely effective, even more than valgrind, so it would be great if > > I could run Postgres with asan enabled to catch various stupid C issues > > I might create. On my system, ulimit -a returns 8192 kbytes, so Postgres > > just lets me set 7680 kbytes as the max. Whatever asan is doing doesn't > > seem to leave enough stack space for Postgres. > > I'm a bit confused by these. We already run with sanitizer in the cirrus CI. > What does this enable that we're not already doing? The reason asan fails is that it uses a "shadow stack" to track stack variable lifetimes. These confuse our stack depth check. CI doesn't have the issue because the compiler doesn't yet enable the feature, locally I get around it by using ASAN_OPTIONS=detect_stack_use_after_return=0:... The checks are actually quite useful, so making our stack depth check work with asan would be worthwhile. I discussed this in a bit more in https://postgr.es/m/20231129193920.4vphw7dqxjzf5v5b%40awork3.anarazel.de Greetings, Andres Freund
-
Re: Fix some ubsan/asan related issues
Andres Freund <andres@anarazel.de> — 2024-01-30T21:58:12Z
Hi, On 2024-01-30 09:59:25 -0600, Tristan Partin wrote: > From 331cec1c9db6ff60dcc6d9ba62a9c8be4e5e95ed Mon Sep 17 00:00:00 2001 > From: Tristan Partin <tristan@neon.tech> > Date: Mon, 29 Jan 2024 18:03:39 -0600 > Subject: [PATCH v1 1/3] Refuse to register message in LogLogicalMessage if > NULL > If this occurs, the memcpy of rdata_data in CopyXLogRecordToWAL breaks > the API contract of memcpy in glibc. The two pointer arguments are > marked as nonnull, even in the event the amount to copy is 0 bytes. It seems pretty odd to call LogLogicalMessage() with a NULL argument. Why is that something useful? > From dc9488f3fdee69b981b52c985fb77106d7d301ff Mon Sep 17 00:00:00 2001 > From: Tristan Partin <tristan@neon.tech> > Date: Wed, 24 Jan 2024 17:07:01 -0600 > Subject: [PATCH v1 2/3] meson: Support compiling with -Db_sanitize=address > > The ecpg is parser is extremely leaky, so we need to silence leak > detection. This does stuff beyond epcg... > +if get_option('b_sanitize').contains('address') > + cdata.set('USE_ADDRESS_SANITIZER', 1) > +endif > > ############################################################### > # NLS / Gettext > diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c > index ac409b0006..e18e716d9c 100644 > --- a/src/bin/initdb/initdb.c > +++ b/src/bin/initdb/initdb.c > @@ -338,6 +338,17 @@ do { \ > output_failed = true, output_errno = errno; \ > } while (0) > > +#ifdef USE_ADDRESS_SANITIZER When asan is used __SANITIZE_ADDRESS__ is defined, so we don't need to implement this ourselves. > +const char *__asan_default_options(void); > + > +const char *__asan_default_options(void) > +{ > + return "detect_leaks=0"; > +} > + > +#endif Wonder if we should move this into some static library and link it into all binaries that don't want leak detection? It doesn't seem great to have to adjust this in a bunch of files if we want to adjust the options... Greetings, Andres Freund -
Re: Fix some ubsan/asan related issues
Alexander Lakhin <exclusion@gmail.com> — 2024-01-31T04:00:00Z
Hello, 30.01.2024 18:57, Tristan Partin wrote: > Patch 1: > > Passing NULL as a second argument to memcpy breaks ubsan, ... Maybe you would like to fix also one more similar place, reached with: create extension xml2; select xslt_process('<x/>', $$<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="@*|node()"> </xsl:template> </xsl:stylesheet>$$); varlena.c:201:26: runtime error: null pointer passed as argument 2, which is declared to never be null There is also an issue with pg_bsd_indent, I stumble upon when doing `make check-world` with the sanitizers enabled: https://www.postgresql.org/message-id/591971ce-25c1-90f3-0526-5f54e3ebb32e%40gmail.com 31.01.2024 00:23, Andres Freund wrote: > The reason asan fails is that it uses a "shadow stack" to track stack variable > lifetimes. These confuse our stack depth check. CI doesn't have the issue > because the compiler doesn't yet enable the feature, locally I get around it > by using ASAN_OPTIONS=detect_stack_use_after_return=0:... Even with detect_stack_use_after_return=0, clang-18's asan makes the test 012_subtransactions.pl fail: 2024-01-31 03:24:25.691 UTC [4112455] 012_subtransactions.pl LOG: statement: SELECT hs_subxids(201); 2024-01-31 03:24:25.714 UTC [4112455] 012_subtransactions.pl ERROR: stack depth limit exceeded 2024-01-31 03:24:25.714 UTC [4112455] 012_subtransactions.pl HINT: Increase the configuration parameter max_stack_depth (currently 2048kB), after ensuring the platform's stack depth limit is adequate. (All the other tests pass.) Though the same test passes when I use clang-16. Best regards, Alexander -
Re: Fix some ubsan/asan related issues
Tristan Partin <tristan@neon.tech> — 2024-01-31T18:16:52Z
On Tue Jan 30, 2024 at 10:00 PM CST, Alexander Lakhin wrote: > Hello, > > 30.01.2024 18:57, Tristan Partin wrote: > > Patch 1: > > > > Passing NULL as a second argument to memcpy breaks ubsan, ... > > Maybe you would like to fix also one more similar place, reached with: > create extension xml2; > select xslt_process('<x/>', > $$<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > <xsl:template match="@*|node()"> > </xsl:template> > </xsl:stylesheet>$$); > > varlena.c:201:26: runtime error: null pointer passed as argument 2, which is declared to never be null > > There is also an issue with pg_bsd_indent, I stumble upon when doing > `make check-world` with the sanitizers enabled: > https://www.postgresql.org/message-id/591971ce-25c1-90f3-0526-5f54e3ebb32e%40gmail.com > > 31.01.2024 00:23, Andres Freund wrote: > > The reason asan fails is that it uses a "shadow stack" to track stack variable > > lifetimes. These confuse our stack depth check. CI doesn't have the issue > > because the compiler doesn't yet enable the feature, locally I get around it > > by using ASAN_OPTIONS=detect_stack_use_after_return=0:... > > Even with detect_stack_use_after_return=0, clang-18's asan makes the test > 012_subtransactions.pl fail: > 2024-01-31 03:24:25.691 UTC [4112455] 012_subtransactions.pl LOG: statement: SELECT hs_subxids(201); > 2024-01-31 03:24:25.714 UTC [4112455] 012_subtransactions.pl ERROR: stack depth limit exceeded > 2024-01-31 03:24:25.714 UTC [4112455] 012_subtransactions.pl HINT: Increase the configuration parameter max_stack_depth > (currently 2048kB), after ensuring the platform's stack depth limit is adequate. > > (All the other tests pass.) > Though the same test passes when I use clang-16. Thanks Alexander! I will try and take a look at these. -- Tristan Partin Neon (https://neon.tech) -
Re: Fix some ubsan/asan related issues
Tristan Partin <tristan@neon.tech> — 2024-02-06T03:53:40Z
On Tue Jan 30, 2024 at 3:58 PM CST, Andres Freund wrote: > Hi, > > On 2024-01-30 09:59:25 -0600, Tristan Partin wrote: > > From 331cec1c9db6ff60dcc6d9ba62a9c8be4e5e95ed Mon Sep 17 00:00:00 2001 > > From: Tristan Partin <tristan@neon.tech> > > Date: Mon, 29 Jan 2024 18:03:39 -0600 > > Subject: [PATCH v1 1/3] Refuse to register message in LogLogicalMessage if > > NULL > > > If this occurs, the memcpy of rdata_data in CopyXLogRecordToWAL breaks > > the API contract of memcpy in glibc. The two pointer arguments are > > marked as nonnull, even in the event the amount to copy is 0 bytes. > > It seems pretty odd to call LogLogicalMessage() with a NULL argument. Why is > that something useful? Dropped. Will change on the Neon side. Should we add an assert somewhere for good measure? Near the memcpy in question? > > From dc9488f3fdee69b981b52c985fb77106d7d301ff Mon Sep 17 00:00:00 2001 > > From: Tristan Partin <tristan@neon.tech> > > Date: Wed, 24 Jan 2024 17:07:01 -0600 > > Subject: [PATCH v1 2/3] meson: Support compiling with -Db_sanitize=address > > > > The ecpg is parser is extremely leaky, so we need to silence leak > > detection. > > This does stuff beyond epcg... Dropped. > > +if get_option('b_sanitize').contains('address') > > + cdata.set('USE_ADDRESS_SANITIZER', 1) > > +endif > > > > ############################################################### > > # NLS / Gettext > > diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c > > index ac409b0006..e18e716d9c 100644 > > --- a/src/bin/initdb/initdb.c > > +++ b/src/bin/initdb/initdb.c > > @@ -338,6 +338,17 @@ do { \ > > output_failed = true, output_errno = errno; \ > > } while (0) > > > > +#ifdef USE_ADDRESS_SANITIZER > > When asan is used __SANITIZE_ADDRESS__ is defined, so we don't need to > implement this ourselves. Thanks! > > +const char *__asan_default_options(void); > > + > > +const char *__asan_default_options(void) > > +{ > > + return "detect_leaks=0"; > > +} > > + > > +#endif > > Wonder if we should move this into some static library and link it into all > binaries that don't want leak detection? It doesn't seem great to have to > adjust this in a bunch of files if we want to adjust the options... See attached patches. Here is what I found to be necessary to get a -Db_sanitize=address,undefined build to successfully make it through tests. I do have a few concerns about the patch. 1. For whatever reason, __SANITIZE_LEAK__ is not defined when the leak sanitizer is enabled. So you will see me use this, to make some include directives work. I don't like this as a final solution because someone could use -fsanitize=leak. 2. I tracked down what seems to be a valid leak in adt/xml.c. Attached (test.sql) is a fairly minimal reproduction of what is needed to show the leak. I didn't spend too much time tracking it down. Might get to it later, who knows. Below you will find the backtrace, and whoever wants to try their hand at fixing it will need to comment out xmlNewNode in the leak.supp file. 3. I don't love the new library name. Maybe it should be name more lsan specific. 4. Should pg_attribute_no_asan be renamed to pg_attribute_no_sanitize_address? That would match pg_attribute_no_sanitize_alignment. I will also attach a Meson test log for good measure. I didn't try testing anything that requires PG_TEST_EXTRA, but I suspect that everything will be fine. Alexander, I haven't yet gotten to the things you pointed out in the sibling thread. ==221848==ERROR: LeakSanitizer: detected memory leaks Direct leak of 120 byte(s) in 1 object(s) allocated from: #0 0x7fac4a6d92ef in malloc (/lib64/libasan.so.8+0xd92ef) (BuildId: 7fcb7759bc17ef47f9682414b6d99732d6a6ab0c) #1 0x7fac4a48427d in xmlNewNode (/lib64/libxml2.so.2+0x5d27d) (BuildId: 3074681c8fa9b17e0cbed09bc61c25ada5c28e7c) #2 0x22107a6 in xmltotext_with_options ../src/backend/utils/adt/xml.c:754 #3 0xead047 in ExecEvalXmlExpr ../src/backend/executor/execExprInterp.c:4020 #4 0xe8c119 in ExecInterpExpr ../src/backend/executor/execExprInterp.c:1537 #5 0xe91f2c in ExecInterpExprStillValid ../src/backend/executor/execExprInterp.c:1881 #6 0x109632d in ExecEvalExprSwitchContext ../src/include/executor/executor.h:355 #7 0x109655d in ExecProject ../src/include/executor/executor.h:389 #8 0x1097186 in ExecResult ../src/backend/executor/nodeResult.c:136 #9 0xf0f90c in ExecProcNodeFirst ../src/backend/executor/execProcnode.c:464 #10 0xec9bec in ExecProcNode ../src/include/executor/executor.h:273 #11 0xed875c in ExecutePlan ../src/backend/executor/execMain.c:1670 #12 0xecbee0 in standard_ExecutorRun ../src/backend/executor/execMain.c:365 #13 0xecb529 in ExecutorRun ../src/backend/executor/execMain.c:309 #14 0x1ae89f6 in PortalRunSelect ../src/backend/tcop/pquery.c:924 #15 0x1ae7c06 in PortalRun ../src/backend/tcop/pquery.c:768 #16 0x1ad1b43 in exec_simple_query ../src/backend/tcop/postgres.c:1273 #17 0x1adf8de in PostgresMain ../src/backend/tcop/postgres.c:4653 #18 0x170dbce in BackendRun ../src/backend/postmaster/postmaster.c:4464 #19 0x170bf70 in BackendStartup ../src/backend/postmaster/postmaster.c:4140 #20 0x170263f in ServerLoop ../src/backend/postmaster/postmaster.c:1776 #21 0x1701052 in PostmasterMain ../src/backend/postmaster/postmaster.c:1475 #22 0x11a5f67 in main ../src/backend/main/main.c:198 #23 0x7fac48e46149 in __libc_start_call_main (/lib64/libc.so.6+0x28149) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756) #24 0x7fac48e4620a in __libc_start_main_impl (/lib64/libc.so.6+0x2820a) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756) #25 0x49c5d4 in _start (/home/tristan957/Projects/work/postgresql/build/tmp_install/home/tristan957/.opt/postgresql/bin/postgres+0x49c5d4) (BuildId: c8ca341e1303be0f9dc0b0271c55c4b9e42af89b) Indirect leak of 13 byte(s) in 1 object(s) allocated from: #0 0x7fac4a6d92ef in malloc (/lib64/libasan.so.8+0xd92ef) (BuildId: 7fcb7759bc17ef47f9682414b6d99732d6a6ab0c) #1 0x7fac4a4e106f in xmlStrndup (/lib64/libxml2.so.2+0xba06f) (BuildId: 3074681c8fa9b17e0cbed09bc61c25ada5c28e7c) #2 0x7fac4a4842c0 in xmlNewNode (/lib64/libxml2.so.2+0x5d2c0) (BuildId: 3074681c8fa9b17e0cbed09bc61c25ada5c28e7c) #3 0x22107a6 in xmltotext_with_options ../src/backend/utils/adt/xml.c:754 #4 0xead047 in ExecEvalXmlExpr ../src/backend/executor/execExprInterp.c:4020 #5 0xe8c119 in ExecInterpExpr ../src/backend/executor/execExprInterp.c:1537 #6 0xe91f2c in ExecInterpExprStillValid ../src/backend/executor/execExprInterp.c:1881 #7 0x109632d in ExecEvalExprSwitchContext ../src/include/executor/executor.h:355 #8 0x109655d in ExecProject ../src/include/executor/executor.h:389 #9 0x1097186 in ExecResult ../src/backend/executor/nodeResult.c:136 #10 0xf0f90c in ExecProcNodeFirst ../src/backend/executor/execProcnode.c:464 #11 0xec9bec in ExecProcNode ../src/include/executor/executor.h:273 #12 0xed875c in ExecutePlan ../src/backend/executor/execMain.c:1670 #13 0xecbee0 in standard_ExecutorRun ../src/backend/executor/execMain.c:365 #14 0xecb529 in ExecutorRun ../src/backend/executor/execMain.c:309 #15 0x1ae89f6 in PortalRunSelect ../src/backend/tcop/pquery.c:924 #16 0x1ae7c06 in PortalRun ../src/backend/tcop/pquery.c:768 #17 0x1ad1b43 in exec_simple_query ../src/backend/tcop/postgres.c:1273 #18 0x1adf8de in PostgresMain ../src/backend/tcop/postgres.c:4653 #19 0x170dbce in BackendRun ../src/backend/postmaster/postmaster.c:4464 #20 0x170bf70 in BackendStartup ../src/backend/postmaster/postmaster.c:4140 #21 0x170263f in ServerLoop ../src/backend/postmaster/postmaster.c:1776 #22 0x1701052 in PostmasterMain ../src/backend/postmaster/postmaster.c:1475 #23 0x11a5f67 in main ../src/backend/main/main.c:198 #24 0x7fac48e46149 in __libc_start_call_main (/lib64/libc.so.6+0x28149) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756) #25 0x7fac48e4620a in __libc_start_main_impl (/lib64/libc.so.6+0x2820a) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756) #26 0x49c5d4 in _start (/home/tristan957/Projects/work/postgresql/build/tmp_install/home/tristan957/.opt/postgresql/bin/postgres+0x49c5d4) (BuildId: c8ca341e1303be0f9dc0b0271c55c4b9e42af89b) SUMMARY: AddressSanitizer: 133 byte(s) leaked in 2 allocation(s). -- Tristan Partin Neon (https://neon.tech) -
Re: Fix some ubsan/asan related issues
Junwang Zhao <zhjwpku@gmail.com> — 2024-09-16T13:29:21Z
Hi Tristan, On Tue, Feb 6, 2024 at 11:53 AM Tristan Partin <tristan@neon.tech> wrote: > > On Tue Jan 30, 2024 at 3:58 PM CST, Andres Freund wrote: > > Hi, > > > > On 2024-01-30 09:59:25 -0600, Tristan Partin wrote: > > > From 331cec1c9db6ff60dcc6d9ba62a9c8be4e5e95ed Mon Sep 17 00:00:00 2001 > > > From: Tristan Partin <tristan@neon.tech> > > > Date: Mon, 29 Jan 2024 18:03:39 -0600 > > > Subject: [PATCH v1 1/3] Refuse to register message in LogLogicalMessage if > > > NULL > > > > > If this occurs, the memcpy of rdata_data in CopyXLogRecordToWAL breaks > > > the API contract of memcpy in glibc. The two pointer arguments are > > > marked as nonnull, even in the event the amount to copy is 0 bytes. > > > > It seems pretty odd to call LogLogicalMessage() with a NULL argument. Why is > > that something useful? > > Dropped. Will change on the Neon side. Should we add an assert > somewhere for good measure? Near the memcpy in question? > > > > From dc9488f3fdee69b981b52c985fb77106d7d301ff Mon Sep 17 00:00:00 2001 > > > From: Tristan Partin <tristan@neon.tech> > > > Date: Wed, 24 Jan 2024 17:07:01 -0600 > > > Subject: [PATCH v1 2/3] meson: Support compiling with -Db_sanitize=address > > > > > > The ecpg is parser is extremely leaky, so we need to silence leak > > > detection. > > > > This does stuff beyond epcg... > > Dropped. > > > > +if get_option('b_sanitize').contains('address') > > > + cdata.set('USE_ADDRESS_SANITIZER', 1) > > > +endif > > > > > > ############################################################### > > > # NLS / Gettext > > > diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c > > > index ac409b0006..e18e716d9c 100644 > > > --- a/src/bin/initdb/initdb.c > > > +++ b/src/bin/initdb/initdb.c > > > @@ -338,6 +338,17 @@ do { \ > > > output_failed = true, output_errno = errno; \ > > > } while (0) > > > > > > +#ifdef USE_ADDRESS_SANITIZER > > > > When asan is used __SANITIZE_ADDRESS__ is defined, so we don't need to > > implement this ourselves. > > Thanks! > > > > +const char *__asan_default_options(void); > > > + > > > +const char *__asan_default_options(void) > > > +{ > > > + return "detect_leaks=0"; > > > +} > > > + > > > +#endif > > > > Wonder if we should move this into some static library and link it into all > > binaries that don't want leak detection? It doesn't seem great to have to > > adjust this in a bunch of files if we want to adjust the options... > > See attached patches. Here is what I found to be necessary to get > a -Db_sanitize=address,undefined build to successfully make it through > tests. I do have a few concerns about the patch. > > 1. For whatever reason, __SANITIZE_LEAK__ is not defined when the leak > sanitizer is enabled. So you will see me use this, to make some > include directives work. I don't like this as a final solution > because someone could use -fsanitize=leak. > 2. I tracked down what seems to be a valid leak in adt/xml.c. Attached > (test.sql) is a fairly minimal reproduction of what is needed to show > the leak. I didn't spend too much time tracking it down. Might get to > it later, who knows. Below you will find the backtrace, and whoever > wants to try their hand at fixing it will need to comment out > xmlNewNode in the leak.supp file. > 3. I don't love the new library name. Maybe it should be name more lsan > specific. > 4. Should pg_attribute_no_asan be renamed to > pg_attribute_no_sanitize_address? That would match > pg_attribute_no_sanitize_alignment. > > I will also attach a Meson test log for good measure. I didn't try > testing anything that requires PG_TEST_EXTRA, but I suspect that > everything will be fine. > > Alexander, I haven't yet gotten to the things you pointed out in the > sibling thread. > > ==221848==ERROR: LeakSanitizer: detected memory leaks > > Direct leak of 120 byte(s) in 1 object(s) allocated from: > #0 0x7fac4a6d92ef in malloc (/lib64/libasan.so.8+0xd92ef) (BuildId: 7fcb7759bc17ef47f9682414b6d99732d6a6ab0c) > #1 0x7fac4a48427d in xmlNewNode (/lib64/libxml2.so.2+0x5d27d) (BuildId: 3074681c8fa9b17e0cbed09bc61c25ada5c28e7c) > #2 0x22107a6 in xmltotext_with_options ../src/backend/utils/adt/xml.c:754 > #3 0xead047 in ExecEvalXmlExpr ../src/backend/executor/execExprInterp.c:4020 > #4 0xe8c119 in ExecInterpExpr ../src/backend/executor/execExprInterp.c:1537 > #5 0xe91f2c in ExecInterpExprStillValid ../src/backend/executor/execExprInterp.c:1881 > #6 0x109632d in ExecEvalExprSwitchContext ../src/include/executor/executor.h:355 > #7 0x109655d in ExecProject ../src/include/executor/executor.h:389 > #8 0x1097186 in ExecResult ../src/backend/executor/nodeResult.c:136 > #9 0xf0f90c in ExecProcNodeFirst ../src/backend/executor/execProcnode.c:464 > #10 0xec9bec in ExecProcNode ../src/include/executor/executor.h:273 > #11 0xed875c in ExecutePlan ../src/backend/executor/execMain.c:1670 > #12 0xecbee0 in standard_ExecutorRun ../src/backend/executor/execMain.c:365 > #13 0xecb529 in ExecutorRun ../src/backend/executor/execMain.c:309 > #14 0x1ae89f6 in PortalRunSelect ../src/backend/tcop/pquery.c:924 > #15 0x1ae7c06 in PortalRun ../src/backend/tcop/pquery.c:768 > #16 0x1ad1b43 in exec_simple_query ../src/backend/tcop/postgres.c:1273 > #17 0x1adf8de in PostgresMain ../src/backend/tcop/postgres.c:4653 > #18 0x170dbce in BackendRun ../src/backend/postmaster/postmaster.c:4464 > #19 0x170bf70 in BackendStartup ../src/backend/postmaster/postmaster.c:4140 > #20 0x170263f in ServerLoop ../src/backend/postmaster/postmaster.c:1776 > #21 0x1701052 in PostmasterMain ../src/backend/postmaster/postmaster.c:1475 > #22 0x11a5f67 in main ../src/backend/main/main.c:198 > #23 0x7fac48e46149 in __libc_start_call_main (/lib64/libc.so.6+0x28149) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756) > #24 0x7fac48e4620a in __libc_start_main_impl (/lib64/libc.so.6+0x2820a) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756) > #25 0x49c5d4 in _start (/home/tristan957/Projects/work/postgresql/build/tmp_install/home/tristan957/.opt/postgresql/bin/postgres+0x49c5d4) (BuildId: c8ca341e1303be0f9dc0b0271c55c4b9e42af89b) > > Indirect leak of 13 byte(s) in 1 object(s) allocated from: > #0 0x7fac4a6d92ef in malloc (/lib64/libasan.so.8+0xd92ef) (BuildId: 7fcb7759bc17ef47f9682414b6d99732d6a6ab0c) > #1 0x7fac4a4e106f in xmlStrndup (/lib64/libxml2.so.2+0xba06f) (BuildId: 3074681c8fa9b17e0cbed09bc61c25ada5c28e7c) > #2 0x7fac4a4842c0 in xmlNewNode (/lib64/libxml2.so.2+0x5d2c0) (BuildId: 3074681c8fa9b17e0cbed09bc61c25ada5c28e7c) > #3 0x22107a6 in xmltotext_with_options ../src/backend/utils/adt/xml.c:754 > #4 0xead047 in ExecEvalXmlExpr ../src/backend/executor/execExprInterp.c:4020 > #5 0xe8c119 in ExecInterpExpr ../src/backend/executor/execExprInterp.c:1537 > #6 0xe91f2c in ExecInterpExprStillValid ../src/backend/executor/execExprInterp.c:1881 > #7 0x109632d in ExecEvalExprSwitchContext ../src/include/executor/executor.h:355 > #8 0x109655d in ExecProject ../src/include/executor/executor.h:389 > #9 0x1097186 in ExecResult ../src/backend/executor/nodeResult.c:136 > #10 0xf0f90c in ExecProcNodeFirst ../src/backend/executor/execProcnode.c:464 > #11 0xec9bec in ExecProcNode ../src/include/executor/executor.h:273 > #12 0xed875c in ExecutePlan ../src/backend/executor/execMain.c:1670 > #13 0xecbee0 in standard_ExecutorRun ../src/backend/executor/execMain.c:365 > #14 0xecb529 in ExecutorRun ../src/backend/executor/execMain.c:309 > #15 0x1ae89f6 in PortalRunSelect ../src/backend/tcop/pquery.c:924 > #16 0x1ae7c06 in PortalRun ../src/backend/tcop/pquery.c:768 > #17 0x1ad1b43 in exec_simple_query ../src/backend/tcop/postgres.c:1273 > #18 0x1adf8de in PostgresMain ../src/backend/tcop/postgres.c:4653 > #19 0x170dbce in BackendRun ../src/backend/postmaster/postmaster.c:4464 > #20 0x170bf70 in BackendStartup ../src/backend/postmaster/postmaster.c:4140 > #21 0x170263f in ServerLoop ../src/backend/postmaster/postmaster.c:1776 > #22 0x1701052 in PostmasterMain ../src/backend/postmaster/postmaster.c:1475 > #23 0x11a5f67 in main ../src/backend/main/main.c:198 > #24 0x7fac48e46149 in __libc_start_call_main (/lib64/libc.so.6+0x28149) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756) > #25 0x7fac48e4620a in __libc_start_main_impl (/lib64/libc.so.6+0x2820a) (BuildId: 7ea8d85df0e89b90c63ac7ed2b3578b2e7728756) > #26 0x49c5d4 in _start (/home/tristan957/Projects/work/postgresql/build/tmp_install/home/tristan957/.opt/postgresql/bin/postgres+0x49c5d4) (BuildId: c8ca341e1303be0f9dc0b0271c55c4b9e42af89b) > > SUMMARY: AddressSanitizer: 133 byte(s) leaked in 2 allocation(s). > > -- > Tristan Partin > Neon (https://neon.tech) I tried your v1-0002, it works at compile phase but failed to run initdb with the following leak detected: ================================================================= ==64983==ERROR: LeakSanitizer: detected memory leaks Direct leak of 248 byte(s) in 1 object(s) allocated from: #0 0x7fc7729df9cf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69 #1 0x55bff5480e8b in save_ps_display_args ../postgres/src/backend/utils/misc/ps_status.c:190 #2 0x55bff4a5a298 in main ../postgres/src/backend/main/main.c:90 #3 0x7fc771924249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 Indirect leak of 19 byte(s) in 1 object(s) allocated from: #0 0x7fc77299777b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439 #1 0x55bff5480f41 in save_ps_display_args ../postgres/src/backend/utils/misc/ps_status.c:198 #2 0x55bff4a5a298 in main ../postgres/src/backend/main/main.c:90 #3 0x7fc771924249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 I worked around by moving *new_environ* as a global variable, I think this is false positive report so maybe this deserves a patch? I tried to apply your v2 patch set but v2-0004 seems out of date. -- Regards Junwang Zhao -
Re: Fix some ubsan/asan related issues
Tristan Partin <tristan@partin.io> — 2024-10-02T04:02:27Z
On Mon Sep 16, 2024 at 9:29 AM EDT, Junwang Zhao wrote: > I tried your v1-0002, it works at compile phase but failed to run initdb > with the following leak detected: > > ================================================================= > ==64983==ERROR: LeakSanitizer: detected memory leaks > > Direct leak of 248 byte(s) in 1 object(s) allocated from: > #0 0x7fc7729df9cf in __interceptor_malloc > ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69 > #1 0x55bff5480e8b in save_ps_display_args > ../postgres/src/backend/utils/misc/ps_status.c:190 > #2 0x55bff4a5a298 in main ../postgres/src/backend/main/main.c:90 > #3 0x7fc771924249 in __libc_start_call_main > ../sysdeps/nptl/libc_start_call_main.h:58 > > Indirect leak of 19 byte(s) in 1 object(s) allocated from: > #0 0x7fc77299777b in __interceptor_strdup > ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439 > #1 0x55bff5480f41 in save_ps_display_args > ../postgres/src/backend/utils/misc/ps_status.c:198 > #2 0x55bff4a5a298 in main ../postgres/src/backend/main/main.c:90 > #3 0x7fc771924249 in __libc_start_call_main > ../sysdeps/nptl/libc_start_call_main.h:58 > > I worked around by moving *new_environ* as a global variable, I think this is > false positive report so maybe this deserves a patch? > > I tried to apply your v2 patch set but v2-0004 seems out of date. Thanks for giving it a try Junwang. I'll rebase this, hopefully tomorrow and give it another push. -- Tristan Partin Neon (https://neon.tech)