[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: PMA leak in functions
From: |
arnold |
Subject: |
Re: PMA leak in functions |
Date: |
Wed, 26 Apr 2023 09:37:51 -0600 |
User-agent: |
Heirloom mailx 12.5 7/5/10 |
Hello.
I ran your program with valgrind and that gave me a clue as to what's
happening.
The behavior you've found is an artifact of your program.
PROCINFO["identifiers"] is an array indexed by all the global identifiers
in the program indicating what they are (array, variable, function, etc.).
You have a million functions. PROCINFO["identifiers"] is being rebuilt each
time gawk runs, using up memory that isn't freed. If I delete the PROCINFO
array just before main() exits, heap file usage grows very slowly.
In a normal program, this wouldn't be a problem at all.
At the moment, I don't see a reason to make any code changes. If you
wish to test how gawk manages memory used by function calls, it can
be done with one or two recursive functions; there's no need for
a million functions that all do the same thing.
Thanks,
Arnold
"KochWilfried@t-online.de" <KochWilfried@t-online.de> wrote:
> -----Original-Nachricht-----
> Betreff: PMA leak in functions
> Datum: 2023-04-23T08:53:14+0200
> Von: "KochWilfried@t-online.de" <KochWilfried@t-online.de>
> An: "bug-gawk@gnu.org" <bug-gawk@gnu.org>
>
> Configuration Information [Automatically generated, do not change]:
> Machine: x86_64
> OS: linux-gnu
> Compiler: gcc
> Compilation CFLAGS: -g -O2 -DNDEBUG
> uname output: Linux katan 6.2.0-20-generic #20-Ubuntu SMP PREEMPT_DYNAMIC
> Thu Apr 6 07:48:48 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
> Machine Type: x86_64-pc-linux-gnu
> Gawk Version: 5.2.1c
> Attestation 1:
> I have read https://www.gnu.org/software/gawk/manual/html_node/Bugs.html.
> Yes / No [ Choose one. If "No", then why haven't you? ]
> Attestation 2:
> I have not modified the sources before building gawk.
> True / False
> [ Choose one. If "False", then please explain what you did and why. ]
> Description:
> 5.2.1c doesn't have PMA leaks anymore on deleted arrays.
> Thank you for that.
> But there are PMA leaks on functions or their call stack.
>
> [Detailed description of the problem, suggestion, or complaint.]
> the bash script below ##### creates the awk program fun.awk with 1 Mln
> functions
> that call each other in a row and increase the parameter i by 1.
> The expected output is 1000001.
> I non PMA-Mode ist cyles 50 times with out problems.
> gawk -f fun.awk
> runs about 5 minutes, mostly busy with parsing.
>
>
> In PMA Mode the 8GB heap file is exhausted after 15 runs.
> rm heap.pma;truncate -s 8192000000 heap.pma;(export
> GAWK_PERSIST_FILE=heap.pma && /usr/bin/time gawk -f fun.awk ); for i in
> $(seq 1 1000 ) ; do (export GAWK_PERSIST_FILE=heap.pma && echo $i &&
> /usr/bin/time gawk 'BEGIN{print @fp_fun1000(1)}') ; rc=$? ; if [[ $rc != 0
> ]] ; then break ; fi ;done
> 57.29user 1.94system 0:59.31elapsed 99%CPU (0avgtext+0avgdata
> 2102272maxresident)k
> 0inputs+6418328outputs (16427major+786025minor)pagefaults 0swaps
> 1
> 1000001
> 3.09user 1.54system 0:04.64elapsed 99%CPU (0avgtext+0avgdata
> 2518272maxresident)k
> 0inputs+2494448outputs (5758major+535259minor)pagefaults 0swaps
> 2
> 1000001
> 2.02user 0.69system 0:02.72elapsed 99%CPU (0avgtext+0avgdata
> 2518272maxresident)k
> 0inputs+2480464outputs (3067major+443710minor)pagefaults 0swaps
> 3
> 1000001
> 2.14user 0.67system 0:02.81elapsed 100%CPU (0avgtext+0avgdata
> 2528256maxresident)k
> 0inputs+782832outputs (3066major+446319minor)pagefaults 0swaps
> ...........
> 14
> 1000001
> 2.00user 0.78system 0:02.79elapsed 100%CPU (0avgtext+0avgdata
> 2529280maxresident)k
> 0inputs+2415624outputs (3065major+446339minor)pagefaults 0swaps
> 15
> gawk: fatal: node.c:1075:more_blocks: freep: cannot allocate 4800 bytes of
> memory: Invalid or incomplete multibyte or wide character
> Command exited with non-zero status 2
> 0.78user 0.39system 0:01.18elapsed 99%CPU (0avgtext+0avgdata
> 1897216maxresident)k
> 0inputs+141944outputs (559major+285563minor)pagefaults 0swaps
>
> #######
> export MX=1000000
> awk '
> BEGIN {
> MX=ENVIRON["MX"]
> fx=length(MX)
> for(i=MX;i>0;i--){
> print "function fun"i"(i){" > "fun.awk"
> print " i=i+1" > "fun.awk"
> if(i!=MX) {
> print " return @fp_fun"i+1"(i)" > "fun.awk"
> }else{
> print " return i" > "fun.awk"
> }
> print "}" >"fun.awk"
> }
> print "BEGIN{" >"fun.awk"
> for(i=MX;i>0;i--){
> print "fp_fun"i"=\"fun"i"\"" > "fun.awk"
> }
> print "if(\"GAWK_PERSIST_FILE\" in ENVIRON){" > "fun.awk"
> print "}else{" > "fun.awk"
> print " for(i=1;i<50;i++){" > "fun.awk"
> print " print ""@fp_fun"1"(1)" > "fun.awk"
> print " }" > "fun.awk"
> print "}" > "fun.awk"
> print "}" > "fun.awk"
> exit(0)
> }
> '
> #######
>
> Repeat-By:
> [Describe the sequence of events that causes the problem to occur.]
> Fix:
> [Description of how to fix the problem. If you don't know a
> fix for the problem, don't include this section.]
>
>