[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] Interpretation of escape sequences in variable content
From: |
Aharon Robbins |
Subject: |
Re: [bug-gawk] Interpretation of escape sequences in variable content |
Date: |
Wed, 18 Nov 2015 06:28:16 +0200 |
User-agent: |
Heirloom mailx 12.5 6/20/10 |
Hello.
> Date: Tue, 17 Nov 2015 18:25:14 +0100
> From: Steffen Nurpmeso <address@hidden>
> To: address@hidden
> Subject: [bug-gawk] Interpretation of escape sequences in variable content
> Status: R
>
> Hello!
>
> Long story: after switching my machine i regulary work with
> GNU/Linux again (first time since short after Debian Woody) and
> finally got frustrated about a dramatical performance impact that
> can be seen when running the configuration script of the MUA
> i maintain. I found out that reading a file of 432 lines via
> redirected input via a loop like
>
> while read line; do
> line="`echo ${line} |\
> ${sed} -e '/^[ ]*#/d' -e '/^$/d' -e 's/[ ]*$//'`"
> [ -z "${line}" ] && continue
>
> took over 14 seconds in LC_ALL=C environment. This is GNU sed.
Something's really wrong.
> Now that frustration was finally big enough i've switched all that
> to awk(1) changing the look to
>
> while read line; do
> line=`${awk} -v LINE="${line}" 'BEGIN{
> gsub(/^[[:space:]]+/, "", LINE);\
> gsub(/[[:space:]]+$/, "", LINE);\
> if(index(LINE, "#") == 1)\
> LINE = "";\
> print LINE
> }'`
> [ -z "${line}" ] && continue
>
> That is faster. :P With nawk it once again is is >~40% percent
> faster, so i could half the time to 7.7 seconds for me. (mawk is
> comparable.)
>
> But now back to GNU awk :-).
> The problem manifests in a message like
>
> gawk: warning: escape sequence `\$' treated as plain `$'
>
> that happens to happen if LINE a.k.a. ${line} looks like
>
> #@ To embed a shell variable unexpanded, use two: "XY=\${HOME}".
>
> This however also happens with --traditional, and i think that
> should not happen, that seems to be a bug. And neither nawk nor
> mawk complain.
Gawk is better than nawk and mawk. :-) The warning became permanent
when I myself used a bad escape sequence.
The warning isn't going to go away, although perhaps (perhaps!) it is
a minor bug to issue it for command line assignments.
You have a few options.
1. Switch to nawk or mawk
2. Use awk '...' 2> /dev/null to throw the warning away
3. Recommended: Let awk parse the whole file for you instead of using
the shell to read it one line at a time. That should speed up the
script substantially:
awk '{
gsub(/^[[:space:]]+/, "")
gsub(/[[:space:]]+$/, "")
if (index($0, "#") == 1)
next
if ($0 != "")
print $0
}' your-input-file
HTH,
Arnold