Re: [bug-gawk] Interpretation of escape sequences in variable content

bug-gawk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Interpretation of escape sequences in variable content

From:	Aharon Robbins
Subject:	Re: [bug-gawk] Interpretation of escape sequences in variable content
Date:	Wed, 18 Nov 2015 06:28:16 +0200
User-agent:	Heirloom mailx 12.5 6/20/10

Hello.

> Date: Tue, 17 Nov 2015 18:25:14 +0100
> From: Steffen Nurpmeso <address@hidden>
> To: address@hidden
> Subject: [bug-gawk] Interpretation of escape sequences in variable content
> Status: R
>
> Hello!
>
> Long story: after switching my machine i regulary work with
>   GNU/Linux again (first time since short after Debian Woody) and
>   finally got frustrated about a dramatical performance impact that
>   can be seen when running the configuration script of the MUA
>   i maintain.  I found out that reading a file of 432 lines via
>   redirected input via a loop like
>
>     while read line; do
>        line="`echo ${line} |\
>              ${sed} -e '/^[         ]*#/d' -e '/^$/d' -e 's/[       ]*$//'`"
>        [ -z "${line}" ] && continue
>
>   took over 14 seconds in LC_ALL=C environment.  This is GNU sed.

Something's really wrong.

>   Now that frustration was finally big enough i've switched all that
>   to awk(1) changing the look to
>
>     while read line; do
>        line=`${awk} -v LINE="${line}" 'BEGIN{
>           gsub(/^[[:space:]]+/, "", LINE);\
>           gsub(/[[:space:]]+$/, "", LINE);\
>           if(index(LINE, "#") == 1)\
>              LINE = "";\
>           print LINE
>        }'`
>        [ -z "${line}" ] && continue
>
>   That is faster.  :P  With nawk it once again is is >~40% percent
>   faster, so i could half the time to 7.7 seconds for me.  (mawk is
>   comparable.)
>
> But now back to GNU awk :-).
> The problem manifests in a message like
>
>   gawk: warning: escape sequence `\$' treated as plain `$'
>
> that happens to happen if LINE a.k.a. ${line} looks like
>
>   #@ To embed a shell variable unexpanded, use two: "XY=\${HOME}".
>
> This however also happens with --traditional, and i think that
> should not happen, that seems to be a bug.  And neither nawk nor
> mawk complain.

Gawk is better than nawk and mawk. :-)  The warning became permanent
when I myself used a bad escape sequence.

The warning isn't going to go away, although perhaps (perhaps!) it is
a minor bug to issue it for command line assignments.

You have a few options.

1. Switch to nawk or mawk

2. Use awk '...' 2> /dev/null to throw the warning away

3. Recommended: Let awk parse the whole file for you instead of using
the shell to read it one line at a time. That should speed up the
script substantially:

        awk '{
                gsub(/^[[:space:]]+/, "")
                gsub(/[[:space:]]+$/, "")
                if (index($0, "#") == 1)
                        next
                if ($0 != "")
                        print $0
        }' your-input-file

HTH,

Arnold

[Prev in Thread]

Current Thread

[Next in Thread]

[bug-gawk] Interpretation of escape sequences in variable content, Steffen Nurpmeso, 2015/11/17
- Re: [bug-gawk] Interpretation of escape sequences in variable content, Aharon Robbins <=
  - Re: [bug-gawk] Interpretation of escape sequences in variable content, Steffen Nurpmeso, 2015/11/18
    - Re: [bug-gawk] Interpretation of escape sequences in variable content, Steffen Nurpmeso, 2015/11/18

Prev by Date: [bug-gawk] Interpretation of escape sequences in variable content
Next by Date: Re: [bug-gawk] Interpretation of escape sequences in variable content
Previous by thread: [bug-gawk] Interpretation of escape sequences in variable content
Next by thread: Re: [bug-gawk] Interpretation of escape sequences in variable content
Index(es):
- Date
- Thread