[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] v4.1.3 (run on OSX 10.11.3): potential gsub() bug
From: |
Aharon Robbins |
Subject: |
Re: [bug-gawk] v4.1.3 (run on OSX 10.11.3): potential gsub() bug |
Date: |
Sun, 31 Jan 2016 21:21:27 +0200 |
User-agent: |
Heirloom mailx 12.5 6/20/10 |
Hi.
Thanks for the notes. The current code base should not dump core,
although I see that with stock 4.1.3.
The questions raised are messy. I don't have good answers. I think
that if you use [...] with real UTF-8 encoded characters as the
start and end point of the ranges, things will work OK. But I'm not
sure.
There is no intent to support things like \x10f7ff. If such a thing
works it's by accident and it won't last; the master branch was changed
to accept no more than two hex digits after \x.
I am not in a rush to add things like \uXXXX to gawk.
For now, you are probably best off avoiding things like [\x80-\xFF] in
Unicode locales. Or using LC_ALL=C.
Thanks,
Arnold
P.S. I'm curious what current GNU grep does with such things? Thanks