--- Begin Message ---
Subject: |
Re: case insensitivity (rather a wish than a bug) |
Date: |
Sun, 8 Sep 2002 16:20:29 -0400 |
User-agent: |
Mutt/1.4i |
Bastian --
...and then Bastian Fuchs said...
%
% Hello,
% it would be nice, if there is an option to sort alphabetically WITHOUT
% consideration of upper oder lower case.
%
% This is the standard output of ls -l:
% -rw-r--r-- 1 bastiaf users 0 Sep 8 01:36 FIRST_EXAMPLE
% -rw-r--r-- 1 bastiaf users 0 Sep 8 01:36 SECOND_EXAMPLE
% -rw-r--r-- 1 bastiaf users 0 Sep 8 01:36 first_example
% -rw-r--r-- 1 bastiaf users 0 Sep 8 01:36 second_example
%
% It would be nice, if ls -l --sort=caseinsensitivity for example would show:
% -rw-r--r-- 1 bastiaf users 0 Sep 8 01:36 FIRST_EXAMPLE
% -rw-r--r-- 1 bastiaf users 0 Sep 8 01:36 first_example
% -rw-r--r-- 1 bastiaf users 0 Sep 8 01:36 SECOND_EXAMPLE
% -rw-r--r-- 1 bastiaf users 0 Sep 8 01:36 second_example
As a matter of fact, since fileutils 4.1, that is the standard behavior
when your environment is not set to POSIX. Check out your
LANG
LC_CTYPE
LC_NUMERIC
LC_TIME
LC_COLLATE
LC_MONETARY
LC_MESSAGES
LC_PAPER
LC_NAME
LC_ADDRESS
LC_TELEPHONE
LC_MEASUREMENT
LC_IDENTIFICATION
LC_ALL
variables as well as the output of
locale
(which should spit all of that out) and
ls --version
to see that you're actually running 4.1 or later and we're not debugging
the wrong problem. The short form, pruned from how it was explained to
me just a few months ago, looks like:
... I am not pleased with Redhat and
Gnome. Those two groups together have caused a huge problem out of
this really simple and good feature. Both of them by default set
LANG=en_US ...
When the discussion of localization and internationalization has ever
come up in the long history of the Internet and standards
organizations the US software body has always had a black eye since we
traditionally ignored anyone who did not speak English ...
... Eventually it was agreed upon only after very,
very, VERY much debate how i18n (internationalization) was to be
implemented. ...
Therefore everyone implements i18n support in their code by calling
strcoll(3) routines instead of strcmp(3) routines. The libc takes
care of everything for you. For the traditional unix user they see no
difference at all. New variables never before seen in a unix
environment such as LANG and LC_ALL were implemented to switch on this
new feature but it would be switched on only if the user specifically
asked for it to be switched on. Full compatibility is preserved.
That is the way it should work. Full compatibility is preserved.
Then along comes Redhat. They decide that users really want
dictionary sorting order and set LANG=en_US ...
... They don't change the bug reporting address and
so GNU gets the bug reports. ... I just wish the disgruntlement
was directed toward RH and not to GNU ...
Of course you are hosed if a vendor sets that variable for you. Then
you do have to know to clean it out of your environment first. Thank
you for knowing what is best for me. Phooey!
Anyway, in my case I had
LANG=
but
LC_CTYPE=en_US.ISO8859-1
and so I was using case-insensitive dictionary sort order; the answer was
to either set everything to POSIX or to nothing or to be sure to set
LC_CTYPE and LC_COLLATE to POSIX. You're probably in the same boat.
%
% - Bastian
HTH & HAND
:-D
--
David T-G * It's easier to fight for one's principles
(play) address@hidden * than to live up to them. -- fortune cookie
(work) address@hidden
http://www.justpickone.org/davidtg/ Shpx gur Pbzzhavpngvbaf Qrprapl Npg!
pgpj5RnduLnKv.pgp
Description: PGP signature
--- End Message ---
--- Begin Message ---
Subject: |
Re: ls sort order bug |
Date: |
Fri, 15 Nov 2002 06:04:29 -0500 |
User-agent: |
Mutt/1.4i |
Matthew --
...and then Matthew Vanecek said...
%
% The ls man page advertises that ls will "Sort entries alphabetically if
What version of ls on what operating system, please?
% none of -cftuSUX" are specified. -a is supposed to show the .
% directories/files.
Right.
%
% The bug is that the . files/directories are intermixed with the other
% files/directories, and that lower case and upper case files/directories
% are intermixed. To sort alphabetically, the . files must come first,
% followed by Upper case files, followed by lower case files.
% I cannot find a combination of options that outputs the expected
% listing.
It sounds to me like you have a 4.1 or later version of fileutils (run
ls --version
to check) and do not have the proper localization environment variables
set for the results you want. I assure you that ls can be directed to
either fold or honor case, and I've never seen . and .. anywhere except
at the top of the listing.
To make a painfully long story short, here is some paraphrased background
purely from memory (and thus subject to inaccuracy on many counts, but I
hope at least a start):
- When *NIX utilities were first written, program behavior, user
interface, and error messages were all coded directly within the
program. Easy and simple, but no support for other languages.
- When the POSIX standard was developed, support for internationalization
was designed in from the start, allowing hooks to language and locale
specs so that a program could talk to the user the way the user wants
to hear it. As of 4.1, the GNU FileUtils are fully POSIX compliant.
- Unfortunately, certain operating system vendors (I will not mention
specifically one known for its crimson fedora, since I do not use it,
but it is my understanding that they are a very common source of this
problem) assume that their users will want a certain behavior -- say,
to fold case as in MS Windows Explorer -- and set the LANG* and LC_*
variables accordingly -- WITHOUT TELLING THE USER who is setting up the
system.
- The GNU FileUtils team then gets the "bug report" when, in fact, all
they've done is write good code and move to a more universal standard.
As I said, I've never seen . and .. anywhere but up front; if you can
document aberrant behavior, I'm sure we'd be interested in seeing your
results. If that's not the case, then you'll need to first check your
LANG
LC_CTYPE
LC_NUMERIC
LC_TIME
LC_COLLATE
LC_MONETARY
LC_MESSAGES
LC_PAPER
LC_NAME
LC_ADDRESS
LC_TELEPHONE
LC_MEASUREMENT
LC_IDENTIFICATION
LC_ALL
environment variables to make sure that at least LANG, LC_CTYPE, and
LC_COLLATE are not set to en_US. A quote from when *I* went through the
investation of this back in June 2002:
So although I didn't have LANG or LC_ALL set I did have LC_CTYPE and/or
LC_COLLATE and those screwed me until I either set LANG and/or LC_ALL to
POSIX or unset LC_CTYPE and/or LC_COLLATE. Ouch.
So who the hell decided that en_US would want to sort without case
sensitivity?? Damned Microsofties! ;-)
%
% --
% Matthew Vanecek <address@hidden>
HTH & HAND
:-D
--
David T-G * There is too much animal courage in
(play) address@hidden * society and not sufficient moral courage.
(work) address@hidden -- Mary Baker Eddy, "Science and Health"
http://www.justpickone.org/davidtg/ Shpx gur Pbzzhavpngvbaf Qrprapl Npg!
pgpyb8aNoavg1.pgp
Description: PGP signature
--- End Message ---