[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: whitespace vs. POSIX
From: |
Eric Blake |
Subject: |
Re: whitespace vs. POSIX |
Date: |
Mon, 23 Oct 2006 21:33:17 -0600 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Thunderbird/1.5.0.7 Mnenhy/0.7.4.666 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
According to Eric Blake on 10/21/2006 7:56 PM:
>
> $ printf 'len(\f)
> ' | m4
>
> should output 1 per a strict reading of POSIX, but outputs 0 on every
> implementation I have access to. I will wait until the Austin group makes
> a ruling on my aardvark before deciding whether it is worth patching GNU
> m4 to use isblank() instead of isspace() when POSIXLY_CORRECT.
In the meantime, it was worth adding a testcase. I'm attaching the patch
for the branch, but the patch for head is similar.
2006-10-23 Eric Blake <address@hidden>
* doc/m4.texinfo (Macro Arguments): Document that leading space
in argument collection stops at macro expansion.
(Incompatibilities): Document POSIX whitespace wording issue.
- --
Life is short - so eat dessert first!
Eric Blake address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFFPYkH84KuGfSFAYARArwUAKCHJKaLtn7Y/eznTtuZnHy4I+knMwCgtZr7
NP+VdCrA1G2dDtLPik0mJco=
=g9lM
-----END PGP SIGNATURE-----
Index: doc/m4.texinfo
===================================================================
RCS file: /sources/m4/m4/doc/m4.texinfo,v
retrieving revision 1.1.1.1.2.91
diff -u -p -r1.1.1.1.2.91 m4.texinfo
--- doc/m4.texinfo 21 Oct 2006 02:55:56 -0000 1.1.1.1.2.91
+++ doc/m4.texinfo 24 Oct 2006 03:32:28 -0000
@@ -1113,9 +1113,32 @@ If the name is followed by an opening pa
collected before the macro is called. If too few arguments are
supplied, the missing arguments are taken to be the empty string.
However, some builtins are documented to behave differently for a
-missing optional argument than for an explicit empty string. If
-there are too many arguments, the excess arguments are ignored.
-Unquoted leading whitespace is stripped off all arguments.
+missing optional argument than for an explicit empty string. If there
+are too many arguments, the excess arguments are ignored. Unquoted
+leading whitespace is stripped off all arguments, but whitespace
+generated by a macro expansion or occuring after a macro that expanded
+to an empty string remains intact. Whitespace includes space, tab,
+newline, carriage return, vertical tab, and formfeed.
+
address@hidden
+define(`macro', `$1')
address@hidden
+macro( unquoted leading space lost)
address@hidden leading space lost
+macro(` quoted leading space kept')
address@hidden quoted leading space kept
+macro(
+ divert `unquoted space kept after expansion')
address@hidden unquoted space kept after expansion
+macro(macro(`
+')`whitespace from expansion kept')
address@hidden
address@hidden from expansion kept
+macro(`unquoted trailing whitespace kept'
+)
address@hidden trailing whitespace kept
address@hidden
address@hidden example
Normally @code{m4} will issue warnings if a builtin macro is called
with an inappropriate number of arguments, but it can be suppressed with
@@ -1136,7 +1159,7 @@ bar(a foo, d)
@noindent
is a macro call with four arguments, which are @samp{a }, @samp{b},
@samp{c} and @samp{d}. To understand why the first argument contains
-whitespace, remember that leading unquoted whitespace is never part
+whitespace, remember that unquoted leading whitespace is never part
of an argument, but trailing whitespace always is.
It is possible for a macro's definition to change during argument
@@ -1168,7 +1191,7 @@ define(
@cindex quoted macro arguments
@cindex macros, quoted arguments to
@cindex arguments, quoted macro
-Each argument has leading unquoted whitespace removed. Within each
+Each argument has unquoted leading whitespace removed. Within each
argument, all unquoted parentheses must match. For example, if
@var{foo} is a macro,
@@ -5388,6 +5411,14 @@ each character of the second and third a
variables of @env{LANG}, @env{LC_ALL}, @env{LC_CTYPE},
@env{LC_MESSAGES}, and @env{NLSPATH}, but this has not yet been
implemented in @acronym{GNU} @code{m4}.
+
address@hidden
address@hidden states that only unquoted leading newlines and blanks
+(that is, space and tab) are ignored when collecting macro arguments.
+However, this appears to be a bug in @acronym{POSIX}, since most
+traditional implementations also ignore all whitespace (formfeed,
+carriage return, and vertical tab). @acronym{GNU} @code{m4} follows
+tradition and ignores all leading unquoted whitespace.
@end itemize
@node Other Incompatibilities