m4-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Another POSIX incompatibility


From: Eric Blake
Subject: Re: Another POSIX incompatibility
Date: Wed, 01 Nov 2006 06:44:08 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Thunderbird/1.5.0.7 Mnenhy/0.7.4.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Eric Blake on 10/31/2006 8:43 AM:
> The Austin group just released its first draft of the next revision of POSIX; 
> you have to be a member of the Austin group to read it (registration is 
> free).  
> In comparing the proposed changes against current POSIX, I found another GNU 
> m4 
> bug:
> 
> $ echo hi > foo
> $ /usr/xpg4/bin/m4 -Dhi=1 foo -Dhi=2 foo
> 1
> 2
> $ m4 -Dhi=1 foo -Dhi=2 foo
> 2
> 2
> 
> The proposed POSIX wording adds "the order of the −D and −U options shall be
> significant, and options can be interspersed with operands", validating the 
> Solaris behavior.  Patch to follow soon (I guess it's a good thing I wasn't 
> able to release m4 1.4.8 last weekend, after all).
> 

Here's the patch for the branch.  I'm thinking that for the head, maybe I
should turn --synclines into an option with an optional argument, so that
you can turn synclines off between files as well as on by using -s0 (and
keeping -s as a synonym for -s1).  And maybe it is time to think about
adding command line options --pushdef, --popdef, and --traceoff (of those,
only --pushdef might be a candidate for a short option -p), for even more
inter-file operational capability.  Head also would support -m and -r
between files (mainly because it was easiest to use the existing deferred
argument handling mechanism to achieve this patch, by treating intermixed
files as deferred arguments to option '\1').

2006-11-01  Eric Blake  <address@hidden>

        * doc/m4.texinfo (Invoking m4): Update according to POSIX 200x
        draft wording.
        * src/m4.h (m4_path_search): Tweak signature.
        * src/path.c (m4_path_search): Likewise.
        * src/builtin.c (include): Update caller.
        * src/m4.c (main): Allow -D, -U, -t, and -s to be interspersed
        with file names.  Don't write to **argv.
        (process_file): New helper method.
        * NEWS: Document this fix.

- --
Life is short - so eat dessert first!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFSKSn84KuGfSFAYARAhOhAJ9VN4T7hPGYP3bWfRd4korQaHFzNgCfSR4e
qK4E6aqw1HrFpN/aocuHm4s=
=MJrB
-----END PGP SIGNATURE-----
Index: NEWS
===================================================================
RCS file: /sources/m4/m4/NEWS,v
retrieving revision 1.1.1.1.2.78
diff -u -p -r1.1.1.1.2.78 NEWS
--- NEWS        31 Oct 2006 14:13:41 -0000      1.1.1.1.2.78
+++ NEWS        1 Nov 2006 13:34:21 -0000
@@ -44,6 +44,9 @@ Version 1.4.8 - ?? ??? 2006, by ??  (CVS
   argument the same as if it were missing, rather than using the empty
   string and making it impossible to end a comment or quote.
 * The `translit' macro now operates in linear instead of quadratic time.
+* The `-D', `-U', `-s', and `-t' command line options now take effect
+  after any files encountered earlier on the command line, rather than up
+  front, as is done in traditional implementations and required by POSIX.
 
 Version 1.4.7 - 25 September 2006, by Eric Blake  (CVS version 1.4.6a)
 
Index: doc/m4.texinfo
===================================================================
RCS file: /sources/m4/m4/doc/m4.texinfo,v
retrieving revision 1.1.1.1.2.96
diff -u -p -r1.1.1.1.2.96 m4.texinfo
--- doc/m4.texinfo      31 Oct 2006 14:13:41 -0000      1.1.1.1.2.96
+++ doc/m4.texinfo      1 Nov 2006 13:34:25 -0000
@@ -495,19 +495,22 @@ The format of the @code{m4} command is:
 @cindex @env{POSIXLY_CORRECT}
 All options begin with @samp{-}, or if long option names are used, with
 @samp{--}.  A long option name need not be written completely, any
-unambiguous prefix is sufficient.  Unless @env{POSIXLY_CORRECT} is set
-in the environment, options may be intermixed with files.  The argument
address@hidden is a marker to denote the end of options.
+unambiguous prefix is sufficient.  @acronym{POSIX} requires @code{m4} to
+recognize arguments intermixed with files, even when
address@hidden is set in the environment.  Most options take
+effect at startup regardless of their position, but some are documented
+below as taking effect after any files that occurred earlier in the
+command line.  The argument @option{--} is a marker to denote the end of
+options.
 
 With short options, options that do not take arguments may be combined
 into a single command line argument with subsequent options, options
 with mandatory arguments may be provided either as a single command line
 argument or as two arguments, and options with optional arguments must
-be provided as a single argument.  In other words, without
address@hidden, @kbd{m4 -QPDfoo -d a -d+f} is equivalent to
+be provided as a single argument.  In other words,
address@hidden -QPDfoo -d a -d+f} is equivalent to
 @kbd{m4 -Q -P -D foo -d -d+f -- ./a}, although the latter form is
-considered canonical.  (With @env{POSIXLY_CORRECT}, it is equivalent to
address@hidden -Q -P -D foo -d -- ./a ./-d+f}).
+considered canonical.
 
 With long options, options with mandatory arguments may be provided with
 an equal sign (@samp{=}) in a single argument, or as two arguments, and
@@ -596,8 +599,9 @@ This enters @var{NAME} into the symbol t
 read.  If @address@hidden is missing, the value is taken to be the
 empty string.  The @var{VALUE} can be any string, and the macro can be
 defined to take arguments, just as if it was defined from within the
-input.  This option may be given more than once; order is significant,
-and redefining the same @var{NAME} loses the previous value.
+input.  This option may be given more than once; order with respect to
+file names is significant, and redefining the same @var{NAME} loses the
+previous value.
 
 @item -I @var{DIRECTORY}
 @itemx address@hidden
@@ -608,7 +612,8 @@ details.  This option may be given more 
 @item -s
 @itemx --synclines
 Generate synchronization lines, for use by the C preprocessor or other
-similar tools.  This is useful, for example, when @code{m4} is used as a
+similar tools.  Order is significant with respect to file names.  This
+option is useful, for example, when @code{m4} is used as a
 front end to a compiler.  Source file name and line number information
 is conveyed by directives of the form @samp{#line @var{linenum}
 "@var{file}"}, which are inserted as needed into the middle of the
@@ -627,7 +632,8 @@ until the beginning of the next generate
 This deletes any predefined meaning @var{NAME} might have.  Obviously,
 only predefined macros can be deleted in this way.  This option may be
 given more than once; undefining a @var{NAME} that does not have a
-definition is silently ignored.
+definition is silently ignored.  Order is significant with respect to
+file names.
 @end table
 
 @node Limits control
@@ -752,7 +758,8 @@ unlimited.  @xref{Debug Levels}, for mor
 @itemx address@hidden
 This enables tracing for the macro @var{NAME}, at any point where it is
 defined.  @var{NAME} need not be defined when this option is given.
-This option may be given more than once.  @xref{Trace}, for more details.
+This option may be given more than once, and order is significant with
+respect to file names.  @xref{Trace}, for more details.
 @end table
 
 @node Command line files
@@ -772,6 +779,11 @@ terminal or other special file type.  It
 ends in the middle of argument collection, a comment, or a quoted
 string.
 
+The options @option{--define} (@option{-D}), @option{--undefine}
+(@option{-U}), @option{--synclines} (@option{-s}), and @option{--trace}
+(@option{-t}) only take effect after processing input from any file
+names that occur earlier on the command line.
+
 If none of the input files invoked @code{m4exit} (@pxref{M4exit}), the
 exit status of @code{m4} will be 0 for success, 1 for general failure
 (such as problems with reading an input file), and 63 for version
Index: src/builtin.c
===================================================================
RCS file: /sources/m4/m4/src/Attic/builtin.c,v
retrieving revision 1.1.1.1.2.48
diff -u -p -r1.1.1.1.2.48 builtin.c
--- src/builtin.c       31 Oct 2006 14:13:41 -0000      1.1.1.1.2.48
+++ src/builtin.c       1 Nov 2006 13:34:25 -0000
@@ -1199,7 +1199,7 @@ static void
 include (int argc, token_data **argv, boolean silent)
 {
   FILE *fp;
-  const char *name;
+  char *name;
 
   if (bad_argc (argv[0], argc, 2, 2))
     return;
@@ -1214,7 +1214,7 @@ include (int argc, token_data **argv, bo
     }
 
   push_file (fp, name, TRUE);
-  free ((char *) name);
+  free (name);
 }
 
 /*------------------------------------------------.
Index: src/m4.c
===================================================================
RCS file: /sources/m4/m4/src/Attic/m4.c,v
retrieving revision 1.1.1.1.2.35
diff -u -p -r1.1.1.1.2.35 m4.c
--- src/m4.c    26 Oct 2006 21:11:56 -0000      1.1.1.1.2.35
+++ src/m4.c    1 Nov 2006 13:34:26 -0000
@@ -65,8 +65,8 @@ const char *program_name;
 struct macro_definition
 {
   struct macro_definition *next;
-  int code;                    /* D, U or t */
-  const char *macro;
+  int code;                    /* D, U, s, t, or '\1' */
+  const char *arg;
 };
 typedef struct macro_definition macro_definition;
 
@@ -257,10 +257,49 @@ static const struct option long_options[
    where we try to continue execution in the meantime.  */
 int retcode;
 
+/* Process a command line file NAME, and return TRUE only if it was
+   stdin.  */
+static boolean
+process_file (const char *name)
+{
+  boolean result = FALSE;
+  if (strcmp (name, "-") == 0)
+    {
+      /* If stdin is a terminal, we want to allow 'm4 - file -'
+        to read input from stdin twice, like GNU cat.  Besides,
+        there is no point closing stdin before wrapped text, to
+        minimize bugs in syscmd called from wrapped text.  */
+      push_file (stdin, "stdin", FALSE);
+      result = TRUE;
+    }
+  else
+    {
+      char *full_name;
+      FILE *fp = m4_path_search (name, &full_name);
+      if (fp == NULL)
+       {
+         error (0, errno, "%s", name);
+         /* Set the status to EXIT_FAILURE, even though we
+            continue to process files after a missing file.  */
+         retcode = EXIT_FAILURE;
+         return FALSE;
+       }
+      push_file (fp, full_name, TRUE);
+      free (full_name);
+    }
+  expand_input ();
+  return result;
+}
+
+/* POSIX requires only -D, -U, and -s; and says that the first two
+   must be recognized when interspersed with file names.  Traditional
+   behavior also handles -s between files.  Starting OPTSTRING with
+   '-' forces getopt_long to hand back file names as arguments to opt
+   '\1', rather than reordering the command line.  */
 #ifdef ENABLE_CHANGEWORD
-#define OPTSTRING "B:D:EF:GH:I:L:N:PQR:S:T:U:W:d::eil:o:st:"
+#define OPTSTRING "-B:D:EF:GH:I:L:N:PQR:S:T:U:W:d::eil:o:st:"
 #else
-#define OPTSTRING "B:D:EF:GH:I:L:N:PQR:S:T:U:d::eil:o:st:"
+#define OPTSTRING "-B:D:EF:GH:I:L:N:PQR:S:T:U:d::eil:o:st:"
 #endif
 
 int
@@ -268,13 +307,13 @@ main (int argc, char *const *argv, char 
 {
   macro_definition *head;      /* head of deferred argument list */
   macro_definition *tail;
-  macro_definition *new;
+  macro_definition *defn;
   int optchar;                 /* option character */
 
   macro_definition *defines;
-  FILE *fp;
   boolean read_stdin = FALSE;
   boolean interactive = FALSE;
+  boolean seen_file = FALSE;
   const char *debugfile = NULL;
   const char *frozen_file_to_read = NULL;
   const char *frozen_file_to_write = NULL;
@@ -293,9 +332,8 @@ main (int argc, char *const *argv, char 
 
   head = tail = NULL;
 
-  while (optchar = getopt_long (argc, (char **) argv, OPTSTRING,
-                               long_options, NULL),
-        optchar != EOF)
+  while ((optchar = getopt_long (argc, (char **) argv, OPTSTRING,
+                                long_options, NULL)) != -1)
     switch (optchar)
       {
       default:
@@ -319,20 +357,21 @@ main (int argc, char *const *argv, char 
 
       case 'D':
       case 'U':
+      case 's':
       case 't':
-
+      case '\1':
        /* Arguments that cannot be handled until later are accumulated.  */
 
-       new = (macro_definition *) xmalloc (sizeof (macro_definition));
-       new->code = optchar;
-       new->macro = optarg;
-       new->next = NULL;
+       defn = (macro_definition *) xmalloc (sizeof (macro_definition));
+       defn->code = optchar;
+       defn->arg = optarg;
+       defn->next = NULL;
 
        if (head == NULL)
-         head = new;
+         head = defn;
        else
-         tail->next = new;
-       tail = new;
+         tail->next = defn;
+       tail = defn;
 
        break;
 
@@ -412,10 +451,6 @@ main (int argc, char *const *argv, char 
        debugfile = optarg;
        break;
 
-      case 's':
-       sync_output = 1;
-       break;
-
       case VERSION_OPTION:
        printf ("%s\n", PACKAGE_STRING);
        fputs ("\
@@ -449,33 +484,54 @@ Written by Rene' Seindal.\n\
   else
     builtin_init ();
 
+  /* Interactive mode means unbuffered output, and interrupts ignored.  */
+
+  if (interactive)
+    {
+      signal (SIGINT, SIG_IGN);
+      setbuf (stdout, (char *) NULL);
+    }
+
   /* Handle deferred command line macro definitions.  Must come after
-     initialisation of the symbol table.  */
+     initialization of the symbol table.  */
 
   while (defines != NULL)
     {
       macro_definition *next;
-      char *macro_value;
       symbol *sym;
 
       switch (defines->code)
        {
        case 'D':
-         macro_value = strchr (defines->macro, '=');
-         if (macro_value)
-           *macro_value++ = '\0';
-         define_user_macro (defines->macro, macro_value, SYMBOL_INSERT);
+         {
+           /* defines->arg is read-only, so we need a copy.  */
+           char *macro_name = xstrdup (defines->arg);
+           char *macro_value = strchr (macro_name, '=');
+           if (macro_value)
+             *macro_value++ = '\0';
+           define_user_macro (macro_name, macro_value, SYMBOL_INSERT);
+           free (macro_name);
+         }
          break;
 
        case 'U':
-         lookup_symbol (defines->macro, SYMBOL_DELETE);
+         lookup_symbol (defines->arg, SYMBOL_DELETE);
          break;
 
        case 't':
-         sym = lookup_symbol (defines->macro, SYMBOL_INSERT);
+         sym = lookup_symbol (defines->arg, SYMBOL_INSERT);
          SYMBOL_TRACED (sym) = TRUE;
          break;
 
+       case 's':
+         sync_output = 1;
+         break;
+
+       case '\1':
+         seen_file = TRUE;
+         read_stdin |= process_file (defines->arg);
+         break;
+
        default:
          M4ERROR ((warning_status, 0,
                    "INTERNAL ERROR: bad code in deferred arguments"));
@@ -487,55 +543,14 @@ Written by Rene' Seindal.\n\
       defines = next;
     }
 
-  /* Interactive mode means unbuffered output, and interrupts ignored.  */
-
-  if (interactive)
-    {
-      signal (SIGINT, SIG_IGN);
-      setbuf (stdout, (char *) NULL);
-    }
-
-  /* Handle the various input files.  Each file is pushed on the input,
+  /* Handle remaining input files.  Each file is pushed on the input,
      and the input read.  Wrapup text is handled separately later.  */
 
-  if (optind == argc)
-    {
-      /* No point closing stdin until after wrapped text is
-        processed.  */
-      push_file (stdin, "stdin", FALSE);
-      read_stdin = TRUE;
-      expand_input ();
-    }
+  if (optind == argc && !seen_file)
+    read_stdin = process_file ("-");
   else
     for (; optind < argc; optind++)
-      {
-       if (strcmp (argv[optind], "-") == 0)
-         {
-           /* If stdin is a terminal, we want to allow 'm4 - file -'
-              to read input from stdin twice, like GNU cat.  Besides,
-              there is no point closing stdin before wrapped text, to
-              minimize bugs in syscmd called from wrapped text.  */
-           push_file (stdin, "stdin", FALSE);
-           read_stdin = TRUE;
-         }
-       else
-         {
-           const char *name;
-           fp = m4_path_search (argv[optind], &name);
-           if (fp == NULL)
-             {
-               error (0, errno, "%s", argv[optind]);
-               /* Set the status to EXIT_FAILURE, even though we
-                  continue to process files after a missing file.  */
-               retcode = EXIT_FAILURE;
-               continue;
-             }
-           push_file (fp, name, TRUE);
-           free ((char *) name);
-         }
-       expand_input ();
-      }
-#undef NEXTARG
+      read_stdin |= process_file (argv[optind]);
 
   /* Now handle wrapup text.  */
 
Index: src/m4.h
===================================================================
RCS file: /sources/m4/m4/src/m4.h,v
retrieving revision 1.1.1.1.2.31
diff -u -p -r1.1.1.1.2.31 m4.h
--- src/m4.h    14 Oct 2006 04:15:27 -0000      1.1.1.1.2.31
+++ src/m4.h    1 Nov 2006 13:34:26 -0000
@@ -422,7 +422,7 @@ const builtin *find_builtin_by_name (con
 void include_init (void);
 void include_env_init (void);
 void add_include_directory (const char *);
-FILE *m4_path_search (const char *, const char **);
+FILE *m4_path_search (const char *, char **);
 
 /* File: eval.c  --- expression evaluation.  */
 
Index: src/path.c
===================================================================
RCS file: /sources/m4/m4/src/Attic/path.c,v
retrieving revision 1.1.1.1.2.12
diff -u -p -r1.1.1.1.2.12 path.c
--- src/path.c  13 Oct 2006 22:25:32 -0000      1.1.1.1.2.12
+++ src/path.c  1 Nov 2006 13:34:26 -0000
@@ -111,7 +111,7 @@ add_include_directory (const char *dir)
    respect to the current working directory.  */
 
 FILE *
-m4_path_search (const char *file, const char **result)
+m4_path_search (const char *file, char **result)
 {
   FILE *fp;
   includes *incl;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]