[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: M4 syntax $11 vs. ${11}
From: |
Gary V . Vaughan |
Subject: |
Re: M4 syntax $11 vs. ${11} |
Date: |
Tue, 13 Mar 2007 05:47:57 +0000 |
Hi Eric,
On 1 Mar 2007, at 04:18, Eric Blake wrote:
Here (finally) is a patch for head, that both implements ${1}, as
well as
forward ports --warn-macro-sequence from the branch. The features are
intertwined enough that I didn't see any good reason to separate
this into
multiple patches (other than the earlier patches I already
committed today).
Here (finally) is the review of the patch for HEAD :-) It's been a
crazy
couple of weeks, apologies for the delay.
For ease of review and future CVS archaeology it would be better to keep
patches as small and self contained as possible though... i.e. this is
at least 3 patches: --warn-macro-sequence; posix $ syntax; regex
function
movement. Unless the split for this patch falls naturally out of
reworking
it, don't worry too much in this particular case though.
However, I would like a review; so it is not applied yet. In
particular, in
macro.c, this patch does not do a deprecation period for $10, as I
originally
thought above, but flat out went with POSIX syntax. I did this on the
principle that relatively few uses of $10 in the wild have been
discovered, and
that --warn-macro-sequence can be used to detect even those uses.
I think this is okay for the dev branch, so long as it also contains
a TODO
to make sure that the default build of released M4 will maintain 100%
bugwards
compatibility with the gnu syntax of 1.4.x.
Still to be written:
1) I want to implement a new builtin m4macroseq([regex],
[resyntax]), which
behaves like the command-line --warn-macro-sequence (and that also
means adding
a command line --m4macroseq for symmetry). With no arguments, it
enables the
default warning sequence, with one empty argument, it disables
warnings, and
since it uses regular expressions, it takes an optional resyntax
argument to
override the current changeresyntax.
I think this functionality belongs either in a module (I've been
wanting to
create a way for modules to add and parse command line options for
some time),
or maybe a separate helper script for upgraders. Either way, there
is no need
to complicate the core with any of the above.
2) I have promised to implement changeextarg(start,[stop]), which
allows multi-
character extended arguments, so that autoconf can reserve ${1} for
shell
output and ${{1}} for the day that autoconf 3.0 depends on m4 2.0.
I will
model it on changequote, including how it interacts with single-
character quote
syntax in changesyntax, except that an argument always must be
supplied (to go
with the policy that macros not beginning with "__" or "m4" must be
blind, to
avoid risk of inadvertant expansion).
I don't think this is necessary. I explained in an earlier post that
I would
like to make changesyntax support multicharacter elements, and remove
as many
of the changexxxx macros as possible, rather than introducing more.
3) I would like to implement ideas from sh, such as ${1-default}
expanding to
the first argument if supplied, or `default' if omitted.
Nice :-)
I think that 2) is the only thing that should be completed before I
feel
comfortable baselining m4-1.9b for wider test exposure on
alpha.gnu.org.
Or rather, the posix syntax be made a build time option until we can
modularise
it enough that run-time changing between posix and gnu syntax is
possible.
+*** The GNU 1.4.x extension of recognizing the sequence `$10' in
macro
+ definitions as the tenth positional parameter is withdrawn, as
it is
+ incompatible with POSIX. The sequence `$10' now correctly
refers to
+ the first positional parameter concatenated with 0. To
directly access
+ the tenth parameter, you must now use extended arguments (you
can also
+ portably access the tenth argument indirectly using the `shift'
+ builtin). To detect places in existing scripts that might be
affected
+ by this change in behavior, you can use the `--warn-macro-
sequence'
+ command-line option.
Please add a FIXME: before 1.9b gnu syntax should be the default.
+*** POSIX allows implementations to assign arbitrary behavior to
the sequence
+ `${' in macro definitions. All earlier versions of GNU M4
just treated
+ it as literal output, but this version introduces extended
arguments.
+ By default, the sequence `${<digits>}' now represents the
extended
+ argument referring to a positional parameter, so that it is still
+ possible to directly refer to more than nine arguments. If
the older
+ 1.4.x behavior of literal output is desired, the new
`changesyntax' or
+ `changeextarg' builtins can be used to cripple extended
arguments. To
+ detect places in existing scripts that might be affected by
this change
+ in behavior, you can use the `--warn-macro-sequence' command-line
+ option.
Let's not mention changeextarg, unless we reach a point where we find
there
is no cleaner way to implement it.
address@hidden address@hidden@address@hidden
address@hidden TODO: add m4macroseq builtin, and alias --m4macroseq
+Issue a warning if the regular expression @var{REGEXP} has a non-
empty
+match in any macro definition (either by @code{define} or
address@hidden). Empty matches are ignored; therefore, supplying the
+empty string as @var{REGEXP} disables any warning. Otherwise,
address@hidden is compiled according to the current regular expression
+syntax. If the optional @var{REGEXP} is not supplied, then a default
+regular expression is used, equivalent to
address@hidden(@address@hidden@}\|[0-9][0-9]+\)} in the @code{GNU_M4} regular
+expression flavor (a literal @samp{$} followed by multiple digits
or by
+an open brace). The default expression is chosen to detect the
+sequences that changed semantics in the default operation of
address@hidden M4 2.0 compared to earlier versions of GNU M4
+(@pxref{Extended Arguments}). Providing an alternate regular
expression
+can provide a useful reverse lookup feature of finding where a
macro is
+defined to have a given definition, or accomodate uses of
address@hidden that intentionally alter extended argument
syntax.
Again, lets move this into a module, or a upgraders' helper script.
address@hidden requires that if multiple digits appear after @samp
{$},
+the first digit is used to select the parameter, and the remaining
+digits are concatenated as literal text. Earlier versions of
address@hidden M4 had an incompatible extension that would use all of
+the digits to reference beyond the ninth argument, but this was
changed
+in M4 2.0. @xref{Extended Arguments}, for more details on this
change.
+
address@hidden
+define(`foo', `$11')
address@hidden
+define(`a1', `hello')
address@hidden
+foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l')
address@hidden
address@hidden example
Add this to a posix compliancy module subsection instead. Other
adjustments
as necessary to accomodate this idea... I won't point out the other
places
in the patch that need to take this into account.
Index: m4/m4module.h
===================================================================
+/* The default sequence detects multi-digit parameters (obsolete
after
+ 1.4.x), and any use of extended arguments with the default ${}
+ syntax (new in 2.0). */
+#define M4_DEFAULT_MACRO_SEQUENCE "\\$\\({[^}]*}\\|[0-9][0-9]+\\)"
+
+extern void m4_macro_expand_input (m4 *);
+extern void m4_macro_call (m4 *, m4_symbol_value *,
+ m4_obstack *, int,
+ m4_symbol_value **);
+extern void m4_set_macro_sequence (m4 *, const char *, int,
+ const char *);
+extern void m4_free_macro_sequence (m4 *);
+extern m4_symbol_value *m4_macro_define (m4 *, const char *, const
char
*,
+ bool);
+extern void m4_check_macro_sequence (m4 *, const char *, const
char *,
+ const char *);
Put the new functions in a loadable module instead.
+/* The regs_allocated field in an re_pattern_buffer refers to the
+ state of the re_registers struct used in successive matches with
+ the same compiled pattern. */
+
+typedef struct {
+ struct re_pattern_buffer pat; /* compiled regular expression */
+ struct re_registers regs; /* match registers */
+} m4_pattern_buffer;
+
extern const char * m4_regexp_syntax_decode (int);
extern int m4_regexp_syntax_encode (const char *);
+extern m4_pattern_buffer *m4_regexp_compile (m4 *, const char *,
+ const char *, int,
+ bool,
m4_pattern_buffer *);
+extern void m4_regexp_free (m4_pattern_buffer *);
Please don't do that! There is module entry point export/import code in
the m4 module API... by moving the macro_sequence stuff into a module,
we can keep the regex code out of the core (important for people who
would like to build a tiny non-gnu m4). Worst case, the regex code
might
need to go in a module of its own so that either the gnu module or my
proposed new macro_sequence module can each require it independently.
@@ -463,19 +501,18 @@
break;
default:
- if (m4_get_posixly_correct_opt (context)
- || !VALUE_ARG_SIGNATURE (value))
- {
- obstack_1grow (obs, ch);
- }
- else
+ if (VALUE_ARG_SIGNATURE (value))
{
+ /* TODO - VALUE_ARG_SIGNATURE is not fully implemented.
+ Is it worth killing this as dead code, and figuring
+ out how to use extended arguments to do what was
+ originally envisioned by VALUE_ARG_SIGNATURE? */
Yes, possibly -- or intergrating the two. For the record
VALUE_ARG_SIGNATURE
accesses a hash table of parameter names to values, built when the
macro is
defined and referenced when ${argname} is expanded. That is, however, a
different patch.
We need to be sure that defn correctly passes the contents of the macro
signature around too.
Another thing this was leading towards is maintaining enough details
about
the macro arguments here that m4_define'd macros would also be able to
take advantage of the automatic checking for insufficient or excess
arguments
that builtins currently have.
The implementation looks fine (location aside (; ). Please add some
thorough tests to stricly define how the feature is supposed to work,
especially in the corner cases you mentioned.
Cheers,
Gary
--
())_. Email me: address@hidden
( '/ Read my blog: http://blog.azazil.net
/ )= ...and my book: http://sources.redhat.com/autobook
`(_~)_ Join my AGLOCO Network: http://www.agloco.com/r/BBBS7912
PGP.sig
Description: This is a digitally signed message part