[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[4/18] argv_ref speedup: make argv struct opaque
From: |
Eric Blake |
Subject: |
[4/18] argv_ref speedup: make argv struct opaque |
Date: |
Thu, 29 Nov 2007 22:36:45 -0700 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071031 Thunderbird/2.0.0.9 Mnenhy/0.7.5.666 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Another patch ported. This patch makes the argv struct opaque to all but
the input engine; this part should have no impact on memory and only a
slight impact, if any, on speed. It also optimizes ifelse to skip strcmp
if lengths are different, so the patch provides a net speedup. I had to
rework the handling of the obstacks in expand_macro, and a later patch
will rework it yet again.
2007-11-29 Eric Blake <address@hidden>
Stage 4: route indir, builtin through ref; make argv opaque.
* src/m4.h (obstack_regrow): Borrow definition from head.
(struct token_chain): Add flatten and len members.
(arg_equal, arg_empty, make_argv_ref): New prototypes.
(struct macro_arguments): Move...
* src/macro.c (struct macro_arguments): ...here, making it
opaque. Add has_ref member.
(empty_token): New placeholder, for optimizing comparison with
empty string.
(collect_arguments): Change signature, and populate new fields.
(expand_macro): Alter handling of obstacks.
(arg_token): New helper method.
(arg_equal, arg_empty, make_argv_ref): New methods.
(arg_type, arg_text, arg_len, arg_func): Use new methods.
* src/builtin.c (m4_ifelse, m4_builtin, m4_indir, m4_eval):
Likewise.
* src/format.c (format): Likewise.
- --
Don't work too hard, make some time for fun as well!
Eric Blake address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFHT6Fs84KuGfSFAYARAi99AJ46B4ofdmuEFgiquroBzBwrAcdxewCePlYD
/cscYIKrycCpW8ZR3lNsRz8=
=Rhmo
-----END PGP SIGNATURE-----
>From 6128996ce3111b5fe1f7d879e11af37c44ee3a92 Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Thu, 29 Nov 2007 21:26:22 -0700
Subject: [PATCH] Stage 4: route indir, builtin through ref; make argv opaque.
* m4/system_.h (obstack_regrow): Fix precedence.
* m4/m4module.h (m4_arg_equal, m4_arg_empty, m4_make_argv_ref):
New prototypes.
(struct m4_macro_args): Move...
* m4/m4private.h (struct m4_macro_args): ...here, making it opaque
to modules. Add has_ref member.
(bool_bitfield): New helper typedef.
(struct m4_symbol_chain): Add flatten and len members.
* m4/macro.c (empty_symbol): New placeholder, for optimizing
comparison with empty string.
(m4_macro_expand_input): Initialize it.
(collect_arguments): Alter signature, and populate new fields.
(trace_pre, trace_post): Remove redundant parameter.
(expand_macro): Alter handling of obstacks.
(m4_arg_symbol): Account for wrapped argv.
(m4_arg_equal, m4_arg_empty, m4_make_argv_ref): New methods.
(m4_arg_text, m4_arg_len, m4_arg_func): Use new methods.
* modules/m4.c (ifelse, syscmd): Likewise.
* modules/evalparse.c (m4_evaluate): Likewise.
(undefine, popdef, m4_dump_symbols): Optimize.
* modules/gnu.c (builtin, indir, esyscmd, debugfile): Use new
methods.
(changesyntax, regexp): Optimize.
* m4/output.c (diversion_storage): Use typedef.
Signed-off-by: Eric Blake <address@hidden>
---
ChangeLog | 26 ++++++
m4/m4module.h | 25 +----
m4/m4private.h | 56 +++++++++---
m4/macro.c | 246 ++++++++++++++++++++++++++++++++++++++++-----------
m4/output.c | 2 +-
m4/system_.h | 6 +-
modules/evalparse.c | 3 +-
modules/gnu.c | 62 ++++---------
modules/m4.c | 38 ++++----
9 files changed, 312 insertions(+), 152 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 695720d..55ee2ea 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,31 @@
2007-11-29 Eric Blake <address@hidden>
+ Stage 4: route indir, builtin through ref; make argv opaque.
+ * m4/system_.h (obstack_regrow): Fix precedence.
+ * m4/m4module.h (m4_arg_equal, m4_arg_empty, m4_make_argv_ref):
+ New prototypes.
+ (struct m4_macro_args): Move...
+ * m4/m4private.h (struct m4_macro_args): ...here, making it opaque
+ to modules. Add has_ref member.
+ (bool_bitfield): New helper typedef.
+ (struct m4_symbol_chain): Add flatten and len members.
+ * m4/macro.c (empty_symbol): New placeholder, for optimizing
+ comparison with empty string.
+ (m4_macro_expand_input): Initialize it.
+ (collect_arguments): Alter signature, and populate new fields.
+ (trace_pre, trace_post): Remove redundant parameter.
+ (expand_macro): Alter handling of obstacks.
+ (m4_arg_symbol): Account for wrapped argv.
+ (m4_arg_equal, m4_arg_empty, m4_make_argv_ref): New methods.
+ (m4_arg_text, m4_arg_len, m4_arg_func): Use new methods.
+ * modules/m4.c (ifelse, syscmd): Likewise.
+ * modules/evalparse.c (m4_evaluate): Likewise.
+ (undefine, popdef, m4_dump_symbols): Optimize.
+ * modules/gnu.c (builtin, indir, esyscmd, debugfile): Use new
+ methods.
+ (changesyntax, regexp): Optimize.
+ * m4/output.c (diversion_storage): Use typedef.
+
Stage 3b: cache length, rather than computing it, in modules.
* m4/hash.c (m4_hash_remove): Avoid double free on remove
failure.
diff --git a/m4/m4module.h b/m4/m4module.h
index 7ffaffd..8f3f590 100644
--- a/m4/m4module.h
+++ b/m4/m4module.h
@@ -77,26 +77,6 @@ struct m4_macro
const char *value;
};
-/* FIXME - make this struct opaque. */
-struct m4_macro_args
-{
- /* One more than the highest actual argument. May be larger than
- arraylen since the array can refer to multiple arguments via a
- single $@ reference. */
- unsigned int argc;
- /* False unless the macro expansion refers to $@; determines whether
- this object can be freed at end of macro expansion or must wait
- until all references have been rescanned. */
- bool inuse;
- const char *argv0; /* The macro name being expanded. */
- size_t argv0_len; /* Length of argv0. */
- size_t arraylen; /* True length of allocated elements in array. */
- /* Used as a variable-length array, storing information about each
- argument. */
- m4_symbol_value *array[FLEXIBLE_ARRAY_MEMBER];
-};
-
-
#define M4BUILTIN(name)
\
static void CONC (builtin_, name) \
(m4 *context, m4_obstack *obs, unsigned int argc, m4_macro_args *argv);
@@ -320,8 +300,13 @@ extern m4_symbol_value *m4_arg_symbol (m4_macro_args
*, unsigned int);
extern bool m4_is_arg_text (m4_macro_args *, unsigned int);
extern bool m4_is_arg_func (m4_macro_args *, unsigned int);
extern const char *m4_arg_text (m4_macro_args *, unsigned int);
+extern bool m4_arg_equal (m4_macro_args *, unsigned int,
+ unsigned int);
+extern bool m4_arg_empty (m4_macro_args *, unsigned int);
extern size_t m4_arg_len (m4_macro_args *, unsigned int);
extern m4_builtin_func *m4_arg_func (m4_macro_args *, unsigned int);
+extern m4_macro_args *m4_make_argv_ref (m4_macro_args *, const char *, size_t,
+ bool, bool);
/* --- RUNTIME DEBUGGING --- */
diff --git a/m4/m4private.h b/m4/m4private.h
index 84e7157..8e23e00 100644
--- a/m4/m4private.h
+++ b/m4/m4private.h
@@ -41,6 +41,15 @@ typedef enum {
#define BIT_SET(flags, bit) ((flags) |= (bit))
#define BIT_RESET(flags, bit) ((flags) &= ~(bit))
+/* Gnulib's stdbool doesn't work with bool bitfields. For nicer
+ debugging, use bool when we know it works, but use the more
+ portable unsigned int elsewhere. */
+#if __GNUC__ > 2
+typedef bool bool_bitfield;
+#else
+typedef unsigned int bool_bitfield;
+#endif /* !__GNUC__ */
+
/* --- CONTEXT MANAGEMENT --- */
@@ -176,17 +185,19 @@ typedef struct m4_symbol_chain m4_symbol_chain;
struct m4_symbol
{
- bool traced;
- m4_symbol_value * value;
+ bool traced; /* True if this symbol is traced. */
+ m4_symbol_value *value; /* Linked list of pushdef'd values. */
};
/* Composite symbols are built of a linked list of chain objects. */
struct m4_symbol_chain
{
m4_symbol_chain *next;/* Pointer to next link of chain. */
- char *str; /* NUL-terminated string if text, else NULL. */
+ char *str; /* NUL-terminated string if text, or NULL. */
+ size_t len; /* Length of str, or 0. */
m4_macro_args *argv; /* Reference to earlier address@hidden */
- unsigned int index; /* Index within argv to start reading from. */
+ unsigned int index; /* Argument index within argv. */
+ bool flatten; /* True to treat builtins as text. */
};
/* A symbol value is used both for values associated with a macro
@@ -215,6 +226,29 @@ struct m4_symbol_value
} u;
};
+/* Structure describing all arguments to a macro, including the macro
+ name at index 0. */
+struct m4_macro_args
+{
+ /* One more than the highest actual argument. May be larger than
+ arraylen since the array can refer to multiple arguments via a
+ single $@ reference. */
+ unsigned int argc;
+ /* False unless the macro expansion refers to $@; determines whether
+ this object can be freed at end of macro expansion or must wait
+ until all references have been rescanned. */
+ bool_bitfield inuse : 1;
+ /* False if all arguments are just text or func, true if this argv
+ refers to another one. */
+ bool_bitfield has_ref : 1;
+ const char *argv0; /* The macro name being expanded. */
+ size_t argv0_len; /* Length of argv0. */
+ size_t arraylen; /* True length of allocated elements in array. */
+ /* Used as a variable-length array, storing information about each
+ argument. */
+ m4_symbol_value *array[FLEXIBLE_ARRAY_MEMBER];
+};
+
#define VALUE_NEXT(T) ((T)->next)
#define VALUE_MODULE(T) ((T)->module)
#define VALUE_FLAGS(T) ((T)->flags)
@@ -223,13 +257,13 @@ struct m4_symbol_value
#define VALUE_MAX_ARGS(T) ((T)->max_args)
#define VALUE_PENDING(T) ((T)->pending_expansions)
-#define SYMBOL_NEXT(S) (VALUE_NEXT ((S)->value))
-#define SYMBOL_MODULE(S) (VALUE_MODULE ((S)->value))
-#define SYMBOL_FLAGS(S) (VALUE_FLAGS ((S)->value))
-#define SYMBOL_ARG_SIGNATURE(S) (VALUE_ARG_SIGNATURE ((S)->value))
-#define SYMBOL_MIN_ARGS(S) (VALUE_MIN_ARGS ((S)->value))
-#define SYMBOL_MAX_ARGS(S) (VALUE_MAX_ARGS ((S)->value))
-#define SYMBOL_PENDING(S) (VALUE_PENDING ((S)->value))
+#define SYMBOL_NEXT(S) (VALUE_NEXT ((S)->value))
+#define SYMBOL_MODULE(S) (VALUE_MODULE ((S)->value))
+#define SYMBOL_FLAGS(S) (VALUE_FLAGS ((S)->value))
+#define SYMBOL_ARG_SIGNATURE(S) (VALUE_ARG_SIGNATURE ((S)->value))
+#define SYMBOL_MIN_ARGS(S) (VALUE_MIN_ARGS ((S)->value))
+#define SYMBOL_MAX_ARGS(S) (VALUE_MAX_ARGS ((S)->value))
+#define SYMBOL_PENDING(S) (VALUE_PENDING ((S)->value))
/* Fast macro versions of symbol table accessor functions,
that also have an identically named function exported in m4module.h. */
diff --git a/m4/macro.c b/m4/macro.c
index 5769f99..25fc7e7 100644
--- a/m4/macro.c
+++ b/m4/macro.c
@@ -30,8 +30,7 @@
#include "intprops.h"
static m4_macro_args *collect_arguments (m4 *, const char *, size_t,
- m4_symbol *, m4_obstack *,
- unsigned int, m4_obstack *);
+ m4_symbol *, m4_obstack *);
static void expand_macro (m4 *, const char *, size_t, m4_symbol *);
static void expand_token (m4 *, m4_obstack *, m4__token_type,
m4_symbol_value *, int);
@@ -42,9 +41,9 @@ static void process_macro (m4 *, m4_symbol_value *,
m4_obstack *, int,
static void trace_prepre (m4 *, const char *, size_t,
m4_symbol_value *);
-static void trace_pre (m4 *, const char *, size_t, m4_macro_args *);
-static void trace_post (m4 *, const char *, size_t,
- m4_macro_args *, m4_input_block *, bool);
+static void trace_pre (m4 *, size_t, m4_macro_args *);
+static void trace_post (m4 *, size_t, m4_macro_args *,
+ m4_input_block *, bool);
static void trace_format (m4 *, const char *, ...)
M4_GNUC_PRINTF (2, 3);
@@ -63,13 +62,17 @@ static size_t macro_call_id = 0;
argv_stack. This stack can be used simultaneously by multiple
macro calls, using obstack_regrow to handle partial objects
embedded in the stack. */
-static struct obstack argc_stack;
+static m4_obstack argc_stack;
/* The shared stack of pointers to collected arguments for macro
calls. This object is never finished; we exploit the fact that
obstack_blank is documented to take a negative size to reduce the
size again. */
-static struct obstack argv_stack;
+static m4_obstack argv_stack;
+
+/* A placeholder symbol value representing the empty string, used to
+ optimize checks for emptiness. */
+static m4_symbol_value empty_symbol;
/* This function reads all input, and expands each token, one at a time. */
void
@@ -82,6 +85,8 @@ m4_macro_expand_input (m4 *context)
obstack_init (&argc_stack);
obstack_init (&argv_stack);
+ m4_set_symbol_value_text (&empty_symbol, "", 0);
+
while ((type = m4__next_token (context, &token, &line, NULL))
!= M4_TOKEN_EOF)
expand_token (context, (m4_obstack *) NULL, type, &token, line);
@@ -251,7 +256,8 @@ expand_argument (m4 *context, m4_obstack *obs,
m4_symbol_value *argp,
static void
expand_macro (m4 *context, const char *name, size_t len, m4_symbol *symbol)
{
- char *argc_base = NULL; /* Base of argc_stack on entry. */
+ void *argc_base = NULL; /* Base of argc_stack on entry. */
+ void *argv_base = NULL; /* Base of argv_stack on entry. */
unsigned int argc_size; /* Size of argc_stack on entry. */
unsigned int argv_size; /* Size of argv_stack on entry. */
m4_macro_args *argv;
@@ -296,17 +302,14 @@ recursion limit of %zu exceeded, use -L<N> to change it"),
argc_size = obstack_object_size (&argc_stack);
argv_size = obstack_object_size (&argv_stack);
- if (0 < argc_size)
- argc_base = obstack_finish (&argc_stack);
+ argc_base = obstack_finish (&argc_stack);
+ if (0 < argv_size)
+ argv_base = obstack_finish (&argv_stack);
if (traced && m4_is_debug_bit (context, M4_DEBUG_TRACE_CALL))
trace_prepre (context, name, my_call_id, value);
- argv = collect_arguments (context, name, len, symbol, &argv_stack,
- argv_size, &argc_stack);
- /* Calling collect_arguments invalidated name, but we copied it as
- argv[0]. */
- name = argv->argv0;
+ argv = collect_arguments (context, name, len, symbol, &argc_stack);
loc_close_file = m4_get_current_file (context);
loc_close_line = m4_get_current_line (context);
@@ -314,14 +317,14 @@ recursion limit of %zu exceeded, use -L<N> to change it"),
m4_set_current_line (context, loc_open_line);
if (traced)
- trace_pre (context, name, my_call_id, argv);
+ trace_pre (context, my_call_id, argv);
expansion = m4_push_string_init (context);
m4_macro_call (context, value, expansion, argv->argc, argv);
expanded = m4_push_string_finish ();
if (traced)
- trace_post (context, name, my_call_id, argv, expanded, trace_expansion);
+ trace_post (context, my_call_id, argv, expanded, trace_expansion);
m4_set_current_file (context, loc_close_file);
m4_set_current_line (context, loc_close_line);
@@ -335,20 +338,21 @@ recursion limit of %zu exceeded, use -L<N> to change it"),
if (0 < argc_size)
obstack_regrow (&argc_stack, argc_base, argc_size);
else
- obstack_free (&argc_stack, (void *) name);
- obstack_blank (&argv_stack, argv_size - obstack_object_size (&argv_stack));
+ obstack_free (&argc_stack, argc_base);
+ if (0 < argv_size)
+ obstack_regrow (&argv_stack, argv_base, argv_size);
+ else
+ obstack_free (&argv_stack, argv);
}
/* Collect all the arguments to a call of the macro SYMBOL (called
NAME, with length LEN). The arguments are stored on the obstack
ARGUMENTS and a table of pointers to the arguments on the obstack
- ARGPTR. ARGPTR is an incomplete object, currently occupying
- ARGV_BASE bytes. Return the object describing all of the macro
+ argv_stack. Return the object describing all of the macro
arguments. */
static m4_macro_args *
collect_arguments (m4 *context, const char *name, size_t len,
- m4_symbol *symbol, m4_obstack *argptr,
- unsigned int argv_base, m4_obstack *arguments)
+ m4_symbol *symbol, m4_obstack *arguments)
{
m4_symbol_value token;
m4_symbol_value *tokenp;
@@ -361,10 +365,13 @@ collect_arguments (m4 *context, const char *name, size_t
len,
args.argc = 1;
args.inuse = false;
+ args.has_ref = false;
+ /* FIXME - add accessor to symtab that returns name from the hash
+ table, so we don't have to copy it here. */
args.argv0 = (char *) obstack_copy0 (arguments, name, len);
args.argv0_len = len;
args.arraylen = 0;
- obstack_grow (argptr, &args, offsetof (m4_macro_args, array));
+ obstack_grow (&argv_stack, &args, offsetof (m4_macro_args, array));
name = args.argv0;
if (m4__next_token_is_open (context))
@@ -374,20 +381,20 @@ collect_arguments (m4 *context, const char *name, size_t
len,
{
more_args = expand_argument (context, arguments, &token, name);
- if (!groks_macro_args && m4_is_symbol_value_func (&token))
- {
- VALUE_MODULE (&token) = NULL;
- m4_set_symbol_value_text (&token, "", 0);
- }
- tokenp = (m4_symbol_value *) obstack_copy (arguments, &token,
- sizeof token);
- obstack_ptr_grow (argptr, tokenp);
+ if ((m4_is_symbol_value_text (&token)
+ && !m4_get_symbol_value_len (&token))
+ || (!groks_macro_args && m4_is_symbol_value_func (&token)))
+ tokenp = &empty_symbol;
+ else
+ tokenp = (m4_symbol_value *) obstack_copy (arguments, &token,
+ sizeof *tokenp);
+ obstack_ptr_grow (&argv_stack, tokenp);
args.arraylen++;
args.argc++;
}
while (more_args);
}
- argv = (m4_macro_args *) ((char *) obstack_base (argptr) + argv_base);
+ argv = (m4_macro_args *) obstack_finish (&argv_stack);
argv->argc = args.argc;
argv->arraylen = args.arraylen;
return argv;
@@ -536,11 +543,11 @@ process_macro (m4 *context, m4_symbol_value *value,
m4_obstack *obs,
-/* The rest of this file contains the functions for macro tracing output.
- All tracing output for a macro call is collected on an obstack TRACE,
- and printed whenever the line is complete. This prevents tracing
- output from interfering with other debug messages generated by the
- various builtins. */
+/* The next portion of this file contains the functions for macro
+ tracing output. All tracing output for a macro call is collected
+ on an obstack TRACE, and printed whenever the line is complete.
+ This prevents tracing output from interfering with other debug
+ messages generated by the various builtins. */
/* Tracing output is formatted here, by a simplified printf-to-obstack
function trace_format (). Understands only %s, %d, %zu (size_t
@@ -653,13 +660,13 @@ trace_prepre (m4 *context, const char *name, size_t id,
m4_symbol_value *value)
/* Format the parts of a trace line, that can be made before the macro is
actually expanded. Used from expand_macro (). */
static void
-trace_pre (m4 *context, const char *name, size_t id, m4_macro_args *argv)
+trace_pre (m4 *context, size_t id, m4_macro_args *argv)
{
unsigned int i;
unsigned int argc = m4_arg_argc (argv);
trace_header (context, id);
- trace_format (context, "%s", name);
+ trace_format (context, "%s", M4ARG (0));
if (1 < argc && m4_is_debug_bit (context, M4_DEBUG_TRACE_ARGS))
{
@@ -686,9 +693,8 @@ trace_pre (m4 *context, const char *name, size_t id,
m4_macro_args *argv)
/* Format the final part of a trace line and print it all. Used from
expand_macro (). */
static void
-trace_post (m4 *context, const char *name, size_t id,
- m4_macro_args *argv, m4_input_block *expanded,
- bool trace_expansion)
+trace_post (m4 *context, size_t id, m4_macro_args *argv,
+ m4_input_block *expanded, bool trace_expansion)
{
if (trace_expansion)
{
@@ -703,12 +709,43 @@ trace_post (m4 *context, const char *name, size_t id,
/* Accessors into m4_macro_args. */
/* Given ARGV, return the symbol value at the specified INDEX, which
- must be non-zero and less than argc. */
+ must be non-zero. */
m4_symbol_value *
m4_arg_symbol (m4_macro_args *argv, unsigned int index)
{
- assert (index && index < argv->argc);
- return argv->array[index - 1];
+ unsigned int i;
+ m4_symbol_value *value;
+
+ assert (index);
+ if (argv->argc <= index)
+ return &empty_symbol;
+
+ if (!argv->has_ref)
+ return argv->array[index - 1];
+ /* Must cycle through all array slots until we find index, since
+ wrappers can contain multiple arguments. */
+ for (i = 0; i < argv->arraylen; i++)
+ {
+ value = argv->array[i];
+ if (value->type == M4_SYMBOL_COMP)
+ {
+ m4_symbol_chain *chain = value->u.chain;
+ /* TODO - for now we support only a single $@ chain. */
+ assert (!chain->next && !chain->str);
+ if (index < chain->argv->argc - (chain->index - 1))
+ {
+ value = m4_arg_symbol (chain->argv, chain->index - 1 + index);
+ if (chain->flatten && m4_is_symbol_value_func (value))
+ value = &empty_symbol;
+ break;
+ }
+ index -= chain->argv->argc - chain->index;
+ }
+ else if (--index == 0)
+ break;
+ }
+ assert (value->type != M4_SYMBOL_COMP);
+ return value;
}
/* Given ARGV, return true if argument INDEX is text. Index 0 is
@@ -737,13 +774,45 @@ m4_is_arg_func (m4_macro_args *argv, unsigned int index)
const char *
m4_arg_text (m4_macro_args *argv, unsigned int index)
{
+ m4_symbol_value *value;
+
if (index == 0)
return argv->argv0;
if (argv->argc <= index)
return "";
- if (!m4_is_symbol_value_text (argv->array[index - 1]))
+ value = m4_arg_symbol (argv, index);
+ if (!m4_is_symbol_value_text (value))
return NULL;
- return m4_get_symbol_value_text (argv->array[index - 1]);
+ return m4_get_symbol_value_text (value);
+}
+
+/* Given ARGV, compare text arguments INDEXA and INDEXB for equality.
+ Both indices must be non-zero. Return true if the arguments
+ contain the same contents; often more efficient than
+ !strcmp (m4_arg_text (argv, indexa), m4_arg_text (argv, indexb)). */
+bool
+m4_arg_equal (m4_macro_args *argv, unsigned int indexa, unsigned int indexb)
+{
+ m4_symbol_value *sa = m4_arg_symbol (argv, indexa);
+ m4_symbol_value *sb = m4_arg_symbol (argv, indexb);
+
+ if (sa == &empty_symbol || sb == &empty_symbol)
+ return sa == sb;
+ /* TODO - allow builtin tokens in the comparison? */
+ assert (m4_is_symbol_value_text (sa) && m4_is_symbol_value_text (sb));
+ return (m4_get_symbol_value_len (sa) == m4_get_symbol_value_len (sb)
+ && strcmp (m4_get_symbol_value_text (sa),
+ m4_get_symbol_value_text (sb)) == 0);
+}
+
+/* Given ARGV, return true if argument INDEX is the empty string.
+ This gives the same result as comparing m4_arg_len against 0, but
+ is often faster. */
+bool
+m4_arg_empty (m4_macro_args *argv, unsigned int index)
+{
+ return (index ? m4_arg_symbol (argv, index) == &empty_symbol
+ : !argv->argv0_len);
}
/* Given ARGV, return the length of argument INDEX, or SIZE_MAX if the
@@ -751,13 +820,16 @@ m4_arg_text (m4_macro_args *argv, unsigned int index)
size_t
m4_arg_len (m4_macro_args *argv, unsigned int index)
{
+ m4_symbol_value *value;
+
if (index == 0)
return argv->argv0_len;
if (argv->argc <= index)
return 0;
- if (!m4_is_symbol_value_text (argv->array[index - 1]))
+ value = m4_arg_symbol (argv, index);
+ if (!m4_is_symbol_value_text (value))
return SIZE_MAX;
- return m4_get_symbol_value_len (argv->array[index - 1]);
+ return m4_get_symbol_value_len (value);
}
/* Given ARGV, return the builtin function referenced by argument
@@ -766,10 +838,78 @@ m4_arg_len (m4_macro_args *argv, unsigned int index)
m4_builtin_func *
m4_arg_func (m4_macro_args *argv, unsigned int index)
{
- if (index == 0 || argv->argc <= index
- || !m4_is_symbol_value_func (argv->array[index - 1]))
+ m4_symbol_value *value;
+
+ if (index == 0 || argv->argc <= index)
+ return NULL;
+ value = m4_arg_symbol (argv, index);
+ if (!m4_is_symbol_value_func (value))
return NULL;
- return m4_get_symbol_value_func (argv->array[index - 1]);
+ return m4_get_symbol_value_func (value);
+}
+
+/* Create a new argument object using the same obstack as ARGV; thus,
+ the new object will automatically be freed when the original is
+ freed. Explicitly set the macro name (argv[0]) from ARGV0 with
+ length ARGV0_LEN. If SKIP, set argv[1] of the new object to
+ argv[2] of the old, otherwise the objects share all arguments. If
+ FLATTEN, any builtins in ARGV are flattened to an empty string when
+ referenced through the new object. */
+m4_macro_args *
+m4_make_argv_ref (m4_macro_args *argv, const char *argv0, size_t argv0_len,
+ bool skip, bool flatten)
+{
+ m4_macro_args *new_argv;
+ m4_symbol_value *value;
+ m4_symbol_chain *chain;
+ unsigned int index = skip ? 2 : 1;
+
+ assert (obstack_object_size (&argv_stack) == 0);
+ /* When making a reference through a reference, point to the
+ original if possible. */
+ if (argv->has_ref)
+ {
+ /* TODO for now we support only a single-length $@ chain. */
+ assert (argv->arraylen == 1 && argv->array[0]->type == M4_SYMBOL_COMP);
+ chain = argv->array[0]->u.chain;
+ assert (!chain->next && !chain->str);
+ argv = chain->argv;
+ index += chain->index - 1;
+ }
+ if (argv->argc <= index)
+ {
+ new_argv = (m4_macro_args *) obstack_alloc (&argv_stack,
+ offsetof (m4_macro_args,
+ array));
+ new_argv->arraylen = 0;
+ new_argv->has_ref = false;
+ }
+ else
+ {
+ new_argv = (m4_macro_args *) obstack_alloc (&argv_stack,
+ (offsetof (m4_macro_args,
+ array)
+ + sizeof value));
+ value = (m4_symbol_value *) obstack_alloc (&argv_stack, sizeof *value);
+ chain = (m4_symbol_chain *) obstack_alloc (&argv_stack, sizeof *chain);
+ new_argv->arraylen = 1;
+ new_argv->array[0] = value;
+ new_argv->has_ref = true;
+ value->type = M4_SYMBOL_COMP;
+ value->u.chain = chain;
+ chain->next = NULL;
+ chain->str = NULL;
+ chain->len = 0;
+ chain->argv = argv;
+ chain->index = index;
+ chain->flatten = flatten;
+ }
+ /* TODO - should argv->inuse be set? */
+ new_argv->argc = argv->argc - (index - 1);
+ new_argv->inuse = false;
+ new_argv->argv0 = argv0;
+ new_argv->argv0_len = argv0_len;
+ return new_argv;
}
/* Define these last, so that earlier uses can benefit from the macros
diff --git a/m4/output.c b/m4/output.c
index ed2a451..8089073 100644
--- a/m4/output.c
+++ b/m4/output.c
@@ -83,7 +83,7 @@ static m4_diversion div0;
static m4_diversion *free_list;
/* Obstack from which diversion storage is allocated. */
-static struct obstack diversion_storage;
+static m4_obstack diversion_storage;
/* Total size of all in-memory buffer sizes. */
static size_t total_buffer_size;
diff --git a/m4/system_.h b/m4/system_.h
index e014d75..64ca73c 100644
--- a/m4/system_.h
+++ b/m4/system_.h
@@ -53,9 +53,9 @@
of an object on the stack. Reopen OBJECT (previously returned by
obstack_alloc or obstack_finish) with SIZE for additional growth,
freeing all objects that occur later in the stack. */
-#define obstack_regrow(OBS, OBJECT, SIZE) \
- (obstack_free (OBS, (char *)(OBJECT) + SIZE), \
- (OBS)->object_base = (char *)(OBJECT))
+#define obstack_regrow(OBS, OBJECT, SIZE) \
+ (obstack_free (OBS, (char *) (OBJECT) + (SIZE)), \
+ (OBS)->object_base = (char *) (OBJECT))
/* In addition to EXIT_SUCCESS and EXIT_FAILURE, m4 can fail with version
mismatch when trying to load a frozen file produced by a newer m4 than
diff --git a/modules/evalparse.c b/modules/evalparse.c
index e21a081..39b0d41 100644
--- a/modules/evalparse.c
+++ b/modules/evalparse.c
@@ -896,7 +896,8 @@ m4_evaluate (m4 *context, m4_obstack *obs, unsigned int
argc,
eval_token et;
eval_error err = NO_ERROR;
- if (*M4ARG (2) && !m4_numeric_arg (context, me, M4ARG (2), &radix))
+ if (!m4_arg_empty (argv, 2)
+ && !m4_numeric_arg (context, me, M4ARG (2), &radix))
return;
if (radix < 1 || radix > 36)
diff --git a/modules/gnu.c b/modules/gnu.c
index bc34692..3c772c5 100644
--- a/modules/gnu.c
+++ b/modules/gnu.c
@@ -444,26 +444,11 @@ M4BUILTIN_HANDLER (builtin)
bp->min_args, bp->max_args,
(bp->flags & M4_BUILTIN_SIDE_EFFECT) != 0))
{
- unsigned int i;
- /* TODO - make use of $@ reference. */
- /* TODO - add accessor that performs this construction. */
m4_macro_args *new_argv;
- new_argv = xmalloc (offsetof (m4_macro_args, array)
- + ((argc - 2) * sizeof (m4_symbol_value *)));
- new_argv->argc = argc - 1;
- new_argv->inuse = false;
- new_argv->argv0 = name;
- new_argv->argv0_len = m4_arg_len (argv, 1);
- new_argv->arraylen = argc - 2;
- memcpy (&new_argv->array[0], &argv->array[1],
- (argc - 2) * sizeof (m4_symbol_value *));
- if ((bp->flags & M4_BUILTIN_GROKS_MACRO) == 0)
- for (i = 2; i < argc; i++)
- if (!m4_is_arg_text (argv, i))
- m4_set_symbol_value_text (m4_arg_symbol (new_argv, i - 1),
- "", 0);
+ bool flatten = (bp->flags & M4_BUILTIN_GROKS_MACRO) == 0;
+ new_argv = m4_make_argv_ref (argv, name, m4_arg_len (argv, 1),
+ true, flatten);
bp->func (context, obs, argc - 1, new_argv);
- free (new_argv);
}
free (value);
}
@@ -508,6 +493,7 @@ M4BUILTIN_HANDLER (changeresyntax)
**/
M4BUILTIN_HANDLER (changesyntax)
{
+ const char *me = M4ARG (0);
M4_MODULE_IMPORT (m4, m4_expand_ranges);
if (m4_expand_ranges)
@@ -533,7 +519,7 @@ M4BUILTIN_HANDLER (changesyntax)
}
if (m4_set_syntax (M4SYNTAX, key, action,
key ? m4_expand_ranges (spec, obs) : "") < 0)
- m4_warn (context, 0, M4ARG (0), _("undefined syntax code: `%c'"),
+ m4_warn (context, 0, me, _("undefined syntax code: `%c'"),
key);
}
}
@@ -554,7 +540,7 @@ M4BUILTIN_HANDLER (debugfile)
if (argc == 1)
m4_debug_set_output (context, me, NULL);
- else if (m4_get_safer_opt (context) && *M4ARG (1))
+ else if (m4_get_safer_opt (context) && !m4_arg_empty (argv, 1))
m4_error (context, 0, 0, me, _("disabled by --safer"));
else if (!m4_debug_set_output (context, me, M4ARG (1)))
m4_error (context, 0, errno, me, _("cannot set debug file `%s'"),
@@ -613,6 +599,7 @@ M4BUILTIN_HANDLER (debugmode)
M4BUILTIN_HANDLER (esyscmd)
{
+ const char *me = M4ARG (0);
M4_MODULE_IMPORT (m4, m4_set_sysval);
M4_MODULE_IMPORT (m4, m4_sysval_flush);
@@ -623,12 +610,12 @@ M4BUILTIN_HANDLER (esyscmd)
if (m4_get_safer_opt (context))
{
- m4_error (context, 0, 0, M4ARG (0), _("disabled by --safer"));
+ m4_error (context, 0, 0, me, _("disabled by --safer"));
return;
}
/* Optimize the empty command. */
- if (*M4ARG (1) == '\0')
+ if (m4_arg_empty (argv, 1))
{
m4_set_sysval (0);
return;
@@ -639,14 +626,14 @@ M4BUILTIN_HANDLER (esyscmd)
pin = popen (M4ARG (1), "r");
if (pin == NULL)
{
- m4_error (context, 0, errno, M4ARG (0),
+ m4_error (context, 0, errno, me,
_("cannot open pipe to command `%s'"), M4ARG (1));
m4_set_sysval (-1);
}
else
{
while ((ch = getc (pin)) != EOF)
- obstack_1grow (obs, (char) ch);
+ obstack_1grow (obs, ch);
m4_set_sysval (pclose (pin));
}
}
@@ -690,27 +677,12 @@ M4BUILTIN_HANDLER (indir)
m4_warn (context, 0, me, _("undefined macro `%s'"), name);
else
{
- unsigned int i;
- /* TODO - make use of $@ reference. */
- /* TODO - add accessor that performs this construction. */
m4_macro_args *new_argv;
- new_argv = xmalloc (offsetof (m4_macro_args, array)
- + ((argc - 2) * sizeof (m4_symbol_value *)));
- new_argv->argc = argc - 1;
- new_argv->inuse = false;
- new_argv->argv0 = name;
- new_argv->argv0_len = m4_arg_len (argv, 1);
- new_argv->arraylen = argc - 2;
- memcpy (&new_argv->array[0], &argv->array[1],
- (argc - 2) * sizeof (m4_symbol_value *));
- if (!m4_symbol_groks_macro (symbol))
- for (i = 2; i < argc; i++)
- if (!m4_is_arg_text (argv, i))
- m4_set_symbol_value_text (m4_arg_symbol (new_argv, i - 1),
- "", 0);
+ bool flatten = !m4_symbol_groks_macro (symbol);
+ new_argv = m4_make_argv_ref (argv, name, m4_arg_len (argv, 1), true,
+ flatten);
m4_macro_call (context, m4_get_symbol_value (symbol), obs,
argc - 1, new_argv);
- free (new_argv);
}
}
}
@@ -793,6 +765,7 @@ M4BUILTIN_HANDLER (patsubst)
M4BUILTIN_HANDLER (regexp)
{
const char *me; /* name of this macro */
+ const char *victim; /* string to search */
const char *pattern; /* regular expression */
const char *replace; /* optional replacement string */
m4_pattern_buffer *buf; /* compiled regular expression */
@@ -845,8 +818,9 @@ M4BUILTIN_HANDLER (regexp)
if (!buf)
return;
+ victim = M4ARG (1);
len = m4_arg_len (argv, 1);
- startpos = regexp_search (buf, M4ARG (1), len, 0, len, replace == NULL);
+ startpos = regexp_search (buf, victim, len, 0, len, replace == NULL);
if (startpos == -2)
{
@@ -858,7 +832,7 @@ M4BUILTIN_HANDLER (regexp)
if (replace == NULL)
m4_shipout_int (obs, startpos);
else if (startpos >= 0)
- substitute (context, obs, me, M4ARG (1), replace, buf);
+ substitute (context, obs, me, victim, replace, buf);
}
diff --git a/modules/m4.c b/modules/m4.c
index 827fabb..f9d65ed 100644
--- a/modules/m4.c
+++ b/modules/m4.c
@@ -178,13 +178,14 @@ M4BUILTIN_HANDLER (define)
M4BUILTIN_HANDLER (undefine)
{
+ const char *me = M4ARG (0);
unsigned int i;
for (i = 1; i < argc; i++)
{
const char *name = M4ARG (i);
if (!m4_symbol_lookup (M4SYMTAB, name))
- m4_warn (context, 0, M4ARG (0), _("undefined macro `%s'"), name);
+ m4_warn (context, 0, me, _("undefined macro `%s'"), name);
else
m4_symbol_delete (M4SYMTAB, name);
}
@@ -209,13 +210,14 @@ M4BUILTIN_HANDLER (pushdef)
M4BUILTIN_HANDLER (popdef)
{
+ const char *me = M4ARG (0);
unsigned int i;
for (i = 1; i < argc; i++)
{
const char *name = M4ARG (i);
if (!m4_symbol_lookup (M4SYMTAB, name))
- m4_warn (context, 0, M4ARG (0), _("undefined macro `%s'"), name);
+ m4_warn (context, 0, me, _("undefined macro `%s'"), name);
else
m4_symbol_popdef (M4SYMTAB, name);
}
@@ -240,10 +242,7 @@ M4BUILTIN_HANDLER (ifelse)
/* The valid ranges of argc for ifelse is discontinuous, we cannot
rely on the regular mechanisms. */
- if (argc == 2)
- return;
-
- if (m4_bad_argc (context, argc, me, 3, -1, false))
+ if (argc == 2 || m4_bad_argc (context, argc, me, 3, -1, false))
return;
else if (argc % 3 == 0)
/* Diagnose excess arguments if 5, 8, 11, etc., actual arguments. */
@@ -254,7 +253,7 @@ M4BUILTIN_HANDLER (ifelse)
while (1)
{
- if (strcmp (M4ARG (index), M4ARG (index + 1)) == 0)
+ if (m4_arg_equal (argv, index, index + 1))
{
obstack_grow (obs, M4ARG (index + 2), m4_arg_len (argv, index + 2));
return;
@@ -317,6 +316,7 @@ void
m4_dump_symbols (m4 *context, m4_dump_symbol_data *data, unsigned int argc,
m4_macro_args *argv, bool complain)
{
+ const char *me = M4ARG (0);
assert (obstack_object_size (data->obs) == 0);
data->size = obstack_room (data->obs) / sizeof (const char *);
@@ -329,12 +329,12 @@ m4_dump_symbols (m4 *context, m4_dump_symbol_data *data,
unsigned int argc,
for (i = 1; i < argc; i++)
{
- symbol = m4_symbol_lookup (M4SYMTAB, M4ARG (i));
+ const char *name = M4ARG (i);
+ symbol = m4_symbol_lookup (M4SYMTAB, name);
if (symbol != NULL)
- dump_symbol_CB (NULL, M4ARG (i), symbol, data);
+ dump_symbol_CB (NULL, name, symbol, data);
else if (complain)
- m4_warn (context, 0, M4ARG (0), _("undefined macro `%s'"),
- M4ARG (i));
+ m4_warn (context, 0, me, _("undefined macro `%s'"), name);
}
}
@@ -508,14 +508,14 @@ m4_sysval_flush (m4 *context, bool report)
M4BUILTIN_HANDLER (syscmd)
{
- if (m4_get_safer_opt (context))
- {
- m4_error (context, 0, 0, M4ARG (0), _("disabled by --safer"));
- return;
- }
-
- /* Optimize the empty command. */
- if (*M4ARG (1) == '\0')
+ if (m4_get_safer_opt (context))
+ {
+ m4_error (context, 0, 0, M4ARG (0), _("disabled by --safer"));
+ return;
+ }
+
+ /* Optimize the empty command. */
+ if (m4_arg_empty (argv, 1))
{
m4_set_sysval (0);
return;
--
1.5.3.5
>From 9db4aab8ae052e044e1e5b024421d75e91f45b92 Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Fri, 19 Oct 2007 21:45:38 -0600
Subject: [PATCH] Stage 4: route indir, builtin through ref; make argv opaque.
* src/m4.h (obstack_regrow): Borrow definition from head.
(struct token_chain): Add flatten and len members.
(arg_equal, arg_empty, make_argv_ref): New prototypes.
(struct macro_arguments): Move...
* src/macro.c (struct macro_arguments): ...here, making it
opaque. Add has_ref member.
(empty_token): New placeholder, for optimizing comparison with
empty string.
(collect_arguments): Change signature, and populate new fields.
(expand_macro): Alter handling of obstacks.
(arg_token): New helper method.
(arg_equal, arg_empty, make_argv_ref): New methods.
(arg_type, arg_text, arg_len, arg_func): Use new methods.
* src/builtin.c (m4_ifelse, m4_builtin, m4_indir, m4_eval):
Likewise.
* src/format.c (format): Likewise.
(cherry picked from commit ab7d5ea40dd30e38cdafdfa69e868390ff6f72ab)
Signed-off-by: Eric Blake <address@hidden>
---
ChangeLog | 18 +++
src/builtin.c | 88 +++------------
src/format.c | 2 +-
src/m4.h | 37 +++----
src/macro.c | 326 ++++++++++++++++++++++++++++++++++++++++++++++-----------
5 files changed, 317 insertions(+), 154 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 7cd6fc8..0662337 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,23 @@
2007-11-29 Eric Blake <address@hidden>
+ Stage 4: route indir, builtin through ref; make argv opaque.
+ * src/m4.h (obstack_regrow): Borrow definition from head.
+ (struct token_chain): Add flatten and len members.
+ (arg_equal, arg_empty, make_argv_ref): New prototypes.
+ (struct macro_arguments): Move...
+ * src/macro.c (struct macro_arguments): ...here, making it
+ opaque. Add has_ref member.
+ (empty_token): New placeholder, for optimizing comparison with
+ empty string.
+ (collect_arguments): Change signature, and populate new fields.
+ (expand_macro): Alter handling of obstacks.
+ (arg_token): New helper method.
+ (arg_equal, arg_empty, make_argv_ref): New methods.
+ (arg_type, arg_text, arg_len, arg_func): Use new methods.
+ * src/builtin.c (m4_ifelse, m4_builtin, m4_indir, m4_eval):
+ Likewise.
+ * src/format.c (format): Likewise.
+
Stage 3: cache length, rather than computing it.
* src/input.c (next_token): Grab length from obstack rather than
calling strlen.
diff --git a/src/builtin.c b/src/builtin.c
index e719cdd..b4053f1 100644
--- a/src/builtin.c
+++ b/src/builtin.c
@@ -758,16 +758,10 @@ m4_ifdef (struct obstack *obs, int argc, macro_arguments
*argv)
static void
m4_ifelse (struct obstack *obs, int argc, macro_arguments *argv)
{
- const char *result;
- const char *me;
+ const char *me = ARG (0);
int index;
- size_t len = 0;
-
- if (argc == 2)
- return;
- me = ARG (0);
- if (bad_argc (me, argc, 3, -1))
+ if (argc == 2 || bad_argc (me, argc, 3, -1))
return;
else if (argc % 3 == 0)
/* Diagnose excess arguments if 5, 8, 11, etc., actual arguments. */
@@ -776,17 +770,13 @@ m4_ifelse (struct obstack *obs, int argc, macro_arguments
*argv)
index = 1;
argc--;
- result = NULL;
- while (result == NULL)
-
- if (arg_len (argv, index) == arg_len (argv, index + 1)
- && strcmp (ARG (index), ARG (index + 1)) == 0)
- {
- result = ARG (index + 2);
- len = arg_len (argv, index + 2);
- }
-
- else
+ while (true)
+ {
+ if (arg_equal (argv, index, index + 1))
+ {
+ obstack_grow (obs, ARG (index + 2), arg_len (argv, index + 2));
+ return;
+ }
switch (argc)
{
case 3:
@@ -794,16 +784,14 @@ m4_ifelse (struct obstack *obs, int argc, macro_arguments
*argv)
case 4:
case 5:
- result = ARG (index + 3);
- len = arg_len (argv, index + 3);
- break;
+ obstack_grow (obs, ARG (index + 3), arg_len (argv, index + 3));
+ return;
default:
argc -= 3;
index += 3;
}
-
- obstack_grow (obs, result, len);
+ }
}
/*---------------------------------------------------------------------.
@@ -944,29 +932,9 @@ m4_builtin (struct obstack *obs, int argc, macro_arguments
*argv)
m4_warn (0, me, _("undefined builtin `%s'"), name);
else
{
- int i;
- /* TODO make use of $@ reference, instead of copying argv. */
- /* TODO make accessor in macro.c that performs this
- construction, so that argv can be opaque type. */
- macro_arguments *new_argv = xmalloc (offsetof (macro_arguments, array)
- + ((argc - 2)
- * sizeof (token_data *)));
- new_argv->argc = argc - 1;
- new_argv->inuse = false;
- new_argv->argv0 = name;
- new_argv->argv0_len = arg_len (argv, 1);
- new_argv->arraylen = argc - 2;
- memcpy (&new_argv->array[0], &argv->array[1],
- (argc - 2) * sizeof (token_data *));
- if (!bp->groks_macro_args)
- for (i = 2; i < argc; i++)
- if (arg_type (argv, i) != TOKEN_TEXT)
- {
- TOKEN_DATA_TYPE (new_argv->array[i - 2]) = TOKEN_TEXT;
- TOKEN_DATA_TEXT (new_argv->array[i - 2]) = (char *) "";
- }
+ macro_arguments *new_argv = make_argv_ref (argv, name, arg_len (argv, 1),
+ true, !bp->groks_macro_args);
bp->func (obs, argc - 1, new_argv);
- free (new_argv);
}
}
@@ -998,29 +966,9 @@ m4_indir (struct obstack *obs, int argc, macro_arguments
*argv)
m4_warn (0, me, _("undefined macro `%s'"), name);
else
{
- int i;
- /* TODO make use of $@ reference, instead of copying argv. */
- /* TODO make accessor in macro.c that performs this
- construction, so that argv can be opaque type. */
- macro_arguments *new_argv = xmalloc (offsetof (macro_arguments, array)
- + ((argc - 2)
- * sizeof (token_data *)));
- new_argv->argc = argc - 1;
- new_argv->inuse = false;
- new_argv->argv0 = name;
- new_argv->argv0_len = arg_len (argv, 1);
- new_argv->arraylen = argc - 2;
- memcpy (&new_argv->array[0], &argv->array[1],
- (argc - 2) * sizeof (token_data *));
- if (!SYMBOL_MACRO_ARGS (s))
- for (i = 2; i < argc; i++)
- if (arg_type (argv, i) != TOKEN_TEXT)
- {
- TOKEN_DATA_TYPE (new_argv->array[i - 2]) = TOKEN_TEXT;
- TOKEN_DATA_TEXT (new_argv->array[i - 2]) = (char *) "";
- }
+ macro_arguments *new_argv = make_argv_ref (argv, name, arg_len (argv, 1),
+ true, !SYMBOL_MACRO_ARGS (s));
call_macro (s, argc - 1, new_argv, obs);
- free (new_argv);
}
}
@@ -1191,7 +1139,7 @@ m4_eval (struct obstack *obs, int argc, macro_arguments
*argv)
if (bad_argc (me, argc, 1, 3))
return;
- if (*ARG (2) && !numeric_arg (me, ARG (2), &radix))
+ if (!arg_empty (argv, 2) && !numeric_arg (me, ARG (2), &radix))
return;
if (radix < 1 || radix > 36)
@@ -1208,7 +1156,7 @@ m4_eval (struct obstack *obs, int argc, macro_arguments
*argv)
return;
}
- if (!*ARG (1))
+ if (arg_empty (argv, 1))
m4_warn (0, me, _("empty string treated as 0"));
else if (evaluate (me, ARG (1), &value))
return;
diff --git a/src/format.c b/src/format.c
index 7fc8fb1..20b3e28 100644
--- a/src/format.c
+++ b/src/format.c
@@ -51,7 +51,7 @@
void
format (struct obstack *obs, int argc, macro_arguments *argv)
{
- const char *me = argv->argv0;
+ const char *me = arg_text (argv, 0);
const char *f; /* format control string */
const char *fmt; /* position within f */
char fstart[] = "%'+- 0#*.*hhd"; /* current format spec */
diff --git a/src/m4.h b/src/m4.h
index 3a6acc3..ac81998 100644
--- a/src/m4.h
+++ b/src/m4.h
@@ -87,7 +87,15 @@ typedef struct string STRING;
#define obstack_chunk_alloc xmalloc
#define obstack_chunk_free free
-/* Those must come first. */
+/* glibc's obstack left out the ability to suspend and resume growth
+ of an object on the stack. Reopen OBJECT (previously returned by
+ obstack_alloc or obstack_finish) with SIZE for additional growth,
+ freeing all objects that occur later in the stack. */
+#define obstack_regrow(OBS, OBJECT, SIZE) \
+ (obstack_free (OBS, (char *) (OBJECT) + (SIZE)), \
+ (OBS)->object_base = (char *) (OBJECT))
+
+/* These must come first. */
typedef struct token_data token_data;
typedef struct macro_arguments macro_arguments;
typedef void builtin_func (struct obstack *, int, macro_arguments *);
@@ -272,8 +280,10 @@ struct token_chain
{
token_chain *next; /* Pointer to next link of chain. */
char *str; /* NUL-terminated string if text, else NULL. */
+ size_t len; /* Length of str, else 0. */
macro_arguments *argv;/* Reference to earlier address@hidden */
unsigned int index; /* Argument index within argv. */
+ bool flatten; /* True to treat builtins as text. */
};
/* The content of a token or macro argument. */
@@ -303,27 +313,6 @@ struct token_data
u;
};
-/* TODO - make this struct opaque, and move definition to macro.c. */
-/* Opaque structure describing all arguments to a macro, including the
- macro name at index 0. */
-struct macro_arguments
-{
- /* Number of arguments owned by this object, may be larger than
- arraylen since the array can refer to multiple arguments via a
- single $@ reference. */
- unsigned int argc;
- /* False unless the macro expansion refers to $@, determines whether
- this object can be freed at end of macro expansion or must wait
- until next byte read from file. */
- bool inuse;
- const char *argv0; /* The macro name being expanded. */
- size_t argv0_len; /* Length of argv0. */
- size_t arraylen; /* True length of allocated elements in array. */
- /* Used as a variable-length array, storing information about each
- argument. */
- token_data *array[FLEXIBLE_ARRAY_MEMBER];
-};
-
#define TOKEN_DATA_TYPE(Td) ((Td)->type)
#define TOKEN_DATA_LEN(Td) ((Td)->u.u_t.len)
#define TOKEN_DATA_TEXT(Td) ((Td)->u.u_t.text)
@@ -442,8 +431,12 @@ void call_macro (symbol *, int, macro_arguments *, struct
obstack *);
unsigned int arg_argc (macro_arguments *);
token_data_type arg_type (macro_arguments *, unsigned int);
const char *arg_text (macro_arguments *, unsigned int);
+bool arg_equal (macro_arguments *, unsigned int, unsigned int);
+bool arg_empty (macro_arguments *, unsigned int);
size_t arg_len (macro_arguments *, unsigned int);
builtin_func *arg_func (macro_arguments *, unsigned int);
+macro_arguments *make_argv_ref (macro_arguments *, const char *, size_t,
+ bool, bool);
/* File: builtin.c --- builtins. */
diff --git a/src/macro.c b/src/macro.c
index 320727d..e257485 100644
--- a/src/macro.c
+++ b/src/macro.c
@@ -24,6 +24,29 @@
#include "m4.h"
+/* Opaque structure describing all arguments to a macro, including the
+ macro name at index 0. */
+struct macro_arguments
+{
+ /* Number of arguments owned by this object, may be larger than
+ arraylen since the array can refer to multiple arguments via a
+ single $@ reference. */
+ unsigned int argc;
+ /* False unless the macro expansion refers to $@, determines whether
+ this object can be freed at end of macro expansion or must wait
+ until next byte read from file. */
+ bool_bitfield inuse : 1;
+ /* False if all arguments are just text or func, true if this argv
+ refers to another one. */
+ bool_bitfield has_ref : 1;
+ const char *argv0; /* The macro name being expanded. */
+ size_t argv0_len; /* Length of argv0. */
+ size_t arraylen; /* True length of allocated elements in array. */
+ /* Used as a variable-length array, storing information about each
+ argument. */
+ token_data *array[FLEXIBLE_ARRAY_MEMBER];
+};
+
static void expand_macro (symbol *);
static void expand_token (struct obstack *, token_type, token_data *, int);
@@ -35,24 +58,24 @@ static int macro_call_id = 0;
/* The shared stack of collected arguments for macro calls; as each
argument is collected, it is finished and its location stored in
- argv_stack. Normally, this stack can be used simultaneously by
- multiple macro calls; the exception is when an outer macro has
- generated some text, then calls a nested macro, in which case the
- nested macro must use a local stack to leave the unfinished text
- alone. Too bad obstack.h does not provide an easy way to reopen a
- finished object for further growth, but in practice this does not
- hurt us too much. */
+ argv_stack. This stack can be used simultaneously by multiple
+ macro calls, using obstack_regrow to handle partial objects
+ embedded in the stack. */
static struct obstack argc_stack;
/* The shared stack of pointers to collected arguments for macro
- calls. This object is never finished; we exploit the fact that
- obstack_blank is documented to take a negative size to reduce the
- size again. */
+ calls. This stack can be used simultaneously by multiple macro
+ calls, using obstack_regrow to handle partial objects embedded in
+ the stack. */
static struct obstack argv_stack;
-/*----------------------------------------------------------------------.
-| This function read all input, and expands each token, one at a time. |
-`----------------------------------------------------------------------*/
+/* The empty string token. */
+static token_data empty_token;
+
+/*----------------------------------------------------------------.
+| This function reads all input, and expands each token, one at a |
+| time. |
+`----------------------------------------------------------------*/
void
expand_input (void)
@@ -64,6 +87,13 @@ expand_input (void)
obstack_init (&argc_stack);
obstack_init (&argv_stack);
+ TOKEN_DATA_TYPE (&empty_token) = TOKEN_TEXT;
+ TOKEN_DATA_TEXT (&empty_token) = "";
+ TOKEN_DATA_LEN (&empty_token) = 0;
+#ifdef ENABLE_CHANGEWORD
+ TOKEN_DATA_ORIG_TEXT (&empty_token) = "";
+#endif
+
while ((t = next_token (&td, &line, NULL)) != TOKEN_EOF)
expand_token ((struct obstack *) NULL, t, &td, line);
@@ -237,12 +267,11 @@ expand_argument (struct obstack *obs, token_data *argp,
const char *caller)
/*-------------------------------------------------------------------------.
| Collect all the arguments to a call of the macro SYM. The arguments are |
| stored on the obstack ARGUMENTS and a table of pointers to the arguments |
-| on the obstack ARGPTR. |
+| on the obstack argv_stack. |
`-------------------------------------------------------------------------*/
static macro_arguments *
-collect_arguments (symbol *sym, struct obstack *argptr, unsigned int argv_base,
- struct obstack *arguments)
+collect_arguments (symbol *sym, struct obstack *arguments)
{
token_data td;
token_data *tdp;
@@ -253,10 +282,11 @@ collect_arguments (symbol *sym, struct obstack *argptr,
unsigned int argv_base,
args.argc = 1;
args.inuse = false;
+ args.has_ref = false;
args.argv0 = SYMBOL_NAME (sym);
args.argv0_len = strlen (args.argv0);
args.arraylen = 0;
- obstack_grow (argptr, &args, offsetof (macro_arguments, array));
+ obstack_grow (&argv_stack, &args, offsetof (macro_arguments, array));
if (peek_token () == TOKEN_OPEN)
{
@@ -265,20 +295,18 @@ collect_arguments (symbol *sym, struct obstack *argptr,
unsigned int argv_base,
{
more_args = expand_argument (arguments, &td, SYMBOL_NAME (sym));
- if (!groks_macro_args && TOKEN_DATA_TYPE (&td) == TOKEN_FUNC)
- {
- TOKEN_DATA_TYPE (&td) = TOKEN_TEXT;
- TOKEN_DATA_TEXT (&td) = (char *) "";
- TOKEN_DATA_LEN (&td) = 0;
- }
- tdp = (token_data *) obstack_copy (arguments, &td, sizeof td);
- obstack_ptr_grow (argptr, tdp);
+ if ((TOKEN_DATA_TYPE (&td) == TOKEN_TEXT && !TOKEN_DATA_LEN (&td))
+ || (!groks_macro_args && TOKEN_DATA_TYPE (&td) == TOKEN_FUNC))
+ tdp = &empty_token;
+ else
+ tdp = (token_data *) obstack_copy (arguments, &td, sizeof td);
+ obstack_ptr_grow (&argv_stack, tdp);
args.arraylen++;
args.argc++;
}
while (more_args);
}
- argv = (macro_arguments *) ((char *) obstack_base (argptr) + argv_base);
+ argv = (macro_arguments *) obstack_finish (&argv_stack);
argv->argc = args.argc;
argv->arraylen = args.arraylen;
return argv;
@@ -327,11 +355,11 @@ call_macro (symbol *sym, int argc, macro_arguments *argv,
static void
expand_macro (symbol *sym)
{
- struct obstack arguments; /* Alternate obstack if argc_stack is busy. */
- unsigned int argv_base; /* Size of argv_stack on entry. */
- void *argc_start; /* Start of argc_stack, else NULL if unsafe. */
+ void *argc_base = NULL; /* Base of argc_stack on entry. */
+ void *argv_base = NULL; /* Base of argv_stack on entry. */
+ unsigned int argc_size; /* Size of argc_stack on entry. */
+ unsigned int argv_size; /* Size of argv_stack on entry. */
macro_arguments *argv;
- int argc;
struct obstack *expansion;
const char *expanded;
bool traced;
@@ -360,24 +388,16 @@ expand_macro (symbol *sym)
traced = (debug_level & DEBUG_TRACE_ALL) || SYMBOL_TRACED (sym);
- argv_base = obstack_object_size (&argv_stack);
- if (obstack_object_size (&argc_stack) > 0)
- {
- /* We cannot use argc_stack if this is a nested invocation, and an
- outer invocation has an unfinished argument being
- collected. */
- obstack_init (&arguments);
- argc_start = NULL;
- }
- else
- argc_start = obstack_finish (&argc_stack);
+ argc_size = obstack_object_size (&argc_stack);
+ argv_size = obstack_object_size (&argv_stack);
+ argc_base = obstack_finish (&argc_stack);
+ if (0 < argv_size)
+ argv_base = obstack_finish (&argv_stack);
if (traced && (debug_level & DEBUG_TRACE_CALL))
trace_prepre (SYMBOL_NAME (sym), my_call_id);
- argv = collect_arguments (sym, &argv_stack, argv_base,
- argc_start ? &argc_stack : &arguments);
- argc = argv->argc;
+ argv = collect_arguments (sym, &argc_stack);
loc_close_file = current_file;
loc_close_line = current_line;
@@ -385,14 +405,14 @@ expand_macro (symbol *sym)
current_line = loc_open_line;
if (traced)
- trace_pre (SYMBOL_NAME (sym), my_call_id, argc, argv);
+ trace_pre (SYMBOL_NAME (sym), my_call_id, argv->argc, argv);
expansion = push_string_init ();
- call_macro (sym, argc, argv, expansion);
+ call_macro (sym, argv->argc, argv, expansion);
expanded = push_string_finish ();
if (traced)
- trace_post (SYMBOL_NAME (sym), my_call_id, argc, argv, expanded);
+ trace_post (SYMBOL_NAME (sym), my_call_id, argv->argc, argv, expanded);
current_file = loc_close_file;
current_line = loc_close_line;
@@ -404,11 +424,50 @@ expand_macro (symbol *sym)
free_symbol (sym);
/* TODO pay attention to argv->inuse, in case someone is depending on
address@hidden */
- if (argc_start)
- obstack_free (&argc_stack, argc_start);
+ if (0 < argc_size)
+ obstack_regrow (&argc_stack, argc_base, argc_size);
+ else
+ obstack_free (&argc_stack, argc_base);
+ if (0 < argv_size)
+ obstack_regrow (&argv_stack, argv_base, argv_size);
else
- obstack_free (&arguments, NULL);
- obstack_blank (&argv_stack, argv_base - obstack_object_size (&argv_stack));
+ obstack_free (&argv_stack, argv);
+}
+
+/* Given ARGV, return the token_data that contains argument INDEX;
+ INDEX must be > 0, < argv->argc. */
+static token_data *
+arg_token (macro_arguments *argv, unsigned int index)
+{
+ unsigned int i;
+ token_data *token;
+
+ assert (index && index < argv->argc);
+ if (!argv->has_ref)
+ return argv->array[index - 1];
+ /* Must cycle through all tokens, until we find index, since a ref
+ may occupy multiple indices. */
+ for (i = 0; i < argv->arraylen; i++)
+ {
+ token = argv->array[i];
+ if (TOKEN_DATA_TYPE (token) == TOKEN_COMP)
+ {
+ token_chain *chain = token->u.chain;
+ /* TODO - for now we support only a single-length $@ chain. */
+ assert (!chain->next && !chain->str);
+ if (index < chain->argv->argc - (chain->index - 1))
+ {
+ token = arg_token (chain->argv, chain->index - 1 + index);
+ if (chain->flatten && TOKEN_DATA_TYPE (token) == TOKEN_FUNC)
+ token = &empty_token;
+ break;
+ }
+ index -= chain->argv->argc - chain->index;
+ }
+ else if (--index == 0)
+ break;
+ }
+ return token;
}
@@ -424,9 +483,15 @@ arg_argc (macro_arguments *argv)
token_data_type
arg_type (macro_arguments *argv, unsigned int index)
{
+ token_data_type type;
+ token_data *token;
+
if (index == 0 || index >= argv->argc)
return TOKEN_TEXT;
- return TOKEN_DATA_TYPE (argv->array[index - 1]);
+ token = arg_token (argv, index);
+ type = TOKEN_DATA_TYPE (token);
+ assert (type != TOKEN_COMP);
+ return type;
}
/* Given ARGV, return the text at argument INDEX, or NULL if the
@@ -435,13 +500,59 @@ arg_type (macro_arguments *argv, unsigned int index)
const char *
arg_text (macro_arguments *argv, unsigned int index)
{
+ token_data *token;
+
if (index == 0)
return argv->argv0;
if (index >= argv->argc)
return "";
- if (TOKEN_DATA_TYPE (argv->array[index - 1]) != TOKEN_TEXT)
- return NULL;
- return TOKEN_DATA_TEXT (argv->array[index - 1]);
+ token = arg_token (argv, index);
+ switch (TOKEN_DATA_TYPE (token))
+ {
+ case TOKEN_TEXT:
+ return TOKEN_DATA_TEXT (token);
+ case TOKEN_FUNC:
+ return NULL;
+ case TOKEN_COMP:
+ /* TODO - how to concatenate multiple arguments? For now, we expect
+ only one element in the chain, and arg_token dereferences it. */
+ default:
+ break;
+ }
+ assert (!"arg_text");
+ abort ();
+}
+
+/* Given ARGV, compare text arguments INDEXA and INDEXB for equality.
+ Both indices must be non-zero and less than argc. Return true if
+ the arguments contain the same contents; often more efficient than
+ strcmp (arg_text (argv, indexa), arg_text (argv, indexb)) == 0. */
+bool
+arg_equal (macro_arguments *argv, unsigned int indexa, unsigned int indexb)
+{
+ token_data *ta = arg_token (argv, indexa);
+ token_data *tb = arg_token (argv, indexb);
+
+ if (ta == &empty_token || tb == &empty_token)
+ return ta == tb;
+ /* TODO - allow builtin tokens in the comparison? */
+ assert (TOKEN_DATA_TYPE (ta) == TOKEN_TEXT
+ && TOKEN_DATA_TYPE (tb) == TOKEN_TEXT);
+ return (TOKEN_DATA_LEN (ta) == TOKEN_DATA_LEN (tb)
+ && strcmp (TOKEN_DATA_TEXT (ta), TOKEN_DATA_TEXT (tb)) == 0);
+}
+
+/* Given ARGV, return true if argument INDEX is the empty string.
+ This gives the same result as comparing arg_len against 0, but is
+ often faster. */
+bool
+arg_empty (macro_arguments *argv, unsigned int index)
+{
+ if (index == 0)
+ return argv->argv0_len == 0;
+ if (index >= argv->argc)
+ return true;
+ return arg_token (argv, index) == &empty_token;
}
/* Given ARGV, return the length of argument INDEX, or SIZE_MAX if the
@@ -449,13 +560,28 @@ arg_text (macro_arguments *argv, unsigned int index)
size_t
arg_len (macro_arguments *argv, unsigned int index)
{
+ token_data *token;
+
if (index == 0)
return argv->argv0_len;
if (index >= argv->argc)
return 0;
- if (TOKEN_DATA_TYPE (argv->array[index - 1]) != TOKEN_TEXT)
- return SIZE_MAX;
- return TOKEN_DATA_LEN (argv->array[index - 1]);
+ token = arg_token (argv, index);
+ switch (TOKEN_DATA_TYPE (token))
+ {
+ case TOKEN_TEXT:
+ assert ((token == &empty_token) == (TOKEN_DATA_LEN (token) == 0));
+ return TOKEN_DATA_LEN (token);
+ case TOKEN_FUNC:
+ return SIZE_MAX;
+ case TOKEN_COMP:
+ /* TODO - how to concatenate multiple arguments? For now, we expect
+ only one element in the chain, and arg_token dereferences it. */
+ default:
+ break;
+ }
+ assert (!"arg_len");
+ abort ();
}
/* Given ARGV, return the builtin function referenced by argument
@@ -464,8 +590,86 @@ arg_len (macro_arguments *argv, unsigned int index)
builtin_func *
arg_func (macro_arguments *argv, unsigned int index)
{
- if (index == 0 || index >= argv->argc
- || TOKEN_DATA_TYPE (argv->array[index - 1]) != TOKEN_FUNC)
+ token_data *token;
+
+ if (index == 0 || index >= argv->argc)
return NULL;
- return TOKEN_DATA_FUNC (argv->array[index - 1]);
+ token = arg_token (argv, index);
+ switch (TOKEN_DATA_TYPE (token))
+ {
+ case TOKEN_FUNC:
+ return TOKEN_DATA_FUNC (token);
+ case TOKEN_TEXT:
+ return NULL;
+ case TOKEN_COMP:
+ /* TODO - how to concatenate multiple arguments? For now, we expect
+ only one element in the chain. */
+ default:
+ break;
+ }
+ assert(!"arg_func");
+ abort ();
+}
+
+/* Create a new argument object using the same obstack as ARGV; thus,
+ the new object will automatically be freed when the original is
+ freed. Explicitly set the macro name (argv[0]) from ARGV0 with
+ length ARGV0_LEN. If SKIP, set argv[1] of the new object to
+ argv[2] of the old, otherwise the objects share all arguments. If
+ FLATTEN, any non-text in ARGV is flattened to an empty string when
+ referenced through the new object. */
+macro_arguments *
+make_argv_ref (macro_arguments *argv, const char *argv0, size_t argv0_len,
+ bool skip, bool flatten)
+{
+ macro_arguments *new_argv;
+ token_data *token;
+ token_chain *chain;
+ unsigned int index = skip ? 2 : 1;
+
+ assert (obstack_object_size (&argv_stack) == 0);
+ /* When making a reference through a reference, point to the
+ original if possible. */
+ if (argv->has_ref)
+ {
+ /* TODO - for now we support only a single-length $@ chain. */
+ assert (argv->arraylen == 1
+ && TOKEN_DATA_TYPE (argv->array[0]) == TOKEN_COMP);
+ chain = argv->array[0]->u.chain;
+ assert (!chain->next && !chain->str);
+ argv = chain->argv;
+ index += chain->index - 1;
+ }
+ if (argv->argc <= index)
+ {
+ new_argv = (macro_arguments *)
+ obstack_alloc (&argv_stack, offsetof (macro_arguments, array));
+ new_argv->arraylen = 0;
+ new_argv->has_ref = false;
+ }
+ else
+ {
+ new_argv = (macro_arguments *)
+ obstack_alloc (&argv_stack,
+ offsetof (macro_arguments, array) + sizeof token);
+ token = (token_data *) obstack_alloc (&argv_stack, sizeof *token);
+ chain = (token_chain *) obstack_alloc (&argv_stack, sizeof *chain);
+ new_argv->arraylen = 1;
+ new_argv->array[0] = token;
+ new_argv->has_ref = true;
+ TOKEN_DATA_TYPE (token) = TOKEN_COMP;
+ token->u.chain = chain;
+ chain->next = NULL;
+ chain->str = NULL;
+ chain->len = 0;
+ chain->argv = argv;
+ chain->index = index;
+ chain->flatten = flatten;
+ }
+ /* TODO - should argv->inuse be set? */
+ new_argv->argc = argv->argc - (index - 1);
+ new_argv->inuse = false;
+ new_argv->argv0 = argv0;
+ new_argv->argv0_len = argv0_len;
+ return new_argv;
}
--
1.5.3.5
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [4/18] argv_ref speedup: make argv struct opaque,
Eric Blake <=