[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[16/18] argv_ref speedup: cache frequently used quotes
From: |
Eric Blake |
Subject: |
[16/18] argv_ref speedup: cache frequently used quotes |
Date: |
Thu, 21 Feb 2008 07:08:17 -0700 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071031 Thunderbird/2.0.0.9 Mnenhy/0.7.5.666 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Next in the series. No dramatic speedups, but enough that it was worth
doing. Rather than always copying string pairs, it was worth caching
frequently used ones. In particular, since quote_age is encoded as the
current 1-byte quote delimiters, it can double as the cache without any
extra storage. I also consolidated the various argument printing
routines, so that a tweak in one function gives consistent results to all
the callers, rather than hunting down the multiple places where an argv
object needs printing (when doing tracing, when an embedded $@ ref is
flattened, when doing builtins such as errprint...). I had to add a test
of -l behavior, since an earlier version of the patch caused a regression
in how trace output displayed when -l was active. Finally, I implemented
a todo in input.c to allow more efficient m4wrap text collection in place,
rather than copying a flattened string; this will be handy in later
patches when m4wrap is converted to POSIX FIFO ordering.
2008-02-21 Eric Blake <address@hidden>
Stage 16: cache quotes and improve arg_print.
Cache rather than always copying quotes when pushing $@ refs; in
particular, reconstruct single-byte quotes on the fly. Allow NUL
through m4wrap. Improve sharing of code that prints arguments.
Memory impact: slight improvement, due to cached quotes.
Speed impact: slight improvement, due to less copying.
* src/m4.h (push_wrapup_init, push_wrapup_finish, quote_cache)
(func_print): New prototypes.
(arg_print): Adjust prototype.
* src/builtin.h (func_print): New function.
(define_user_macro): Slight cleanup.
(dump_args): Delete, no longer used.
(m4_errprint): Use arg_print.
(m4_m4wrap): Handle embedded NUL.
* src/debug.c (trace_pre): Use arg_print.
* src/input.c (cached_quote): New variable.
(push_wrapup): Split...
(push_wrapup_init, push_wrapup_finish): ...into these.
(input_print): Use arg_print.
(quote_cache): New function.
(pop_input, next_char_1, append_quote_token, set_quote_age):
Adjust users.
* src/macro.c (arg_text, make_argv_ref_token): Adjust users.
(arg_print): Add parameters.
* examples/null.m4: Test for NUL in m4wrap.
* examples/null.out: Update expected output.
* doc/m4.texinfo (Debug Levels): Test --arglength truncation.
- --
Don't work too hard, make some time for fun as well!
Eric Blake address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFHvYXR84KuGfSFAYARAmV6AJ9dSxzjV8GsM04XD03cm5gRO9aPuACfc8uA
TQ43YMozSg5cSf7YvAhSSQg=
=KamW
-----END PGP SIGNATURE-----
From def1f82375ed7f310bdd8d8e1ce0c2cd9c64e2c6 Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Wed, 20 Feb 2008 21:37:30 -0700
Subject: [PATCH] Stage 16: cache quotes and improve m4_arg_print.
* m4/m4module.h (m4_symbol_value_print, m4_symbol_print)
(m4_arg_print): Adjust prototypes.
(m4_dump_args): Delete.
(m4_push_wrapup): Split...
(m4_push_wrapup_init, m4_push_wrapup_finish): ...into these
prototypes.
* m4/m4private.h (struct m4_syntax_table): Add cached_quote
member.
(m4__quote_cache, m4__quote_uncache): New prototypes.
* m4/syntax.c (m4_syntax_create): Initialize the cache.
(m4__quote_cache): New function.
(m4_set_syntax): Update caller.
* m4/symtab.c (m4_symbol_value_print): Add parameter.
(m4_symbol_print, dump_symbol_CB): Adjust all callers.
* m4/utility.c (m4_dump_args): Delete; callers should use
m4_arg_print instead.
* m4/input.c (m4_push_wrapup_init, m4_push_wrapup_finish): Split
implementation, and allow embedded NUL.
(m4_print_token, pop_input, composite_print, composite_peek):
(composite_read, append_quote_token): Adjust all callers.
* m4/macro.c (trace_prepre, m4_arg_text, make_argv_ref):
Likewise.
(m4_arg_print): Add parameters.
(trace_pre): Rewrite in terms of m4_arg_print.
* modules/m4.c (errprint): Likewise.
(m4wrap): Rewrite to allow embedded NUL.
(dumpdef): Adjust caller.
* doc/m4.texinfo (Debuglen): Enhance debuglen test.
* tests/null.m4: Test for NUL in m4wrap.
* tests/null.out: Update expected output.
Signed-off-by: Eric Blake <address@hidden>
---
ChangeLog | 37 +++++++++++++++++++++
doc/m4.texinfo | 18 +++++++---
m4/input.c | 77 +++++++++++++++++++++++++++++--------------
m4/m4module.h | 19 +++++------
m4/m4private.h | 16 +++++++++
m4/macro.c | 99 +++++++++++++++++++++++++++-----------------------------
m4/symtab.c | 78 +++++++++++++++++++++++++++++++-------------
m4/syntax.c | 49 +++++++++++++++++++++++++++
m4/utility.c | 25 --------------
modules/m4.c | 15 ++++----
tests/null.m4 | 5 +--
tests/null.out | 2 +-
12 files changed, 290 insertions(+), 150 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index d0f4e27..b5ae203 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,42 @@
2008-02-20 Eric Blake <address@hidden>
+ Stage 16: cache quotes and improve m4_arg_print.
+ Cache rather than always copying quotes when pushing $@ refs; in
+ particular, reconstruct single-byte quotes on the fly. Allow NUL
+ through m4wrap. Improve sharing of code that prints arguments.
+ Memory impact: slight improvement, due to cached quotes.
+ Speed impact: slight improvement, due to less copying.
+ * m4/m4module.h (m4_symbol_value_print, m4_symbol_print)
+ (m4_arg_print): Adjust prototypes.
+ (m4_dump_args): Delete.
+ (m4_push_wrapup): Split...
+ (m4_push_wrapup_init, m4_push_wrapup_finish): ...into these
+ prototypes.
+ * m4/m4private.h (struct m4_syntax_table): Add cached_quote
+ member.
+ (m4__quote_cache, m4__quote_uncache): New prototypes.
+ * m4/syntax.c (m4_syntax_create): Initialize the cache.
+ (m4__quote_cache): New function.
+ (m4_set_syntax): Update caller.
+ * m4/symtab.c (m4_symbol_value_print): Add parameter.
+ (m4_symbol_print, dump_symbol_CB): Adjust all callers.
+ * m4/utility.c (m4_dump_args): Delete; callers should use
+ m4_arg_print instead.
+ * m4/input.c (m4_push_wrapup_init, m4_push_wrapup_finish): Split
+ implementation, and allow embedded NUL.
+ (m4_print_token, pop_input, composite_print, composite_peek):
+ (composite_read, append_quote_token): Adjust all callers.
+ * m4/macro.c (trace_prepre, m4_arg_text, make_argv_ref):
+ Likewise.
+ (m4_arg_print): Add parameters.
+ (trace_pre): Rewrite in terms of m4_arg_print.
+ * modules/m4.c (errprint): Likewise.
+ (m4wrap): Rewrite to allow embedded NUL.
+ (dumpdef): Adjust caller.
+ * doc/m4.texinfo (Debuglen): Enhance debuglen test.
+ * tests/null.m4: Test for NUL in m4wrap.
+ * tests/null.out: Update expected output.
+
Fix out-of-bounds read for sanitized macro names, from 2008-02-06.
* m4/utility.c (m4_verror_at_line): Properly terminate the string.
Reported by Ralf Wildenhues.
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index 5ed20ea..ffc1949 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -3860,17 +3860,25 @@ parsed as an integer.
The macro @code{debuglen} is recognized only with parameters.
@end deffn
address@hidden options: -l4 -techo
+The following example demonstrates the behavior of length truncation.
+Note that each argument and the final result are individually truncated.
+Also, the special tokens for builtin functions are not truncated.
+
address@hidden options: -l6 -techo -tdefn
@example
-$ @kbd{m4 -d -l 4 -t echo}
+$ @kbd{m4 -d -l 6 -t echo -t defn}
debuglen(`oops')
@error{}m4:stdin:1: Warning: debuglen: non-numeric argument `oops'
@result{}
define(`echo', `$@@')
@result{}
-echo(`long string')
address@hidden: -1- echo(`long...') -> ``lon...'
address@hidden string
+echo(`1', `long string')
address@hidden: -1- echo(`1', `long s...') -> ``1',`l...'
address@hidden,long string
+echo(defn(`changequote'))
address@hidden: -2- defn(`change...') -> <changequote>
address@hidden: -1- echo(`') -> ``''
address@hidden
debuglen
@result{}debuglen
debuglen(`0')
diff --git a/m4/input.c b/m4/input.c
index b5d50a1..381f38d 100644
--- a/m4/input.c
+++ b/m4/input.c
@@ -60,7 +60,7 @@
"wrapup_stack" to "current_input" can continue indefinitely, even
generating infinite loops (e.g. "define(`f',`m4wrap(`f')')f"),
without memory leaks. Adding wrapped data is done through
- m4_push_wrapup().
+ m4_push_wrapup_init/m4_push_wrapup_finish().
Pushing new input on the input stack is done by m4_push_file(), the
conceptual m4_push_string(), and m4_push_builtin() (for builtin
@@ -124,7 +124,8 @@ static bool consume_syntax (m4 *,
m4_obstack *, unsigned int);
#ifdef DEBUG_INPUT
# include "quotearg.h"
-static int m4_print_token (const char *, m4__token_type, m4_symbol_value *);
+static int m4_print_token (m4 *, const char *, m4__token_type,
+ m4_symbol_value *);
#endif
/* Vtable of callbacks for each input method. */
@@ -753,7 +754,10 @@ composite_peek (m4_input_block *me, m4 *context, bool
allow_argv)
argv. */
m4_push_string_init (context);
m4__push_arg_quote (context, current_input, chain->u.u_a.argv,
- chain->u.u_a.index, chain->u.u_a.quotes);
+ chain->u.u_a.index,
+ m4__quote_cache (M4SYNTAX, NULL,
+ chain->quote_age,
+ chain->u.u_a.quotes));
chain->u.u_a.index++;
chain->u.u_a.comma = true;
m4_push_string_finish ();
@@ -804,7 +808,10 @@ composite_read (m4_input_block *me, m4 *context, bool
allow_quote, bool safe)
argv. */
m4_push_string_init (context);
m4__push_arg_quote (context, current_input, chain->u.u_a.argv,
- chain->u.u_a.index, chain->u.u_a.quotes);
+ chain->u.u_a.index,
+ m4__quote_cache (M4SYNTAX, NULL,
+ chain->quote_age,
+ chain->u.u_a.quotes));
chain->u.u_a.index++;
chain->u.u_a.comma = true;
m4_push_string_finish ();
@@ -898,8 +905,12 @@ composite_print (m4_input_block *me, m4 *context,
m4_obstack *obs)
break;
case M4__CHAIN_ARGV:
assert (!chain->u.u_a.comma);
- if (m4_arg_print (obs, chain->u.u_a.argv, chain->u.u_a.index,
- chain->u.u_a.quotes, &maxlen, module))
+ if (m4_arg_print (context, obs, chain->u.u_a.argv,
+ chain->u.u_a.index,
+ m4__quote_cache (M4SYNTAX, NULL, chain->quote_age,
+ chain->u.u_a.quotes),
+ chain->u.u_a.flatten, NULL, &maxlen, false,
+ module))
done = true;
break;
default:
@@ -965,16 +976,13 @@ m4_input_print (m4 *context, m4_obstack *obs,
m4_input_block *input)
}
}
-/* The function m4_push_wrapup () pushes a string on the wrapup stack.
- When the normal input stack gets empty, the wrapup stack will become
- the input stack, and m4_push_string () and m4_push_file () will
- operate on wrapup_stack. M4_push_wrapup should be done as
- m4_push_string (), but this will suffice, as long as arguments to
- m4_m4wrap () are moderate in size.
+/* The function m4_push_wrapup_init () returns an obstack ready for
+ direct expansion of wrapup text, and should be followed by
+ m4_push_wrapup_finish ().
FIXME - we should allow pushing builtins as well as text. */
-void
-m4_push_wrapup (m4 *context, const char *s)
+m4_obstack *
+m4_push_wrapup_init (m4 *context)
{
m4_input_block *i;
@@ -984,11 +992,25 @@ m4_push_wrapup (m4 *context, const char *s)
i->funcs = &string_funcs;
i->file = m4_get_current_file (context);
i->line = m4_get_current_line (context);
-
- i->u.u_s.len = strlen (s);
- i->u.u_s.str = obstack_copy (wrapup_stack, s, i->u.u_s.len);
-
wsp = i;
+ return wrapup_stack;
+}
+
+/* After pushing wrapup text, this completes the bookkeeping. */
+void
+m4_push_wrapup_finish (void)
+{
+ m4_input_block *i = wsp;
+ if (obstack_object_size (wrapup_stack) == 0)
+ {
+ wsp = i->prev;
+ obstack_free (wrapup_stack, i);
+ }
+ else
+ {
+ i->u.u_s.len = obstack_object_size (wrapup_stack);
+ i->u.u_s.str = (char *) obstack_finish (wrapup_stack);
+ }
}
@@ -1010,6 +1032,7 @@ pop_input (m4 *context, bool cleanup)
if (tmp != NULL)
{
obstack_free (current_input, isp);
+ m4__quote_uncache (M4SYNTAX);
next = NULL; /* might be set in m4_push_string_init () */
}
@@ -1099,8 +1122,11 @@ append_quote_token (m4 *context, m4_obstack *obs,
m4_symbol_value *value)
/* TODO preserve $@ through quotes. */
if (src_chain->type == M4__CHAIN_ARGV)
{
- m4_arg_print (obs, src_chain->u.u_a.argv, src_chain->u.u_a.index,
- src_chain->u.u_a.quotes, NULL, false);
+ m4_arg_print (context, obs, src_chain->u.u_a.argv,
+ src_chain->u.u_a.index,
+ m4__quote_cache (M4SYNTAX, NULL, src_chain->quote_age,
+ src_chain->u.u_a.quotes),
+ src_chain->u.u_a.flatten, NULL, NULL, false, false);
m4__arg_adjust_refcount (context, src_chain->u.u_a.argv, false);
return;
}
@@ -1484,7 +1510,7 @@ m4__next_token (m4 *context, m4_symbol_value *token, int
*line,
init_builtin_token (context, token);
next_char (context, false, true);
#ifdef DEBUG_INPUT
- m4_print_token ("next_token", M4_TOKEN_MACDEF, token);
+ m4_print_token (context, "next_token", M4_TOKEN_MACDEF, token);
#endif
return M4_TOKEN_MACDEF;
}
@@ -1492,7 +1518,7 @@ m4__next_token (m4 *context, m4_symbol_value *token, int
*line,
{
init_argv_symbol (context, obs, token);
#ifdef DEBUG_INPUT
- m4_print_token ("next_token", M4_TOKEN_ARGV, token);
+ m4_print_token (context, "next_token", M4_TOKEN_ARGV, token);
#endif
return M4_TOKEN_ARGV;
}
@@ -1720,7 +1746,7 @@ m4__next_token (m4 *context, m4_symbol_value *token, int
*line,
VALUE_MAX_ARGS (token) = -1;
#ifdef DEBUG_INPUT
- m4_print_token ("next_token", type, token);
+ m4_print_token (context, "next_token", type, token);
#endif
return type;
@@ -1751,7 +1777,8 @@ m4__next_token_is_open (m4 *context)
#ifdef DEBUG_INPUT
int
-m4_print_token (const char *s, m4__token_type type, m4_symbol_value *token)
+m4_print_token (m4 *context, const char *s, m4__token_type type,
+ m4_symbol_value *token)
{
m4_obstack obs;
size_t len;
@@ -1802,7 +1829,7 @@ m4_print_token (const char *s, m4__token_type type,
m4_symbol_value *token)
if (token)
{
obstack_init (&obs);
- m4_symbol_value_print (token, &obs, NULL, NULL, true);
+ m4_symbol_value_print (context, token, &obs, NULL, false, NULL, true);
len = obstack_object_size (&obs);
xfprintf (stderr, "%s\n", quotearg_style_mem (c_maybe_quoting_style,
obstack_finish (&obs),
diff --git a/m4/m4module.h b/m4/m4module.h
index a807e70..13f5b4b 100644
--- a/m4/m4module.h
+++ b/m4/m4module.h
@@ -126,8 +126,6 @@ struct m4_string_pair
extern bool m4_bad_argc (m4 *, int, const char *, size_t, size_t,
bool);
extern bool m4_numeric_arg (m4 *, const char *, const char *, int *);
-extern void m4_dump_args (m4 *, m4_obstack *, size_t,
- m4_macro_args *, const char *, bool);
extern bool m4_parse_truth_arg (m4 *, const char *, const char *, bool);
/* Error handling. */
@@ -248,10 +246,10 @@ extern m4_symbol_value *m4_get_symbol_value
(m4_symbol*);
extern bool m4_get_symbol_traced (m4_symbol*);
extern bool m4_set_symbol_name_traced (m4_symbol_table*,
const char *, bool);
-extern bool m4_symbol_value_print (m4_symbol_value *, m4_obstack *,
- const m4_string_pair *, size_t *,
- bool);
-extern void m4_symbol_print (m4_symbol *, m4_obstack *,
+extern bool m4_symbol_value_print (m4 *, m4_symbol_value *, m4_obstack *,
+ const m4_string_pair *, bool,
+ size_t *, bool);
+extern void m4_symbol_print (m4 *, m4_symbol *, m4_obstack *,
const m4_string_pair *, bool, size_t,
bool);
extern bool m4_symbol_value_groks_macro (m4_symbol_value *);
@@ -327,9 +325,9 @@ extern bool m4_arg_empty (m4_macro_args *,
size_t);
extern size_t m4_arg_len (m4_macro_args *, size_t);
extern m4_builtin_func *m4_arg_func (m4_macro_args *, size_t);
extern m4_obstack *m4_arg_scratch (m4 *);
-extern bool m4_arg_print (m4_obstack *, m4_macro_args *,
- size_t, const m4_string_pair *,
- size_t *, bool);
+extern bool m4_arg_print (m4 *, m4_obstack *, m4_macro_args *,
+ size_t, const m4_string_pair *, bool,
+ const char *, size_t *, bool, bool);
extern m4_macro_args *m4_make_argv_ref (m4 *, m4_macro_args *, const char *,
size_t, bool, bool);
extern void m4_push_arg (m4 *, m4_obstack *, m4_macro_args *,
@@ -466,7 +464,8 @@ extern void m4_push_file (m4 *, FILE *, const
char *, bool);
extern void m4_push_builtin (m4 *, m4_symbol_value *);
extern m4_obstack *m4_push_string_init (m4 *);
extern m4_input_block *m4_push_string_finish (void);
-extern void m4_push_wrapup (m4 *, const char *);
+extern m4_obstack *m4_push_wrapup_init (m4 *);
+extern void m4_push_wrapup_finish (void);
extern bool m4_pop_wrapup (m4 *);
extern void m4_input_print (m4 *, m4_obstack *, m4_input_block *);
diff --git a/m4/m4private.h b/m4/m4private.h
index 2201703..ee6dc6a 100644
--- a/m4/m4private.h
+++ b/m4/m4private.h
@@ -437,6 +437,14 @@ struct m4_syntax_table {
these can alter the rescan of a prior parameter in a quoted
context. */
unsigned int quote_age;
+
+ /* Track a cached quote pair on the input obstack. */
+ m4_string_pair *cached_quote;
+
+ /* Storage for a simple cached quote that can be recreated on the fly. */
+ char cached_lquote[2];
+ char cached_rquote[2];
+ m4_string_pair cached_simple;
};
/* Fast macro versions of syntax table accessor functions,
@@ -462,6 +470,14 @@ struct m4_syntax_table {
age will give the same parse. */
#define m4__safe_quotes(S) (((S)->quote_age & 0xffff) != 0)
+/* Set or refresh the cached quote. */
+extern const m4_string_pair *m4__quote_cache (m4_syntax_table *,
+ m4_obstack *obs, unsigned int,
+ const m4_string_pair *);
+
+/* Clear the cached quote. */
+#define m4__quote_uncache(S) ((S)->cached_quote = NULL)
+
/* --- MACRO MANAGEMENT --- */
diff --git a/m4/macro.c b/m4/macro.c
index d6f81d8..2288dea 100644
--- a/m4/macro.c
+++ b/m4/macro.c
@@ -903,8 +903,8 @@ trace_prepre (m4 *context, const char *name, size_t id,
m4_symbol_value *value)
quotes = m4_get_syntax_quotes (M4SYNTAX);
trace_header (context, id);
trace_format (context, "%s ... = ", name);
- m4_symbol_value_print (value, &context->trace_messages, quotes, &arg_length,
- module);
+ m4_symbol_value_print (context, value, &context->trace_messages, quotes,
+ false, &arg_length, module);
trace_flush (context);
}
@@ -913,13 +913,10 @@ trace_prepre (m4 *context, const char *name, size_t id,
m4_symbol_value *value)
static void
trace_pre (m4 *context, size_t id, m4_macro_args *argv)
{
- size_t i;
- size_t argc = m4_arg_argc (argv);
-
trace_header (context, id);
trace_format (context, "%s", M4ARG (0));
- if (1 < argc && m4_is_debug_bit (context, M4_DEBUG_TRACE_ARGS))
+ if (1 < m4_arg_argc (argv) && m4_is_debug_bit (context, M4_DEBUG_TRACE_ARGS))
{
const m4_string_pair *quotes = NULL;
size_t arg_length = m4_get_max_debug_arg_length_opt (context);
@@ -928,16 +925,8 @@ trace_pre (m4 *context, size_t id, m4_macro_args *argv)
if (m4_is_debug_bit (context, M4_DEBUG_TRACE_QUOTE))
quotes = m4_get_syntax_quotes (M4SYNTAX);
trace_format (context, "(");
- for (i = 1; i < argc; i++)
- {
- size_t len = arg_length;
- if (i != 1)
- trace_format (context, ", ");
-
- m4_symbol_value_print (m4_arg_symbol (argv, i),
- &context->trace_messages, quotes, &len,
- module);
- }
+ m4_arg_print (context, &context->trace_messages, argv, 1, quotes, false,
+ ", ", &arg_length, true, module);
trace_format (context, ")");
}
}
@@ -1062,8 +1051,8 @@ arg_mark (m4_macro_args *argv)
Return TOKEN when successful, NULL when wrapping ARGV is trivially
empty. */
static m4_symbol_value *
-make_argv_ref (m4_symbol_value *value, m4_obstack *obs, size_t level,
- m4_macro_args *argv, size_t index, bool flatten,
+make_argv_ref (m4 *context, m4_symbol_value *value, m4_obstack *obs,
+ size_t level, m4_macro_args *argv, size_t index, bool flatten,
const m4_string_pair *quotes)
{
m4__symbol_chain *chain;
@@ -1093,19 +1082,8 @@ make_argv_ref (m4_symbol_value *value, m4_obstack *obs,
size_t level,
chain->u.u_a.flatten = flatten;
chain->u.u_a.comma = false;
chain->u.u_a.skip_last = false;
- if (quotes)
- {
- /* Clone the quotes into the obstack, since changequote can
- occur before this $@ is rescanned. */
- /* TODO - optimize when quote_age is nonzero? */
- m4_string_pair *tmp = (m4_string_pair *) obstack_copy (obs, quotes,
- sizeof *quotes);
- tmp->str1 = (char *) obstack_copy0 (obs, quotes->str1, quotes->len1);
- tmp->str2 = (char *) obstack_copy0 (obs, quotes->str2, quotes->len2);
- chain->u.u_a.quotes = tmp;
- }
- else
- chain->u.u_a.quotes = NULL;
+ chain->u.u_a.quotes = m4__quote_cache (M4SYNTAX, obs, chain->quote_age,
+ quotes);
return value;
}
@@ -1218,8 +1196,10 @@ m4_arg_text (m4 *context, m4_macro_args *argv, size_t
index)
obstack_grow (obs, chain->u.u_s.str, chain->u.u_s.len);
break;
case M4__CHAIN_ARGV:
- m4_arg_print (obs, chain->u.u_a.argv, chain->u.u_a.index,
- chain->u.u_a.quotes, NULL, false);
+ m4_arg_print (context, obs, chain->u.u_a.argv, chain->u.u_a.index,
+ m4__quote_cache (M4SYNTAX, NULL, chain->quote_age,
+ chain->u.u_a.quotes),
+ chain->u.u_a.flatten, NULL, NULL, false, false);
break;
default:
assert (!"m4_arg_text");
@@ -1373,32 +1353,48 @@ m4_arg_func (m4_macro_args *argv, size_t index)
/* Dump a representation of ARGV to the obstack OBS, starting with
argument INDEX. If QUOTES is non-NULL, each argument is displayed
- with those quotes. If MAX_LEN is non-NULL, truncate the output
- after *MAX_LEN bytes are output and return true; otherwise, return
- false, and reduce *MAX_LEN by the number of bytes output. If
- MODULE, print any details about originating modules. QUOTES count
- against the truncation length, but not module names. */
+ with those quotes. If FLATTEN, builtins are ignored. Separate
+ arguments with SEP, which defaults to a comma. If MAX_LEN is
+ non-NULL, truncate the output after *MAX_LEN bytes are output and
+ return true; otherwise, return false, and reduce *MAX_LEN by the
+ number of bytes output. If QUOTE_EACH, the truncation length is
+ reset for each argument, quotes do not count against length, and
+ all arguments are printed; otherwise, quotes count against the
+ length and trailing arguments may be discarded. If MODULE, print
+ any details about originating modules; modules do not count against
+ truncation length. */
bool
-m4_arg_print (m4_obstack *obs, m4_macro_args *argv, size_t index,
- const m4_string_pair *quotes, size_t *max_len, bool module)
+m4_arg_print (m4 *context, m4_obstack *obs, m4_macro_args *argv, size_t index,
+ const m4_string_pair *quotes, bool flatten, const char *sep,
+ size_t *max_len, bool quote_each, bool module)
{
size_t len = max_len ? *max_len : SIZE_MAX;
size_t i;
- bool comma = false;
+ bool use_sep = false;
+ size_t sep_len;
+ size_t *plen = quote_each ? NULL : &len;
+ if (!sep)
+ sep = ",";
+ sep_len = strlen (sep);
for (i = index; i < argv->argc; i++)
{
- if (comma && m4_shipout_string_trunc (obs, ",", 1, NULL, &len))
+ if (quote_each && max_len)
+ len = *max_len;
+ if (use_sep && m4_shipout_string_trunc (obs, sep, sep_len, NULL, plen))
return true;
- comma = true;
- if (quotes && m4_shipout_string_trunc (obs, quotes->str1, quotes->len1,
- NULL, &len))
+ use_sep = true;
+ if (quotes && !quote_each
+ && m4_shipout_string_trunc (obs, quotes->str1, quotes->len1, NULL,
+ plen))
return true;
- if (m4_symbol_value_print (m4_arg_symbol (argv, i), obs, NULL, &len,
+ if (m4_symbol_value_print (context, m4_arg_symbol (argv, i), obs,
+ quote_each ? quotes : NULL, flatten, &len,
module))
return true;
- if (quotes && m4_shipout_string_trunc (obs, quotes->str2, quotes->len2,
- NULL, &len))
+ if (quotes && !quote_each
+ && m4_shipout_string_trunc (obs, quotes->str2, quotes->len2, NULL,
+ plen))
return true;
}
if (max_len)
@@ -1424,8 +1420,8 @@ m4_make_argv_ref (m4 *context, m4_macro_args *argv, const
char *argv0,
m4_obstack *obs = m4_arg_scratch (context);
new_value = (m4_symbol_value *) obstack_alloc (obs, sizeof *value);
- value = make_argv_ref (new_value, obs, context->expansion_level - 1, argv,
- index, flatten, NULL);
+ value = make_argv_ref (context, new_value, obs, context->expansion_level - 1,
+ argv, index, flatten, NULL);
if (!value)
{
obstack_free (obs, new_value);
@@ -1527,7 +1523,8 @@ m4_push_args (m4 *context, m4_obstack *obs, m4_macro_args
*argv, bool skip,
}
/* TODO allow shift, $@, to push builtins without flatten. */
- value = make_argv_ref (&tmp, obs, -1, argv, i, true, quote ? quotes : NULL);
+ value = make_argv_ref (context, &tmp, obs, -1, argv, i, true,
+ quote ? quotes : NULL);
assert (value == &tmp);
if (len)
{
diff --git a/m4/symtab.c b/m4/symtab.c
index 9636f9d..dc05c1b 100644
--- a/m4/symtab.c
+++ b/m4/symtab.c
@@ -533,15 +533,16 @@ m4_set_symbol_name_traced (m4_symbol_table *symtab, const
char *name,
}
/* Grow OBS with a text representation of VALUE. If QUOTES, then use
- it to surround a text definition. If MAXLEN, then truncate text
+ it to surround a text definition. If FLATTEN, then flatten builtin
+ macros to the empty string. If MAXLEN, then truncate text
definitions to *MAXLEN, and adjust by how many characters are
printed. If MODULE, then include which module defined a builtin.
Return true if the output was truncated. QUOTES and MODULE do not
count against the truncation length. */
bool
-m4_symbol_value_print (m4_symbol_value *value, m4_obstack *obs,
- const m4_string_pair *quotes, size_t *maxlen,
- bool module)
+m4_symbol_value_print (m4 *context, m4_symbol_value *value, m4_obstack *obs,
+ const m4_string_pair *quotes, bool flatten,
+ size_t *maxlen, bool module)
{
const char *text;
const m4_builtin *bp;
@@ -558,18 +559,42 @@ m4_symbol_value_print (m4_symbol_value *value, m4_obstack
*obs,
result = true;
break;
case M4_SYMBOL_FUNC:
- bp = m4_get_symbol_value_builtin (value);
- obstack_1grow (obs, '<');
- obstack_grow (obs, bp->name, strlen (bp->name));
- obstack_1grow (obs, '>');
+ if (flatten)
+ {
+ if (quotes)
+ {
+ obstack_grow (obs, quotes->str1, quotes->len1);
+ obstack_grow (obs, quotes->str2, quotes->len2);
+ }
+ module = false;
+ }
+ else
+ {
+ bp = m4_get_symbol_value_builtin (value);
+ obstack_1grow (obs, '<');
+ obstack_grow (obs, bp->name, strlen (bp->name));
+ obstack_1grow (obs, '>');
+ }
break;
case M4_SYMBOL_PLACEHOLDER:
- text = m4_get_symbol_value_placeholder (value);
- obstack_1grow (obs, '<');
- obstack_1grow (obs, '<');
- obstack_grow (obs, text, strlen (text));
- obstack_1grow (obs, '>');
- obstack_1grow (obs, '>');
+ if (flatten)
+ {
+ if (quotes)
+ {
+ obstack_grow (obs, quotes->str1, quotes->len1);
+ obstack_grow (obs, quotes->str2, quotes->len2);
+ }
+ module = false;
+ }
+ else
+ {
+ text = m4_get_symbol_value_placeholder (value);
+ obstack_1grow (obs, '<');
+ obstack_1grow (obs, '<');
+ obstack_grow (obs, text, strlen (text));
+ obstack_1grow (obs, '>');
+ obstack_1grow (obs, '>');
+ }
break;
case M4_SYMBOL_COMP:
chain = value->u.u_c.chain;
@@ -585,8 +610,13 @@ m4_symbol_value_print (m4_symbol_value *value, m4_obstack
*obs,
result = true;
break;
case M4__CHAIN_ARGV:
- if (m4_arg_print (obs, chain->u.u_a.argv, chain->u.u_a.index,
- chain->u.u_a.quotes, &len, module))
+ if (m4_arg_print (context, obs, chain->u.u_a.argv,
+ chain->u.u_a.index,
+ m4__quote_cache (M4SYNTAX, NULL,
+ chain->quote_age,
+ chain->u.u_a.quotes),
+ chain->u.u_a.flatten, NULL, &len, false,
+ module))
result = true;
break;
default:
@@ -623,7 +653,7 @@ m4_symbol_value_print (m4_symbol_value *value, m4_obstack
*obs,
MODULE, then include which module defined a builtin. QUOTES and
MODULE do not count toward truncation. */
void
-m4_symbol_print (m4_symbol *symbol, m4_obstack *obs,
+m4_symbol_print (m4 *context, m4_symbol *symbol, m4_obstack *obs,
const m4_string_pair *quotes, bool stack, size_t arg_length,
bool module)
{
@@ -634,7 +664,7 @@ m4_symbol_print (m4_symbol *symbol, m4_obstack *obs,
assert (obs);
value = m4_get_symbol_value (symbol);
- m4_symbol_value_print (value, obs, quotes, &len, module);
+ m4_symbol_value_print (context, value, obs, quotes, false, &len, module);
if (stack)
{
value = VALUE_NEXT (value);
@@ -643,7 +673,8 @@ m4_symbol_print (m4_symbol *symbol, m4_obstack *obs,
obstack_1grow (obs, ',');
obstack_1grow (obs, ' ');
len = arg_length;
- m4_symbol_value_print (value, obs, quotes, &len, module);
+ m4_symbol_value_print (context, value, obs, quotes, false, &len,
+ module);
value = VALUE_NEXT (value);
}
}
@@ -820,15 +851,16 @@ m4_set_symbol_value_placeholder (m4_symbol_value *value,
const char *text)
static void *dump_symbol_CB (m4_symbol_table *symtab, const char *name,
m4_symbol *symbol, void *userdata);
static M4_GNUC_UNUSED void *
-symtab_dump (m4_symbol_table *symtab)
+symtab_dump (m4 *context, m4_symbol_table *symtab)
{
- return m4_symtab_apply (symtab, true, dump_symbol_CB, NULL);
+ return m4_symtab_apply (symtab, true, dump_symbol_CB, context);
}
static void *
dump_symbol_CB (m4_symbol_table *symtab, const char *name,
- m4_symbol *symbol, void *ignored)
+ m4_symbol *symbol, void *ptr)
{
+ m4 * context = (m4 *) ptr;
m4_symbol_value *value = m4_get_symbol_value (symbol);
int flags = value ? SYMBOL_FLAGS (symbol) : 0;
m4_module * module = value ? SYMBOL_MODULE (symbol) : NULL;
@@ -845,7 +877,7 @@ dump_symbol_CB (m4_symbol_table *symtab, const char *name,
{
m4_obstack obs;
obstack_init (&obs);
- m4_symbol_value_print (value, &obs, NULL, NULL, true);
+ m4_symbol_value_print (context, value, &obs, NULL, false, NULL, true);
xfprintf (stderr, "%s", (char *) obstack_finish (&obs));
obstack_free (&obs, NULL);
}
diff --git a/m4/syntax.c b/m4/syntax.c
index 115884e..7479388 100644
--- a/m4/syntax.c
+++ b/m4/syntax.c
@@ -160,6 +160,10 @@ m4_syntax_create (void)
/* Set up current table to match default. */
m4_set_syntax (syntax, '\0', '\0', NULL);
+ syntax->cached_simple.str1 = syntax->cached_lquote;
+ syntax->cached_simple.len1 = 1;
+ syntax->cached_simple.str2 = syntax->cached_rquote;
+ syntax->cached_simple.len2 = 1;
return syntax;
}
@@ -420,6 +424,7 @@ m4_set_syntax (m4_syntax_table *syntax, char key, char
action,
assert (false);
}
set_quote_age (syntax, false, true);
+ m4__quote_uncache (syntax);
return code;
}
@@ -758,6 +763,50 @@ set_quote_age (m4_syntax_table *syntax, bool reset, bool
change)
syntax->quote_age = 0;
}
+/* Interface for caching frequently used quote pairs, independently of
+ the current quote delimiters (for example, consider a text macro
+ expansion that includes several copies of $@), and using AGE for
+ optimization. If QUOTES is NULL, don't use quoting. If OBS is
+ non-NULL, AGE should be the current quote age, and QUOTES should be
+ m4_get_syntax_quotes; the return value will be a cached quote pair,
+ where the pointer is valid at least as long as OBS is not reset,
+ but whose contents are only guaranteed until the next changequote
+ or quote_cache. Otherwise, OBS is NULL, AGE should be the same as
+ before, and QUOTES should be a previously returned cache value;
+ used to refresh the contents of the result. */
+const m4_string_pair *
+m4__quote_cache (m4_syntax_table *syntax, m4_obstack *obs, unsigned int age,
+ const m4_string_pair *quotes)
+{
+ /* Implementation - if AGE is non-zero, then the implementation of
+ set_quote_age guarantees that we can recreate the return value on
+ the fly; so we use static storage, and the contents must be used
+ immediately. If AGE is zero, then we must copy QUOTES onto OBS,
+ but we might as well cache that copy. */
+ if (!quotes)
+ return NULL;
+ if (age)
+ {
+ *syntax->cached_lquote = (age >> 8) & 0xff;
+ *syntax->cached_rquote = age & 0xff;
+ return &syntax->cached_simple;
+ }
+ if (!obs)
+ return quotes;
+ assert (quotes == &syntax->quote);
+ if (!syntax->cached_quote)
+ {
+ assert (obstack_object_size (obs) == 0);
+ syntax->cached_quote = (m4_string_pair *) obstack_copy (obs, quotes,
+ sizeof *quotes);
+ syntax->cached_quote->str1 = (char *) obstack_copy0 (obs, quotes->str1,
+ quotes->len1);
+ syntax->cached_quote->str2 = (char *) obstack_copy0 (obs, quotes->str2,
+ quotes->len2);
+ }
+ return syntax->cached_quote;
+}
+
/* Define these functions at the end, so that calls in the file use the
faster macro version from m4module.h. */
diff --git a/m4/utility.c b/m4/utility.c
index 89b4083..0a3296b 100644
--- a/m4/utility.c
+++ b/m4/utility.c
@@ -95,31 +95,6 @@ m4_numeric_arg (m4 *context, const char *caller, const char
*arg, int *valuep)
return true;
}
-
-/* Print arguments from the table ARGV to obstack OBS, starting at
- index START, separated by SEP, and quoted by the current quotes, if
- QUOTED is true. */
-void
-m4_dump_args (m4 *context, m4_obstack *obs, size_t start, m4_macro_args *argv,
- const char *sep, bool quoted)
-{
- size_t i;
- size_t len = strlen (sep);
- bool need_sep = false;
- size_t argc = m4_arg_argc (argv);
-
- for (i = start; i < argc; i++)
- {
- if (need_sep)
- obstack_grow (obs, sep, len);
- else
- need_sep = true;
-
- m4_shipout_string (context, obs, M4ARG (i), M4ARGLEN (i), quoted);
- }
-}
-
-
/* Parse ARG as a truth value. If unrecognized, issue a warning on
behalf of ME and return PREVIOUS; otherwise return the parsed
value. */
diff --git a/modules/m4.c b/modules/m4.c
index afb9d0c..eb6540b 100644
--- a/modules/m4.c
+++ b/modules/m4.c
@@ -358,7 +358,8 @@ M4BUILTIN_HANDLER (dumpdef)
obstack_grow (obs, data.base[0], strlen (data.base[0]));
obstack_1grow (obs, ':');
obstack_1grow (obs, '\t');
- m4_symbol_print (symbol, obs, quotes, stack, arg_length, module);
+ m4_symbol_print (context, symbol, obs, quotes, stack, arg_length,
+ module);
obstack_1grow (obs, '\n');
}
@@ -792,7 +793,7 @@ M4BUILTIN_HANDLER (errprint)
size_t len;
assert (obstack_object_size (obs) == 0);
- m4_dump_args (context, obs, 1, argv, " ", false);
+ m4_arg_print (context, obs, argv, 1, NULL, true, " ", NULL, false, false);
m4_sysval_flush (context, false);
len = obstack_object_size (obs);
/* The close_stdin module makes it safe to skip checking the return
@@ -845,13 +846,13 @@ M4BUILTIN_HANDLER (m4exit)
version only the first. */
M4BUILTIN_HANDLER (m4wrap)
{
- assert (obstack_object_size (obs) == 0);
+ obs = m4_push_wrapup_init (context);
if (m4_get_posixly_correct_opt (context))
- m4_shipout_string (context, obs, M4ARG (1), M4ARGLEN (1), false);
+ obstack_grow (obs, M4ARG (1), M4ARGLEN (1));
else
- m4_dump_args (context, obs, 1, argv, " ", false);
- obstack_1grow (obs, '\0');
- m4_push_wrapup (context, obstack_finish (obs));
+ /* TODO allow pushing builtins. */
+ m4_arg_print (context, obs, argv, 1, NULL, true, " ", NULL, false, false);
+ m4_push_wrapup_finish ();
}
/* Enable tracing of all specified macros, or all, if none is specified.
diff --git a/tests/null.m4 b/tests/null.m4
index 2fa38dd..83e7c34 100644
--- a/tests/null.m4
+++ b/tests/null.m4
@@ -77,9 +77,8 @@ dnl Passed through len:
dnl Test m4exit separately from m4wrap; see above.
dnl Undefined macro name in m4symbols: not tested yet, needs to warn
dnl Defined macro name in m4symbols: not tested yet
-dnl Passed through m4wrap: not working yet
-m4wrap(``m4wrap:' -
- -
+dnl Passed through m4wrap:
+m4wrap(``m4wrap:' - -
')dnl
dnl Warning from maketemp: not tested yet. No file name includes NUL, needs to
warn
dnl Warning from mkdtemp: not tested yet. No file name includes NUL, needs to
warn
diff --git a/tests/null.out b/tests/null.out
index c42e03c..aca4b78 100644
--- a/tests/null.out
+++ b/tests/null.out
@@ -20,4 +20,4 @@ shift: - -,- -
substr: - -
traceon: strange: - -
undefine: ok
-m4wrap: -
+m4wrap: - -
--
1.5.4
From d59ecd1edf7a7f56a4f15e2a378a7871a746bc7f Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Tue, 13 Nov 2007 06:55:27 -0700
Subject: [PATCH] Stage 16: cache quotes and improve arg_print.
* src/m4.h (push_wrapup_init, push_wrapup_finish, quote_cache)
(func_print): New prototypes.
(arg_print): Adjust prototype.
* src/builtin.h (func_print): New function.
(define_user_macro): Slight cleanup.
(dump_args): Delete, no longer used.
(m4_errprint): Use arg_print.
(m4_m4wrap): Handle embedded NUL.
* src/debug.c (trace_pre): Use arg_print.
* src/input.c (cached_quote): New variable.
(push_wrapup): Split...
(push_wrapup_init, push_wrapup_finish): ...into these.
(input_print): Use arg_print.
(quote_cache): New function.
(pop_input, next_char_1, append_quote_token, set_quote_age):
Adjust users.
* src/macro.c (arg_text, make_argv_ref_token): Adjust users.
(arg_print): Add parameters.
* examples/null.m4: Test for NUL in m4wrap.
* examples/null.out: Update expected output.
* doc/m4.texinfo (Debug Levels): Test --arglength truncation.
(cherry picked from commit 44740d89961c48b712562dfc650dc0cb57898aa0)
Signed-off-by: Eric Blake <address@hidden>
---
ChangeLog | 28 +++++++++++
doc/m4.texinfo | 26 ++++++++++-
examples/null.m4 | 5 +-
examples/null.out | 2 +-
src/builtin.c | 91 +++++++++++++++++-------------------
src/debug.c | 38 ++-------------
src/input.c | 132 ++++++++++++++++++++++++++++++++++++++++------------
src/m4.h | 8 ++-
src/macro.c | 82 +++++++++++++++++++--------------
9 files changed, 257 insertions(+), 155 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 0f4e496..6578264 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,33 @@
2008-02-21 Eric Blake <address@hidden>
+ Stage 16: cache quotes and improve arg_print.
+ Cache rather than always copying quotes when pushing $@ refs; in
+ particular, reconstruct single-byte quotes on the fly. Allow NUL
+ through m4wrap. Improve sharing of code that prints arguments.
+ Memory impact: slight improvement, due to cached quotes.
+ Speed impact: slight improvement, due to less copying.
+ * src/m4.h (push_wrapup_init, push_wrapup_finish, quote_cache)
+ (func_print): New prototypes.
+ (arg_print): Adjust prototype.
+ * src/builtin.h (func_print): New function.
+ (define_user_macro): Slight cleanup.
+ (dump_args): Delete, no longer used.
+ (m4_errprint): Use arg_print.
+ (m4_m4wrap): Handle embedded NUL.
+ * src/debug.c (trace_pre): Use arg_print.
+ * src/input.c (cached_quote): New variable.
+ (push_wrapup): Split...
+ (push_wrapup_init, push_wrapup_finish): ...into these.
+ (input_print): Use arg_print.
+ (quote_cache): New function.
+ (pop_input, next_char_1, append_quote_token, set_quote_age):
+ Adjust users.
+ * src/macro.c (arg_text, make_argv_ref_token): Adjust users.
+ (arg_print): Add parameters.
+ * examples/null.m4: Test for NUL in m4wrap.
+ * examples/null.out: Update expected output.
+ * doc/m4.texinfo (Debug Levels): Test --arglength truncation.
+
Fix out-of-bounds read for sanitized macro names, from 2008-02-06.
* src/m4.c (m4_verror_at_line): Properly terminate the string.
Reported by Ralf Wildenhues.
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index 56445c0..2a549dd 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -3466,7 +3466,8 @@ following:
In trace output, show the actual arguments that were collected before
invoking the macro. This applies to all macro calls if the @samp{t}
flag is used, otherwise only the macros covered by calls of
address@hidden
address@hidden Arguments are subject to length truncation specified by
+the command line option @option{--arglength} (or @option{-l}).
@item c
In trace output, show several trace lines for each macro call. A line
@@ -3477,7 +3478,9 @@ after the call has completed.
@item e
In trace output, show the expansion of each macro call, if it is not
void. This applies to all macro calls if the @samp{t} flag is used,
-otherwise only the macros covered by calls of @code{traceon}.
+otherwise only the macros covered by calls of @code{traceon}. The
+expansion is subject to length truncation specified by the command line
+option @option{--arglength} (or @option{-l}).
@item f
In debug and trace output, include the name of the current input file in
@@ -3557,6 +3560,25 @@ foo
@result{}FOO
@end example
+The following example demonstrates the behavior of length truncation,
+when specified on the command line. Note that each argument and the
+final result are individually truncated. Also, the special tokens for
+builtin functions are not truncated.
+
address@hidden options: -l6
address@hidden
+$ @kbd{m4 -d -l 6}
+define(`echo', `$@@')debugmode(`+t')
address@hidden
+echo(`1', `long string')
address@hidden: -1- echo(`1', `long s...') -> ``1',`l...'
address@hidden,long string
+indir(`echo', defn(`changequote'))
address@hidden: -2- defn(`change...')
address@hidden: -1- indir(`echo', <changequote>) -> ``''
address@hidden
address@hidden example
+
@node Debug Output
@section Saving debugging output
diff --git a/examples/null.m4 b/examples/null.m4
index 2632522..79f4715 100644
--- a/examples/null.m4
+++ b/examples/null.m4
@@ -73,9 +73,8 @@ dnl Other arguments of indir:
dnl Passed through len:
`len:' len( ) len(- -)
dnl Test m4exit separately from m4wrap; see above.
-dnl Passed through m4wrap: not working yet
-m4wrap(``m4wrap:' -
- -
+dnl Passed through m4wrap:
+m4wrap(``m4wrap:' - -
')dnl
dnl Warning from maketemp: not tested yet. No file name includes NUL, needs to
warn
dnl Warning from mkstemp: not tested yet. No file name includes NUL, needs to
warn
diff --git a/examples/null.out b/examples/null.out
index c42e03c..aca4b78 100644
--- a/examples/null.out
+++ b/examples/null.out
@@ -20,4 +20,4 @@ shift: - -,- -
substr: - -
traceon: strange: - -
undefine: ok
-m4wrap: -
+m4wrap: - -
diff --git a/src/builtin.c b/src/builtin.c
index d4a0fee..09322ea 100644
--- a/src/builtin.c
+++ b/src/builtin.c
@@ -198,6 +198,28 @@ find_builtin_by_name (const char *name)
return bp;
return bp + 1;
}
+
+/*------------------------------------------------------------------.
+| Print a representation of FUNC to OBS. If FLATTEN, output QUOTES |
+| around an empty string instead. |
+`------------------------------------------------------------------*/
+void
+func_print (struct obstack *obs, const builtin *func, bool flatten,
+ const string_pair *quotes)
+{
+ assert (func);
+ if (flatten && quotes)
+ {
+ obstack_grow (obs, quotes->str1, quotes->len1);
+ obstack_grow (obs, quotes->str2, quotes->len2);
+ }
+ else if (!flatten)
+ {
+ obstack_1grow (obs, '<');
+ obstack_grow (obs, func->name, strlen (func->name));
+ obstack_1grow (obs, '>');
+ }
+}
/*-------------------------------------------------------------------------.
| Install a builtin macro with name NAME, bound to the C function given in |
@@ -398,14 +420,15 @@ free_regex (void)
}
}
-/*-------------------------------------------------------------------------.
-| Define a predefined or user-defined macro, with name NAME, and expansion |
-| TEXT. MODE destinguishes between the "define" and the "pushdef" case. |
-| It is also used from main ().
|
-`-------------------------------------------------------------------------*/
+/*-----------------------------------------------------------------.
+| Define a predefined or user-defined macro, with name NAME of |
+| length NAME_LEN, and expansion TEXT. MODE is SYMBOL_INSERT for |
+| "define" or SYMBOL_PUSHDEF for "pushdef". This function is also |
+| used from main (). |
+`-----------------------------------------------------------------*/
void
-define_user_macro (const char *name, size_t len, const char *text,
+define_user_macro (const char *name, size_t name_len, const char *text,
symbol_lookup mode)
{
symbol *s;
@@ -422,24 +445,23 @@ define_user_macro (const char *name, size_t len, const
char *text,
if (macro_sequence_inuse && text)
{
regoff_t offset = 0;
- len = strlen (defn);
+ struct re_registers *regs = ¯o_sequence_regs;
+ size_t len = strlen (defn);
while (offset < len
&& (offset = re_search (¯o_sequence_buf, defn, len, offset,
- len - offset, ¯o_sequence_regs)) >= 0)
+ len - offset, regs)) >= 0)
{
/* Skip empty matches. */
- if (macro_sequence_regs.start[0] == macro_sequence_regs.end[0])
+ if (regs->start[0] == regs->end[0])
offset++;
else
{
- char tmp;
- offset = macro_sequence_regs.end[0];
- tmp = defn[offset];
- defn[offset] = '\0';
- m4_warn (0, NULL, _("definition of `%s' contains sequence `%s'"),
- name, defn + macro_sequence_regs.start[0]);
- defn[offset] = tmp;
+ offset = regs->end[0];
+ m4_warn (0, NULL,
+ _("definition of `%s' contains sequence `%.*s'"),
+ name, regs->end[0] - regs->start[0],
+ defn + regs->start[0]);
}
}
if (offset == -2)
@@ -599,34 +621,6 @@ shipout_int (struct obstack *obs, int val)
obstack_grow (obs, s, strlen (s));
}
-/*------------------------------------------------------------------.
-| Print arguments from the table ARGV to obstack OBS, starting with |
-| START, separated by SEP, and quoted by the current quotes if |
-| QUOTED is true. |
-`------------------------------------------------------------------*/
-
-static void
-dump_args (struct obstack *obs, int start, macro_arguments *argv,
- const char *sep, bool quoted)
-{
- unsigned int i;
- bool dump_sep = false;
- size_t len = strlen (sep);
- unsigned int argc = arg_argc (argv);
-
- for (i = start; i < argc; i++)
- {
- if (dump_sep)
- obstack_grow (obs, sep, len);
- else
- dump_sep = true;
- if (quoted)
- obstack_grow (obs, curr_quote.str1, curr_quote.len1);
- obstack_grow (obs, ARG (i), ARG_LEN (i));
- if (quoted)
- obstack_grow (obs, curr_quote.str2, curr_quote.len2);
- }
-}
/* The rest of this file is code for builtins and expansion of user
defined macros. All the functions for builtins have a prototype as:
@@ -1518,7 +1512,7 @@ m4_errprint (struct obstack *obs, int argc,
macro_arguments *argv)
if (bad_argc (ARG (0), argc, 1, -1))
return;
- dump_args (obs, 1, argv, " ", false);
+ arg_print (obs, argv, 1, NULL, true, " ", NULL, false);
debug_flush_files ();
len = obstack_object_size (obs);
/* The close_stdin module makes it safe to skip checking the return
@@ -1599,12 +1593,13 @@ m4_m4wrap (struct obstack *obs, int argc,
macro_arguments *argv)
{
if (bad_argc (ARG (0), argc, 1, -1))
return;
+ obs = push_wrapup_init ();
if (no_gnu_extensions)
obstack_grow (obs, ARG (1), ARG_LEN (1));
else
- dump_args (obs, 1, argv, " ", false);
- obstack_1grow (obs, '\0');
- push_wrapup ((char *) obstack_finish (obs));
+ /* TODO - allow builtins, rather than always flattening. */
+ arg_print (obs, argv, 1, NULL, true, " ", NULL, false);
+ push_wrapup_finish ();
}
/* Enable tracing of all specified macros, or all, if none is specified.
diff --git a/src/debug.c b/src/debug.c
index d6b2ddc..737ee52 100644
--- a/src/debug.c
+++ b/src/debug.c
@@ -359,44 +359,16 @@ trace_prepre (const char *name, int id)
void
trace_pre (const char *name, int id, macro_arguments *argv)
{
- int i;
- const builtin *bp;
- int argc = arg_argc (argv);
-
trace_header (id);
trace_format ("%s", name);
- if (argc > 1 && (debug_level & DEBUG_TRACE_ARGS))
+ if (arg_argc (argv) > 1 && (debug_level & DEBUG_TRACE_ARGS))
{
+ int len = max_debug_argument_length;
trace_format ("(");
-
- for (i = 1; i < argc; i++)
- {
- if (i != 1)
- trace_format (", ");
-
- switch (arg_type (argv, i))
- {
- case TOKEN_TEXT:
- trace_format ("%l%S%r", ARG (i));
- break;
-
- case TOKEN_FUNC:
- bp = find_builtin_by_addr (arg_func (argv, i));
- if (bp == NULL)
- {
- assert (!"trace_pre");
- abort ();
- }
- trace_format ("<%s>", bp->name);
- break;
-
- default:
- assert (!"trace_pre");
- abort ();
- }
-
- }
+ arg_print (&trace, argv, 1,
+ (debug_level & DEBUG_TRACE_QUOTE) ? &curr_quote : NULL,
+ false, ", ", &len, true);
trace_format (")");
}
diff --git a/src/input.c b/src/input.c
index 5c3b345..bbd50f4 100644
--- a/src/input.c
+++ b/src/input.c
@@ -42,14 +42,14 @@
loops (e.g. "define(`f',`m4wrap(`f')')f"), without memory leaks.
Pushing new input on the input stack is done by push_file (),
- push_string (), push_wrapup () (for wrapup text), and push_macro ()
- (for macro definitions). Because macro expansion needs direct
- access to the current input obstack (for optimization), push_string
- () is split in two functions, push_string_init (), which returns a
- pointer to the current input stack, and push_string_finish (),
- which returns a pointer to the final text. The input_block *next
- is used to manage the coordination between the different push
- routines.
+ push_string (), push_wrapup_init/push_wrapup_finish () (for wrapup
+ text), and push_macro () (for macro definitions). Because macro
+ expansion needs direct access to the current input obstack (for
+ optimization), push_string () is split in two functions,
+ push_string_init (), which returns a pointer to the current input
+ stack, and push_string_finish (), which returns a pointer to the
+ final text. The input_block *next is used to manage the
+ coordination between the different push routines.
The current file and line number are stored in two global
variables, for use by the error handling functions in m4.c. Macro
@@ -185,6 +185,9 @@ static struct re_registers regs;
context. */
static unsigned int current_quote_age;
+/* Cache a quote pair. See quote_cache. */
+static string_pair *cached_quote;
+
static bool pop_input (bool);
static void set_quote_age (void);
@@ -500,17 +503,14 @@ push_string_finish (void)
return ret;
}
-/*------------------------------------------------------------------.
-| The function push_wrapup () pushes a string on the wrapup stack. |
-| When the normal input stack gets empty, the wrapup stack will |
-| become the input stack, and push_string () and push_file () will |
-| operate on wrapup_stack. Push_wrapup should be done as |
-| push_string (), but this will suffice, as long as arguments to |
-| m4_m4wrap () are moderate in size. |
-`------------------------------------------------------------------*/
+/*--------------------------------------------------------------.
+| The function push_wrapup_init () returns an obstack ready for |
+| direct expansion of wrapup text, and should be followed by |
+| push_wrapup_finish (). |
+`--------------------------------------------------------------*/
-void
-push_wrapup (const char *s)
+struct obstack *
+push_wrapup_init (void)
{
input_block *i;
i = (input_block *) obstack_alloc (wrapup_stack, sizeof *i);
@@ -518,9 +518,28 @@ push_wrapup (const char *s)
i->type = INPUT_STRING;
i->file = current_file;
i->line = current_line;
- i->u.u_s.len = strlen (s);
- i->u.u_s.str = (char *) obstack_copy (wrapup_stack, s, i->u.u_s.len);
wsp = i;
+ return wrapup_stack;
+}
+
+/*---------------------------------------------------------------.
+| After pushing wrapup text, push_wrapup_finish () completes the |
+| bookkeeping. |
+`---------------------------------------------------------------*/
+void
+push_wrapup_finish (void)
+{
+ input_block *i = wsp;
+ if (obstack_object_size (wrapup_stack) == 0)
+ {
+ wsp = i->prev;
+ obstack_free (wrapup_stack, i);
+ }
+ else
+ {
+ i->u.u_s.len = obstack_object_size (wrapup_stack);
+ i->u.u_s.str = (char *) obstack_finish (wrapup_stack);
+ }
}
@@ -607,6 +626,7 @@ pop_input (bool cleanup)
abort ();
}
obstack_free (current_input, isp);
+ cached_quote = NULL;
next = NULL; /* might be set in push_string_init () */
isp = tmp;
@@ -674,13 +694,7 @@ input_print (struct obstack *obs, const input_block *input)
obstack_1grow (obs, '>');
break;
case INPUT_MACRO:
- {
- const builtin *bp = find_builtin_by_addr (input->u.func);
- assert (bp);
- obstack_1grow (obs, '<');
- obstack_grow (obs, bp->name, strlen (bp->name));
- obstack_1grow (obs, '>');
- }
+ func_print (obs, find_builtin_by_addr (input->u.func), false, NULL);
break;
case INPUT_CHAIN:
chain = input->u.u_c.chain;
@@ -696,7 +710,9 @@ input_print (struct obstack *obs, const input_block *input)
case CHAIN_ARGV:
assert (!chain->u.u_a.comma);
if (arg_print (obs, chain->u.u_a.argv, chain->u.u_a.index,
- chain->u.u_a.quotes, &maxlen))
+ quote_cache (NULL, chain->quote_age,
+ chain->u.u_a.quotes),
+ chain->u.u_a.flatten, NULL, &maxlen, false))
return;
break;
default:
@@ -783,7 +799,9 @@ peek_input (bool allow_argv)
argument from argv. */
push_string_init ();
push_arg_quote (current_input, chain->u.u_a.argv,
- chain->u.u_a.index, chain->u.u_a.quotes);
+ chain->u.u_a.index,
+ quote_cache (NULL, chain->quote_age,
+ chain->u.u_a.quotes));
chain->u.u_a.index++;
chain->u.u_a.comma = true;
push_string_finish ();
@@ -911,7 +929,9 @@ next_char_1 (bool allow_quote)
argument from argv. */
push_string_init ();
push_arg_quote (current_input, chain->u.u_a.argv,
- chain->u.u_a.index, chain->u.u_a.quotes);
+ chain->u.u_a.index,
+ quote_cache (NULL, chain->quote_age,
+ chain->u.u_a.quotes));
chain->u.u_a.index++;
chain->u.u_a.comma = true;
push_string_finish ();
@@ -1007,7 +1027,9 @@ append_quote_token (struct obstack *obs, token_data *td)
if (src_chain->type == CHAIN_ARGV)
{
arg_print (obs, src_chain->u.u_a.argv, src_chain->u.u_a.index,
- src_chain->u.u_a.quotes, NULL);
+ quote_cache (NULL, src_chain->quote_age,
+ src_chain->u.u_a.quotes),
+ src_chain->u.u_a.flatten, NULL, NULL, false);
arg_adjust_refcount (src_chain->u.u_a.argv, false);
return;
}
@@ -1366,6 +1388,7 @@ set_quote_age (void)
| (*curr_quote.str2 & 0xff));
else
current_quote_age = 0;
+ cached_quote = NULL;
}
/* Return the current quote age. Each non-trivial changequote alters
@@ -1391,6 +1414,53 @@ safe_quotes (void)
{
return current_quote_age != 0;
}
+
+/* Interface for caching frequently used quote pairs, using AGE for
+ optimization. If QUOTES is NULL, don't use quoting. If OBS is
+ non-NULL, AGE should be the current quote age, and QUOTES should be
+ &curr_quote; the return value will be a cached quote pair, where
+ the pointer is valid at least as long as OBS is not reset, but
+ whose contents are only guaranteed until the next changequote or
+ quote_cache. Otherwise, OBS is NULL, AGE should be the same as
+ before, and QUOTES should be a previously returned cache value;
+ used to refresh the contents of the result. */
+const string_pair *
+quote_cache (struct obstack *obs, unsigned int age, const string_pair *quotes)
+{
+ static char lquote[2];
+ static char rquote[2];
+ static string_pair simple = {lquote, 1, rquote, 1};
+
+ /* Implementation - if AGE is non-zero, then the implementation of
+ set_quote_age guarantees that we can recreate the return value on
+ the fly; so we use static storage, and the contents must be used
+ immediately. If AGE is zero, then we must copy QUOTES onto OBS
+ (since changequote will invalidate the original), but we might as
+ well cache that copy (in case the current expansion contains more
+ than one instance of $@). */
+ if (!quotes)
+ return NULL;
+ if (age)
+ {
+ *lquote = (age >> 8) & 0xff;
+ *rquote = age & 0xff;
+ return &simple;
+ }
+ if (!obs)
+ return quotes;
+ assert (next && quotes == &curr_quote);
+ if (!cached_quote)
+ {
+ assert (obs == current_input && obstack_object_size (obs) == 0);
+ cached_quote = (string_pair *) obstack_copy (obs, quotes,
+ sizeof *quotes);
+ cached_quote->str1 = (char *) obstack_copy0 (obs, quotes->str1,
+ quotes->len1);
+ cached_quote->str2 = (char *) obstack_copy0 (obs, quotes->str2,
+ quotes->len2);
+ }
+ return cached_quote;
+}
/*--------------------------------------------------------------------.
diff --git a/src/m4.h b/src/m4.h
index e1da7a7..0c2a8c8 100644
--- a/src/m4.h
+++ b/src/m4.h
@@ -386,7 +386,8 @@ void push_macro (builtin_func *);
struct obstack *push_string_init (void);
bool push_token (token_data *, int, bool);
const input_block *push_string_finish (void);
-void push_wrapup (const char *);
+struct obstack *push_wrapup_init (void);
+void push_wrapup_finish (void);
bool pop_wrapup (void);
void input_print (struct obstack *, const input_block *);
@@ -410,6 +411,8 @@ void set_word_regexp (const char *, const char *);
#endif
unsigned int quote_age (void);
bool safe_quotes (void);
+const string_pair *quote_cache (struct obstack *, unsigned int,
+ const string_pair *);
/* File: output.c --- output functions. */
extern int current_diversion;
@@ -494,7 +497,7 @@ size_t arg_len (macro_arguments *, unsigned int);
builtin_func *arg_func (macro_arguments *, unsigned int);
struct obstack *arg_scratch (void);
bool arg_print (struct obstack *, macro_arguments *, unsigned int,
- const string_pair *, int *);
+ const string_pair *, bool, const char *, int *, bool);
macro_arguments *make_argv_ref (macro_arguments *, const char *, size_t,
bool, bool);
void push_arg (struct obstack *, macro_arguments *, unsigned int);
@@ -553,6 +556,7 @@ const char *ntoa (int32_t, int);
const builtin *find_builtin_by_addr (builtin_func *);
const builtin *find_builtin_by_name (const char *);
+void func_print (struct obstack *, const builtin *, bool, const string_pair *);
/* File: path.c --- path search for include files. */
diff --git a/src/macro.c b/src/macro.c
index 8b85cf6..8b7e303 100644
--- a/src/macro.c
+++ b/src/macro.c
@@ -911,7 +911,9 @@ arg_text (macro_arguments *argv, unsigned int index)
break;
case CHAIN_ARGV:
arg_print (obs, chain->u.u_a.argv, chain->u.u_a.index,
- chain->u.u_a.quotes, NULL);
+ quote_cache (NULL, chain->quote_age,
+ chain->u.u_a.quotes),
+ chain->u.u_a.flatten, NULL, NULL, false);
break;
default:
assert (!"arg_text");
@@ -1097,50 +1099,70 @@ arg_scratch (void)
/* Dump a representation of ARGV to the obstack OBS, starting with
argument INDEX. If QUOTES is non-NULL, each argument is displayed
- with those quotes. If MAX_LEN is non-NULL, truncate the output
- after *MAX_LEN bytes are output and return true; otherwise, return
- false, and reduce *MAX_LEN by the number of bytes output. */
+ with those quotes. If FLATTEN, builtins are ignored. Separate
+ arguments with SEP, which defaults to a comma. If MAX_LEN is
+ non-NULL, truncate the output after *MAX_LEN bytes are output and
+ return true; otherwise, return false, and reduce *MAX_LEN by the
+ number of bytes output. If QUOTE_EACH, the truncation length is
+ reset for each argument, quotes do not count against length, and
+ all arguments are printed; otherwise, quotes count against the
+ length and trailing arguments may be discarded. */
bool
arg_print (struct obstack *obs, macro_arguments *argv, unsigned int index,
- const string_pair *quotes, int *max_len)
+ const string_pair *quotes, bool flatten, const char *sep,
+ int *max_len, bool quote_each)
{
int len = max_len ? *max_len : INT_MAX;
unsigned int i;
token_data *token;
token_chain *chain;
- bool comma = false;
-
+ bool use_sep = false;
+ bool done;
+ size_t sep_len;
+ size_t *plen = quote_each ? NULL : &len;
+
+ if (!sep)
+ sep = ",";
+ sep_len = strlen (sep);
for (i = index; i < argv->argc; i++)
{
- if (comma && obstack_print (obs, ",", 1, &len))
+ if (quote_each && max_len)
+ len = *max_len;
+ if (use_sep && obstack_print (obs, sep, sep_len, plen))
return true;
- else
- comma = true;
+ use_sep = true;
token = arg_token (argv, i, NULL);
- if (quotes && obstack_print (obs, quotes->str1, quotes->len1, &len))
- return true;
switch (TOKEN_DATA_TYPE (token))
{
case TOKEN_TEXT:
+ if (quotes && obstack_print (obs, quotes->str1, quotes->len1, plen))
+ return true;
if (obstack_print (obs, TOKEN_DATA_TEXT (token),
- TOKEN_DATA_LEN (token), &len))
+ TOKEN_DATA_LEN (token), &len) && !quote_each)
+ return true;
+ if (quotes && obstack_print (obs, quotes->str2, quotes->len2, plen))
return true;
break;
case TOKEN_COMP:
+ if (quotes && obstack_print (obs, quotes->str1, quotes->len1, plen))
+ return true;
chain = token->u.u_c.chain;
- while (chain)
+ done = false;
+ while (chain && !done)
{
switch (chain->type)
{
case CHAIN_STR:
if (obstack_print (obs, chain->u.u_s.str, chain->u.u_s.len,
&len))
- return true;
+ done = true;
break;
case CHAIN_ARGV:
if (arg_print (obs, chain->u.u_a.argv, chain->u.u_a.index,
- chain->u.u_a.quotes, &len))
- return true;
+ quote_cache (NULL, chain->quote_age,
+ chain->u.u_a.quotes),
+ flatten, NULL, &len, false))
+ done = true;
break;
default:
assert (!"arg_print");
@@ -1148,16 +1170,19 @@ arg_print (struct obstack *obs, macro_arguments *argv,
unsigned int index,
}
chain = chain->next;
}
+ if (done && !quote_each)
+ return true;
+ if (quotes && obstack_print (obs, quotes->str2, quotes->len2, plen))
+ return true;
break;
case TOKEN_FUNC:
- /* TODO - support func. */
+ func_print (obs, find_builtin_by_addr (TOKEN_DATA_FUNC (token)),
+ flatten, quotes);
+ break;
default:
assert (!"arg_print");
abort ();
}
- if (quotes && obstack_print (obs, quotes->str2, quotes->len2,
- &len))
- return true;
}
if (max_len)
*max_len = len;
@@ -1201,20 +1226,7 @@ make_argv_ref_token (token_data *token, struct obstack
*obs, int level,
chain->u.u_a.flatten = flatten;
chain->u.u_a.comma = false;
chain->u.u_a.skip_last = false;
- if (quotes)
- {
- /* Clone the quotes into the obstack, since a subsequent
- changequote may take effect before the $@ ref is
- rescanned. */
- /* TODO - optimize when quote_age is nonzero. */
- string_pair *tmp = (string_pair *) obstack_copy (obs, quotes,
- sizeof *quotes);
- tmp->str1 = (char *) obstack_copy0 (obs, quotes->str1, quotes->len1);
- tmp->str2 = (char *) obstack_copy0 (obs, quotes->str2, quotes->len2);
- chain->u.u_a.quotes = tmp;
- }
- else
- chain->u.u_a.quotes = NULL;
+ chain->u.u_a.quotes = quote_cache (obs, chain->quote_age, quotes);
return token;
}
--
1.5.4
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [16/18] argv_ref speedup: cache frequently used quotes,
Eric Blake <=