[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: fix frozen diversions on master branch
From: |
Eric Blake |
Subject: |
Re: fix frozen diversions on master branch |
Date: |
Mon, 19 May 2008 22:09:52 -0600 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080421 Thunderbird/2.0.0.14 Mnenhy/0.7.5.666 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
According to Eric Blake on 5/13/2008 11:39 AM:
| I'm also thinking of changing the frozen file format 2 to require newlines
| between strings. For example, we now have:
|
| F3,3,2
| dnldnlm4
|
| as the designator for a macro named 'dnl', pointing to the builtin 'dnl' in
| module 'm4'. But if we instead used:
|
| F3,3,2
| dnl
| dnl
| m4
|
| I also realized that because this patch uses the quotearg module, and
quotearg
| does not pad out a trailing NUL as '\000', then without the newline between
| strings, this would be ambiguous (macro '\0' with content '\1', or macro
'\1'
| with content '\n'?) once argv_ref patch 24 allows NUL in macro names.
Committing this patch, in preparation for merging stage 24; it adds a
newline in the frozen format between all consecutive strings (Q, C, F, T
directives). NOTE: this invalidates all existing V2 frozen files; but I
am okay with that (we've already done this in the past, and can continue
to do so up until the 2.0 release). Either re-freeze the original file,
or hand-edit your frozen file to add in appropriate newlines.
| then I could use getndelim2 (rapidly searching for the next \ escape or the
| next newline), instead of fgetc (one byte at a time), resulting in a
speedup by
| processing in blocks instead of bytes (I recently posted a 40% speedup to
| coreutils' cut doing just that).
That patch to follow later, hopefully with some timing numbers to back up
my (as-yet-untested) claim that it should speed up reload.
- --
Don't work too hard, make some time for fun as well!
Eric Blake address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkgyTxAACgkQ84KuGfSFAYAskwCcD+w5aR9R0f70kzraH0r/W3dR
di8AoM16YIakPyox4E6pIpWI1wgIfUuJ
=DJDU
-----END PGP SIGNATURE-----
>From a8e726e4bf47514e770f5885e416f92f150419a9 Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Mon, 19 May 2008 21:58:16 -0600
Subject: [PATCH] In frozen file, split consecutive strings with newline.
* src/freeze.c (dump_symbol_CB): Add newline to 'T', 'F'.
(produce_frozen_state): Likewise for 'Q', 'C'.
(reload_frozen_state): Parse the new layout.
[GET_DIRECTIVE]: Fix format 1 regression from 2008-05-13.
* tests/freeze.at (loading format 2): Rewrite to new format.
(reloading unknown builtin): Likewise.
(loading format 1): Make sure backslash-newline is not
interpreted.
* doc/m4.texinfo (Frozen file format 2): Document the format.
* NEWS: Document this change.
Signed-off-by: Eric Blake <address@hidden>
---
ChangeLog | 12 +++++++
NEWS | 12 ++++---
doc/m4.texinfo | 32 ++++++++----------
src/freeze.c | 98 +++++++++++++++++++++++++++++++++++++++++--------------
tests/freeze.at | 44 +++++++++++++++++-------
5 files changed, 137 insertions(+), 61 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 625a99f..bad66ae 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,17 @@
2008-05-19 Eric Blake <address@hidden>
+ In frozen file, split consecutive strings with newline.
+ * src/freeze.c (dump_symbol_CB): Add newline to 'T', 'F'.
+ (produce_frozen_state): Likewise for 'Q', 'C'.
+ (reload_frozen_state): Parse the new layout.
+ [GET_DIRECTIVE]: Fix format 1 regression from 2008-05-13.
+ * tests/freeze.at (loading format 2): Rewrite to new format.
+ (reloading unknown builtin): Likewise.
+ (loading format 1): Make sure backslash-newline is not
+ interpreted.
+ * doc/m4.texinfo (Frozen file format 2): Document the format.
+ * NEWS: Document this change.
+
Fix xgettext options.
* po/Makevars (XGETTEXT_OPTIONS): The " must be passed to
xgettext.
diff --git a/NEWS b/NEWS
index 6d2624d..1391e36 100644
--- a/NEWS
+++ b/NEWS
@@ -196,11 +196,13 @@ promoted to 2.0.
*** The syntax of frozen files format V2 has been improved to save
additional state. This includes the `R' directive for default regular
expression syntax, the `t' directive for traced macros, and the `d'
- directive for debug mode. Also, a V2 file can now be represented
- completely in ASCII, thanks to escape sequences. Unfortunately, files
- frozen by M4 1.4q cannot be read by 1.9b, but since 1.4q was not widely
- distributed, this is not expected to be much of an issue, and comes
- with the territory of using a beta release.
+ directive for debug mode. Existing directives with consecutive strings
+ now require an intermediate newline, for faster parsing. Also, a V2
+ file can now be represented completely in ASCII, thanks to escape
+ sequences. Unfortunately, files frozen by M4 1.4q cannot be read by
+ 1.9b, but since 1.4q was not widely distributed, this is not expected
+ to be much of an issue, and comes with the territory of using a beta
+ release.
- FIXME: format 2 still needs to catch more missing state; once 2.0 is
released, any further changes would introduce format 3.
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index 6978714..2048483 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -8384,12 +8384,9 @@ always behave the same as the original file.
These shortcomings have been addressed in version 2 of the frozen file
syntax. New directives have been added, and existing directives have
-additional, and sometimes optional, parameters.
address@hidden FIXME - change implementation to match this:
address@hidden All @var{str} instances
address@hidden in the grammar are now followed by @key{NL}, which makes the
split
address@hidden between consecutive strings easier to recognize.
-Strings may now
+additional, and sometimes optional, parameters. All @var{str} instances
+in the grammar are now followed by @key{NL}, which makes the split
+between consecutive strings easier to recognize. Strings may now
contain escape sequences modeled after C, such as @samp{\n} for newline
or @samp{\0} for @sc{nul}, so that the frozen file can be pure
@sc{ascii} (although when hand-editing a frozen file, it is still
@@ -8404,8 +8401,7 @@ Confirms the format of the file. @code{m4}
@value{VERSION} only creates
frozen files where @var{number} is 2. This directive must be the first
non-comment in the file, and may not appear more than once.
address@hidden C @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL}
@var{str2} @key{NL}
address@hidden C @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
address@hidden C @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL}
@var{str2} @key{NL}
Uses @var{str1} and @var{str2} as the begin-comment and
end-comment strings. If omitted, then @samp{#} and @key{NL} are the
comment delimiters.
@@ -8429,9 +8425,9 @@ If omitted, then diversion 0 is current.
@comment FIXME - the first usage, with only one string, is not supported
@comment in the current code
address@hidden F @var{len1} @key{NL} @var{str1} @key{NL}
address@hidden F @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
address@hidden F @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1}
@var{str2} @var{str3} @key{NL}
address@hidden @item F @var{len1} @key{NL} @var{str1} @key{NL}
address@hidden F @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL}
@var{str2} @key{NL}
address@hidden F @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1}
@key{NL} @var{str2} @key{NL} @var{str3} @key{NL}
Defines, through @code{pushdef}, a definition for @var{str1} expanding
to the function whose builtin name is given by @var{str2} (defaulting to
@var{str1} if not present). With two arguments, the builtin name is
@@ -8446,7 +8442,7 @@ builtin entries to the symbol table. Modules must be
loaded prior to
specifying module-specific builtins via the three-argument @code{F} or
@code{T}.
address@hidden Q @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
address@hidden Q @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL}
@var{str2} @key{NL}
Uses @var{str1} and @var{str2} as the begin-quote and end-quote strings.
If omitted, then @samp{`} and @samp{'} are the quote delimiters.
@@ -8465,13 +8461,13 @@ Enables tracing for any macro named @var{str}, similar
to using the
@code{traceon} builtin. This option may occur more than once for
multiple macros; if omitted, no macro starts out as traced.
address@hidden FIXME - what about a module's text macros, like __gnu__?
address@hidden T @var{len1} , @var{len2} @key{NL} @var{str1} @var{str2} @key{NL}
address@hidden T @var{len1} , @var{len2} @key{NL} @var{str1} @key{NL}
@var{str2} @key{NL}
address@hidden T @var{len1} , @var{len2} , @var{len3} @key{NL} @var{str1}
@key{NL} @var{str2} @key{NL} @var{str3} @key{NL}
Defines, though @code{pushdef}, a definition for @var{str1} expanding to
-the text given by @var{str2}. With two arguments, the builtin name is
-searched for among the intrinsic builtin functions only; with three
-arguments, the builtin name is searched for amongst the builtin
-functions defined by the module named by @var{str3}.
+the text given by @var{str2}. This directive may appear more than once
+for the same name, and its order, along with @samp{F}, is important. If
+present, the optional third argument associates the macro with a module
+named by @var{str3}.
@end table
@node Compatibility
diff --git a/src/freeze.c b/src/freeze.c
index db5a1cf..d32fc9e 100644
--- a/src/freeze.c
+++ b/src/freeze.c
@@ -216,10 +216,14 @@ dump_symbol_CB (m4_symbol_table *symtab, const char
*symbol_name,
fputc ('\n', file);
produce_mem_dump (file, symbol_name, symbol_len);
+ fputc ('\n', file);
produce_mem_dump (file, text, text_len);
- if (module)
- produce_mem_dump (file, module_name, module_len);
fputc ('\n', file);
+ if (module)
+ {
+ produce_mem_dump (file, module_name, module_len);
+ fputc ('\n', file);
+ }
}
else if (m4_is_symbol_value_func (value))
{
@@ -235,10 +239,14 @@ dump_symbol_CB (m4_symbol_table *symtab, const char
*symbol_name,
fputc ('\n', file);
produce_mem_dump (file, symbol_name, symbol_len);
+ fputc ('\n', file);
produce_mem_dump (file, bp->name, bp_len);
- if (module)
- produce_mem_dump (file, module_name, module_len);
fputc ('\n', file);
+ if (module)
+ {
+ produce_mem_dump (file, module_name, module_len);
+ fputc ('\n', file);
+ }
}
else if (m4_is_symbol_value_placeholder (value))
; /* Nothing to do for a builtin we couldn't reload earlier. */
@@ -278,6 +286,7 @@ produce_frozen_state (m4 *context, const char *name)
{
xfprintf (file, "Q%zu,%zu\n", pair->len1, pair->len2);
produce_mem_dump (file, pair->str1, pair->len1);
+ fputc ('\n', file);
produce_mem_dump (file, pair->str2, pair->len2);
fputc ('\n', file);
}
@@ -288,6 +297,7 @@ produce_frozen_state (m4 *context, const char *name)
{
xfprintf (file, "C%zu,%zu\n", pair->len1, pair->len2);
produce_mem_dump (file, pair->str1, pair->len1);
+ fputc ('\n', file);
produce_mem_dump (file, pair->str2, pair->len2);
fputc ('\n', file);
}
@@ -471,12 +481,21 @@ reload_frozen_state (m4 *context, const char *name)
} \
while (0)
-#define GET_STRING(File, Buf, BufSize, StrLen) \
+#define GET_STRING(File, Buf, BufSize, StrLen, UseChar) \
do \
{ \
size_t len = (StrLen); \
char *p; \
int ch; \
+ if (UseChar) \
+ { \
+ ungetc (character, File); \
+ if (advance_line) \
+ { \
+ assert (character == '\n'); \
+ advance_line = false; \
+ } \
+ } \
CHECK_ALLOCATION ((Buf), (BufSize), len); \
p = (Buf); \
while (len-- > 0) \
@@ -490,6 +509,13 @@ reload_frozen_state (m4 *context, const char *name)
*p++ = ch; \
} \
*p = '\0'; \
+ GET_CHARACTER; \
+ while (version > 1 && character == '\\') \
+ { \
+ GET_CHARACTER; \
+ VALIDATE ('\n'); \
+ GET_CHARACTER; \
+ } \
} \
while (0)
@@ -526,12 +552,6 @@ reload_frozen_state (m4 *context, const char *name)
GET_CHARACTER; \
VALIDATE ('\n'); \
} \
- else if (character == '\\') \
- { \
- GET_CHARACTER; \
- VALIDATE ('\n'); \
- continue; \
- } \
} \
while (character == '\n')
@@ -600,7 +620,7 @@ ill-formed frozen file, version 2 directive `%c'
encountered"), 'd');
GET_CHARACTER;
GET_NUMBER (number[0], false);
VALIDATE ('\n');
- GET_STRING (file, string[0], allocated[0], number[0]);
+ GET_STRING (file, string[0], allocated[0], number[0], false);
VALIDATE ('\n');
m4_set_debug_level_opt (context, m4_debug_decode (context, 0,
@@ -643,9 +663,19 @@ ill-formed frozen file, version 2 directive `%c'
encountered"), 'F');
/* Get string contents. */
- GET_STRING (file, string[0], allocated[0], number[0]);
- GET_STRING (file, string[1], allocated[1], number[1]);
- GET_STRING (file, string[2], allocated[2], number[2]);
+ GET_STRING (file, string[0], allocated[0], number[0], false);
+ if (version > 1)
+ {
+ VALIDATE ('\n');
+ GET_CHARACTER;
+ }
+ GET_STRING (file, string[1], allocated[1], number[1], true);
+ if (version > 1 && number[2])
+ {
+ VALIDATE ('\n');
+ GET_CHARACTER;
+ }
+ GET_STRING (file, string[2], allocated[2], number[2], true);
VALIDATE ('\n');
/* Enter a macro having a builtin function as a definition. */
@@ -686,7 +716,7 @@ ill-formed frozen file, version 2 directive `%c'
encountered"), 'M');
GET_CHARACTER;
GET_NUMBER (number[0], false);
VALIDATE ('\n');
- GET_STRING (file, string[0], allocated[0], number[0]);
+ GET_STRING (file, string[0], allocated[0], number[0], false);
VALIDATE ('\n');
m4__module_open (context, string[0], NULL);
@@ -705,7 +735,7 @@ ill-formed frozen file, version 2 directive `%c'
encountered"), 'R');
GET_CHARACTER;
GET_NUMBER (number[0], false);
VALIDATE ('\n');
- GET_STRING (file, string[0], allocated[0], number[0]);
+ GET_STRING (file, string[0], allocated[0], number[0], false);
VALIDATE ('\n');
m4_set_regexp_syntax_opt (context,
@@ -732,7 +762,7 @@ ill-formed frozen file, version 2 directive `%c'
encountered"), 'S');
GET_CHARACTER;
GET_NUMBER (number[0], false);
VALIDATE ('\n');
- GET_STRING (file, string[0], allocated[0], number[0]);
+ GET_STRING (file, string[0], allocated[0], number[0], false);
/* Syntax under M4_SYNTAX_MASKS is handled specially; all
other characters are additive. */
@@ -758,7 +788,7 @@ ill-formed frozen file, version 2 directive `%c'
encountered"), 't');
GET_CHARACTER;
GET_NUMBER (number[0], false);
VALIDATE ('\n');
- GET_STRING (file, string[0], allocated[0], number[0]);
+ GET_STRING (file, string[0], allocated[0], number[0], false);
VALIDATE ('\n');
m4_set_symbol_name_traced (M4SYMTAB, string[0], true);
@@ -789,9 +819,17 @@ ill-formed frozen file, version 2 directive `%c'
encountered"), 't');
/* Get string contents. */
if (operation != 'D')
- GET_STRING (file, string[0], allocated[0], number[0]);
- GET_STRING (file, string[1], allocated[1], number[1]);
- GET_CHARACTER;
+ {
+ GET_STRING (file, string[0], allocated[0], number[0], false);
+ if (version > 1)
+ {
+ VALIDATE ('\n');
+ GET_CHARACTER;
+ }
+ }
+ else
+ GET_CHARACTER;
+ GET_STRING (file, string[1], allocated[1], number[1], true);
VALIDATE ('\n');
/* Act according to operation letter. */
@@ -862,9 +900,19 @@ ill-formed frozen file, version 2 directive `%c'
encountered"), 'T');
VALIDATE ('\n');
/* Get string contents. */
- GET_STRING (file, string[0], allocated[0], number[0]);
- GET_STRING (file, string[1], allocated[1], number[1]);
- GET_STRING (file, string[2], allocated[2], number[2]);
+ GET_STRING (file, string[0], allocated[0], number[0], false);
+ if (version > 1)
+ {
+ VALIDATE ('\n');
+ GET_CHARACTER;
+ }
+ GET_STRING (file, string[1], allocated[1], number[1], true);
+ if (version > 1 && number[2])
+ {
+ VALIDATE ('\n');
+ GET_CHARACTER;
+ }
+ GET_STRING (file, string[2], allocated[2], number[2], true);
VALIDATE ('\n');
/* Enter a macro having an expansion text as a definition. */
diff --git a/tests/freeze.at b/tests/freeze.at
index 209c611..2aa2aea 100644
--- a/tests/freeze.at
+++ b/tests/freeze.at
@@ -84,7 +84,8 @@ $ m4 --version | head -n1
GNU M4 1.4.5
$ cat frozen.m4
divert(`-1')
-define(`foo', `\FOO')
+define(`foo', `\n\
+FOO')
pushdef(`foo', `bar${1}')
define(`my_define', defn(`define'))
define(`my_changeword', defn(`changeword'))
@@ -116,8 +117,9 @@ T9,6
my_definedefine
F9,6
my_definedefine
-T3,4
-foo\FOO
+T3,7
+foo\n\
+FOO
T3,7
foobar${1}
F3,3
@@ -139,7 +141,8 @@ my_define([bar], [4])[]popdef([my_define]) bar
]])
AT_CHECK_M4([-R frozen.m4f input.m4], [0],
-[[bar${1} /* foo */ \FOO
+[[bar${1} /* foo */ \n\
+FOO
1
define 1
3
@@ -174,15 +177,19 @@ V2
# missing close quote should be supplied
Q1,0
>
+
# missing close comment should be supplied
C1,0
<
+
M2
m4
M3
gnu
F7,7,3
-builtinbuiltingnu
+builtin
+builtin
+gnu
# Text to negative diversion must not crash. Catches a regression
# introduced 2007-05-28 and fixed 2007-05-31.
D-1,5
@@ -196,11 +203,13 @@ D,
# Testing escape sequences
T4,5
-blah\t\477\040\X5C
+blah
+\t\477\040\X5C
# Long macro definition. Catches a regression introduced on 2007-01-20
# and patched 2007-02-25.
T4,122
-long01234567890123456789012345678901234567890123456789
+long
+01234567890123456789012345678901234567890123456789
01234567890123456789012345678901234567890123456789
01234567890123456789
]])
@@ -247,15 +256,23 @@ m4
M3
gnu
T1,5
-a\n\n\n\n\n
+a
+\n\n\n\n\n
F6,6,2
divnum\
-divnumm4\
+
+divnum
+\
+m4\
F6,6,2
-divertdivertm4
+divert
+divert
+m4
F6,6,2
-definedefinem4
+define
+define
+m4
D]$1[,3
hi
@@ -263,7 +280,7 @@ hi
AT_CHECK_M4([-R frozen.m4f in.m4], m4_ifval([$2], [1], [0]),
m4_ifval([$2], [], [m4_bpatsubst([$1], [^0*])
m4_if(m4_substr([$1], [0], [1]), [-], [], [[hi
-]])]), m4_ifval([$2], [[m4:frozen.m4f:16: integer overflow in frozen file
+]])]), m4_ifval([$2], [[m4:frozen.m4f:24: integer overflow in frozen file
]]))
])
@@ -363,7 +380,8 @@ AT_CHECK_M4([-F frozen.m4f -t undefined empty.m4])
# Add an unknown builtin.
echo 'F1,1' >> frozen.m4f
-echo 'ab' >> frozen.m4f
+echo 'a' >> frozen.m4f
+echo 'b' >> frozen.m4f
AT_DATA([[input.m4]],
[[dnl The macro is defined; checking this is safe
--
1.5.5.1