[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Line synchronisation output in comments
From: |
Eric Blake |
Subject: |
Re: [PATCH] Line synchronisation output in comments |
Date: |
Fri, 25 May 2007 17:26:58 +0000 (UTC) |
User-agent: |
Loom/3.14 (http://gmane.org/) |
Sergey Poznyakoff <gray <at> Mirddin.farlep.net> writes:
>
> No, it does not. After thinking its over I came to the another patch,
> attached below (this time it is made against the CVS version). The two
> testcases (my initial one, and the one based on your last posting) are
> provided in the attachements 2 and 3.
Thanks for the continued progress. I don't think your patch quite handles
three-line comments correctly, though (there were some minor inconsistencies
about when line numbers were updated, so that two-lines didn't trigger all the
code paths), so I think my approach below is more reliable. Here's what I'm
checking in to the branch; please give it a whirl so I can feel confident
releasing 1.4.10.
Meanwhile, I have to port it to HEAD. And I would also like to fix the bug
that I uncovered in the interactions between -s and divert (but probably only
on head, as I think it will be pretty invasive for a stable branch). For an
example of the bug, note that "#line 3" is not a preprocessor directive any
more.
$ m4 -s
divert(2)2divert(1)1
dnl
undivert
^D
1
#line 1 "stdin"
2#line 3 "stdin"
It is particularly important to fix the bug on HEAD, since it has the new
syncoutput builtin to worry about. I'm thinking that diversions should
remember the file and line that is in effect when they are created, as well as
the location of the first \n in diverted text; and use this to generate the
first #line directive at the appropriate location in case the diversion is
dumped midline, rather than the current policy of blindly dumping the #line
directive into the diversion. Subsequent #line directives are not a problem;
it is only the first directive per diversion.
2007-05-25 Eric Blake <address@hidden>
Fix sync line interaction with multiline comments.
* doc/m4.texinfo (Other Incompatibilities): Add example, and
document bug in --syncline/divert interaction.
(Preprocessor features): Augment test.
* src/m4.h (output_text): Export.
(shipout_text, next_token): Add parameter.
* src/freeze.c (reload_frozen_state): Don't interfere with
synclines when reloading state.
* src/output.c (output_text): Export.
(shipout_text): Take new parameter for start line of token.
Output at most one syncline per token.
* src/input.c (next_token): Report line where multiline tokens
start.
* src/macro.c (expand_input, expand_token, expand_argument):
Adjust callers so that line is passed from input to output.
* NEWS: Document this fix.
Reported by Sergey Poznyakoff.
Index: NEWS
===================================================================
RCS file: /sources/m4/m4/NEWS,v
retrieving revision 1.1.1.1.2.100
diff -u -p -r1.1.1.1.2.100 NEWS
--- NEWS 24 May 2007 17:23:42 -0000 1.1.1.1.2.100
+++ NEWS 25 May 2007 17:12:57 -0000
@@ -6,6 +6,8 @@ Version 1.4.10 - ?? ??? 2007, by ???? (
* Fix regression introduced in 1.4.9 in the `eval' builtin when performing
division.
+* The synclines option `-s' no longer generates sync lines in the middle of
+ multiline comments or quoted strings.
* Work around a number of corner-case POSIX compliance bugs in various
broken stdio libraries. In particular, the `syscmd' builtin behaves
more predictably when stdin is seekable.
Index: doc/m4.texinfo
===================================================================
RCS file: /sources/m4/m4/doc/m4.texinfo,v
retrieving revision 1.1.1.1.2.124
diff -u -p -r1.1.1.1.2.124 m4.texinfo
--- doc/m4.texinfo 25 May 2007 12:58:50 -0000 1.1.1.1.2.124
+++ doc/m4.texinfo 25 May 2007 17:12:59 -0000
@@ -664,7 +664,8 @@ the file name did not change from the pr
Synchronization directives are always given on complete lines by
themselves. When a synchronization discrepancy occurs in the middle of
an output line, the associated synchronization directive is delayed
-until the beginning of the next generated line.
+until the next newline that does not occur in the middle of a quoted
+string or comment.
@comment options: -s
@example
@@ -672,15 +673,31 @@ define(`twoline', `1
2')
@result{}#line 2 "stdin"
@result{}
+changecom(`/*', `*/')
address@hidden
+define(`comment', `/*1
+2*/')
address@hidden 5
address@hidden
dnl no line
hello
address@hidden 4
address@hidden 7
@result{}hello
twoline
@result{}1
address@hidden 5
address@hidden 8
@result{}2
+comment
address@hidden/*1
address@hidden/
+one comment `two
+three'
address@hidden 10
address@hidden /*1
address@hidden/ two
address@hidden
goodbye
address@hidden 12
@result{}goodbye
@end example
@@ -6151,7 +6168,29 @@ diverted text as being generated at the
The sync line option is used mostly when using @code{m4} as
a front end to a compiler. If a diverted line causes a compiler error,
the error messages should most probably refer to the place where the
-diversion were made, and not where it was inserted again.
+diversion was made, and not where it was inserted again.
+
address@hidden options: -s
address@hidden
+divert(2)2
+divert(1)1
+divert`'0
address@hidden 3 "stdin"
address@hidden
+^D
address@hidden 2 "stdin"
address@hidden
address@hidden 1 "stdin"
address@hidden
address@hidden example
+
+The current @code{m4} implementation has a limitation that the syncline
+output at the start of each diversion occurs no matter what, even if the
+previous diversion did not end with a newline. This goes contrary to
+the claim that synclines appear on a line by themselves, so this
+limitation may be corrected in a future version of @code{m4}. In the
+meantime, when using @option{-s}, it is wisest to make sure all
+diversions end with newline.
@item
@acronym{GNU} @code{m4} makes no attempt at prohibiting self-referential
Index: src/freeze.c
===================================================================
RCS file: /sources/m4/m4/src/freeze.c,v
retrieving revision 1.1.1.1.2.14
diff -u -p -r1.1.1.1.2.14 freeze.c
--- src/freeze.c 1 Nov 2006 22:29:08 -0000 1.1.1.1.2.14
+++ src/freeze.c 25 May 2007 17:12:59 -0000
@@ -1,6 +1,6 @@
/* GNU m4 -- A simple macro processor
- Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 2006
+ Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 2006, 2007
Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
@@ -329,7 +329,7 @@ reload_frozen_state (const char *name)
make_diversion (number[0]);
if (number[1] > 0)
- shipout_text (NULL, string[1], number[1]);
+ output_text (string[1], number[1]);
break;
case 'F':
Index: src/input.c
===================================================================
RCS file: /sources/m4/m4/src/Attic/input.c,v
retrieving revision 1.1.1.1.2.34
diff -u -p -r1.1.1.1.2.34 input.c
--- src/input.c 5 Feb 2007 13:43:36 -0000 1.1.1.1.2.34
+++ src/input.c 25 May 2007 17:13:00 -0000
@@ -808,22 +808,23 @@ set_word_regexp (const char *regexp)
#endif /* ENABLE_CHANGEWORD */
-/*-------------------------------------------------------------------------.
-| Parse and return a single token from the input stream. A token can |
-| either be TOKEN_EOF, if the input_stack is empty; it can be TOKEN_STRING |
-| for a quoted string; TOKEN_WORD for something that is a potential macro |
-| name; and TOKEN_SIMPLE for any single character that is not a part of
|
-| any of the previous types. |
-| |
-| Next_token () return the token type, and passes back a pointer to the
|
-| token data through TD. The token text is collected on the obstack |
-| token_stack, which never contains more than one token text at a time.
|
-| The storage pointed to by the fields in TD is therefore subject to |
-| change the next time next_token () is called.
|
-`-------------------------------------------------------------------------*/
+/*--------------------------------------------------------------------.
+| Parse and return a single token from the input stream. A token |
+| can either be TOKEN_EOF, if the input_stack is empty; it can be |
+| TOKEN_STRING for a quoted string; TOKEN_WORD for something that is |
+| a potential macro name; and TOKEN_SIMPLE for any single character |
+| that is not a part of any of the previous types. If LINE is not |
+| NULL, set *LINE to the line where the token starts. |
+| |
+| Next_token () return the token type, and passes back a pointer to |
+| the token data through TD. The token text is collected on the |
+| obstack token_stack, which never contains more than one token text |
+| at a time. The storage pointed to by the fields in TD is |
+| therefore subject to change the next time next_token () is called. |
+`--------------------------------------------------------------------*/
token_type
-next_token (token_data *td)
+next_token (token_data *td, int *line)
{
int ch;
int quote_level;
@@ -833,9 +834,11 @@ next_token (token_data *td)
char *orig_text = NULL;
#endif
const char *file;
- int line;
+ int dummy;
obstack_free (&token_stack, token_bottom);
+ if (!line)
+ line = &dummy;
/* Can't consume character until after CHAR_MACRO is handled. */
ch = peek_input ();
@@ -860,7 +863,7 @@ next_token (token_data *td)
next_char (); /* Consume character we already peeked at. */
file = current_file;
- line = current_line;
+ *line = current_line;
if (MATCH (ch, bcomm.string, true))
{
obstack_grow (&token_stack, bcomm.string, bcomm.length);
@@ -872,7 +875,7 @@ next_token (token_data *td)
else
/* current_file changed to "" if we see CHAR_EOF, use the
previous value we stored earlier. */
- M4ERROR_AT_LINE ((EXIT_FAILURE, 0, file, line,
+ M4ERROR_AT_LINE ((EXIT_FAILURE, 0, file, *line,
"ERROR: end of file in comment"));
type = TOKEN_STRING;
@@ -955,7 +958,7 @@ next_token (token_data *td)
if (ch == CHAR_EOF)
/* current_file changed to "" if we see CHAR_EOF, use
the previous value we stored earlier. */
- M4ERROR_AT_LINE ((EXIT_FAILURE, 0, file, line,
+ M4ERROR_AT_LINE ((EXIT_FAILURE, 0, file, *line,
"ERROR: end of file in string"));
if (MATCH (ch, rquote.string, true))
Index: src/m4.h
===================================================================
RCS file: /sources/m4/m4/src/m4.h,v
retrieving revision 1.1.1.1.2.42
diff -u -p -r1.1.1.1.2.42 m4.h
--- src/m4.h 24 May 2007 17:23:43 -0000 1.1.1.1.2.42
+++ src/m4.h 25 May 2007 17:13:00 -0000
@@ -285,7 +285,7 @@ typedef enum token_data_type token_data_
void input_init (void);
token_type peek_token (void);
-token_type next_token (token_data *);
+token_type next_token (token_data *, int *);
void skip_line (void);
/* push back input */
@@ -321,7 +321,8 @@ extern int output_current_line;
void output_init (void);
void output_exit (void);
-void shipout_text (struct obstack *, const char *, int);
+void output_text (const char *, int);
+void shipout_text (struct obstack *, const char *, int, int);
void make_diversion (int);
void insert_diversion (int);
void insert_file (FILE *);
Index: src/macro.c
===================================================================
RCS file: /sources/m4/m4/src/Attic/macro.c,v
retrieving revision 1.1.1.1.2.16
diff -u -p -r1.1.1.1.2.16 macro.c
--- src/macro.c 1 Nov 2006 22:29:08 -0000 1.1.1.1.2.16
+++ src/macro.c 25 May 2007 17:13:00 -0000
@@ -1,7 +1,7 @@
/* GNU m4 -- A simple macro processor
- Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 2006 Free Software
- Foundation, Inc.
+ Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 2006, 2007 Free
+ Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@@ -25,7 +25,7 @@
#include "m4.h"
static void expand_macro (symbol *);
-static void expand_token (struct obstack *, token_type, token_data *);
+static void expand_token (struct obstack *, token_type, token_data *, int);
/* Current recursion level in expand_macro (). */
int expansion_level = 0;
@@ -59,12 +59,13 @@ expand_input (void)
{
token_type t;
token_data td;
+ int line;
obstack_init (&argc_stack);
obstack_init (&argv_stack);
- while ((t = next_token (&td)) != TOKEN_EOF)
- expand_token ((struct obstack *) NULL, t, &td);
+ while ((t = next_token (&td, &line)) != TOKEN_EOF)
+ expand_token ((struct obstack *) NULL, t, &td, line);
obstack_free (&argc_stack, NULL);
obstack_free (&argv_stack, NULL);
@@ -79,7 +80,7 @@ expand_input (void)
`------------------------------------------------------------------------*/
static void
-expand_token (struct obstack *obs, token_type t, token_data *td)
+expand_token (struct obstack *obs, token_type t, token_data *td, int line)
{
symbol *sym;
@@ -94,7 +95,8 @@ expand_token (struct obstack *obs, token
case TOKEN_CLOSE:
case TOKEN_SIMPLE:
case TOKEN_STRING:
- shipout_text (obs, TOKEN_DATA_TEXT (td), strlen (TOKEN_DATA_TEXT (td)));
+ shipout_text (obs, TOKEN_DATA_TEXT (td), strlen (TOKEN_DATA_TEXT (td)),
+ line);
break;
case TOKEN_WORD:
@@ -106,10 +108,10 @@ expand_token (struct obstack *obs, token
{
#ifdef ENABLE_CHANGEWORD
shipout_text (obs, TOKEN_DATA_ORIG_TEXT (td),
- strlen (TOKEN_DATA_ORIG_TEXT (td)));
+ strlen (TOKEN_DATA_ORIG_TEXT (td)), line);
#else
shipout_text (obs, TOKEN_DATA_TEXT (td),
- strlen (TOKEN_DATA_TEXT (td)));
+ strlen (TOKEN_DATA_TEXT (td)), line);
#endif
}
else
@@ -149,7 +151,7 @@ expand_argument (struct obstack *obs, to
/* Skip leading white space. */
do
{
- t = next_token (&td);
+ t = next_token (&td, NULL);
}
while (t == TOKEN_SIMPLE && isspace (to_uchar (*TOKEN_DATA_TEXT (&td))));
@@ -184,7 +186,7 @@ expand_argument (struct obstack *obs, to
paren_level++;
else if (*text == ')')
paren_level--;
- expand_token (obs, t, &td);
+ expand_token (obs, t, &td, line);
break;
case TOKEN_EOF:
@@ -196,7 +198,7 @@ expand_argument (struct obstack *obs, to
case TOKEN_WORD:
case TOKEN_STRING:
- expand_token (obs, t, &td);
+ expand_token (obs, t, &td, line);
break;
case TOKEN_MACDEF:
@@ -213,7 +215,7 @@ expand_argument (struct obstack *obs, to
abort ();
}
- t = next_token (&td);
+ t = next_token (&td, NULL);
}
}
@@ -239,7 +241,7 @@ collect_arguments (symbol *sym, struct o
if (peek_token () == TOKEN_OPEN)
{
- next_token (&td); /* gobble parenthesis */
+ next_token (&td, NULL); /* gobble parenthesis */
do
{
more_args = expand_argument (arguments, &td);
Index: src/output.c
===================================================================
RCS file: /sources/m4/m4/src/Attic/output.c,v
retrieving revision 1.1.1.1.2.19
diff -u -p -r1.1.1.1.2.19 output.c
--- src/output.c 16 Mar 2007 12:30:50 -0000 1.1.1.1.2.19
+++ src/output.c 25 May 2007 17:13:00 -0000
@@ -422,7 +422,7 @@ output_character_helper (int character)
| to a diversion file or an in-memory diversion buffer.
|
`------------------------------------------------------------------------*/
-static void
+void
output_text (const char *text, int length)
{
int count;
@@ -444,23 +444,26 @@ output_text (const char *text, int lengt
}
}
-/*-------------------------------------------------------------------------.
-| Add some text into an obstack OBS, taken from TEXT, having LENGTH |
-| characters. If OBS is NULL, rather output the text to an external file |
-| or an in-memory diversion buffer. If OBS is NULL, and there is no |
-| output file, the text is discarded. |
-| |
-| If we are generating sync lines, the output have to be examined, because |
-| we need to know how much output each input line generates. In general, |
-| sync lines are output whenever a single input lines generates several
|
-| output lines, or when several input lines does not generate any output. |
-`-------------------------------------------------------------------------*/
+/*--------------------------------------------------------------------.
+| Add some text into an obstack OBS, taken from TEXT, having LENGTH |
+| characters. If OBS is NULL, output the text to an external file |
+| or an in-memory diversion buffer instead. If OBS is NULL, and |
+| there is no output file, the text is discarded. LINE is the line |
+| where the token starts (not necessarily current_line, in the case |
+| of multiline tokens). |
+| |
+| If we are generating sync lines, the output has to be examined, |
+| because we need to know how much output each input line generates. |
+| In general, sync lines are output whenever a single input lines |
+| generates several output lines, or when several input lines do not |
+| generate any output. |
+`--------------------------------------------------------------------*/
void
-shipout_text (struct obstack *obs, const char *text, int length)
+shipout_text (struct obstack *obs, const char *text, int length, int line)
{
static bool start_of_output_line = true;
- char line[20];
+ char linebuf[20];
const char *cursor;
/* If output goes to an obstack, merely add TEXT to it. */
@@ -501,43 +504,59 @@ shipout_text (struct obstack *obs, const
output_text (text, length);
}
else
- for (; length-- > 0; text++)
- {
- if (start_of_output_line)
- {
- start_of_output_line = false;
- output_current_line++;
-
+ {
+ /* Check for syncline only at the start of a token. Multiline
+ tokens, and tokens that are out of sync but in the middle of
+ the line, must wait until the next raw newline triggers a
+ syncline. */
+ if (start_of_output_line)
+ {
+ start_of_output_line = false;
+ output_current_line++;
#ifdef DEBUG_OUTPUT
- printf ("DEBUG: cur %d, cur out %d\n",
- current_line, output_current_line);
+ fprintf (stderr, "DEBUG: line %d, cur %d, cur out %d\n",
+ line, current_line, output_current_line);
#endif
- /* Output a `#line NUM' synchronization directive if needed.
- If output_current_line was previously given a negative
- value (invalidated), rather output `#line NUM "FILE"'. */
-
- if (output_current_line != current_line)
- {
- sprintf (line, "#line %d", current_line);
- for (cursor = line; *cursor; cursor++)
- OUTPUT_CHARACTER (*cursor);
- if (output_current_line < 1 && current_file[0] != '\0')
- {
- OUTPUT_CHARACTER (' ');
- OUTPUT_CHARACTER ('"');
- for (cursor = current_file; *cursor; cursor++)
- OUTPUT_CHARACTER (*cursor);
- OUTPUT_CHARACTER ('"');
- }
- OUTPUT_CHARACTER ('\n');
- output_current_line = current_line;
- }
- }
- OUTPUT_CHARACTER (*text);
- if (*text == '\n')
- start_of_output_line = true;
- }
+ /* Output a `#line NUM' synchronization directive if needed.
+ If output_current_line was previously given a negative
+ value (invalidated), output `#line NUM "FILE"' instead. */
+
+ if (output_current_line != line)
+ {
+ sprintf (linebuf, "#line %d", line);
+ for (cursor = linebuf; *cursor; cursor++)
+ OUTPUT_CHARACTER (*cursor);
+ if (output_current_line < 1 && current_file[0] != '\0')
+ {
+ OUTPUT_CHARACTER (' ');
+ OUTPUT_CHARACTER ('"');
+ for (cursor = current_file; *cursor; cursor++)
+ OUTPUT_CHARACTER (*cursor);
+ OUTPUT_CHARACTER ('"');
+ }
+ OUTPUT_CHARACTER ('\n');
+ output_current_line = line;
+ }
+ }
+
+ /* Output the token, and track embedded newlines. */
+ for (; length-- > 0; text++)
+ {
+ if (start_of_output_line)
+ {
+ start_of_output_line = false;
+ output_current_line++;
+#ifdef DEBUG_OUTPUT
+ fprintf (stderr, "DEBUG: line %d, cur %d, cur out %d\n",
+ line, current_line, output_current_line);
+#endif
+ }
+ OUTPUT_CHARACTER (*text);
+ if (*text == '\n')
+ start_of_output_line = true;
+ }
+ }
}
/* Functions for use by diversions. */
_______________________________________________
Bug-m4 mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/bug-m4