Re: Building Texinfo 7.1.91 pretest with MinGW [non-ASCII file names]

bug-texinfo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Building Texinfo 7.1.91 pretest with MinGW [non-ASCII file names]

From:	Gavin Smith
Subject:	Re: Building Texinfo 7.1.91 pretest with MinGW [non-ASCII file names]
Date:	Thu, 14 Nov 2024 20:03:41 +0000

On Thu, Nov 14, 2024 at 11:44:36AM +0200, Eli Zaretskii wrote:
> > Date: Thu, 14 Nov 2024 10:07:11 +0100
> > From: Patrice Dumas <pertusus@free.fr>
> > 
> > On Thu, Nov 14, 2024 at 07:42:02AM +0000, Gavin Smith wrote:
> > > On Thu, Nov 14, 2024 at 08:47:25AM +0200, Eli Zaretskii wrote:
> > > > I'm not sure I follow: do you intend to use "int\xc3\xa9rnal.txt" as
> > > > an actual file name on disk?  In that case, please note that a
> > > > backslash cannot be part of a file name on Windows: it's a directory
> > > > separator.  If you want an escape character, it should be something
> > > > else, like # for example.
> > > 
> > > I had meant that - but I had forgotten that a backslash shouldn't be
> > > used in a file name on Windows.
> > 
> > # is a comment in shell, maybe a % would be better?
> 
> Yes, % is another possibility.

I've written a Perl program to rename a list of files provided on
standard input, using maintain/copy_change_file_name_encoding.pl as a
starting point (this was not as simple for me as I thought it might be,
as both directory and ordinary files could have to be renamed).

In tests/run_parser_all.sh, the output files are listed using the
"find" command, which are then piped to the Perl program.

With this change, the tests can be updated with "make -k check" followed
by "make copy-tests", followed by committing the changes to the test
results ("for f in */res_parser; do git add  $f ; done").  This 
leads to 27 files being renamed, which therefore would not be in the
tar distribution or tracked in git.  (This is not counting the "tex-html"
tests some of which would also be affected.)

It does not deal with the issue of skipping such tests, though.

diff --git a/tp/tests/escape_file_names.pl b/tp/tests/escape_file_names.pl
new file mode 100755
index 0000000000..5d66871e2e
--- /dev/null
+++ b/tp/tests/escape_file_names.pl
@@ -0,0 +1,80 @@
+#! /usr/bin/env perl
+
+# escape_file_name.pl: read list of file names from stdin and rename
+# any with non-ASCII characters
+#
+# Copyright 2024 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License,
+# or (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+use strict;
+use utf8;
+
+use File::Copy;
+use File::Basename;
+use File::Spec;
+use File::Path;
+
+my @files;
+
+# Read all of input first
+while (<>) {
+  chomp;
+  push @files, $_;
+}
+
+# Sort files in forward order.  This should mean we create directories
+# before any files they contain.
+@files = sort @files;
+
+my @moved_files;
+
+for my $file (@files) {
+  if ($file =~ /[^[:ascii:]]/) {
+    unshift @moved_files, $file;
+
+    my $ascii_name = '';
+    for my $char (split('', $file)) {
+        if (ord($char) < 0x80) {
+          $ascii_name .= $char;
+        } else {
+          $ascii_name .= sprintf("%%%x", ord($char));
+        }
+    }
+
+    my $dest_path = $ascii_name;
+
+    if (-d $file) {
+        mkdir $dest_path;
+    } else {
+        my $copy_succeeded = copy($file, $dest_path);
+        if (not $copy_succeeded) {
+          warn "could not move $file: $!\n";
+          exit(1);
+        }
+    }
+  }
+}
+
+# After copying the files, remove the files from the original locations
+# in reverse order.
+for my $delete (@moved_files) {
+    if (-d $delete) {
+      File::Path::rmtree($delete);
+    } else {
+      unlink $delete;
+    }
+}
+
+exit(0);
diff --git a/tp/tests/run_parser_all.sh b/tp/tests/run_parser_all.sh
index a25562002b..f5c4f0be7a 100755
--- a/tp/tests/run_parser_all.sh
+++ b/tp/tests/run_parser_all.sh
@@ -178,6 +178,12 @@ post_process_output ()
   fi
 }
 
+# ensure only ASCII filenames are used in output
+escape_file_names ()
+{
+    find "${outdir}${dir}" | ${srcdir}/escape_file_names.pl
+}
+
 LC_ALL=C; export LC_ALL
 LANGUAGE=en; export LANGUAGE
 
@@ -443,6 +449,7 @@ while read line; do
       rm -rf "${raw_outdir}$dir"
 
       post_process_output
+      escape_file_names
 
       if test "z$res_dir_used" != 'z' ; then
         diff $DIFF_OPTIONS -r "$res_dir_used" "${outdir}$dir" 2>>$logfile > 
"$testdir/$diffs_dir/$diff_base.diff"

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Building Texinfo 7.1.91 pretest with MinGW, (continued)

Prev by Date: Re: Building Texinfo 7.1.91 pretest with MinGW
Next by Date: Re: Building Texinfo 7.1.91 pretest with MinGW [remove copy-file module]
Previous by thread: Re: Building Texinfo 7.1.91 pretest with MinGW [non-ASCII file names]
Next by thread: Re: Building Texinfo 7.1.91 pretest with MinGW [non-ASCII file names]
Index(es):
- Date
- Thread