m4-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[0/18] new argv_ref branch for m4 speedup


From: Eric Blake
Subject: [0/18] new argv_ref branch for m4 speedup
Date: Mon, 19 Nov 2007 21:12:27 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071031 Thunderbird/2.0.0.9 Mnenhy/0.7.5.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

For some background - up till now, m4 tail recursion (such as used in
m4sugar's m4_foreach) was inherently quadratic in operation when executed
on GNU m4:
http://lists.gnu.org/archive/html/autoconf-patches/2007-10/msg00145.html

(For the record, I tried to determine how other m4 implementations behave,
but Solaris m4 has a stupid limit of a maximum of 100 arguments to any
macro invocation, and I don't have access to a compiled BSD version).

I've been working on this for over a month, now, and am finally happy
enough with my series of patches to post it as a new branch named
'argv_ref'.  I still need to write the ChangeLogs, perform additional
regression testing (such as the complete autoconf testsuite, and
bootstrapping libtool), as well as port the technical content of the patch
series to the master branch, before releasing m4 1.4.11 (I might rewind
the argv_ref branch at times to fold in some of those improvements, but I
will not rewind the branch-1_4 branch).  Anyone that would like to help
out by building
http://git.sv.gnu.org/gitweb/?p=m4.git;a=shortlog;h=argv_ref and ensuring
that it has no difference from m4 1.4.10 (other than the speedups) is more
than welcome to give it a try.  I've added a lot of new assertions into
the code, and can't quite guarantee that awkward choices of quoting or
comment characters will lead to unintended semantics changes or assertion
failures.

For some numbers, based on testing on cygwin:

Best times, ~peak memory usage for:
[1] 'time M4=m4 autoconf -f' on coreutils (ensures that the output is
identical, and that large-scale real-life examples aren't adversely affected)
[2] 'time m4 -Dlimit=1500 loop.m4' in m4/examples (tests boxed recursion,
where each iteration takes three arguments and trims off an element of a
quoted list)
[3] 'time m4 -Dlimit=1500 -Dalt loop.m4' in m4/examples (tests unboxed
recursion, where each iteration uses one fewer argument)

          [1]              [2]             [3]
compiled at -O2
m4-1.4.10:29.812,  4.4      9.225, 40.7     8.683, 18.8

compiled at -g
master:   34.054,  4.3     10.698,  1.8     9.767,  1.8
stage 1:  34.959,  4.3     11.084,  1.8     9.879,  1.8
stage 2:  34.572,  4.3     10.657,  1.8     9.906,  1.8
stage 3:  32.821,  4.3     10.587,  1.8     9.991,  1.8
stage 4:  30.942,  4.3     10.738,  1.8    10.060,  1.8
stage 5:  32.389,  4.3     10.877,  1.8    10.095,  1.8
stage 6:  33.388,  4.3     10.737,  1.8     9.784,  1.8
stage 7:  32.729,  4.4     10.576,  2.0     9.924,  5.3
stage 8:  32.377,  4.5     10.608,  2.1     9.954,  2.2
stage 9:  34.862,  7.6     10.626, 33.3     9.481,  2.1
stage10:  33.730,  7.6     10.061, 35.3     8.888,  2.1
stage11:  31.903,  6.1      8.717,  2.1     8.959,  2.1
stage12:  32.480,  6.1      8.746,  2.2     8.977,  2.1
stage13:  32.875,  4.6      8.981, 10.5     8.999,  2.1
stage14:  34.190,  5.3      9.919, 10.6    10.260, 55.0
stage15:  32.670,  5.5      8.035, 10.7     6.953, 10.9
stage16:  31.769,  5.5      8.123, 10.6     7.002, 10.9
stage17:  29.833,  5.1      5.567,  2.1     6.582,  2.4
stage18:  30.201,  5.1      5.558,  1.8     5.571,  2.1

Other tests include checking for scaling behavior:

[1] m4 -Dlimit=n loop.m4
[2] m4 -Dlimit=n -Dalt loop.m4

n =     1500          3000          6000
[1]     0.192, 1.8    0.297, 1.9    0.548, 2.0
[2]     0.177, 2.1    0.320, 2.6    0.522, 3.4

Impressive! m4 1.4.10 was quadratic in both memory and time before the
patch series, and reduces to linear (plus fixed overhead) in both memory
and time afterwards.  And the performance of autoconf using an unoptimized
image at stage18 practically overtook the speed of the -O2 baseline of
1.4.10, all due to the algorithmic improvements.

I'll follow up with each of the 18 patches to just the m4-patches list.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHQl6q84KuGfSFAYARAsGRAKCTTU0pw6xSfaybUMyhiFBFne//WgCfeu4p
OK0o+qMTMWlinsEBdCd37OU=
=LJhc
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]