[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Performance improvement for large keysets
From: |
Bruno Haible |
Subject: |
Re: Performance improvement for large keysets |
Date: |
Wed, 29 Jan 2020 17:12:15 +0100 |
User-agent: |
KMail/5.1.3 (Linux/4.4.0-171-generic; KDE/5.18.0; x86_64; ; ) |
Hi Pavel,
> When gperf was way too slow for me to wait, I took a quick look with
> profiler and made a simple optimization in Search::compute_partition to
> avoid scanning all possible partitions and to consider only those with
> matching size.
>
> Please consider taking the change:
> https://github.com/pps83/gperf/commit/5da97d6ac156f1fdb7967a9f45654c15ab9b4e83
> ...
> This change results in roughly 35% run-time speedup with 5000 keys (key
> lengths are 1 to 10 chars).
Probably it can be optimized even more, by using a hash table in this place
(mapping an undetermined_chars array to the list of keywords that have this
same undetermined_chars array)...
> This change uses std::vector, if you prefer not to introduce std c++
> containers, then you may take follow up change to use basic c++ code:
> https://github.com/pps83/gperf/commit/f967d3c1f63902e022dac586d646824d91ff21b7
Yes, so far gperf has seen very few portability hassles due to the evolution
of the C++ standard, because it hardly relies on the libstdc++ functionality.
The alternative is to buy into libstdc++, and then work around portability
problems through autoconf tests and #ifdef - which is a lot of effort for
the small codebase of gperf.
Bruno