I have a very similar experience as mentioned in "Multi-threaded search for large key sets" thread:
When gperf was way too slow for me to wait, I took a quick look with profiler and made a simple optimization in Search::compute_partition to avoid scanning all possible partitions and to consider only those with matching size.
Please consider taking the change:
Search::compute_partition iterates entire list of available partitions. This change updates compute_partition to keeps track of partitions by their size to avoid iterating partitions with sizes that do not match required undetermined_chars_length.
This change results in roughly 35% run-time speedup with 5000 keys (key lengths are 1 to 10 chars).
This change uses std::vector, if you prefer not to introduce std c++ containers, then you may take follow up change to use basic c++ code:
Thanks,
Pavel