[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gperf] shrinking the asso_values array
From: |
Mark Stankus |
Subject: |
Re: [bug-gperf] shrinking the asso_values array |
Date: |
Thu, 5 Sep 2019 09:58:20 -0700 |
Ok, thanks.
Mark Stankus
On Wed, Sep 4, 2019 at 5:13 PM Bruno Haible <address@hidden> wrote:
>
> Hi Mark,
>
> > FOURTH SUGGESTION ( MEDIUM DIFFICULTY )
> >
> > I was wondering if it would be worthwhile to use an "if" statement
> > and a shorter static array asso_values in the "hash" function in some
> > cases. The reason would be that the array is 256 bytes and loading
> > different parts of the array could use up to 4 cache lines. Loading
> > caches lines is very expensive from the point of view of the CPU
> > (sometimes 100 cycles) as opposed to an if statement (which is a lot less,
> > but is blocking.
> >
> > I looked at permut2.gperf in the examples directory of the gperf
> > distribution. I will talk about this and next suggestion via examples
> > related to that file.
> >
> > In our case, the largest hash value is 5 and all but four entries in
> > the array are 6.
> > Recall that in the perumt2.gperf example, only the letters x, y and z are
> > used.
> >
> > Instead of
> >
> > return len + asso_values[(unsigned char) str[1] + 1] +
> > assoc_values[(unsigned char)[str[0]];
> >
> > in the hash function, there could be an optimized(?) version of the
> > following
> >
> > unsigned int returned_value = 6;
> > if ( len <= 2 && 'x' <= str[1] && str[1] <= 'z' && 'x' <= str[0] &&
> > str[0] <= 'z' )
> > {
> > unsigned char first = (unsigned char) str[1] - 'x' + 1;
> > unsigned char second = (unsigned char) str[0] - 'x';
> > returned_value len + modified_asso_values[first] +
> > modified_asso_values[second];
> > }
> > return returned_value;
>
> On today's CPUs, with their long pipelines of prepared instructions, the
> condition you are writing there would cost 5 conditional jumps. Or, when
> we leave out the redundant 'len <= 2' test (already done by the caller)
> and assume an optimizing compiler that optimizes
> 'x' <= str[1] && str[1] <= 'z'
> to
> (unsigned int)(str[1] - 'x') <= 'z' - 'x'
> we still have 2 conditional jumps.
>
> > If either letter is out of range, the array is not accessed at all.
>
> Optimizing cache behaviour at the expense of conditional jumps is not
> good. You need a reasonable balance between both.
>
> Bruno
>
- [bug-gperf] One usability and 4 fine tuning performance suggestions (cache lines and array size), Mark Stankus, 2019/09/04
- Re: [bug-gperf] gperf as a library, Bruno Haible, 2019/09/04
- Re: [bug-gperf] unsigned long vs. unsigned int, Bruno Haible, 2019/09/04
- Re: [bug-gperf] large lengths, Bruno Haible, 2019/09/04
- Re: [bug-gperf] shrinking the asso_values array, Bruno Haible, 2019/09/04
- Re: [bug-gperf] shrinking the asso_values array,
Mark Stankus <=
- Re: [bug-gperf] bit packing tricks, Bruno Haible, 2019/09/04