phash: bloat the hashes somewhat, reducing the likelihood of false positives
Set the hash size scaling constant to 1.6, signifying 3.2 times the hash load. This both reduces the convergence time and makes it less likely (< 25%) that a non-entry will require a secondary comparison, and after all, in most of our use cases non-entries are by far the more common. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
This commit is contained in:
parent
32322a9a93
commit
0d17f8a7e6
2 changed files with 5 additions and 3 deletions
|
@ -224,7 +224,7 @@ if ($what eq 'c') {
|
|||
# Put a large value in unused slots. This makes it extremely unlikely
|
||||
# that any combination that involves unused slot will pass the range test.
|
||||
# This speeds up rejection of unrecognized tokens, i.e. identifiers.
|
||||
print OUT "#define UNUSED_HASH_ENTRY (65535/3)\n";
|
||||
print OUT "\n#define UNUSED_HASH_ENTRY (65535/3)\n";
|
||||
|
||||
print OUT "\n\n/* Primary preprocessor token hash */\n\n";
|
||||
|
||||
|
|
|
@ -145,8 +145,10 @@ sub gen_perfect_hash($) {
|
|||
|
||||
# Minimal power of 2 value for N with enough wiggle room.
|
||||
# The scaling constant must be larger than 0.5 in order for the
|
||||
# algorithm to ever terminate.
|
||||
my $room = int(scalar(@keys)*0.8);
|
||||
# algorithm to ever terminate. The higher the scaling constant,
|
||||
# the more space does the hash take up, but the less likely is it
|
||||
# that an invalid token will require a string comparison.
|
||||
my $room = int(scalar(@keys)*1.6);
|
||||
$n = 1;
|
||||
while ($n < $room) {
|
||||
$n <<= 1;
|
||||
|
|
Loading…
Reference in a new issue