Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

What is the best way to generate a unique 64 bit numeric database key from a str

ID: 652942 • Letter: W

Question

What is the best way to generate a unique 64 bit numeric database key from a string?

For example: if the string is a URL, you could generate a 64 bit key using SipHash or by truncating a SHA256 hash to 64 bits, but collisions are likely at 232 (~ 4 billion) entries.

However, if instead you truncated the hash to (64 bits - X bits), and concatenated the resulting hash with the first X bits of the input string, you would reduce the chance of collisions since two similar strings are unlikely to produce the same hash code:

key_bits = 64
x_bits = 8
hash_code = sha256(url_string)
key = truncate(hash_code, key_bits - x_bits) + truncate(url_string, x_bits)

Is this correct? Is there a better way?

Explanation / Answer

Unfortunately, without some additional assumptions about the strings (e.g. that the first 8 bytes are unique), there is no way to reduce the chance of collision below the usual chance levels. If you need unique 64-bit ids, the options are:

1.Do a database lookup when generating keys and pick another key if it is already in use (you state that you want to avoid this).

2. Ignore duplicate keys, e.g. set things up so that only one of the two entries is kept. As long as you don't go much above 232 entries, the expected number of collisions will be very low, and depending on your task, it may be acceptable to ignore a small fraction of the entries.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote