From the answers to (When) is hash table lookup O(1)?, I gather that hash tables

ID: 648011 • Letter: F

Question

From the answers to (When) is hash table lookup O(1)?, I gather that hash tables have O(1) worst-case behavior, at least amortized, when the data satisfies certain statistical conditions, and there are techniques to help make these conditions broad.

However, from a programmer's perspective, I don't know in advance what my data will be: it often comes from some external source. And I rarely have all the data at once: often insertions and deletions happen at a rate that's not far below the rate of lookups, so preprocessing the data to fine-tune the hash function is out.

So, taking a step out: given some knowledge about data source, how can I determine whether a hash table has a chance of having O(1) operations, and possibly which techniques to use on my hash function?

Explanation / Answer

I am not a data structures expert, but the usual theoretical approach to hashing is that one defines a family of functions (e.g., ha,b(x)=ax+bmodp) and then considers the behavior on a worst case input on a randomly chosen member of the family, where the adversary doesn't know the random choice in advance. This is similar to how randomized algorithms are analyzed as well: the expectation is taken over the algorithm's choices, not the input distribution.

In the past, according to a Usenix paper by Crosby and Wallach, common programming languages didn't do anything like this, leaving lots of web apps (and other servers) open to a DoS attack based on manufacturing collisions. (The paper is from 2003, but it suggests that Dan Bernstein had discovered the same idea quite a bit earlier.)

A quick google search provides claims that the state of the art in terms of implementations has both improved and not improved.

Another aside is that in a high-bandwidth world, timing attacks make it not-so-hard to find collisions online (as opposed to offline as the Crosby-Wallach link suggests). I seem to remember that Daniel Golovin had results some years ago on data structures that aren't vulnerable to timing attacks, but I don't know if they are widely used.

Navigate

From the answers to (When) is hash table lookup O(1)?, I gather that hash tables

From the apical surface to the basal surface of an epithelial cell, what is the

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

From the answers to (When) is hash table lookup O(1)?, I gather that hash tables

Question

Explanation / Answer

Related Questions

Navigate