6.006 7. Hashing III: Open Addressing
MIT OpenCourseWare http://ocw.mit.edu6.006Introduction to AlgorithmsSpring 2008For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. Lecture 7 Hashing III: Open Addressing 6.006 Spring 2008Lecture 7: Hashing III: Open Addressing Lecture Overview • Open Addressing, Probing Strategies • Uniform Hashing, Analysis • Advanced Hashing Readings CLRS Chapter 11.4 (and 11.3.3 and 11.5 if interested) Open Addressing Another approach to collisions no linked lists • • all items stored in table (see Fig. 1) item2item1item3Figure 1: Open Addressing Table • one item per slot = ⇒ m ≥ n • hash function specifies order of slots to probe (try) for a key, not just one slot: (see Fig. 2) Insert(k,v) for i in xrange(m): if T [h(k, i)] is None: � empty slot T [h(k, i)] = (k, v) � store item return raise ‘full’ 1 Lecture 7 Hashing III: Open Addressing 6.006 Spring 2008h(k,3)h(k,1)h(k,4)h(k,2)k h: U x {φ,1, . . . , m-1} {φ,1, . . . , m-1} permutationall possible keyswhich probeslot to probeFigure 2: Order of Probes Example: Insert k = 496 collisionφ1234567m-1collisioninsert586 , . . .133 , . . .204 , . . .496 , . . .481 , . . .probe h(496, φ) = 4probe h(496, 1) = 1probe h(496, 2) = 5Figure 3: Insert Example Search(k) for i in xrange(m): if T [h(k, i)] is None: � empty slot? return None � end of “chain” elif T [h(k, i)][φ] == k: � matching key return T [h(k, i)] � return item return None ˙ � exhausted table 2Lecture 7 Hashing III: Open Addressing 6.006 Spring 2008Delete(k) • can’t just set T [h(k, i)] = Noneexample: delete(586) = search(496) fails• ⇒ • replace item with DeleteMe, which Insert treats as None but Search doesn’t Probing Strategies Linear Probing h(k, i)=(h�(k)+i) mod m where h�(k) is ordinary hash function • like street parking • problem: clustering as consecutive group of filled slots grows, gets more likely to grow (see Fig. 4) h(k,m-1)h(k,0)h(k,2)h(k,1);;;..;Figure 4: Primary Clustering • for 0.01 <α< 0.99 say, clusters of Θ(lg n). These clusters are knownfor α = 1, clusters of Θ(√n) These clusters are known• Double Hashing h(k, i) =(h1(k)+i. h2(k)) mod m where h1(k) and h2(k) are two ordinary hash functions. • actually hit all slots (permutation) if h2(k) is relatively prime to m • e.g. m =2r, make h2(k) always odd Uniform Hashing Assumption Each key is equally likely to have any one of the m! permutations as its probe sequence • not really true • but double hashing can come close 3 Lecture 7 Hashing III: Open Addressing 6.006 Spring 2008Analysis 1Open addressing for n items in table of size m has expected cost of ≤ 1 − α per operation, where α = n/m(< 1) assuming uniform hashing Example: α = 90% = 10 expected probes ⇒ Proof: Always make a first probe.With probability n/m, first slot occupied.In worst case (e.g. key not in table), go to next.With probability n − 1 , second slot occupied. m − 1n − 2Then, with probability , third slot full. m − 2Etc. (n possibilities) nSo expected cost = 1 + (1 + n − 1 (1 + n − 2() mm − 1m − 2··· nNow n − 1= α for i = φ, ,n(≤ m) m − 1 ≤ m ··· So expected cost ≤ 1+ α(1 + α(1 + α(··· ))) = 1+ α + α2 + α3 + ··· 1 = 1 − α Open Addressing vs. Chaining Open Addressing: better cache performance and rarely allocates memory Chaining: less sensitive to hash functions and α 4• � Lecture 7 Hashing III: Open Addressing 6.006 Spring 2008Advanced Hashing Universal Hashing Instead of defining one hash function, define a whole family and select one at random • e.g. multiplication method with random a can prove Pr (over random h) {h(x)= h(y)} = 1 for every (i.e. not random) x = ym = O(1) expected time per operation without assuming simple uniform hashing! •⇒CLRS 11.3.3 Perfect Hashing Guarantee O(1) worst-case search idea: if m = n2 then E[� collisions] ≈ 1 • 2= get φ after O(1) tries ...but O(n2) space⇒ • use this structure for storing chains k items => m = k 2NO COLLISIONS2 levels[CLRS 11.5]Figure 5: Two-level Hash Table• can prove O(n) expected total space! • if ever fails, rebuild from scratch 5
Description
In this lecture notes we are going to continue with hashing. Different strategies used in this Hasing III are Open Addressing, Probing Strategies, Uniform Hashing, Analysis and Advanced Hashing. Open Addressing is an
approach to collisions , in this strategy no linked lists, all items stored in table ,one item per slot, hash function specifies order of slots to probe (try) for a key.This lesson very nicely explains different probing strategies such as 1. Linear probing 2.Double hashing 3.Uniform hashing and 4. Advanced hashing with the help of neat diagrams.
Instructors: Prof. Erik Demaine, Prof. Ronald Rivest, Prof. Srinivas Devadas, MIT Course Number: 6.006 Level: Undergraduate, 6.006 7. 6.005 7. Hashing III: Open Addressing, Introduction to Algorithms, Electrical Engineering and Computer Science, Engineering, Massachusetts Institute of Technology: MIT Open Course Ware, http://ocw.mit.edu (02-11-2011). License: Creative Commons BY-NC-SA: http://ocw.mit.edu/terms/#cc.
Presentation Transcript
Your Facebook Friends on WizIQ