Next: String Search, Previous: Space-Filling Curves, Up: Sorting and Searching [Contents][Index]
Computes the soundex hash of name. Returns a string of an initial letter and up to three digits between 0 and 6. Soundex supposedly has the property that names that sound similar in normal English pronunciation tend to map to the same key.
Soundex was a classic algorithm used for manual filing of personal records before the advent of computers. It performs adequately for English names but has trouble with other languages.
See Knuth, Vol. 3 Sorting and searching, pp 391–2
To manage unusual inputs, soundex
omits all non-alphabetic
characters. Consequently, in this implementation:
(soundex <string of blanks>) ⇒ "" (soundex "") ⇒ ""
Examples from Knuth:
(map soundex '("Euler" "Gauss" "Hilbert" "Knuth" "Lloyd" "Lukasiewicz")) ⇒ ("E460" "G200" "H416" "K530" "L300" "L222") (map soundex '("Ellery" "Ghosh" "Heilbronn" "Kant" "Ladd" "Lissajous")) ⇒ ("E460" "G200" "H416" "K530" "L300" "L222")
Some cases in which the algorithm fails (Knuth):
(map soundex '("Rogers" "Rodgers")) ⇒ ("R262" "R326") (map soundex '("Sinclair" "St. Clair")) ⇒ ("S524" "S324") (map soundex '("Tchebysheff" "Chebyshev")) ⇒ ("T212" "C121")