Next: String Search, Previous: Space-Filling Curves, Up: Sorting and Searching [Contents][Index]
Computes the soundex hash of name. Returns a string of an initial letter and up to three digits between 0 and 6. Soundex supposedly has the property that names that sound similar in normal English pronunciation tend to map to the same key.
Soundex was a classic algorithm used for manual filing of personal records before the advent of computers. It performs adequately for English names but has trouble with other languages.
See Knuth, Vol. 3 Sorting and searching, pp 391–2
To manage unusual inputs, soundex omits all non-alphabetic
characters. Consequently, in this implementation:
(soundex <string of blanks>) ⇒ "" (soundex "") ⇒ ""
Examples from Knuth:
(map soundex '("Euler" "Gauss" "Hilbert" "Knuth"
"Lloyd" "Lukasiewicz"))
⇒ ("E460" "G200" "H416" "K530" "L300" "L222")
(map soundex '("Ellery" "Ghosh" "Heilbronn" "Kant"
"Ladd" "Lissajous"))
⇒ ("E460" "G200" "H416" "K530" "L300" "L222")
Some cases in which the algorithm fails (Knuth):
(map soundex '("Rogers" "Rodgers")) ⇒ ("R262" "R326")
(map soundex '("Sinclair" "St. Clair")) ⇒ ("S524" "S324")
(map soundex '("Tchebysheff" "Chebyshev")) ⇒ ("T212" "C121")