Next: , Previous: , Up: Sorting and Searching   [Contents][Index]


7.2.8 Soundex

(require 'soundex)

Function: soundex name

Computes the soundex hash of name. Returns a string of an initial letter and up to three digits between 0 and 6. Soundex supposedly has the property that names that sound similar in normal English pronunciation tend to map to the same key.

Soundex was a classic algorithm used for manual filing of personal records before the advent of computers. It performs adequately for English names but has trouble with other languages.

See Knuth, Vol. 3 Sorting and searching, pp 391–2

To manage unusual inputs, soundex omits all non-alphabetic characters. Consequently, in this implementation:

(soundex <string of blanks>)    ⇒ ""
(soundex "")                    ⇒ ""

Examples from Knuth:

(map soundex '("Euler" "Gauss" "Hilbert" "Knuth"
                       "Lloyd" "Lukasiewicz"))
        ⇒ ("E460" "G200" "H416" "K530" "L300" "L222")

(map soundex '("Ellery" "Ghosh" "Heilbronn" "Kant"
                        "Ladd" "Lissajous"))
        ⇒ ("E460" "G200" "H416" "K530" "L300" "L222")

Some cases in which the algorithm fails (Knuth):

(map soundex '("Rogers" "Rodgers"))     ⇒ ("R262" "R326")

(map soundex '("Sinclair" "St. Clair")) ⇒ ("S524" "S324")

(map soundex '("Tchebysheff" "Chebyshev")) ⇒ ("T212" "C121")