Next: , Previous: Space-Filling Curves, Up: Sorting and Searching


7.2.8 Soundex

(require 'soundex)

— Function: soundex name

Computes the soundex hash of name. Returns a string of an initial letter and up to three digits between 0 and 6. Soundex supposedly has the property that names that sound similar in normal English pronunciation tend to map to the same key.

Soundex was a classic algorithm used for manual filing of personal records before the advent of computers. It performs adequately for English names but has trouble with other languages.

See Knuth, Vol. 3 Sorting and searching, pp 391–2

To manage unusual inputs, soundex omits all non-alphabetic characters. Consequently, in this implementation:

          (soundex <string of blanks>)    ⇒ ""
          (soundex "")                    ⇒ ""

Examples from Knuth:

          (map soundex '("Euler" "Gauss" "Hilbert" "Knuth"
                                 "Lloyd" "Lukasiewicz"))
                  ⇒ ("E460" "G200" "H416" "K530" "L300" "L222")
          
          (map soundex '("Ellery" "Ghosh" "Heilbronn" "Kant"
                                  "Ladd" "Lissajous"))
                  ⇒ ("E460" "G200" "H416" "K530" "L300" "L222")

Some cases in which the algorithm fails (Knuth):

          (map soundex '("Rogers" "Rodgers"))     ⇒ ("R262" "R326")
          
          (map soundex '("Sinclair" "St. Clair")) ⇒ ("S524" "S324")
          
          (map soundex '("Tchebysheff" "Chebyshev")) ⇒ ("T212" "C121")