I am looking for a way of doing fuzzy string matching using the Haskell programming language. I want to parse the user's text input (which is a normal natural english sentence). The user may will have errors in his text, which should get recognized and replaced automatically by my program. I think the easiest way would be calculating the 'Levenshtein distance' to all known words and use the word with the smallest distance. Although, this is not really fuzzy logic (fuzzy logic depends, in my view, more on the context of the sentence which is written), I think it is a good starting point in order to get an initial version working. According to the Wikibook the Haskell implementation would be as easy as this:
levenshtein :: String- > String -> Int
levenshtein s t =
d!!(length s)!!(length t)
where d = [[distance m n|n<-[0..length t]]|m<-[0..length s]]
distance i 0 = i
distance 0 j = j
distance i j = minimum [d!!(i-1)!!j+1,
d!!i!!(j-1)+1, d!!(i-1)!!(j-1)
+ (if s!!(i-1)==t!!(j-1)
then 0 else 1)]
This method may be too slow if I've to compare each word with about serveral thousands words. Hrrm.
Or using another implementation, because:
The implementations of the Levenshtein algorithm on this page are illustrative only. Applications will, in most cases, use implementations which use heap allocations sparingly, in particular when large lists of words are compared to each other.
Yet another way would be implementing a Fuzzy Logic Library (as learned in the Neuronal Fuzzy Networks lecture @ Aachen University of Applied Sciences) from scratch.