Fuzzy Regular Expressions
Variants of regular expressions can be used for working with text in natural language, when it is necessary to take into account possible typos and spelling variants. For example, the text "Julius Caesar" might be a fuzzy match for:
- Gaius Julius Caesar
- Yulius Cesar
- G. Juliy Caezar
In such cases the mechanism implements some fuzzy string matching algorithm and possibly some algorithm for finding the similarity between text fragment and pattern.
This task is closely related to both full text search and named entity recognition.
Some software libraries work with fuzzy regular expressions:
- TRE - well-developed portable free project in C, which uses syntax similar to POSIX
- FREJ - open source project in Java with non-standard syntax (which utilizes prefix, Lisp-like notation), targeted to allow easy use of substitutions of inner matched fragments in outer blocks, but lacks many features of standard regular expressions.
- agrep - command-line utility (proprietary, but free for non-commercial usage).
Read more about this topic: Regular Expression
Famous quotes containing the words fuzzy, regular and/or expressions:
“Even their song is not a sure thing.
It is not a language;
it is a kind of breathing.
They are two asthmatics
whose breath sobs in and out
through a small fuzzy pipe.”
—Anne Sexton (19281974)
“While youre playing cards with a regular guy or having a bite to eat with him, he seems a peaceable, good-humoured and not entirely dense person. But just begin a conversation with him about something inedible, politics or science, for instance, and he ends up in a deadend or starts in on such an obtuse and base philosophy that you can only wave your hand and leave.”
—Anton Pavlovich Chekhov (18601904)
“Those expressions are omitted which can not with propriety be read aloud in the family.”
—Thomas Bowdler (17541825)