Algorithm of fuzzy text search in online social networks

Yulia Davydova


In the task of online social networks monitoring search with keywords is complicated by misspellings, typos, slang in users’ posts. To reduce search sensitivity to misspellings and improve the completeness of search results it is proposed to use fuzzy search with filtration. This article presents the algorithm consisting of two stages – scanning and verification. On the scanning stage, text is being filtered with the aim to exclude posts, which definitely do not contain keywords from consideration. Remaining post are checked on the verification stage. Integration of linguistic rules and misspellings statistics in text search allows to preserve its accuracy. The article presents estimation of effectiveness of the whole algorithm of fuzzy search and of the classifier used in it in particularly. Testing was done on the sample of posts from The General Internet-Corpus of Russian.

