eduzhai > Applied Sciences > Engineering >

String Similarity Based on Phonetic in Gujarati Language Using GUJSIM Algorithm

  • Save

... pages left unread,continue reading

Document pages: 11 pages

Abstract: Searching using top 10 search engine1 to find “ગ ાંધીજી” or “ગ ન્ધીજી” and surprised to see the result which far differs from one to another. As in the Gujarati language, both strings are correct. Therefore, String similarity algorithm is useful for text mining applications while we generate index – saving space and time, both. Basically, string similarity compares each character from both strings but it may not give the accurate result on highly rich Gujarati language due to different kinds of writing styles which depend on matras, reph, vatu, and diacritics on simple and compound alphabets. GUJSIM (GUJarati SIMilarity) algorithm is the hybrid approach to do strings similarity for the Gujarati language. Here, the author compares 70 strings pairs and GUJSIM algorithm which gives optimum percentage result. This algorithm also helps to reduce the percentage of index based on the unique string.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...