tokenization using indic NLP library

Hello! I should say ?????????????????? since today’s topic is regarding Indian language.

Natural Language Processing looks fascinating but it’s similar to Machine Learning where we need data cleaning and data pre-processing.

Sounds boring right?  But it’s not our mistake…machines never tried to learn human languages . It was us who generously learnt numbers to communicate with them . Jokes apart, when we talk data pre-processing, Tokenization is an integral part of this. Basically, we split the text further into units called tokens which can be words or characters.

Click Here

Tags: Library NLP