Bharadwaja Kumar This person is not on ResearchGate, or hasnt claimed this research yet.
Download full-text PDF Read full-text Download full-text PDF Read full-text Download citation Copy link Link copied Read full-text Download citation Copy link Link copied Citations (6) References (31) Figures (3) Abstract and Figures In this Information and communication technology era, designing interactive computer systems that are effective, efficient, easy, and enjoyable to use is becoming increasingly important.
Of the numerous ways explored by researchers to enhance Human-Computer Interaction, Text to Speech or Speech Synthesis affirms to be one such modality for developing better interfaces.
Text normalization is performed on unrestricted Tamil text to convert non-standard words into standard words for the reduction of ambiguous utterances along the interim processing of the words.
Loan words in Tamil text are identified in order to improve the pronunciation model of the Tamil speech synthesizer system.
In this paper, we describe a semiotic classifier based on decision list approach with which we are able to tackle many varieties of non-standard words.
We also describe a loannative word classifier based on multiple linear regression which works efficiently even on shorter words of 3 syllables in length.
In todays predominant Digital, Information-Communication Technology and Human-Computer Interaction era such profound text processors is imperative..
Semiotic Class. Post Processing.
Example of test data classiication Figures - uploaded by Vai Bhavi Author content All figure content in this area was uploaded by Vai Bhavi Content may be subject to copyright.
Discover the worlds research 20 million members 135 million publications 700k research projects Join for free Public Full-text 1 Content uploaded by Vai Bhavi Author content All content in this area was uploaded by Vai Bhavi on Aug 22, 2016 Content may be subject to copyright.
Of the numerous ways explor ed by researchers to enhance standard words int o standard words for the reduction of ambiguous utterances along the interim processing of the words.
Communication T echnology and Human-Computer Interaction era such prof ound text processors is imperativ e.
T ext Processing f or Developing Unrestricted T amil T ext to Speech Synthesis Sy stem Vaibhavi Rajendran and G.
Bharadwaja K umar Schoo l of Comp uting Scie nce and E ngin eerin g, VIT Univ ersity, Chenna i Ca mpus, T amil Nadu, Ind ia; v vaiba vig mail.com Keywords: Natural Language Pr ocessing, T amil, T ext Processing, T ext-to-Speech (TTS), Unrestricted T ext 1.
Introduction An extensive research has been carried out in developing T ext-to-Speech (TTS) systems for languages such as English, Chinese, Japanese, Germany and also for Indian languages such as Hindi, Urdu, Gujarati, T elugu, and T amil etc.
A Natural Language Processing (NLP) module in an unrestricted TTS involves processing o f the real tex t which is an intriguing task.
Proper a ttention is required to perform text normalization in a way such that i t enhances the readability of t he TTS by decreasing the production of words with incorrect or unnatural pronunciatio n.
In real text, many non-standard rep resentations of words appear which can be termed as informal language used for communication in social networking sites, blogs and other networking places.
Processing informal text has become an increasingly popular research topic in recent years.
Regardless of t he size of the text corpus, there will always be tokens that do not appear and have unknown pronuncia tions.
It has twenty nine states, seven union terr itories, 22 national languages, 1162 other lan guages and dialects a nd almost all the religions of the w orld have adher ents in the country 2.
Because of this, loan or borrowed words from dierent languages is a co mmon thing in many lan guages of India.