Skip to main content

Posts

Showing posts from January, 2019

Design of a Phonetically Balanced Code-Mixed Hindi-English Read Speech Corpus for Automatic Speech Recognition

by -Ayushi Pandey, B M L Srivastava , Rohit Kumar, B T Nellore , K S Teja, S V Gangashetty Paper Link "Hungry kya?" "What your bahana is?" few advertisement slogans                             Pepsi: " Yeh Dil Maange More"         Coke: "Life ho to aisi " Have you come across the above conversations and native half baked pure language :) . New pattern emerged known as Hinglish . The mix of Hindi and English is the language of the street and the college campus, and its sound sets many parents' teeth on edge. It's a bridge between two cultures that has become an island of its own, a distinct hybrid culture for people who aspire to make it rich abroad without sacrificing the sassiness of the mother tongue. And it may soon claim more native speakers worldwide than English.  full article on Hinglish Bilingual and multilingual speech communities recognize code-switching and code-mixing  as predominant phen

Plagiarism detection in programming assignments

Summary of Research Papers from IIIT Hyd Unsupervised Learning Based Approach for Plagiarism Detection inProgramming Assignments Jitendra Yasaswi Bharadwaj katta, Srikailash G, Anil Chilupuri, Suresh Purini, C V Jawahar  Research Paper link Once there lived group of ants. Due to weather conditions like summer winter and rainy seasons Ants decided to roll out particles of soil and take it to out of the earth and make safer place to live in. Group work manifested and took form of ant hill. Before experiencing the fruits of the designed home by ants. A snake came from some where, occupied the place and started living. The hard-work of ants resulted in frustration. These kind of stealing is called obfuscation. Today's paper deals with Plagiarism. Automatic Detection of Plagiarism in programming Assignments. Martins define plagiarism as  " the usage of work without crediting its authors". Easy access to enormous web content has turned plagiarism in

Science journalism meets artificial intelligence "Robotic Journalism"

By - Raghuram Vadapalli, Bakhtiyar Syed, Nishant Prabhu , Balaji Vasan Srinivasan, Vasudeva Varma. Summary of Research papers from IIT Hyd Research paper link Since couple of years an exciting topic is getting attraction in Machine learning and Artificial world that is "Robot Reporter".  Today's paper got inspired by the concept. Application to science journalism is non-trivial, as that would entail understanding scientific content and translating it to simpler language without distorting underlying semantics. paper heads infant steps towards answering few challenges. Authors came out with a tool, which, given the title and abstract of a research paper will generate a blog title by mimicking a human science journalist. The tool uses model trained on 87,328 pairs of research papers and their related blogs. Contributions can be summed up as follows 1. A new parallel corpus of 87, 328 pairs of research  paper titles and abstracts and their c

Towards better Sentence Classification for Morphologically Rich Languages

- Mahuri Tummalapalli, Manoj Chinnakotla, Radhika Mamidi Summary of Research papers from IIIT Hyd. Research Paper Link I would like to present  summary of  couple of Natural Language processing research done at International Institute of Information Technology IIIT Hyderabad in this and upcoming blog posts. English Sentence classification methods highly dependent on language resources that does parsing or relay on labeled and un labeled data, That makes difficult to adapt them to other languages. Paper evaluates on deep learning techniques for sentence classification on morphological rich Indian languages, Particularly on Hindi and Telugu.  Author took Hindi annotated data by translating the TREC-UIUC data-set. Author shed light on multiInput-CNN variant and is able to perform better Morphological Rich and Agglutinating means...? A Morphological Rich Languages (MRL) is one which grammatical relations like Subject, Predicate, Object, etc., are indicated by chan