Skip to main content

Towards better Sentence Classification for Morphologically Rich Languages




- Mahuri Tummalapalli, Manoj Chinnakotla, Radhika Mamidi

Summary of Research papers from IIIT Hyd.


I would like to present  summary of  couple of Natural Language processing research done at International Institute of Information Technology IIIT Hyderabad in this and upcoming blog posts.

English Sentence classification methods highly dependent on language resources that does parsing or relay on labeled and un labeled data, That makes difficult to adapt them to other languages. Paper evaluates on deep learning techniques for sentence classification on morphological rich Indian languages, Particularly on Hindi and Telugu. 

Author took Hindi annotated data by translating the TREC-UIUC data-set. Author shed light on multiInput-CNN variant and is able to perform better


Morphological Rich and Agglutinating means...?


A Morphological Rich Languages (MRL) is one which grammatical relations like Subject, Predicate, Object, etc., are indicated by changes to the words instead of relative position or addition of particles.




An Agglutinative language is a type of synthetic language with morphology that primarily uses agglutination. Words may contain different morphemes to determine their meanings, but all of these morphemes (including stems and affixes) remain, in every aspect, unchanged after their unions.


Paper does two tasks with the translated Hindi and Telugu data-sets.


Question Classification


Author uses Bag of words, WH-Word in Question, Word Shape, Question length, etc WH-Words are extracted using Rules on Parse trees. 

He also uses Hypernyms(named entities) and named entities.

Sentiment Analysis  


Sentiment Analysis uses Bag of Words for features extraction (unigram, Bigram).
Parts of Speech  (POS) tags, adjectives,etc.






Models

High light of the paper that attracted me is Context Vectors.


  • Dynamic CNN - Dynamic k means pooling was introduced. A dynamic k-max pooling operations is where the k is deciding function of length of the sentence and the death of the network.



  • LSTM - is used to retain dependency sensitive model and then apply output to a CNN get final result. These are called Context Vectors (CoV)


  • Context Vector -A context vector is associated with every unique word in the training corpus. A self organization-based learning approach is used to derive these context vectors such that vectors for words that are used in similar contexts will point in similar directions.


  • Attention mechanism in multiInput-CNN
What is Attention
Attention Mechanism can be viewed as a method for making the RNN work better by letting the network know where to look as it is performing its task.
    Ref for more on attention 
           Basic model proposed by Kim for Attension model for more details ref 

           Author customised Attention based multiInput -CNN to gain performance


  • SVM

The SVM is trained with a linear kernel on bag-of-words, bag-of-word ngrams and bag-of-character ngram input. Author fed a combination of word and character ngrams to the SVM for experiments.


Experiments and results


The experiments have been divided into three parts. Table 3 below presents detailed evaluation and discussion of different baseline models and inputs on all data-sets. Table 4 compares multiInput-CNN model’s performance with the baselines and other state-of-the arts. In Table 5, Author tries to understand the inherent data-set biases across languages.





Author proposes Future research on experiments on the topic can be referred in paper.

Comments

Popular posts from this blog

ABOD and its PyOD python module

Angle based detection By  Hans-Peter Kriegel, Matthias Schubert, Arthur Zimek  Ludwig-Maximilians-Universität München  Oettingenstr. 67, 80538 München, Germany Ref Link PyOD By  Yue Zhao   Zain Nasrullah   Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada  Zheng Li jk  Northeastern University Toronto, Toronto, ON M5X 1E2, Canada I am combining two papers to summarize Anomaly detection. First one is Angle Based Outlier Detection (ABOD) and other one is python module that  uses ABOD along with over 20 other apis (PyOD) . This is third part in the series of Anomaly detection. First article exhibits survey that covered length and breadth of subject, Second article highlighted on data preparation and pre-processing.  Angle Based Outlier Detection. Angles are more stable than distances in high dimensional spaces for example the popularity of cosine-based similarity measures for text data. Object o is an out

TableSense: Spreadsheet Table Detection with Convolutional Neural Networks

 - By Haoyu Dong, Shijie Liu, Shi Han, Zhouyu Fu, Dongmei Zhang Microsoft Research, Beijing 100080, China. Beihang University, Beijing 100191, China Paper Link Abstract Spreadsheet table detection is the task of detecting all tables on a given sheet and locating their respective ranges. Automatic table detection is a key enabling technique and an initial step in spreadsheet data intelligence. However, the detection task is challenged by the diversity of table structures and table layouts on the spreadsheet. Considering the analogy between a cell matrix as spreadsheet and a pixel matrix as image, and encouraged by the successful application of Convolutional Neural Networks (CNN) in computer vision, we have developed TableSense, a novel end-to-end framework for spreadsheet table detection. First, we devise an effective cell featurization scheme to better leverage the rich information in each cell; second, we develop an enhanced convolutional neural network model for tab

DEEP LEARNING FOR ANOMALY DETECTION: A SURVEY

-By  Raghavendra Chalapathy  University of Sydney,  Capital Markets Co-operative Research Centre (CMCRC)  Sanjay Chawla  Qatar Computing Research Institute (QCRI),  HBKU  Paper Link Anomaly detection also known as outlier detection is the identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data. Typically the anomalous items will translate to some kind of problem such as bank fraud, a structural defect, medical problems or errors in a text. Anomalies are also referred to as outliers, novelties, noise, deviations and exceptions Hawkins defines an outlier as an observation that deviates so significantly from other observations as to arouse suspicion that it was generated by a different mechanism. Aim of this paper is two-fold, First is a structured and comprehensive overview of research methods in deep learning-based anomaly detection. Furthermore the adoption of these methods