Skip to main content

Word Rotator's Distance

 Sho Yokoi, Ryo Takahashi ,Reina Akama, Jun Suzuki, Kentaro Inui


Abstract

A key principle in assessing textual similarity is measuring the degree of semantic overlap between two texts by considering the word alignment. Such alignment-based approaches are intuitive and interpretable; however, they are empirically inferior to the simple cosine similarity between general-purpose sentence vectors. To address this issue, we focus on and demonstrate the fact that the norm of word vectors is a good proxy for word importance, and their angle is a good proxy for word similarity. Alignment-based approaches do not distinguish them, whereas sentence-vector approaches automatically use the norm as the word importance. Accordingly, we propose a method that first decouples word vectors into their norm and direction, and then computes alignment-based similarity using earth mover’s distance (i.e., optimal transport cost), which we refer to as word rotator’s distance. Besides, we find how to “grow” the norm and direction of word vectors (vector converter), which is a new systematic approach derived from sentence-vector estimation methods. On several textual similarity datasets, the combination of these simple proposed methods outperformed not only alignment-based approaches but also strong baselines. 


Semantic textual similarity (STS) is the task of measuring the degree of semantic equivalence between two sentences. 

For example, the sentences 

    “Two boys on a couch are playing video games.” and

     “Two boys are playing a video game.” 

are mostly equivalent (the similarity score of 4 out of 5) 

while the sentences 

    “The woman is playing the violin.” and 

    “The young lady enjoys listening to the guitar.” 

are not equivalent but on the same topic (score of 1). 


System predictions are customarily evaluated by Pearson correlation with the gold scores. Hence, systems are only required to predict relative similarity rather than absolute scores.


There are two major approaches to tackling STS. One is to measure the degree of semantic overlap between texts by considering the word alignment, which we refer to as alignment-based approaches. The other approach involves generating general-purpose sentence vectors from two texts (typically comprising word vectors), and then calculating their similarity, which we refer to as sentence-vector approaches. Alignment-based approaches are consistent with human intuition about textual similarity, and their predictions are interpretable. However, the performance of such approaches is lower than that of sentence-vector approaches.

 STS method that first decouples word vectors into their norms and direction vectors and then aligns the direction vectors using earth mover’s distance (EMD). Here, the key idea is to map the norm and angle of the word vectors to the EMD parameters probability mass and transportation cost, respectively. The proposed method is natural from both optimal transport and word embeddings perspectives, preserves the features of alignment-based methods, and can directly incorporate sentence-vector estimation methods, which results in fairly high performance.

 contributions are as follows. 

• Norm of a word vector implicitly encodes the importance weight of a word and that the angle between word vectors is a good proxy for the dissimilarity of words. 

• A new textual similarity measure, i.e., word rotator’s distance, that separately utilizes the norm and direction of word vectors. 

• To enhance the proposed WRD, we utilize a new word-vector conversion mechanism, which is formally induced from recent sentence-vector estimation methods. 

• Demonstrates the proposed methods achieve high performance compared to strong baseline methods on several STS tasks


Word Mover’s Distance and its Issues

Earth Mover’s Distance 

    Intuitively, earth mover’s distance is the minimum cost required to turn one pile of dirt into another pile of dirt (Figure 1). 



Word Mover’s Distance

Word mover’s distance (WMD) is a dissimilarity measure between texts and is a pioneering work that introduced EMD to the natural language processing (NLP) field. This study is strongly inspired by this work. We introduce WMD prior to presenting the proposed method. WMD is the cost of transporting a set of word vectors in an embedding space (Euclidean space) (Figure 2). 

Proposed a simple yet powerful sentence similarity measure using EMD. The proposed method considers each sentence as a discrete distribution on the unit hypersphere and calculates EMD on this hypersphere (Figure 5). Here, the alignment of the direction vectors corresponds to a rotation on the unit hypersphere; thus, we refer to the proposed method as word rotator’s distance (WRD). Formally, we consider each sentence s as a discrete distribution νs comprising direction vectors weighted by their norm (bag-of-direction-vectors distribution)


Experiment and Predictions


Conclusion 
In this paper, we first indicated 
(i) that the norm and angle of word vectors are good proxies for the importance of a word and dissimilarity between words, respectively, and 
(ii) that some previous alignment-based STS methods inappropriately “mix up” them. With these findings, we have proposed word rotator’s distance (WRD), which is a new unsupervised, EMD-based STS metric. 

WRD was designed so that the norm and angle of word vectors correspond to the probability mass and transportation cost in EMD, respectively. In addition, we found that the latest powerful sentence vector estimation methods implicitly improve the norm and angle of word vectors and we can exploit this effect as a word vector converter (VC). In experiments on multiple STS tasks, the proposed methods outperformed not only alignment-based methods such as word mover’s distance, but also powerful addition-based sentence vectors.


Comments

Popular posts from this blog

ABOD and its PyOD python module

Angle based detection By  Hans-Peter Kriegel, Matthias Schubert, Arthur Zimek  Ludwig-Maximilians-Universität München  Oettingenstr. 67, 80538 München, Germany Ref Link PyOD By  Yue Zhao   Zain Nasrullah   Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada  Zheng Li jk  Northeastern University Toronto, Toronto, ON M5X 1E2, Canada I am combining two papers to summarize Anomaly detection. First one is Angle Based Outlier Detection (ABOD) and other one is python module that  uses ABOD along with over 20 other apis (PyOD) . This is third part in the series of Anomaly detection. First article exhibits survey that covered length and breadth of subject, Second article highlighted on data preparation and pre-processing.  Angle Based Outlier Detection. Angles are more stable than distances in high dimensional spaces for example the popularity of cosine-based sim...

Ownership at Large

 Open Problems and Challenges in Ownership Management -By John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Shan He, Ralf Lämmel, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers Facebook Inc.  Software-intensive organizations rely on large numbers of software assets of different types, e.g., source-code files, tables in the data warehouse, and software configurations. Who is the most suitable owner of a given asset changes over time, e.g., due to reorganization and individual function changes. New forms of automation can help suggest more suitable owners for any given asset at a given point in time. By such efforts on ownership health, accountability of ownership is increased. The problem of finding the most suitable owners for an asset is essentially a program comprehension problem: how do we automatically determine who would be best placed to understand, maintain, ev...

Hybrid Approach to Automation, RPA and Machine Learning

- By Wiesław Kopec´, Kinga Skorupska, Piotr Gago, Krzysztof Marasek  Polish-Japanese Academy of Information Technology Paper Link Courtesy DZone   Abstract One of the more prominent trends within Industry 4.0 is the drive to employ Robotic Process Automation (RPA), especially as one of the elements of the Lean approach.     The full implementation of RPA is riddled with challenges relating both to the reality of everyday business operations, from SMEs to SSCs and beyond, and the social effects of the changing job market. To successfully address these points there is a need to develop a solution that would adjust to the existing business operations and at the same time lower the negative social impact of the automation process. To achieve these goals we propose a hybrid, human-centred approach to the development of software robots. This design and  implementation method combines the Living Lab approach with empowerment through part...