Skip to main content

client2vec: Generic Clients repository for Banking Applications

-By Leonardo Baldassini, Jose Antonio Rodr´ıguez Serrano 
BBVA Data & Analytics


Abstract

Designing the client2vec an internal library to rapidly build baselines for banking applications. Client2vec uses marginalized stacked de-noising autoencoders on current account transactions data to create vector embeddings which represent the behaviors of our clients. These representations can then be used in, and optimized against, a variety of tasks such as client segmentation, profiling and targeting.



Most data analytics and commercial campaigns in retail banking revolve around the concept of behavioral similarity, for instance: studies and campaigns on client retention; product recommendations; web applications where clients can compare their expenses with those of similar people in order to better manage their own finances; data integrity tools. The analytic work behind each of these products normally requires the construction of a set of customer attributes and a model, both typically tailored to the problem of interest. The aim is to systematize this process in order to encourage model and code reuse, reduce project feasibility assessment times and promote homogeneous practices.  

Client2vec: a library to speed up the construction of informative baselines for behavior centric banking applications. In particular, client2vec focuses on behaviors which can be extracted from account transactions data by encoding that information into vector form (client embedding). These embeddings make it possible to quantify how similar two customers are and, when input into clustering or regression algorithms, outperform the socio demographic customer attributes traditionally used for customer segmentation or marketing campaigns. The proposed solution is with minimal computational and preprocessing requirements that could run even on simple infrastructures.  Client2vec offers our data scientists the possibility to optimize the embeddings against the business problem at hand. For instance, the embedding may be tuned to optimize the average precision for the task of retrieving suitable targets for a campaign.


Approach

client2vec following an analogy with unsupervised word embeddings, whereby account transactions can be seen as words, clients as documents (bags or sequence of words) and the behavior of a client as the summary of a document. Just like word or document embeddings, client embeddings should exhibit the fundamental property that neighboring points in the space of embeddings correspond to clients with similar behaviors.


First Approach : To extract vector representations of transactions and compose them into client embeddings, as done with word embeddings to extract phrase or document embeddings via averaging or more sophisticated techniques.

Second Approach : To embed clients straight away

We explored the former option by applying the famed word2vec algorithm to our data and then pooling the embeddings of individual transactions into client representations with a variety of methods. For the latter approach, which is the one currently employed by client2vec, we built client embeddings via a marginalized stacked denoising autoencoder (mSDA). For comparison and benchmarking purposes, we also tested the embedding comprising the raw transactional data of a client and the one produced by sociodemographic variables. Embeddings are then turned into actionable baselines by casting business problems as nearest neighbor regressions. This builds on successful works in computer vision which adopt the principle of the unreasonable effectiveness of data.



Sociodemographic variables

The obvious fundamental benchmark to which we compared all methods are sociodemographic variables: age, gender, income range, postcode, city and province. Such variables are typically considered by banks, retailers and other organizations for purposes like segmentations or campaigns. All of these variables are categorical, even the income, having been binned in several ranges. As such, we one-hot encode them and then reduce the dimensionality of the vector thus obtained in order to measure the Euclidean distance between two sociodemographic representations.

Raw transactions


Embedding via word2vec

Word2vec is a family of embeddings of words in documents, which express each word token with a dense vector. These vectors result from the intermediate encoding of a 2- layer network trained to reconstruct the linguistic context of each token and exhibit strong semantic properties, e.g. two nearby vectors refer to words that may share the same topic or even be synonyms.

Model selection

We treat the preprocessing options for mSDAs listed above like hyperparameters to optimize at train time. Likewise, the hyperparameters for the word2vec benchmark are the word-embedding dimension and the context window size [28], while for the raw transaction embeddings we only choose whether to L2-normalize, log-normalize or binarize. The optimization is carried out separately for each use case we consider.


Results


Conclusions

 An attempt to develop an internal tool that could catalyze the data-driven decision making for BBVA. They described how we worked towards a solution that was simple to use, fast to deploy and integrate in colleagues’ processes and that required minimal preprocessing. Along the way, we learned that composing transactional embeddings extracted with word2vec into customer embeddings doesn’t always offer an acceptable performance, while mSDAs help us capture a good deal of behavioral information. Furthermore, we highlighted how this information can be extracted even from simple, coarse transactional data. We plan to keep expanding the client2vec library by adding new representations as new use cases arise, as well as by proactively exploring algorithms that fit its philosophy of simplicity, such as the nonlinear extension of mSDA or metric learning to further boost the performance mSDA embeddings in client targeting. 


Comments

Popular posts from this blog

ABOD and its PyOD python module

Angle based detection By  Hans-Peter Kriegel, Matthias Schubert, Arthur Zimek  Ludwig-Maximilians-Universität München  Oettingenstr. 67, 80538 München, Germany Ref Link PyOD By  Yue Zhao   Zain Nasrullah   Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada  Zheng Li jk  Northeastern University Toronto, Toronto, ON M5X 1E2, Canada I am combining two papers to summarize Anomaly detection. First one is Angle Based Outlier Detection (ABOD) and other one is python module that  uses ABOD along with over 20 other apis (PyOD) . This is third part in the series of Anomaly detection. First article exhibits survey that covered length and breadth of subject, Second article highlighted on data preparation and pre-processing.  Angle Based Outlier Detection. Angles are more stable than distances in high dimensional spaces for example the popularity of cosine-based sim...

Cybersecurity Threats in Connected and Automated Vehicles based Federated Learning Systems

  Ranwa Al Mallah , Godwin Badu-Marfo , Bilal Farooq image Courtesy: Comparitech Abstract Federated learning (FL) is a machine learning technique that aims at training an algorithm across decentralized entities holding their local data private. Wireless mobile networks allow users to communicate with other fixed or mobile users. The road traffic network represents an infrastructure-based configuration of a wireless mobile network where the Connected and Automated Vehicles (CAV) represent the communicating entities. Applying FL in a wireless mobile network setting gives rise to a new threat in the mobile environment that is very different from the traditional fixed networks. The threat is due to the intrinsic characteristics of the wireless medium and is caused by the characteristics of the vehicular networks such as high node-mobility and rapidly changing topology. Most cyber defense techniques depend on highly reliable and connected networks. This paper explores falsified informat...

MLOps Drivenby Data Quality using ease.ml techniques

 Cedric Renggli, Luka Rimanic, Nezihe Merve Gurel, Bojan Karlas, Wentao Wu, Ce Zhang ETH Zurich Microsoft Research Paper Link ease.ml reference paper link Image courtesy 99designes Developing machine learning models can be seen as a process similar to the one established for traditional software development. A key difference between the two lies in the strong dependency between the quality of a machine learning model and the quality of the data used to train or perform evaluations. In this work, we demonstrate how different aspects of data quality propagate through various stages of machine learning development. By performing joint analysis of the impact of well-known data quality dimensions and the downstream machine learning process, we show that different components of a typical MLOps pipeline can be efficiently designed, providing both a technical and theoretical perspective. Courtesy: google The term “MLOps” is used when this DevOps process is specifically applied to ML. Diffe...