-By Leonardo Baldassini, Jose Antonio Rodr´ıguez Serrano
BBVA Data & Analytics
Abstract
Designing the client2vec an internal library to rapidly build baselines for banking applications. Client2vec uses marginalized stacked de-noising
autoencoders on current account transactions data to create vector embeddings which represent the behaviors of our
clients. These representations can then be used in, and optimized against, a variety of tasks such as client segmentation,
profiling and targeting.
Most data analytics and commercial campaigns in retail
banking revolve around the concept of behavioral similarity, for instance: studies and campaigns on client retention; product recommendations; web applications where clients can compare their expenses with those of similar
people in order to better manage their own finances; data integrity tools.
The analytic work behind each of these products normally
requires the construction of a set of customer attributes and
a model, both typically tailored to the problem of interest. The aim is to systematize this process in order to encourage
model and code reuse, reduce project feasibility assessment
times and promote homogeneous practices.
Client2vec: a library to speed
up the construction of informative baselines for behavior centric banking applications. In particular, client2vec focuses on behaviors which can be extracted from account
transactions data by encoding that information into vector form (client embedding). These embeddings make it possible to quantify how similar two customers are and, when input into clustering or regression algorithms, outperform the
socio demographic customer attributes traditionally used for
customer segmentation or marketing campaigns. The proposed solution is with minimal computational and preprocessing requirements that could run even on simple infrastructures. Client2vec offers our
data scientists the possibility to optimize the embeddings
against the business problem at hand. For instance, the embedding may be tuned to optimize the average precision for
the task of retrieving suitable targets for a campaign.
Approach
client2vec following an analogy with unsupervised
word embeddings, whereby account transactions
can be seen as words, clients as documents (bags or sequence
of words) and the behavior of a client as the summary of a
document. Just like word or document embeddings, client
embeddings should exhibit the fundamental property that
neighboring points in the space of embeddings correspond
to clients with similar behaviors.
First Approach : To extract
vector representations of transactions and compose them
into client embeddings, as done with word embeddings to
extract phrase or document embeddings via averaging or
more sophisticated techniques.
Second Approach : To embed
clients straight away
We explored the former option by applying the famed word2vec algorithm to our data and
then pooling the embeddings of individual transactions into
client representations with a variety of methods. For the
latter approach, which is the one currently employed by
client2vec, we built client embeddings via a marginalized stacked denoising autoencoder (mSDA). For comparison
and benchmarking purposes, we also tested the embedding
comprising the raw transactional data of a client and the
one produced by sociodemographic variables.
Embeddings are then turned into actionable baselines by
casting business problems as nearest neighbor regressions.
This builds on successful works in computer vision which adopt the principle of the unreasonable effectiveness
of data.
Sociodemographic variables
The obvious fundamental benchmark to which we compared
all methods are sociodemographic variables: age, gender, income range, postcode, city and province. Such variables are
typically considered by banks, retailers and other organizations for purposes like segmentations or campaigns. All of
these variables are categorical, even the income, having been
binned in several ranges. As such, we one-hot encode them
and then reduce the dimensionality of the vector thus obtained in order to measure the Euclidean distance between
two sociodemographic representations.
Raw transactions
Embedding via word2vec
Word2vec is a family of embeddings of words in documents, which express each word token with a dense vector.
These vectors result from the intermediate encoding of a 2-
layer network trained to reconstruct the linguistic context of
each token and exhibit strong semantic properties, e.g. two
nearby vectors refer to words that may share the same topic
or even be synonyms.
Model selection
We treat the preprocessing options for mSDAs listed above
like hyperparameters to optimize at train time. Likewise,
the hyperparameters for the word2vec benchmark are the word-embedding dimension and the context window size [28],
while for the raw transaction embeddings we only choose
whether to L2-normalize, log-normalize or binarize. The
optimization is carried out separately for each use case we
consider.
Results
Conclusions
An attempt to
develop an internal tool that could catalyze the data-driven decision making for BBVA. They described how we worked towards a solution that was simple to use, fast to deploy and
integrate in colleagues’ processes and that required minimal preprocessing. Along the way, we learned that composing transactional embeddings extracted with word2vec into
customer embeddings doesn’t always offer an acceptable performance, while mSDAs help us capture a good deal of behavioral information. Furthermore, we highlighted how this
information can be extracted even from simple, coarse transactional data. We plan to keep expanding the client2vec library by adding new representations as new use cases arise,
as well as by proactively exploring algorithms that fit its
philosophy of simplicity, such as the nonlinear extension of
mSDA or metric learning to further boost
the performance mSDA embeddings in client targeting.
Comments