Skip to main content

Explaining Explanations: An Overview of Interpretability of ML

-By Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter and Lalana Kagal

Massachusetts Institute of Technology Cambridge, MA 02139 




Explaining Explanations: An Overview of Interpretability of Machine Learning



Explainable AI (XAI), Interpretable AI, or Transparent AI refer to techniques in artificial intelligence (AI) which can be trusted and easily understood by humans. It contrasts with the concept of the "black box" in machine learning where even their designers cannot explain why the AI arrived at a specific decision. XAI can be used to implement a social right to explanation. Some claim that transparency rarely comes for free and that there are often tradeoffs between how "smart" an AI is and how transparent it is; these tradeoffs are expected to grow larger as AI systems increase in internal complexity. The technical challenge of explaining AI decisions is sometimes known as the interpretability problem.


Some existing deployed systems and regulations make the need for explanatory systems urgent and timely. With impending regulations like the European Union’s “Right to Explanation”, calls for diversity and inclusion in AI systems, findings that some automated systems may reinforce inequality and bias, and requirements for safe and secure AI in safety-critical tasks, there has been a recent explosion of interest in interpreting the representations and decisions of black-box models. These models are everywhere, and the development of interpretable and explainable models is scattered throughout various disciplines. Examples of general “explainable systems” include interpretable AI, explainable ML, causality, safe AI, computational social
science, and automatic scientific discovery.

Explanation


Some say a good explanation depends on the question. “Why questions.” In particular, when you can phrase what you want to know from an algorithm as a why question. There are two whyquestions of interest; why and why-should. Similarly to the explainable planning literature, philosophers wonder about the why-shouldn’t and why-should questions, which can give the kinds of explainability requirements we want.

Interpretability vs. Completeness

An explanation can be evaluated in two ways: according to its interpretability, and according to its completeness. The goal of interpretability is to describe the internals of a system in a way that is understandable to humans. The success of this goal is tied to the cognition, knowledge, and biases of the user.

The goal of completeness is to describe the operation of a system in an accurate way. An explanation is more complete when it allows the behavior of the system to be anticipated in more situations.

Explainability of Deep Networks

An explanation of processing answers “Why does this particular input lead to that particular output?” and is analogous to explaining the execution trace of a program. An explanation about representation answers “What information does the network contain?” and can be compared to explaining the internal data structures of a program


Explanations of Deep Network Processing


Commonly used deep networks derive their decisions using a large number of elementary operations

1) Linear Proxy Models

The proxy model approach is exemplified well by the LIME method. With LIME, a black-box system is explained by probing behavior on perturbations of an input, and then that data is used to construct a local linear model that serves as a simplified proxy for the full model in the neighborhood of the input.

2) Decision Trees

Decision Trees are focused on shallow networks, to generalizing the process for deep neural networks. DeepRED, which demonstrates a way of extending the CRED algorithm (designed for shallow networks) to arbitrarily many hidden layers. DeepRED utilizes several strategies to simplify its decision trees: it uses RxREN to prune unnecessary input, and it applies algorithm C4.5, a statistical method for creating a parsimonious decision tree.

3) Automatic-Rule Extraction 

Automatic rule extraction is another well-studied approach for summarizing decisions. Andrews outlines existing rule extraction techniques, and provides a useful taxonomy of five dimensions of ruleextraction methods including their expressive power, translucency and the quality of rules.

4) Salience Mapping

The salience map approach is exemplified by occlusion procedure by Zeiler, where a network is repeatedly tested with portions of the input occluded to create a map showing which parts of the data actually have influence on the network output.

Representations


1) Role of Layers

Layers can be understood by testing their ability to help solve different problems from the problems the network was originally trained on.

2) Role of Individual Units

The information within a layer can be further subdivided into individual neurons or individual convolutional filters. The role of such individual units can be understood qualitatively, by creating visualizations of the input patterns that maximize the response of a single unit, or quantitatively, by testing the ability of a unit to solve a transfer problem.

3) Role of Representation Vectors

Closely related to the approach of characterizing individual units is characterizing other directions in the representation vector space formed by linear combinations of individual units. Concept Activation Vectors (CAVs) are a framework for interpretation of a neural nets representations by identifying and probing directions that align with human-interpretable concepts.


Production systems


Networks can be trained to use explicit attention as part of their architecture; they can be trained to learn disentangled representations; or they can be directly trained to create generative explanations

a) Attention Networks

Attention-based networks learn functions that provide a weighting over inputs or internal features to steer the information visible to other parts of a network. Attention-based approaches have shown remarkable success in solving problems such as allowing natural language translation models to process words in an appropriate  nonsequential order, and they have also been applied in domains such as fine-grained image classification and visual question answering.

b) Disentangled Representations 

Disentangled representations have individual dimensions that describe meaningful and independent factors of variation. The problem of separating latent factors is an old problem that has previously been attacked using a variety of techniques such as Principal Component Analysis, Independent Component Analysis, and Nonnegative Matrix Factorization. Deep networks can be trained to explicitly learn disentangled representations. One approach that shows promise is Variational Autoencoding, which trains a network to optimize a model to match the input probability distribution according to information-theoretic measures. 

C) Generated Explanations 

Finally, deep networks can also be designed to generate their own human-understandable explanations as part of the explicit training of the system. Explanation generation has been demonstrated as part of systems for visual question answering as well as in finegrained image classification. In addition to solving their primary task, these systems synthesize a “because” sentence that explains the decision in natural language.

Conclusion


We find, though, that the various approaches taken to address different facets of explainability are siloed. Work in the explainability space tends to advance a particular category of technique, with comparatively little attention given to approaches that merge different categories of techniques to achieve more effective explanation. Given the purpose and type of explanation, it is not obvious what the best type of explanation metric is and should be. We encourage the use of diverse metrics that align with the purpose and completeness of the targeted explanation. Our view is that, as the community learns to advance its work collaboratively by combining ideas from different fields, the overall state of system explanation will improve dramatically.


Comments

Popular posts from this blog

ABOD and its PyOD python module

Angle based detection By  Hans-Peter Kriegel, Matthias Schubert, Arthur Zimek  Ludwig-Maximilians-Universität München  Oettingenstr. 67, 80538 München, Germany Ref Link PyOD By  Yue Zhao   Zain Nasrullah   Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada  Zheng Li jk  Northeastern University Toronto, Toronto, ON M5X 1E2, Canada I am combining two papers to summarize Anomaly detection. First one is Angle Based Outlier Detection (ABOD) and other one is python module that  uses ABOD along with over 20 other apis (PyOD) . This is third part in the series of Anomaly detection. First article exhibits survey that covered length and breadth of subject, Second article highlighted on data preparation and pre-processing.  Angle Based Outlier Detection. Angles are more stable than distances in high dimensional spaces for example the popularity of cosine-based sim...

Ownership at Large

 Open Problems and Challenges in Ownership Management -By John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Shan He, Ralf Lämmel, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers Facebook Inc.  Software-intensive organizations rely on large numbers of software assets of different types, e.g., source-code files, tables in the data warehouse, and software configurations. Who is the most suitable owner of a given asset changes over time, e.g., due to reorganization and individual function changes. New forms of automation can help suggest more suitable owners for any given asset at a given point in time. By such efforts on ownership health, accountability of ownership is increased. The problem of finding the most suitable owners for an asset is essentially a program comprehension problem: how do we automatically determine who would be best placed to understand, maintain, ev...

Hybrid Approach to Automation, RPA and Machine Learning

- By Wiesław Kopec´, Kinga Skorupska, Piotr Gago, Krzysztof Marasek  Polish-Japanese Academy of Information Technology Paper Link Courtesy DZone   Abstract One of the more prominent trends within Industry 4.0 is the drive to employ Robotic Process Automation (RPA), especially as one of the elements of the Lean approach.     The full implementation of RPA is riddled with challenges relating both to the reality of everyday business operations, from SMEs to SSCs and beyond, and the social effects of the changing job market. To successfully address these points there is a need to develop a solution that would adjust to the existing business operations and at the same time lower the negative social impact of the automation process. To achieve these goals we propose a hybrid, human-centred approach to the development of software robots. This design and  implementation method combines the Living Lab approach with empowerment through part...