Concept-based Explanations for  Convolutional Neural Network  Predictions

Kamakshi, V.

DSpace Home
→
Ph.D Theses
→
Year- 2023
→
View Item

Concept-based Explanations for Convolutional Neural Network Predictions

Kamakshi, V.

URI: http://dspace.iitrpr.ac.in:8080/xmlui/handle/123456789/4803

Date: 2023-10-21

Abstract:

Convolutional Neural Networks(CNN) have achieved state-of-the-art image classification results. The research sub-field, Explainable AI (XAI), aims to unravel the working mechanism used by these accurate, opaque black boxes to enhance users’ trust, and detect spurious correlations, thereby enabling the pervasive adoption of AI systems. Studies show that humans process images in terms of sub-regions called concepts. For instance, a peacock is identified by its green feathers, blue neck, etc. So explanations in terms of such concepts are proven to be helpful for humans to understand the working of CNN better. Existing approaches leverage an external repository of concept examples to extract the concept representations learned by the CNNs. However, distributional differences that may exist between the external repository and the data on which the CNN is trained, the faithfulness of these explanations, i.e., if the extracted representations truly represent the learned representations, is not guaranteed. To circumvent this challenge, the thesis proposes three novel frameworks that automatically extract the concepts from the data. The first framework, PACE, automatically extracts class-specific concepts relevant to the black-box prediction. It tightly integrates the faithfulness of the explanatory framework into the black-box model. It generates explanations for two different CNN architectures trained for classifying the AWA2 and Imagenet-Birds datasets. Extensive human subject experiments are conducted to validate the human interpretability and consistency of the extracted explanations. While class-specific concepts unravel the blueprints of a class from CNN’s perspective, concepts are often shared across classes; for instance, gorillas and chimpanzees naturally share many characteristics as they belong to the same family. The second framework, SCE, unravels the concept sharedness across related classes from CNNs perspective. The relevance of the extracted concepts towards prediction and the primitive image aspects, like color, texture, and shape encoded by the concept, are estimated after training the explainer, enabling it to shed light on the various concepts on which the different black box architectures trained on the Imagenet dataset group and distinguish related classes. The secondary focus of the thesis is to extend the fruits of explainability to allied learning paradigms contributing to state-of-the-art image classification successes. Domain adaptation techniques that leverage knowledge from an auxiliary source domain for learning in labeled data-scarce target domain increase accuracy. However, the adaptation process remains unclear, particularly the knowledge leveraged from the source domain. The third framework XSDA-Net uses a case-based reasoning mechanism to explain the prediction of a test instance in terms of similar-looking regions in the source and target train images. The utility of the proposed framework is theoretically and empirically demonstrated by curating the domain adaptation settings on datasets popularly known to exhibit part-based explainability. Ablation analyses show the importance of each component of the learning objective. This thesis also provides a complete description of the XAI field, summarizing the state-of-the-art contributions to the different types of explanations. The underlying principle, limitations, and improvements made to these seminal contributions have also been highlighted. Furthermore, this thesis also presents future research directions and unexplored avenues in XAI research.

Show full item record