Abstract:
Human learning revolves around experiences. The gradual acquisition of knowledge for
learning a new task involves leveraging similar experiences in the past. The capability to
transfer prior knowledge to generalize to new situations that have not been encountered
earlier drives the continuous learning process. Motivated by the process of human learning,
the machine learning paradigm known as “Transfer Learning” leverages knowledge
from other domains to learn new tasks. Specifically, transfer learning deals with leveraging
the information about the task, and data from single or multiple auxiliary domains.
Hailed as the Machine Learning’s (ML) next frontier by several eminent researchers of
the ML community, its prospects have found usage in different applications with varying
degrees of success.
This thesis proposes three novel transfer learning frameworks that utilize data from a
single auxiliary domain to assist the target classification task. The core idea of the proposed
frameworks is to leverage the label relationships to make practical and effective
use of data from a heterogeneous domain. The heterogeneity may be in terms of the underlying
data distributions, the feature spaces, and the label spaces. The first framework,
Multi-Partition Feature Alignment Network, tackles the scenario when the underlying
data distribution of the domains are different. The deep-learning-based framework combines
unsupervised adversarial adaptation with clustering for distribution alignment. It
achieves fine-grained class-wise feature alignment by bringing a refined pseudo-labeled
target partition closer to the nearest source partition with the same label. The experimental
results on visual domain adaptation tasks on standard benchmark datasets, namely, digits
classification and object recognition, validate the effectiveness of the proposed method.
The second framework, Supervised Heterogeneous Domain Adaptation via Random
Forests, is a shared label space driven framework that transfers labeled knowledge between
heterogeneous feature spaces. The Random-Forest-based framework utilizes the
common label space to extract one-to-one correspondences across the domains. The
generated correspondences are used to learn a cross-domain feature mapping that can
link the heterogeneous feature spaces. The performance of the proposed framework is
benchmarked against several baselines and state-of-the-art approaches on synthetic and
real-world heterogeneous transfer tasks, namely, cross-view image and text classification
tasks, cross-lingual sentiment/text classification tasks, and cross-domain activity recognition
tasks.
The third framework, Web-Induced Heterogeneous Transfer Learning with Sample
Selection, bridges the domains in the most generic case when the label spaces are also
heterogeneous. The proposed framework is conceived as a feature transfer optimization
problem that learns a linear and sparse transformation that can transform data from
an auxiliary domain to the target feature space. Assuming some semantic relationships
within and across the label spaces, the framework leverages web-distance to introduce
semantic co-alignment in the target space. The experimental results on cross-lingual text
transfer, cross-domain activity recognition, and deep representation transfer tasks indicate
the superiority of the proposed framework over state-of-the-art transfer approaches.
Besides, this thesis also provides a complete description of the transfer learning paradigm.
It also summarizes the state-of-the-art contributions to the different transfer learning scenarios.
The underlying principle and limitations, along with the improvements made
to these seminal contributions have also been highlighted. Furthermore, this thesis also
presents some future research directions, along with some unexplored avenues in the
transfer learning paradigm.