Title | : | Domain Adaptation for Object Categorization |
Speaker | : | Suranjana Samanta (IITM) |
Details | : | Tue, 12 Nov, 2013 3:00 PM @ BSB 361 |
Abstract: | : | A basic assumption in many machine learning algorithms is that the distribution of the training and test samples are identical. However in many cases, specifically with real world dataset (images, speech, bio metrics, text etc.), this assumption is violated. For certain applications, a limited number of labeled training samples are available for training a classifier, for use with test samples in target domain. However, a large number of labeled samples are available with a different distribution from an auxiliary domain, termed as the source domain. Domain adaptation (DA) is a type of transfer learning where one can use the training samples obtained from source domain to aid a statistical learning task to be used on the test samples present in the target domain, where the distribution of the data in two domains differ. There are generally two types of domain adaptation techniques available in the literature, depending on the training samples available from the target domain: (a) supervised - where we have a sparse set of labeled samples and (b) unsupervised – with plenty of unlabeled samples. This presentation describes a few methods designed for supervised and unsupervised DA. For supervised DA, we form groups/cluster of data following a Gaussian distribution. These clusters are individually transformed, such that the divergence of distribution between that of transformed source domain and target domain is minimized. These transformations are formulated using the Eigen-values/vectors in one case, and the covariance matrices of both the domains in the other. For unsupervised DA, we minimize the disparity of the distribution between the target and transformed source domains, while preserving the spatial arrangement of data in source domain. Two methods of optimization (iterative and convex optimization) have been proposed which help to estimate the transformed source domain data efficiently. A comparative study of different methods of cross- domain clustering of data, where clusters formed in one domain influence the clusters to be formed in another domain and vice-versa, will be briefly presented. Results on various real-world benchmark datasets show the efficiency of our proposed methods. |