Ensemble Learning for Domain Adaptation by Importance Weighted Least Squares


We study ensemble learning for unsupervised domain adaptation, i.e., with labeled data in a source domain and unlabeled data in a target domain, drawn from a different input distribution. An open problem is to find an optimal aggregation of given models without making strong assumptions on the model classes. While several heuristics exist, methods are still missing that rely on thorough theories for bounding the target error. In this turn, we propose a method that extends the theory of weighted least squares to linear aggregations and vector-valued functions. Our method is asymptotically error-rate-optimal, in the sense that the error of the computed aggregation is asymptotically not worse than twice the error of the unknown optimal aggregation. In experiments, we compare our method to (1) classical ensemble learning on source data only, (2) majority voting on target predictions, (3) ensemble learning based on pseudo-labels, (4) importance weighted validation, and, (5) deep embedded validation; on several datasets including language, images and time-series. As a result, our method sets a new state-of-the-art performance for ensemble learning in unsupervised domain adaptation under theoretical error guarantees.