ISTA

Machine Learning and
Computer Vision Group



Research

Our research lies at the interface between computer vision and machine learning. We solve computer vision problems using learning methods, and we develop learning techniques that are inspired by, but not limited to, problems that occur in computer vision. We aim for principled solutions over heuristic ones, always trying to understand the conceptual potential as well as limitations of the method we develop in addition to evaluating their practical usefulness.
Specific topics that we target are transfer learning as well as learning with non-i.i.d. data. We apply the techniques we develop to problems such as object recognition and localization in natural images and semantic image representations.

Ongoing projects

Trustworthy machine learning.

Machine learning methods (aka "Artificial Intelligence") have become remarkably good in solving complex prediction tasks. Nevertheless, they are met with scepticism in the general society, because end users do not fully trust them. In the project, we work on techniques and results that can establish trust about learning systems. This including easy-to-explain guarantees on performance and robustness, but also aspects of transparency and fairness.

Theoretic foundations
of transfer learning

We are interested in developing a theoretic understanding of machine learning in situations where information is transfered between different learning tasks, e.g. multi-task learning, domain adaptation and, in particular, lifelong learning. Fundamental questions in this area are "Which information from previously learned tasks can be used for solving new tasks and how?", "What are good performance measures for judging the success of transfer learning algorithms?", and "Can we establish (upper or lower) bounds how useful the transfer of information between tasks will be?". We try to answer these questions using techniques from statistical machine learning, in particular PAC-Bayesian theory, and from online learning.

Lifelong machine learning

Our goal in the project is to develop and analyse algorithms for continuous, open-ended machine learning as well as their applications to visual data (images and videos). The underlying hypothesis is that we can only significantly improve the state of the art in computer vision algorithms by giving them access to background and contextual knowledge about the visual world, and that the most feasible way to obtain such knowledge is by extracting it (semi-)automatically from incoming visual stimuli.

Learning with non-standard
forms of supervision.

Supervised machine learning relies crucially on supervision in form of human annotation. Very often, acquiring the annotation is tedious and the main bottleneck in building high-quality predictive model. In the project, we study methods for training supervised learning models using non-standard forms of annotation that in contrast to ordinary annotations are either weaker and easier to obtain (e.g. only per-image labels instead of hand-drawn object segmentation masks), or stronger and more powerful (e.g. including a reason why a particular annotation is correct).

Sequential Learning
and Decision Making

Sequential decision tasks are ubiquitous for real-world applications of machine learning, e.g. when a robot interacts with its environment, a smartphone app interacts with its owner, or a surveillance cameras analyses video footage. These settings have in common that subsequent situations and decisions are not statistically independent of each other. In this project we study the implications of this phenomenon for machine learning and decision making. Exemplary results are algorithms that adapt their decision to a non-stationary data distribution or that learn optimal predictors with respect to the conditional distribution of an underlying stochastic process instead of the marginal one.

Previous Research Topics

Extraction of Semantic Information from Image and Video (CLASS Project at MPI Tübingen)
Camera-Based Document Capture (IPeT project at DFKI)
Video Compression (XviD project)
Efficient Filtering for Image Processing (with O. Wirjadi)