Title: Lifelong Learning of Visual Scene Understanding
Project acronym/reference: L3ViSU / 308036
Duration: 2013-01-01 to 2017-03-31 and 2017-10-01 to 2018-12-31
EU contribution: EUR 1,464,711
Project programme: FP7 Ideas
Contract type: ERC Starting Grant (consolidator phase)
Our goal in the project is to develop and analyse algorithms that use continuous, open-ended machine learning from visual input data (images and videos) in order to interpret visual scenes on a level comparable to humans. L3ViSU is based on the hypothesis that we can only significantly improve the state of the art in computer vision algorithms by giving them access to background and contextual knowledge about the visual world, and that the most feasible way to obtain such knowledge is by extracting it (semi-) automatically from incoming visual stimuli. Consequently, at the core of L3ViSU lies the idea of life-long visual learning. Sufficient data for such an effort is readily available, e.g. through digital TV-channels and media-sharing Internet platforms, but the question of how to use these resources for building better computer vision systems is wide open. In L3ViSU we will rely on modern machine learning concepts, representing task-independent prior knowledge as prior distributions and function regularizers. This functional form allows them to help solving specific tasks by guiding the solution to "reasonable" ones, and to suppress mistakes that violate "common sense". The result will not only be improved prediction quality, but also a reduction in the amount of manual supervision necessary, and the possibility to introduce more semantics into computer vision, which has recently been identified as one of the major tasks for the next decade. L3ViSU is a project on the interface between computer vision and machine learning. Solving it requires expertise in both areas, as it is represented in my research group at IST Austria. The life-long learning concepts developed within L3ViSU, however, will have impact outside of both areas, let it be as basis of life-long learning system with a different focus, such as in bioinformatics, or as a foundation for projects of commercial value, such as more intelligent driver assistance or video surveillance systems.