Thomas Pock, TU Graz (home)
A tightrope walk between convexity and non-convexity in computer vision
Energy minimization methods are among the most successful approaches to solve problems in computer vision, image processing and machine learning. Unfortunately, many interesting problems lead to non-smooth and in particular to non-convex optimization problems. I this talk I will discuss different strategies to tackle non-convex problems, leading to very efficient and in some cases globally optimal algorithms.
Thomas Pock received a MSc and a PhD degree in Telematik from Graz University of Technology in 2004 and 2008, respectively. He is currently employed as an Assistant Professor at the Institute for Computer Graphics and Vision at Graz University of Technology and he is the leader of the "variational methods" working group. In 2013 he received the START price of the Austrian Science Fund (FWF) and the German Pattern recognition award of the German association for pattern recognition (DAGM). His research interests include convex optimization and in particular variational methods with application to segmentation, optical flow, stereo, registration as well as its efficient implementation on graphics processing units.
Thomas Mensink, UVA Amsterdam (home)
Large Scale Image Classification and Generalizing to New Classes
In this talk I'll present recent research on large scale image classification and how to learn classifiers for new classes at negligible cost.
First, I'll give a brief overview of the Fisher Vector (FV) image representation. The FV framework could be seen as a generalisation of the popular Bag-of-Visual words approach, by taking into account more statistics about the distribution of the local descriptors in the image. This representation has many advantages: it is efficient to compute, it leads to excellent results even with efficient linear classifiers, and it can be compressed with a minimal loss of accuracy using product quantization.
Second, I'll discuss distance based classifiers, such as the k-Nearest Neigbours (kNN) and Nearest Class Means (NCM), since these methods can incorporate new classes and training images continuously over time at negligible cost. This is not possible with the popular one-vs-rest SVM approach, but is essential when dealing with real-life open-ended datasets. For the NCM classifier, which assigns an image to the class with the closest mean, we introduce a new metric learning approach based on multi-class logistic discrimination. During training we enforce that an image from a class is closer to its class mean than to any other class mean in the projected space. Experiments on the ImageNet 2010 challenge dataset, which contains over 1 million training images of thousand classes, show that, surprisingly, the NCM classifier compares favorably to the non-linear k-NN classifier. Moreover, the NCM performance is comparable to that of linear SVMs which obtain current state-of-the-art performance. Experimentally we also study the generalization performance to classes that were not used to learn the metrics and obtain surprisingly good results.
Thomas Mensink received the MSc degree in artificial intelligence from the University of Amsterdam, and the PhD degree in computer science from the University of Grenoble. His PhD dissertation entitled "Learning Image Classification and Retrieval Models", was awarded with the AFRIF Prix de These 2012 for the best PhD thesis in France in Computer Vision and Machine Learning. During his PhD research, he worked both at Xerox Research Centre Europe and with the LEAR team of INRIA Grenoble. Currently, he is a postdoctoral researcher at the University of Amsterdam. His research interests include machine learning and computer vision.
Robert Forchheimer, ISY Linköping (home)
A Time-to-Impact Sensor and Applications in Robotics
Time-to-Impact (TTI) sensing can be performed with an ordinary image sensor and suitable processing. Typically an optical flow field is computed and the TTI is derived from it. However, optical flow requires fairly substantial processing and memory which lowers the rate at which TTI values can be achieved, particularly if low power is required. The talk will present a different view on how to compute the optical flow in a way relevant for TTI estimation. Furthermore, the implementation of the algorithm on a focal plane processor (NSIP) will be presented. Finally, some robotic applications will be addressed.
Robert Forchheimer has expertise in compression of media signals. He received the M.S. degree in electrical engineering from the Royal Institute of Technology, Stockholm (KTH) in 1972 and the Ph.D degree from Linköping University in 1979. During the academic year 1979 to 1980 he was a visiting research scientist at University of Southern California where he worked in the areas of image coding, computer architectures for image processing and optical computing. Dr. Forchheimer's research areas have involved data security, packet radio communication, smart vision sensors and image coding. He has authored and coauthored papers in all of these areas and also holds several patents. He is the cofounder of several companies within the university science park. Dr. Forchheimer is currently in charge of the Information Coding Group at Linköping University. His main work concerns algorithms and systems for image and video communication.