"Visual Scene Understanding - It's Time to Address it Again" (August 28)
Prof. Bernt Schiele, Director, Max Planck Institute for Informatics, Germany
Abstract: Inspired by the ability of humans to interpret and understand 3D scenes nearly effortlessly, the problem of 3D scene understanding has long been advocated as the "holy grail" of computer vision. In the early days this problem was addressed in a bottom-up fashion without enabling satisfactory or reliable results for scenes of realistic complexity. In recent years there has been considerable progress on many sub-problems of the overall 3D scene understanding problem. As the performance for these sub-tasks starts to achieve remarkable performance levels, we argue that the problem to automatically infer and understand 3D scenes should be addressed again. In this talk we will - on the one hand - highlight progress on some essential components of scene understanding such as object class recognition and articulated pose estimation and tracking. On the other hand, we will also report on our current attempt towards 3D scene understanding in the particular case of traffic scene analysis.
Biography: Bernt Schiele is Max-Planck-Director at MPI Informatics and Professor at Saarland University since 2010. He studied computer science at the University of Karlsruhe, Germany and ENSIMAG, Grenoble, France. In 1994 he was visiting researcher at Carnegie Mellon University, Pittsburgh, PA, USA. In 1997 he obtained his PhD from INP Grenoble, France in the field of computer vision. The title of his thesis was "Object Recognition using Multidimensional Receptive Field Histograms". Between 1997 and 2000 he was postdoctoral associate and Visiting Assistant Professor with the group of Prof. Alex Pentland at Massachusetts Institute of Technology, Cambridge, MA, USA. From 1999 until 2004 he was Assistant Professor at the Swiss Federal Institute of Technology in Zurich (ETH Zurich). Between 2004 and 2010 he was Full Professor at the computer science department of TU Darmstadt.
"Video Synopsis" (August 29)
Prof. Shmuel Peleg, Hebrew University of Jerusalem, Israel
Abstract: Surveillance video is practically never used. There are claims that 0.5% of the video is watched, but the true number is probably much smaller. The reason is clear: There are too many hours of surveillance video for people to watch. Most attempts to deal with the overflow of surveillance video involve the development of automatic video understanding: object recognition and activity understanding. Video Synopsis is complementary to video understanding. After objects are detected by background subtraction, video synopsis changes the time of display of each object such that more objects are "packed" into a shorter time. The resulting video is a shorter summary of the original video, where the objects are shown more densely than in the original video. While video synopsis can reduce, on the average, an hour of video into a minute, the synopsis loses causality: Objects that appear together in the original video may appear at different time in the synopsis, and vice versa. The combination of video synopsis and video understanding is expected to give the maximum benefit. As video understanding is still not fool proof, people
Biography: Shmuel Peleg received his Ph.D. in Computer Science from the University of Maryland in 1979. In 1981 he became a faculty member at the Hebrew University of Jerusalem where he is a Professor of Computer Science. Shmuel served as the first chairman of the Institute of Computer Science at The Hebrew University from 1990 to 1993. He published over 140 technical papers in computer vision and image processing, and holds 18 US patents. His technologies provided the technical foundations to several startup companies, among them BriefCam Ltd. creating short summaries of long surveillance videos, able to summarize hours of video in minutes. He served as an editor and committee member of numerous international journals and conferences, and most recently was a co-general-chair of CVPR 2011, and a program co-chair of ICCP 2013.
"On Gait and Soft Biometrics for Surveillance" (August 30)
Prof. Mark Nixon, University of Southampton, United Kingdom
Abstract: The prime advantage of gait as a biometric is that it can be used for recognition at a distance whereas other biometrics cannot. There is a rich selection of approaches and many advances have been made, as will be reviewed in this talk. Soft biometrics is an emerging area of interest in biometrics where we augment computer vision derived measures by human descriptions. Applied to gait biometrics, this again can be used where other biometric data is obscured or at too low resolution. The human descriptions are semantic and are a set of labels which are converted into numbers. Naturally, there are considerations of language and psychology when the labels are collected. After describing current progress in gait biometrics, this talk will describe how the soft biometrics labels are collected, and how they can be used to enhance recognising people by the way they walk. As well as reinforcing biometrics, this approach might lead to a new procedure for collecting witness statements, and to the ability to retrieve subjects from video using witness statements.
Mark Nixon is the Professor in Computer Vision at the University of Southampton UK. His research interests are in image processing and computer vision. His team develops new techniques for static and moving shape extraction which have found application in automatic face and automatic gait recognition and in medical image analysis. His team were early workers in face recognition, later came to pioneer gait recognition and more recently joined the pioneers of ear biometrics. Amongst research contracts, he was Principal Investigator with John Carter on the DARPA supported project Automatic Gait Recognition for Human ID at a Distance and he was previously with the FP7 Scovis project and is currently with the EU-funded Tabula Rasa project. Mark has published over 400 papers in peer reviewed journals, conference proceedings and technical books His vision textbook, with Alberto Aguado, Feature Extraction and Image Processing (Academic Press) reached 3rd Edition in 2012 and has become a standard text in computer vision. With T. Tan and R. Chellappa, their 2005 book Human ID based on Gait is part of the Springer Series on Biometrics. He has chaired/program chaired BMVC 98, AVBPA 03, IEEE Face and Gesture FG06, ICPR 04, ICB 09, IEEE BTAS 2010, and given many invited talks.