When: Thurs, Oct 24th 2019, 2:00p
Where: Rose St Building Seminar Area
Title: SLAM for Learning and Learning for SLAM
Abstract: Decades of research in mutiview geometry have provided us with real-time dense monocular tracking and mapping systems which are capable of reconstructing high fidelity maps of the world around us. However, these geometric approaches to monocular reconstruction drastically differ from how humans perceive and interact with their environment. We learn from the experience of having seen large numbers of highly correlated scenes from multiple viewpoints, and use this prior knowledge for effective perception. I am going to discuss ways to leverage the incredible efficiency of artificial neural network to capture such knowledge and use it in existing monocular reconstruction and tracking pipelines for robust SLAM. Deploying supervised techniques for learning geometry is cumbersome and usually requires a large amount of annotated data, involving careful capture with calibrated sensors including LIDAR and IMUs. I will discuss how the basic principles of multi-view geometry and SLAM can be reused to train deep neural networks, to predict scene depth, normals, ego motion, and deformations with handheld or mounted commodity cameras alone.
Bio: Ravi Garg is a Senior Research Associate with the Australian Centre for Visual Technologies at The University of Adelaide, and is an Associate Research Fellow with the Australian Centre for Robotic Vision. He is working with Prof Ian Reid on his Laureate Fellowship project “Lifelong Computer Vision Systems”. Prior to joining University of Adelaide, he finished his PhD from Queen Mary University of London under the supervision of Prof Lourdes Agapito where he worked on Dense Motion Capture of Deformable Surfaces from Monocular Video.
His current research interest lies in building learnable systems with little or no supervision which can reason about scene geometry as well as semantics. He is exploring how far the visual geometry concepts can help current deep neural network frameworks in scene understanding. In particular, his research focuses on unsupervised learning for single view 3D reconstruction, visual tracking in monocular video and weakly or semi-supervised semantic reasoning in images or videos. He is also interested in building real-time, semantically rich robust monocular SLAM systems which can leverage deep learning.