Date: Tuesday, May 23
Start Time: 11:25 am
End Time: 11:55 am
In what format can an AI system best present what it “sees” in a visual scene to help robots accomplish tasks? This question has been a long-standing challenge for computer scientists and robotics engineers. In this presentation, we will provide insights into cutting-edge techniques being used to help robots better understand their surroundings, learn new skills with minimal guidance and become more capable of performing complex tasks. We will discuss recent advances in unsupervised representation learning and explain how these approaches can be used to build visual representations that are appropriate for a controller that decides how the robot should act. In particular, we will present insights from our research group’s recent work on how to represent the constituent objects and entities in a visual scene, and how to combine vision and language in a way that permits effectively translating language-based task descriptions into images depicting the robot’s goals.