Date: Thursday, May 23
Start Time: 5:25 pm
End Time: 5:55 pm
In this talk, we present recent work on AI Guide Dog, a groundbreaking research project aimed at providing navigation assistance for the blind community. This multiyear project at Carnegie Mellon University leverages AI to predict sighted human reactions in real time and convey this information audibly to blind individuals, overcoming the limitations of existing GPS apps and mobility tools for the blind. We will discuss the various vision-only and multimodal models we’ve evaluated. We’ll also discuss imitation learning approaches we are currently exploring. We will also highlight trade-offs among the strict requirements for our models to ensure explainable predictions, high accuracy and real-time processing on mobile devices. And we will share insights we’ve gained through three iterations of this project, explaining our data collection procedures, training pipelines and cutting-edge vision and multimodal modeling methodologies. We’ll conclude with some exciting results from our models.