Date: Thursday, May 23
Start Time: 1:30 pm
End Time: 2:00 pm
Humans rely on multiple senses to quickly and accurately obtain the most important information we need. Similarly, developers have begun using multiple types of sensors to improve machine perception. To date, this has mostly been done with “late fusion” approaches, in which separate ML models are trained for each type of sensor data, and the outputs of these models are combined in an ad hoc manner. However, such systems have proven difficult to implement and disappointing in their perception performance. We are now witnessing a transition away from this siloed sensor approach. Recent research shows that superior perception performance can be realized by training a single ML model on multiple types of sensor data. In this talk, we will explain why this new approach to multimodal perception will soon dominate and outline the key business challenges and opportunities that are emerging as a result, including challenges and opportunities in frameworks, tools, databases and models.