Comcast’s Xfinity Home connects millions of home smart cameras and IoT devices to improve our customers’ safety and security. Our teams use computer vision and deep learning to understand video and sensor data from these devices to identify relevant events so that we can improve the user experience. Specifically, we have explored the spatial-temporal relationships among objects, places and actions. We have also developed a semi-supervised learning approach for video classification (VideoSSL) to detect certain activities using limited training data. Using these techniques, we have achieved very promising results on activity recognition with multiple datasets.