Always-on perception is becoming a defining capability of next-generation edge devices, from AR glasses and hearables to battery-operated sensors. Yet continuous audio/video and motion understanding runs into two hard limits: energy per inference and memory bandwidth. In this talk, we present FotoNation’s heterogeneous, near-memory edge AI architecture for always-on processing, combining a low-power on-device engine (ODE), an ISP-Lite front end and region of interest (ROI) adaptive compute. The pipeline stays active while dynamically scaling precision and workload to the scene, delivering responsive perception within milliwatt-class power envelopes. The ISP-Lite provides lightweight, task-aware preconditioning to reduce downstream compute without sacrificing semantic quality, while the ODE performs energy-proportional feature extraction for motion vectors, micro-events and audiovisual triggers. These cues drive a multistage ROI scheduler that invokes neural ISP and specialized models only where needed (e.g., faces, text). Near-memory compute couples local memory with AI accelerators to minimize off-chip transfers, enabling continuous sensing with deterministic latency.

