{"version":"1.0","provider_name":"2025 Summit","provider_url":"https:\/\/embeddedvisionsummit.com\/2025","title":"Vision-Language Models on the Edge - 2025 Summit","type":"rich","width":600,"height":338,"html":"<blockquote class=\"wp-embedded-content\" data-secret=\"V3kKu9zkgG\"><a href=\"https:\/\/embeddedvisionsummit.com\/2025\/session\/vision-language-models-on-the-edge\/\">Vision-Language Models on the Edge<\/a><\/blockquote><iframe sandbox=\"allow-scripts\" security=\"restricted\" src=\"https:\/\/embeddedvisionsummit.com\/2025\/session\/vision-language-models-on-the-edge\/embed\/#?secret=V3kKu9zkgG\" width=\"600\" height=\"338\" title=\"&#8220;Vision-Language Models on the Edge&#8221; &#8212; 2025 Summit\" data-secret=\"V3kKu9zkgG\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" class=\"wp-embedded-content\"><\/iframe><script type=\"text\/javascript\">\n\/* <![CDATA[ *\/\n\/*! This file is auto-generated *\/\n!function(d,l){\"use strict\";l.querySelector&&d.addEventListener&&\"undefined\"!=typeof URL&&(d.wp=d.wp||{},d.wp.receiveEmbedMessage||(d.wp.receiveEmbedMessage=function(e){var t=e.data;if((t||t.secret||t.message||t.value)&&!\/[^a-zA-Z0-9]\/.test(t.secret)){for(var s,r,n,a=l.querySelectorAll('iframe[data-secret=\"'+t.secret+'\"]'),o=l.querySelectorAll('blockquote[data-secret=\"'+t.secret+'\"]'),c=new RegExp(\"^https?:$\",\"i\"),i=0;i<o.length;i++)o[i].style.display=\"none\";for(i=0;i<a.length;i++)s=a[i],e.source===s.contentWindow&&(s.removeAttribute(\"style\"),\"height\"===t.message?(1e3<(r=parseInt(t.value,10))?r=1e3:~~r<200&&(r=200),s.height=r):\"link\"===t.message&&(r=new URL(s.getAttribute(\"src\")),n=new URL(t.value),c.test(n.protocol))&&n.host===r.host&&l.activeElement===s&&(d.top.location.href=t.value))}},d.addEventListener(\"message\",d.wp.receiveEmbedMessage,!1),l.addEventListener(\"DOMContentLoaded\",function(){for(var e,t,s=l.querySelectorAll(\"iframe.wp-embedded-content\"),r=0;r<s.length;r++)(t=(e=s[r]).getAttribute(\"data-secret\"))||(t=Math.random().toString(36).substring(2,12),e.src+=\"#?secret=\"+t,e.setAttribute(\"data-secret\",t)),e.contentWindow.postMessage({message:\"ready\",secret:t},\"*\")},!1)))}(window,document);\n\/\/# sourceURL=https:\/\/embeddedvisionsummit.com\/2025\/wp-includes\/js\/wp-embed.min.js\n\/* ]]> *\/\n<\/script>\n","thumbnail_url":"https:\/\/embeddedvisionsummit.com\/2025\/wp-content\/uploads\/sites\/15\/2025\/04\/ZakkaC_SpeakerCard.jpg","thumbnail_width":1200,"thumbnail_height":630,"description":"In this presentation, we provide an overview of vision-language models (VLMs) and their deployment on edge devices using Hugging Face\u2019s recently released SmolVLM as an example. We will examine the training process of VLMs, including data preparation, alignment techniques and [&hellip;]"}