For a long time, running large language models meant cloud infrastructure, significant compute budgets, and a dependency on external platforms.
That is changing.
RP Tech, an NVIDIA partner, in partnership with YourStory, is hosting a live webinar, 'Running Sovereign LLMs on NVIDIA DGX Spark', to bring that shift to life. This hands-on session is designed to show developers, researchers, and engineers what it truly means to run powerful sovereign AI models locally, privately, and without relying on the cloud.
Leading the session is Megh Makwana, Manager of Applied GenAI Solution Engineering at NVIDIA, who works at the intersection of foundational model building, large-scale GPU workload optimization, and helping cloud service providers build AI platforms using NVIDIA AI Enterprise. Join him on Friday, April 17, from 3:00 to 4:30 PM for a live virtual webinar that gets straight into the work.
What the webinar covers
The session is built around running sovereign LLMs directly on NVIDIA DGX Spark, a personal AI supercomputer that delivers enterprise-grade AI performance in a compact, desktop form factor. Attendees will see live demonstrations of Sarvam 30B and Param-2-17B, two powerful sovereign language models, running locally on NVIDIA DGX Spark and powering a real chat application. No cloud. No external dependencies. Just the model, the hardware, and a workflow you can actually follow.
The session opens with a focused introduction to NVIDIA DGX Spark, covering what the hardware is, what the NVIDIA AI software stack enables, and why local AI development is becoming a credible option for production workflows. From there, it moves directly into the hands-on demonstration.
What you will walk away with
By the end of the session, attendees will know how to optimize inference for sovereign LLMs using low-precision formats like FP8 and NVFP4, how to deploy models on NVIDIA DGX Spark using open source frameworks including SGLang, vLLM, and TensorRT-LLM, and how to build a personal AI assistant powered by these models directly on the machine. Every takeaway is practical and immediately applicable.
Who should attend
This session is for developers, researchers, engineers, and architects who are either already working with large models or actively exploring how to bring AI workloads closer to the edge. A working familiarity with Python, Docker, and frameworks like SGLang, vLLM, or TensorRT-LLM will help you get the most out of the hands-on segments.
NVIDIA DGX Spark is redefining what is possible at the desktop. This webinar is your opportunity to see it firsthand, guided by one of NVIDIA's own solution engineering leaders.
Register now to secure your spot.

