NVIDIA free webinar - special edition

You can learn Visual Language Models from the Best in the industry

Jan 21, 2025

image generated with Stable Diffusion 3.5

In the past months NVIDIA became super active in the web. And not only to promote their latest GPU technology!

Few days ago I already wrote about the rare opportunity to finally see the end of the Hype and the start of practical AI application, to real world problem. In fact NVIDIA partnered with clinical research services company IQVIA, genomics specialist Illumina, and Mayo Clinic to accelerate drug discovery, enhancing genomic research and pioneering advanced healthcare services with agentic and generative AI.

Awakened Intelligence 2025#002

Fabio

Jan 19

Read full story

And now, a free webinar is offered to all of us!

On Jan 22nd, NVIDIA will run a webinar “Enhance Visual Understanding With Generative AI”. The content is tailored for smart spaces, healthcare and media & entertainment practitioners.

Don’t Miss Out: build your own Visual AI expertise

Are you ready to unlock the groundbreaking potential of generative AI for visual understanding?

If your answer is YES, then you must join the NVIDIA’s exclusive webinar on January 22nd and discover how large language models (LLMs), vision language models (VLMs), and multimodal large language models (MLLMs) are revolutionizing industries like smart cities, healthcare, and media & entertainment.

Needless to say… Seats are limited, so register now and secure your spot now!

You can register for free here.

Here’s what awaits you:

Real-World Applications: Witness firsthand how leading companies are leveraging visual AI agents to transform data into actionable insights.
Technical Deep Dive: Explore the intricacies of implementing these cutting-edge technologies and learn how to fast-track your own visual AI agent development with NVIDIA AI Blueprint.
Future of Vision AI: Gain exclusive insights from the latest research in vision and multimodal AI, shaping the future of this exciting field.
Responsible AI Practices: Learn about the ongoing efforts to ensure the responsible training and deployment of AI research systems.

Now the real question… Why should I care?

NVIDIA and the physical world

Probably little is know about this: NVIDIA is probably one onf the few (if not the only) Big Tech Companies that is training people (not only models) and showing off real world applications of Generative AI.

The lines between the digital and physical worlds are blurring, and NVIDIA is at the forefront of this exciting convergence. They have an ambitious vision: to imbue AI with a genuine understanding of our three-dimensional reality.

And here we are not talking only about robots performing pre-programmed tasks in controlled environments. NVIDIA wants to create AI that can perceive, reason about, and interact with the world around us in a truly meaningful way.

Meet NVIDIA Cosmos, a platform that’s pushing the boundaries of what’s possible in the realm of “Physical AI.” At its core are World Foundation Models (WFMs) — sophisticated AI systems trained on vast quantities of visual and sensor data.

These models can learn the complex alchemy of physical phenomena, and grasp concepts like gravity, friction, and the interplay of forces that shape our world. To me it looks like an AI that doesn’t just “see” an object but understands its weight, its texture, and how it might behave in different scenarios.

This focus on 3D understanding is key. Instead of perceiving the world as a collection of flat images, these WFMs construct a rich, three-dimensional representation, much like our own. This opens up incredible possibilities for applications like robotics, where machines can navigate complex environments with greater autonomy and dexterity. Imagine robots that can truly collaborate with humans, adapting to dynamic situations and anticipating our needs.

NVIDIA Cosmos is still in its early stages, but it represents a bold step forward a future where AI can integrate with our physical world.

And in my opinion the implications extend far beyond robotics. Think about the entire Digital Twin movement: every Company in the world wants to cut costs in their Research And Development process, and certainty that a prototype is going to work. If you have a solid Foundation Model, they can create virtual worlds that mirror the nuances of our physical reality, leading to breakthroughs in fields like design, engineering, and scientific research.

To SEE is to KNOW

The latest advancement in the Generative AI community are mainly focused on the Visual Language Models or Multi Modalities. This is because to be able to see is basically to be able to know.

VLMs and MLLMs are both types of AI models that combine different types of data to understand the world more comprehensively, but they differ in their scope and capabilities:

VLMs (Vision-Language Models) combine vision and language. VLMs are designed to process both visual information (images, videos) and textual data (natural language). They can learn relationships between images and text and are trained on large datasets of images paired with descriptions. In this way they become proficient to understand the connections between visual elements and their corresponding textual representations.

MLLMs (Multimodal Large Language Models) integrate multiple data modalities. MLLMs go beyond vision and language to incorporate other modalities like audio, video, sensor data, and even numerical data. They have a more comprehensive understanding of the world, capturing complex relationships between different data types.

If you want to bring a new vision in your work or in your studies, this filed is worth the investment. In fact both VLMs and MLLMs are driving significant advancements in AI, enabling machines to perceive and interact with the world in increasingly sophisticated ways.

Don’t miss this opportunity to:

Gain a competitive edge in the rapidly evolving world of visual AI.
Network with industry leaders and experts.
Transform your vision into reality with the power of generative AI.

Webinar Details:

Topic: Enhance Visual Understanding With Generative AI
Date: January 22, 2025
Time: 2:00 p.m. — 4:00 p.m. CET | 6:30 p.m. — 8:30 p.m. IST
Duration: 2 hours
Focus: Generative AI, LLMs, VLMs, MLLMs, real-world applications, NVIDIA AI Blueprint, responsible AI practices.

Go and catch it!

There is no need this time for a free gift… in fact NVIDIA is giving it, right?

This is only the start!

Hope you will find all of this useful. I am using Substack only for the newsletter. Here every week I am giving free links to my paid articles on Medium. Follow me and Read my latest articles https://medium.com/@fabio.matricardi

Check out my Substack page, if you missed some posts. And, since it is free, feel free to share it!