Awakened Intelligence 2025#001
Gradio updates, Qwen-Chat for free and ICL-In Context Learning breakthrough - A PoorGPUguy weekly dose of cutting-edge open-source AI news & insights
Welcome back to ThePoorGPUguy weekly newsletter! The Generative AI world change faster than ever, and it cannot be only about new models and benchmarks breakthrough.
So this week we will have a look at some of the interesting use-cases and tools that can help to create AI powered application. There are always two levels: coding and advancements.
Coding
Gradio updates brings to life your Generative AI application with 2 lines of code
Qwen-Chat is now a web app, 100% free, and with no way inferior to ChatGPT or Claude.
Advancements
In Context Learning (ICL) is too underrated, and must be the key for increasing accuracy in Small Language Models generation.
Gradio 5.0 is out and it cannot be easier
The Gradio team have been hard at work over the past few months, and now can announce the stable release of Gradio 5! With more than 2 million users every month (and >470,000 apps on Hugging Face Spaces), Gradio has become the default way to build, share, and use machine learning applications.
Gradio is the fastest way to demo your machine learning or LLM powered apps with a friendly web interface so that anyone can use it, anywhere! If you are a python programmer, build with the Gradio blocks is easy and intuitive. You can create a Graphic User interface with few simple instructions.
And it works!
The main useful new feature is that you can create a full functional Chat-bot with 2 lines of code! This will be the gift of the week: my free tutorial on how to start with Gradio and Granite-3.1 - Check out the end of the newsletter.
The next amazing feature is the Gradio Playground: an online space where you can create the UI layouts and see in real-time the changes.
This is about to change! Gradio 5 ships with an experimental AI Playground where you can use AI to generate or modify Gradio apps and preview the app right in your browser immediately: https://gradio.app/playground
Not to mention that you have a small AI assistant to help you fix the code
Qwen CHAT is free
@Alibaba_Qwen launched its AI web app with support for HTML rendering in artefacts, file uploads and more. It has all the main features of the famous GPT-4o and Claude
When you are a GPU poor, you need often a bigger LLM to test potential prompts, or simply get your synthetic data for In Context Learning examples (see the next section).
The new Qwen web-based interface is designed to make interacting with Qwen models more accessible and user-friendly. This announcement comes after Alibaba received many valuable feedback from the community, with many suggesting that a web UI would significantly enhance Qwen’s usability and reach.
You can login for free and test all you want here: Try it here
Key Features of Qwen Chat
The web app boasts a huge set of features designed to cover user needs:
Multiple Model Support: Choose from flagship models like Qwen2.5-Plus, vision-language model Qwen2-VL-Max, reasoning models QwQ and QVQ, and the coding expert Qwen2.5-Coder-32B-Instruct.
Document Upload: Upload documents and get AI-generated answers based on their content.
HTML Preview Mode: Enjoy enhanced readability with HTML-supported previews.
Image Upload: Leverage visual understanding capabilities by uploading images for analysis.
What’s Coming Next?
The team behind Qwen Chat has teased several exciting features currently in the pipeline, including:
Web search integration
Image generation capabilities
Voice mode for hands-free interaction
In Context Learning is underrated
In the past weeks, as you may remember, I tried to overcame a big problem with Small Language Models. They are good, but they are too sensible to prompt wording and even words order. Below 1 Billion parameters you risk to get gibberish or non consistent responses.
My Anne Frank test is one example, an I categorized it into Truthful RAG: asking the LLM to reply based only on the context, I prompt a chuck of text about technology and then ask a question “Who is Anne Frank?“. I expect the Generative AI model to reply to me “Unanswerable“ because the provided context does not contain any information about Anne Frank.
Very Small Language Models always fail this test: and this is a pity, because Question/Answering is probably the most useful task of the LLMs.
I stumbled again on the amazing feats of RWKV (see the video above). Jellyfish042 created an astonishing RWKV-7 model (9M & 26M params) able to solve Othello only with Chain of Thought. I reported a similar amazing project that is able to solve SUDOKU in my previous newsletter:
Basically only with the context prompt (and previous training).
You can check the GitHub Repo and see what he managed to do with such a small number of parameters, and a RNN architecture (not a Transformers!).
This was an inspiration for me.
I tried to get a 100% precision score to the Truthful RAG with the little HuggingFace TB model SmolLM2-360M-Instruct-GGUF with surprising results: if I succeed to implement the same technique with the smallest SmolLM2-135M-Instruct, I will share with you the code and how to do it yourself!
📣 So stay tuned, and be the first one to know about it! 😉
🎁 So here my gift of the week: a free tutorial on how to start with Gradio and Granite-3.1 - The new Gradio 5 feature for you to create a full functional Chat-bot with 2 lines of code!
This is only the start!
Hope you will find all of this useful. I am using Substack only for the newsletter. Here every week I am giving free links to my paid articles on Medium. Follow me and Read my latest articles https://medium.com/@fabio.matricardi
Check out my Substack page, if you missed some posts. And, since it is free, feel free to share it!