Generative AI reimagined: from SciFi dreams to practical solutions
Stop settling for science fiction: focus on Real-World use cases for a sustainable success
This week newsletter I want to share with you one of the best suggestions I have ever heard about Generative AI in the past 18 months.
All of it is thanks to Darren Oberst, the genius behind the LLMWare team: we talked about them in on of past newsletter too.
These ideas and guidelines are so revolutionary… that I decided to put them down for you all. It is like acting with a contrary principle, it is like being rebels in a world following main stream trends and practices.
And I like this a lot.
Embrace pragmatism over hype in your generative AI strategy.
The world is overflowing with articles proclaiming the next breakthrough, from mind-bending "AGI" to personalized robots. These visions often distract us from the real potential of Generative AI – its power to improve processes and solve everyday challenges.
This newsletter cuts through the noise, offering a contrarian approach that prioritizes practical implementation over flashy promises: Focus on realistic use cases for sustainable success.
So without any further ado, here the 7 Key Lessons for effective Generative AI implementation.
Rethink your approach
The conventional wisdom of "bigger is always better" falls apart when considering generative AI. I always talked to you about small models, highlighting their advantages in terms of accessibility, cost-effectiveness, and ease of iteration.
Start Small, Scale Smart: Don't jump into a massive model with complex requirements just because it seems cool. Begin with smaller models that fit your specific needs – explore the power of open source options and see how they can drive real impact in your business processes.
Did you know that in Hugging Face there are already crazy rebels that are using Small Language Models, and tuned them for specific tasks?
Remember when I talked to you about NuExtract? Well the mini model of the NuExtract family is a fined tune version of Qwen2-0.5B! Imagine how good can Small Language Model be!
Embrace Language Models for what they are
Stop trying to make language models act like encyclopedias or experts, even if you think they could be! Focus on their strengths: understanding language patterns and generating human-quality text.
Closed Context is Key: Treat these models as assistants who excel at analyzing information within a specific context (like your documents) rather than assuming knowledge of the world beyond that context. This approach significantly reduces hallucination risks and improves accuracy.
In real world use-cases no one needs ad chatbot to talk about the weather or what-so-ever: what is required is a truthful and clear analysis, Q&A and data extraction from documents and data sources.
In this regard it doesn’t matter if a model does not know everything… who cares! But what we must care is that the Small Language Model is able to understand the data and truthfully reply to our requests!
Shorten your instructions, embrace determinism
Forget about complicated prompts, long instructions are inefficient and prone to errors. There is plenty of verbose super complicated prompt collections around the community: well, to begin with, Small Language Models don’t like them!
Simple is Better: Keep your instructions clear, concise, and repeatable for optimal results across various models. Don't let the allure of complex prompts obscure a simpler approach that delivers consistent outcomes.
You can check out more, since we talked about this topic in the previous newsletter Micro-tuning, Macro-performance: elevate Mini Language Models.
Self-Host your AI powerhouse
You know me! The main reasons I always talk about llama-cpp-python
and small language models are related to the fact that I don’t have a GPU and I must do everything with my 16Gb of RAM. And I don’t want anyone have my data!
Control Over Data Security & Governance: The real value of generative AI lies in its ability to work within your existing systems. Don’t rely on APIs for every use case – self-hosting empowers you with full control over your data, ensuring compliance and security while promoting scalability across diverse business processes.
Open Source: transparency leads to Trust
There are amazing guys around. The teams at H2O.ai, Qwen, Ai2, Prem, LLMWare and Hugging Face TB research are investing quite the amount of time and efforts to feed the Open Source community with high quality language and multi-modal models. The reason is clear:
Transparency is Key: Open source models bring transparency to the AI process, allowing for deeper understanding of their inner workings and more robust vetting through community scrutiny. This fosters trust and facilitates informed decision-making when deploying AI solutions within your organization. Embrace the power of open collaboration!
Last week we just talked about Allen Institute first sully Open Source Mixture of Experts. This is what really Open Source means.
CPU Power: the underestimated engine
The big shots are starting to move too! After the amazing and still unrivaled results of LlamaCPP the true innovator in generative AI for the past 16 months, other companies started to work to make Language Models easy.
Optimum Intel, OpenVINO, Vulkan are all projects that aim to bring the best performances to the Intel CPU and chips. In fact more than 90% of the PC users have an Intel hardware, maybe with an integrated GPU card, and so far it was impossible to use them during inference.
The Future is Local: Embrace ubiquitous computing with CPUs as a driver for local inference, reducing reliance on expensive GPUs. This allows you to sully make use of generative AI's potential in everyday devices and workstations without compromising your budget or infrastructure limitations.
Beyond Hype: real-world use cases are the Path forward
Stop squeezing all Generative AI use cases into Chatbots — and start thinking and using Gen AI as just another tool in your software tool kit.
What you need to so is to build applications and workflows. And most workflow and automation is not chat or interaction-focused, but rather happens behind the scenes and integrates between multiple steps and decision points.
Moving beyond the constraints of chat opens up a lot of exciting and practical use cases.
Workflow, not chat: Don’t get bogged down by grandiose visions of future AI dominance. Begin with tangible solutions that address immediate needs within your organization. This pragmatic approach will build your expertise and demonstrate a path to meaningful impact while avoiding unrealistic expectations. And certainly is not dome by chit-chatting, but creating meaningful workflow that refine your data and providing insights.
Conclusions
Stop settling for Science Fiction and focus on practical solutions. Generative AI is a powerful tool, but it's not magic.
And it is really time to be rebels now. It is still OK if everyone else is not doing it, because this will give us the advantage of using Generative AI for its true purpose.
Let’s move beyond science fiction and focus on building solutions with generative AI. The future of this technology lies in our hands – let's make it happen!
NOTE: you can read Darren’s article on Medium, and it is free!
My gift for this week is doubled! This is to celebrate the insights shared with us. The first article is about Vulkan: it is a driver that will allow anyone to use its GPU regardless of the manufacturer. Even your integrated Intel GPU will work with your LLM!
The second one is an introduction to OpenVINO. This framework displayed amazing capabilities with Intel hardware, to the point that you can be even faster than Mac Metal!
This is only the start!
Hope you will find all of this useful. Feel free to contact me on Medium.
I am using Substack only for the newsletter. Here every week I am giving free links to my paid articles on Medium.
Follow me and Read my latest articles https://medium.com/@fabio.matricardi