
For months, I have wrestled with a problem that has consumed my thoughts and challenged everything I know about software development.
This week I wrote about building the future with AI agents. One of the key areas for me is moving beyond prompt engineering to something more reliable.
I have spent decades learning how to craft reliable software. Now I want to bring that reliability to AI development.
Today I am ready to share what I have been building in the background.
It started with a game. It ended with something that could change how we build AI applications forever.
The Breaking Point
About a year ago, I began working on a new role playing game for fun (my background is in game development working on two different titles, Ealdorlight and Sol Trader). This time, I wanted to use AI to generate dynamic storylines and characters.
The initial progress was exhilarating. The AI created rich, compelling background stories for characters that I could never have written alone.
Then reality set in.
I found myself spending hours not building game features, but tweaking prompts. Desperately trying to maintain consistency in character responses.
Each small change to the game required hours of prompt engineering. The creative joy of game development gave way to an endless cycle of trial and error. I was not making a game anymore. I was becoming some kind of prompt babysitter.
A Familiar Pattern
All my software development experience screamed at me that something was wrong. We solved these problems in traditional software development decades ago. We have unit tests. Integration tests. Continuous integration. Why were we wrestling with prompts instead of writing tests?
The existing solutions felt incomplete. It feels like we are in the punchcard era of AI development. There are some early solutions: DSPy offers fascinating academic insights but requires significant expertise to implement. LangSmith and other services still leave the hardest parts of prompt engineering to developers. We have made AI accessible, but not truly usable for developers.
What if we could automate the entire prompt optimisation process? What if developers could write simple functions and let the AI handle the complexity of prompt engineering?
With that idea, Kaijo was born.
What Is Kaijo?
Kaijo makes AI development feel like normal software development. Write a function (just an api call). Add some tests (bring some examples, or generate them with AI). Let Kaijo handle the rest.
Behind the scenes, Kaijo continuously evaluates and optimises your AI functions. It figures out from your examples what good looks like, and what the best prompt is to get that result. It can do this using cheaper models or different models seamlessly, and can test your prompts in parallel to find the combination of the cheapest fastest and best model to get what you need.
The result? AI functions that just work.
Kaijo Enables The “12 Factor Agents” Approach
The industry is beginning to recognise that successful AI applications share common principles. Gone are the days when you use a big prompt, a loop and a bag of tools and hope for the best.
One set of guidelines is the “12 Factor Agents” way of building agents. Kaijo plays very nicely with this approach:
-
Natural Language to Tool Calls: Instead of wrestling with raw prompts, Kaijo transforms natural language into structured tool calls, making AI interactions predictable and testable.
-
Structured Outputs: Every AI interaction in Kaijo produces structured, predictable outputs that integrate seamlessly with your existing codebase.
-
State Management: You call Kaijo at any point within your business logic, making AI functions behave like any other part of your application. You manage state, workflow and RAG as before.
These principles ensure that AI development with Kaijo feels familiar to any software developer, while handling the unique challenges of AI applications.
See It In Action
On Friday 2nd May 2pm to 6pm UK time, I will attempt to build an entire AI application live on stream. We are going to be building a cheatsheet generator that creates personalised study guides from any text. Try as I might I haven’t found anything on the internet that does this yet, and using Kaijo, this will be much easier to build.
The stream will demonstrate how Kaijo transforms AI development. You will see how the hardest part of AI development becomes the easiest. No prompt engineering required.
AI Development For The Rest Of Us
Kaijo represents more than just a tool. It represents a future where developers can focus on building applications, not wrestling with prompts. Where AI is just another reliable component in our software stack.
By embracing software engineering principles that have stood the test of time and adapting them for the AI era, we are creating a foundation for the next generation of AI applications.
Early access to Kaijo opens very soon. You can sign up for the waitlist at kaijo.ai, and join my newsletter below for more notes about the journey.
The future of AI development should not belong to AI experts. It should belong to regular developers who want to build amazing things. Let us make that future together.
More articles
Kill Your Prompts: Build Agents That Actually Work
Every technical team I talk to faces the same painful truth about AI agents. They build something that works brilliantly in their demo, showing it off to stakeholders who nod approvingly. Then they deploy it to production and watch it break in ways they never imagined. The carefully crafted prompts they spent weeks perfecting become a maintenance nightmare.
Recently I showed a (virtual) room full of technical leaders how to kill their prompts entirely. I do not mean improve them or optimise them. I mean kill them completely and build something better.
Read more →Master Prompt Stacking: The Secret to Making AI Code Like You Do
After months of fighting with inconsistent AI coding results, I discovered something that changed how I work with tools like Cursor. The problem was not my prompts. The problem was that I had no idea what else was being fed into the AI alongside my requests.
During a recent webinar, I walked through this discovery with a group of engineers who were facing the same frustrations. What we uncovered was both obvious and surprising: AI coding tools are far more complex than they appear on the surface.
Read more →The Huge List of AI Tools: What's Actually Worth Using in June 2025?
There are way too many AI tools out there now. Every week brings another dozen “revolutionary” AI products promising to transform how you work. It’s overwhelming trying to figure out what’s actually useful versus what’s just hype.
So I’ve put together this major comparison of all the major AI tools as of June 2025. No fluff, no marketing speak - just a straightforward look at what each tool actually does and who it’s best for. Whether you’re looking for coding help, content creation, or just want to chat with an AI, this should help you cut through the noise and find what you need.
Read more →Unlocking Real Leverage with AI Delegation
Starting to delegate to AI feels awkward. It is a lot like hiring your first contractor: you know there is leverage on the other side, but the first steps are messy and uncertain. The myth of the perfect plan holds many people back, but the reality is you just need to begin.
The payoff is real, but the start is always a little rough.
Here is how I do it.
Read more →How I Make Complex AI Changes
Most technical leaders know the pain. You get partway into an ambitious AI project, then hit a wall. You are not sure how to start, or you get so far and then stall out, lost in the noise of options and half-finished experiments.
Recently I tackled this head on. I did this live, in front of an audience. I used a framework that finally made the difference.
The challenge: could I take a complex change, break it down, and actually finish it, live on stream? My answer: yes, with the right approach. Here is exactly how I did it.
Read more →