Introducing Kaijo: AI functions that just work

April 2025

kaijo

For months, I have wrestled with a problem that has consumed my thoughts and challenged everything I know about software development.

This week I wrote about building the future with AI agents. One of the key areas for me is moving beyond prompt engineering to something more reliable.

I have spent decades learning how to craft reliable software. Now I want to bring that reliability to AI development.

Today I am ready to share what I have been building in the background.

It started with a game. It ended with something that could change how we build AI applications forever.

The Breaking Point

About a year ago, I began working on a new role playing game for fun (my background is in game development working on two different titles, Ealdorlight and Sol Trader). This time, I wanted to use AI to generate dynamic storylines and characters.

The initial progress was exhilarating. The AI created rich, compelling background stories for characters that I could never have written alone.

Then reality set in.

I found myself spending hours not building game features, but tweaking prompts. Desperately trying to maintain consistency in character responses.

Each small change to the game required hours of prompt engineering. The creative joy of game development gave way to an endless cycle of trial and error. I was not making a game anymore. I was becoming some kind of prompt babysitter.

A Familiar Pattern

All my software development experience screamed at me that something was wrong. We solved these problems in traditional software development decades ago. We have unit tests. Integration tests. Continuous integration. Why were we wrestling with prompts instead of writing tests?

The existing solutions felt incomplete. It feels like we are in the punchcard era of AI development. There are some early solutions: DSPy offers fascinating academic insights but requires significant expertise to implement. LangSmith and other services still leave the hardest parts of prompt engineering to developers. We have made AI accessible, but not truly usable for developers.

What if we could automate the entire prompt optimisation process? What if developers could write simple functions and let the AI handle the complexity of prompt engineering?

With that idea, Kaijo was born.

What Is Kaijo?

Kaijo makes AI development feel like normal software development. Write a function (just an api call). Add some tests (bring some examples, or generate them with AI). Let Kaijo handle the rest.

Behind the scenes, Kaijo continuously evaluates and optimises your AI functions. It figures out from your examples what good looks like, and what the best prompt is to get that result. It can do this using cheaper models or different models seamlessly, and can test your prompts in parallel to find the combination of the cheapest fastest and best model to get what you need.

The result? AI functions that just work.

Kaijo Enables The “12 Factor Agents” Approach

The industry is beginning to recognise that successful AI applications share common principles. Gone are the days when you use a big prompt, a loop and a bag of tools and hope for the best.

One set of guidelines is the “12 Factor Agents” way of building agents. Kaijo plays very nicely with this approach:

  1. Natural Language to Tool Calls: Instead of wrestling with raw prompts, Kaijo transforms natural language into structured tool calls, making AI interactions predictable and testable.

  2. Structured Outputs: Every AI interaction in Kaijo produces structured, predictable outputs that integrate seamlessly with your existing codebase.

  3. State Management: You call Kaijo at any point within your business logic, making AI functions behave like any other part of your application. You manage state, workflow and RAG as before.

These principles ensure that AI development with Kaijo feels familiar to any software developer, while handling the unique challenges of AI applications.

See It In Action

On Friday 2nd May 2pm to 6pm UK time, I will attempt to build an entire AI application live on stream. We are going to be building a cheatsheet generator that creates personalised study guides from any text. Try as I might I haven’t found anything on the internet that does this yet, and using Kaijo, this will be much easier to build.

The stream will demonstrate how Kaijo transforms AI development. You will see how the hardest part of AI development becomes the easiest. No prompt engineering required.

AI Development For The Rest Of Us

Kaijo represents more than just a tool. It represents a future where developers can focus on building applications, not wrestling with prompts. Where AI is just another reliable component in our software stack.

By embracing software engineering principles that have stood the test of time and adapting them for the AI era, we are creating a foundation for the next generation of AI applications.


Early access to Kaijo opens very soon. You can sign up for the waitlist at kaijo.ai, and join my newsletter below for more notes about the journey.

The future of AI development should not belong to AI experts. It should belong to regular developers who want to build amazing things. Let us make that future together.


More articles

Building the Future

A diagram of the future of AI agents

Something has been on my mind for months. The rapid evolution of AI agents has opened up possibilities I cannot ignore.

We are witnessing the emergence of semi autonomous agents that will fundamentally reshape how we work and communicate. The opportunities in this space are extraordinary. I am diving deeper into this world of AI agent development and product creation.

My newsletter is evolving. Instead of dispensing tips from a position of authority, I invite you on a journey of discovery. I will document my experiences building with AI, how to apply my tech experience in a new world, and navigating the inevitable struggles and setbacks.

Read on for several key areas I am exploring.

Read more

The Reality of AI Power Usage

AI Power Usage

AI power usage generates significant controversy. Headlines paint it as an environmental catastrophe waiting to happen. The reality proves more nuanced and potentially more optimistic than these dire warnings suggest.

A ChatGPT query uses 10 times more energy than a Google search. This sounds alarming until one realises it equates to running your hairdryer for six seconds. The entire data centre industry, including all AI operations, accounts for just 1.5% of global electricity consumption.

Here is a rundown of the more pressing issues with AI power usage.

Read more

AI Therapists: Self Reflection With AI

“I am not ready for this conversation.” We all know that feeling. The endless mental rehearsal. The anxiety building with each imagined scenario.

But what if I told you that AI could be your practice partner for both difficult conversations and deeper self-reflection?

AI as a therapeutic practice partner

Read more

Coding with AI: How To Do It Well And What This Means

I am shipping AI-first production code every day. Not experimental features. Not throwaway prototypes. Real, deployed, mission-critical code powering Cherrypick’s tens of thousands of users.

Social media overflows with “vibe coding” demonstrations. These flashy but superficial examples show AI apparently conjuring perfect code in seconds. The reality of professional AI-assisted development runs much deeper. Real production work with AI is messier, more nuanced, and demands rigorous thinking, but very effective.

This is not about magical code generation. It is about a new way of thinking about development. It requires substantial real-world development experience to do well: the onus is upon those of us with this experience to teach the next generation how to harness these tools effectively.

This is how I am doing it, what it all might mean, and how we can help others find the way.

Read more

Prompting Sucks (And What We Can Do About It)

Prompting sucks. If you have spent any time working with LLMs, you already know this.

It is not just that prompting is difficult - it is fundamentally broken as an approach to working with AI. It is brittle, model-specific, and endlessly repetitive.

Here is why we need to move beyond prompting, and what we can do about it.

Read more