What's Ralph and why do you care?
Ralph is all about automating the code generating process. You can use it to build small applications while you eat your lunch and build bigger applications while you sleep. Apart from the initial setup, the skills required are mostly those of a product manager; specifically, the ability to write a detailed requirements document.
Why do we need another Ralph blog post?
I found it hard to get going with Ralph because the existing content was either too theoretical or not practical enough. I figured it out in the end, but I thought I could write something to help other people get going faster, so that's what you're reading.
The what and why of Ralph
LLMs have a limited context window, which means they can only do a limited amount of reasoning. In turn, this means LLMs have problems generating code for large or complex projects. In my experience, once the prompt gets beyond a page or two, the quality falls off and code gen starts to miss things. The net result is, you need to have a human in the loop to code or to prompt; the human spots places where code gen has failed and prompts the LLM to fix the issues.
Ralph solves the problem by slicing the whole project into "bite-size requirements" with acceptance criteria after each requirement. If code gen for a requirement doesn't meet its acceptance criteria, Ralph tries again. In this way. it constructs the project step-by-step until it's built all the requirements and so delivers the complete project. The entire process is automated and there's no human involvement.
The Ralph loop gets its name from The Simpson's character Ralph Wiggum. If you've never watched The Simpson's, here's what you need to know: Ralph is well-meaning, but intellectually slow. Imagine you're instructing Ralph on how to build something. You'd break down the project into chunks and have Ralph run tests to make sure each chunk was correct before moving onto the next chunk. Ralph would build the project piece-by-piece until the whole thing was finished. This way might be slow, but you'd get it done right.
AIs and CLI
To get Ralph to work, you'll need to install a code gen CLI on you local machine. The most common tutorials I've seen on the web use the Claude CLI, so install this if you don't have an existing code gen solution. I got Ralph working with Cursor via the Cursor CLI, so I know that works too. Whatever AI you choose, you'll need an active subscription; you're not going to do this for free.
Skills
Next up, you'll need to install a skills file for your LLM. If this were a normal blog post, I'd tell you exactly where to go to get the skills file, but I'm not going to do that. The Ralph world is changing so quickly, any links I give you will be out of date by the time you read this. You'll need to search to get the latest version of the Ralph skills file you need.
(Skills enable code generation tools to do specific tasks. If you don't know what a skills file is in the context of a code-generating LLM, take some time to find out before moving ahead.)
Git for the LLM to use
As I'll explain later, Ralph uses git, so you'll need a git account and you'll need to create a repo for this project. I used my GitHub account, so I know GitHub works fine for this.
The Product Requirements Document (PRD)
This is where the fun starts. You need to write a Products Requirements Document using Markdown. The PRD lists all the requirements, each requirement being a "bite-sized chunk". Here's an excerpt from a PRD.md file on my system.
MBTA-002: How the app appears to users
Description
- The app will consist of three pages: "trains & alerts", "map & facilities", and "about".
- It will be possible for the user to easily navigate between pages (e.g. using a tab control or buttons).
Acceptance criteria
- There are three pages on the app: "trains & alerts", "map & facilities", and "about".
- On each page, the user can navigate to the other pages using a control, e.g. a tab control or buttons.
Here's what's going on
- The PRD consists of multiple sections like this one. Each section is a "bite-sized chunk" of functionality the LLM can generate code for. Think of the sections as individual requirements.
- The section (or requirement) title includes an ID (MBTA-002) and a descriptive title.
- The Description sub-section contains bullet points that describe the functionality you want. Remember, the point of the Ralph loop is to keep things simple, so keep the sub-section short.
- The Acceptance criteria sub-section states the criteria the generated code must pass. If the code passes, the LLM moves onto the next requirement. If it doesn't pass, it repeats the code generation process (there's more to this I'll discuss later).
Anyone with good Product Management skills should be able to quickly build a PRD like this.
(In practice, the Acceptance criteria sub-section looks a lot like the Description sub-section. What I do is write up the Description sub-section, then ask my LLM to add acceptance criteria based on my Description. I then add in any new acceptance criteria I can think of.)
PRD.md to JSON
The Ralph loop processes a JSON file, so the next step is the production of a JSON file from the PRD.md file. This is done using the skill you installed earlier. It's a simple call to a bash script; on my Cursor installation, the script is called convert.sh.
The output is a long JSON file consisting of multiple records. Each record is a requirement taken from the PRD.md file. Here's the JSON record for the requirement in the previous section.
{
"id": "2.1",
"category": "ui",
"story": "Build base template with BosWay branding and navigation between three pages.",
"steps": [
"Header shows BosWay and page context e.g. BosWay - about (MBTA-001).",
"Add tabs or buttons to switch trains & alerts, map & facilities, about (MBTA-002)."
],
"acceptance": "Three routes work; every page can reach the other two; titles consistent with PRD.",
"priority": 3,
"passes": false,
"notes": ""
},
This JSON record is so important, I'm going to ask you to take a closer look at it. You can see the Description and the Acceptance criteria here, albeit worded differently. The other three sections to look at are priority, passes, and notes.
- priority tells the Ralph loop what to work on next (start with the highest priority and working down).
- passes. This starts as false. If the LLM successfully implements the requirement, it sets this value to true.
- notes. This contains notes for the LLM on the next pass through the loop. Let's say the loop fails the Acceptance criteria, the notes field will contain details on the failure. On the next pass of the loop, the LLM uses these notes to try and do better. What generates these notes? The LLM.
Once you've generated the JSON file, you're ready to run the Ralph loop.
The Ralph loop
The Ralph loop takes the JSON file as input and processes the requirements one-by-one, starting with the most important. The bash file to do it is called start.sh on my system and it's a little complex. I'll talk through how it works at a high level, leaving out some advanced bits.
Before starting the loop, the code performs various checks, e.g. the JSON file exists, the git settings are correct and so on.
The script then moves onto the Ralph loop. Because the Ralph loop does a lot, I'm going break it down piece-by-piece.
- On each trip round the loop, the code starts with some checks. It checks if the process is rate-limited on the AI API or if there are other reasons why it can't continue.
- From the JSON file, it reads the requirement with the highest priority where passes is false.
- It passes this requirement to the AI API along with the current git code version.
- The AI generates code, or changes the existing code, to meet the requirement.
- The AI generates tests based on the acceptance criteria.
- If the tests pass, the AI updates the JSON passes field to true.
- If the tests fail, the AI may update the notes field to provide a hint how to do better next time round. (Remember, the passes field is false by default so it doesn't change the value if the loop fails.)
- The loop saves the generated code to a local git branch.
In the loop, there are some more advanced bits and pieces I'm going to briefly mention here that might be important to you:
- There are API call timeouts.
- You can set a maximum number of iterations to prevent the loop getting stuck and burning through your tokens.
- There's a circuit breaker that can stop the loop if zero files are changed or if the same error is detected on multiple loops.
- You can set a rate limit to prevent the LLM provider from banning you.
When the Ralph loop finishes, you should have the code for your project. In practice, you'll need to tweak what you get back, but in my experience, you'll be very close.
How long the loop takes depends on the thoroughness of your PRD and the size of your project. As a general rule of thumb, a smallish project (e.g. building an interactive web app based on a simple data source) might take an hour.
Cost!
Ralph burns through API calls. Most LLM providers will give you a limited number of API calls per month which is separate from your token allocation. Even one Ralph project can burn through your entire API allocation. The bottom line is, Ralph can be an expensive thing to play with (low hundreds of USD to properly experiment). I suggest you think carefully about your projects and test Ralph in a considered way.
The reality
I've made it sound like the Ralph process is quite smooth. Right now, it isn't, there are bumps along the way, for example, the setup process is a little complicated, the Ralph loop reporting needs a bit of user-friendly tweaking, the online descriptions aren't as helpful as they should be, and so on.
BUT.
It works and it works well.
My experience
It took me some effort to get Ralph up and running, but once I figured it out, it blew me away. It built an entire project without human intervention and it got it nearly right. Importantly, I realized the bits it missed were gaps in the PRD. In other words, I needed a better spec.
The Ralph loop changes the balance of skills in favor of a more detailed up-front spec that anyone with product management skills can write.
That's quite a profound change.
My recommendations
I do recommend you try a Ralph loop for yourself and I have some suggestions for making your experimentation easier.
- Allocate enough setup time to install skills etc. This can be frustrating, so be prepared.
- Choose a project you've done before. This means you know what the end result should be.
- Write a very detailed PRD as described above. Use an LLM to add acceptance criteria and add some of your own. Thoroughness here is key.
- Run the Ralph loop.
- Compare the Ralph results to your prior results.
Good luck!
No comments:
Post a Comment