Hacker News — vinext + Cloudflare Workers

new
past
show
ask
show
jobs
submit

▲Show HN: Optio – Orchestrate AI coding agents in K8s to go from ticket to PR (github.com)

30 points by jawiggins 11 hours ago | 18 comments

naultic 3 hours ago [-]

I'm working on something a little similar but mines more a dev tool vs process automation but I love where yours is headed. The biggest issue I've run into is handling retries with agents. My current solution is I have them set checkpoints so they can revert easily and when they can't make an edit or they can't get a test passing, they just restart from earlier state. Problem is this uses up lots of tokens on retries how did you handle this issue in your app?

jawiggins 3 hours ago [-]

Generally I've found agents are capable of self correcting as long as they can bash up against a guardrail and see the errors. So in optio the agent is resumed and told to fix any CI failures or fix review feedback.

denysvitali 4 hours ago [-]

FWIW, a "cheaper" version of this is triggering Claude via GitHub Actions and `@claude`ing your agents like that. If you run your CI on Kubernets (ARC), it sounds pretty much the same

MrDarcy 5 hours ago [-]

Looks cool, congrats on the launch. Is there any sandbox isolation from the k8s platform layer? Wondering if this is suitable for multiple tenants or customers.

jawiggins 5 hours ago [-]

Oh good question, I haven't thought deeply about this.

Right now nothing special happens, so claude/codex can access their normal tools and make web calls. I suppose that also means they could figure out they're running in a k8s pod and do service discovery and start calling things.

What kind of features would you be interested in seeing around this? Maybe a toggle to disable internet connections or other connections outside of the container?

antihero 5 hours ago [-]

And what stops it making total garbage that wrecks your codebase?

jawiggins 5 hours ago [-]

There are a few things:

a) you can create CI/build checks that run in github and the agents will make sure pass before it merges anything

b) you can configure a review agent with any prompt you'd like to make sure any specific rules you have are followed

c) you can disable all the auto-merge settings and review all the agent code yourself if you'd like.

kristjansson 4 hours ago [-]

> to make sure

you've really got to be careful with absolute language like this in reference to LLMs. A review agent provides no guarantees whatsoever, just shifts the distribution of acceptable responses, hopefully in a direction the user prefers.

jawiggins 4 hours ago [-]

Fair, it's something like a semantic enforcement rather than a hard one. I think current AI agents are good enough that if you tell it, "Review this PR and request changes anytime a user uses a variable name that is a color", it will do a pretty good job. But for complex things I can still see them falling short.

SR2Z 1 hours ago [-]

I mean, having unit tests and not allowing PRs in unless they all pass is pretty easy (or requiring human review to remove a test!).

A software engineer takes a spec which "shifts the distribution of acceptable responses" for their output. If they're 100% accurate (snort), how good does an LLM have to be for you to accept its review as reasonable?

upupupandaway 5 hours ago [-]

Ticket -> PR -> Deployment -> Incident

abybaddi009 2 hours ago [-]

Does this support skills and MCP?

conception 4 hours ago [-]

What’s the most complicated, finished project you’ve done with this?

jawiggins 4 hours ago [-]

Recently I used to to finish up my re-implementation of curl/libcurl in rust (https://news.ycombinator.com/item?id=47490735). At first I started by trying to have a single claude code session run in an iterative loop, but eventually I found it was way to slow.

I started tasking subagents for each remaining chunk of work, and then found I was really just repeating the need for a normal sprint tasking cycle but where subagents completed the tasks with the unit tests as exit criteria. So optio came to my mind, where I asked an agent to run the test suite, see what was failing, and make tickets for each group of remaining failures. Then I use optio to manage instances of agents working on and closing out each ticket.

hmokiguess 4 hours ago [-]

the misaligned columns in the claude made ASCII diagrams on the README really throw me off, why not fix them?

| | | |

jawiggins 4 hours ago [-]

Should be fixed now :)

rafaelbcs 5 hours ago [-]

[dead]

QubridAI 5 hours ago [-]

[flagged]

knollimar 5 hours ago [-]

I don't want to accuse you of being an LLM but geez this sounds like satire

weird-eye-issue 3 hours ago [-]

It's AI.

Rendered at 04:38:28 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.