im so sleepy.
agents driven by language models (LMs) call functions to do stuff. Functions like these:
- read_file(path, from, to)
- write(path, content)
- list_fir(path, show_hidden = false)
- edit_file(path, old_string, new_string)
this is not a simplification btw.
so far, LMs were told to generate any of these formats to call a function:
- json, which sucks cuz of c-escape
- xml which sucks cuz of closing tags
- yaml which sucks cuz of multilinear strings being indented (sucks for LMs)
- just bash which sucks cuz of security
wow these all have problems with something, hm?
worst of all: we wanna save tokens wherever possible. so if an LM has to generate a full </parameter> for each argument in a function, that adds up quick
Introducing: my new format
wowie let's have a look at this format!
[write
path="file.txt"
content=
content of file here
horray newlines
no c escape! cool, i can regex all I want [\s\S]*
]
now isnt that simple?
- no string escape problems
- no xml-closing tags
- no json-brace-foolery
- no... | symbols for multilinear strings
now, of course, this is a new format. so language models suck at generating it, right?
WRONG
even a local 2-bit quant of qwen 3.6 35B-A3B aligned to it super easily.
and! even a dense Qwen3 4B model at Q4 quant worked with it flawlessly. I'm tired and need to sleep.
now congratulate me! say "horray wow ur such a genius ohmygod we are gonna save so many tokens and thus möney".
go, go head. im not gonna ask an LM to do it, that much is clear.
or, even better: tell me what SUCKS about this, im always open for critical feedback.
id rather be wrong than believe im right all the time.
that is a need, I'm gonna stop playing modded minecraft and work on this instead. it sounds like a lot of fun!