this post was submitted on 17 May 2026
3 points (100.0% liked)

Qwen

68 readers
2 users here now

A community all about the Qwens! (LLMs, VLMs, WANs...)

Here their blog page and their free chat interface

Post are allowed to have any format.

It is advised to put "Qwen" into the title somewhere.

Da Rules

  1. please be nice <3 🧸
  2. no bigotry or general evil-doings please! πŸ’–
  3. no politics 🌏❌
  4. please don't make me add more rules <3

founded 1 year ago
MODERATORS
 

im so sleepy.

agents driven by language models (LMs) call functions to do stuff. Functions like these:

  • read_file(path, from, to)
  • write(path, content)
  • list_fir(path, show_hidden = false)
  • edit_file(path, old_string, new_string)

this is not a simplification btw.

so far, LMs were told to generate any of these formats to call a function:

  • json, which sucks cuz of c-escape
  • xml which sucks cuz of closing tags
  • yaml which sucks cuz of multilinear strings being indented (sucks for LMs)
  • just bash which sucks cuz of security

wow these all have problems with something, hm?

worst of all: we wanna save tokens wherever possible. so if an LM has to generate a full </parameter> for each argument in a function, that adds up quick

Introducing: my new format

wowie let's have a look at this format!

[write path="file.txt" content=

content of file here
horray newlines
no c escape! cool, i can regex all I want [\s\S]*

]

now isnt that simple?

  • no string escape problems
  • no xml-closing tags
  • no json-brace-foolery
  • no... | symbols for multilinear strings

now, of course, this is a new format. so language models suck at generating it, right?

WRONG

even a local 2-bit quant of qwen 3.6 35B-A3B aligned to it super easily.

and! even a dense Qwen3 4B model at Q4 quant worked with it flawlessly. I'm tired and need to sleep.

now congratulate me! say "horray wow ur such a genius ohmygod we are gonna save so many tokens and thus mΓΆney".

go, go head. im not gonna ask an LM to do it, that much is clear.

or, even better: tell me what SUCKS about this, im always open for critical feedback.

id rather be wrong than believe im right all the time.

you are viewing a single comment's thread
view the rest of the comments
[–] Smorty@lemmy.blahaj.zone 2 points 3 days ago (1 children)

thanks for sharing, but the goal here was not to make yet another key-value format, but to have natural feelingultiline-strings withinimal escaping.

thats why i settled for the code blocks: they tokenize well, are universally understoff as "this is some text", are easy to write and rarely ever have to deal with escape sequences.

in this case, the [ and ] also serve as tool-call delimiters, which would usually be some heavy XMS ones <functions> </functions>...thats 6 - 10 tokens down the drain for delimiters! >o< aaaaa

thank u for engaging with the post btw. i really appreciate it <3

[–] gandalf_der_12te@lemmy.blahaj.zone 1 points 1 day ago (1 children)

if you are looking to embed code into free-flowing text output, you can go with a format such as this:


this is some free-flowing text output from the LLM to write to a file.

> write("filename", "content")
> someotherfunction()

this is more free-flowing text

you write the commands with indentations (>) and basically ignore every other line. some languages do it that way. i think PHP uses opening tags to denote where code starts/ends, the rest is just ignored by the PHP interpreter.

[–] Smorty@lemmy.blahaj.zone 1 points 1 day ago

I... think u didnt read my comment.

for me, the point was to have natural feeling multiline strings, but yesyes, if those are not a concern, ur format very much rules ~