this post was submitted on 15 May 2026
184 points (99.5% liked)
Aneurysm Posting
3842 readers
232 users here now
For shitposting by people who can smell burnt toast.
Rules:
- Nothing promoting crypto, blockchain or NFTs.
- Nothing right wing.
- Nothing anti science.
- No tankie support.
- No TERFS.
- No porn.
- Must tag AI posts as such.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Semantic Vectors don't work that way.
Yeah, if words were actually encoded as 1-hot vectors this would be pretty trivial, but the rest of LLM training would be somewhere between infeasible and impossible. The actual embedding vectors obscure spelling even more.
Side note: last time I checked, current embedding vectors were approximately 40 dimensional... Has that gone up significantly in the last couple of years?
A fair bit. EmbeddingGemma is open weights and allows for 128-768 dimensions.
It's not as simple as more dimensions = better, due to size, efficiency, and context rot limitations though.
Introducing EmbeddingGemma: The Best-in-Class Open Model for On-Device Embeddings - Google Developers Blog - https://developers.googleblog.com/en/introducing-embeddinggemma/