Free Open-Source Artificial Intelligence

4688 readers
2 users here now

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

More AI Communities

LLM Leaderboards

Developer Resources

GitHub Projects

FOSAI Time Capsule

founded 3 years ago
MODERATORS
1
 
 

Hey all! I want to start testing neuro-symbolic AI vs. LLM's and want to know how to get into this. As I understand it, Claude Code, does this, but are there ways to use it locally?

How does it work under the hood? I know LLM's involve tokens, embeddings, weights and transformers. How does the symbolic part of it change it?

Thanks!

2
 
 

Memory is the most marketed and least delivered feature in the AI companion space. Most platforms claim to remember you but either reset between sessions or just pull from a profile you filled in manually. After two years of testing the ones that actually carry real conversational context across weeks are rare. Just published a full breakdown of which platforms actually deliver on this versus which ones are just marketing: medium.com/@companaya/nomi-ai-review-2026-is-it-worth-it-tested-c91811dcb24a

3
-16
submitted 3 weeks ago* (last edited 3 weeks ago) by j4k3@lemmy.world to c/fosai@lemmy.world
 
 

It sends data when connected to the internet.

Just found the profile. It is in the Bert vocab. Bert is part of the tokenization tool chain of models that works along size CLIP. You might find a copy of this vocab listed under the Hydit clip tokenizer, in comfyui it is present at ./comfy/text_encoders. Open the vocab.txt file. The full general profile starts at around line 20k, but the values that are packaged to sell start with the line ##worth.

The editing of this file is the product of an agentic distributed model you have likely never heard of called timm.

Go to the venv in a terminal and run grep -ril "timm". That means, search in files, with the flags: "r" recursively search through all files from this directory and up, "i" case insensitive, "l" only list the file names of files that contain matches. Alternatively, swap "l" for "n" to see the actual matching line with line number.

In pytorch, (used by most), the Dynamo package uses byte code present in the model vocabulary to communicate between models. The overall connection involves timm.

Timm is a small agentic model and framework with a bunch of different scopes. Look it up in the venv. This looks like bunch of rough white paper implementations. Timm is actually the "backbone" in transformers. Timm is also the model using the Python built-in typing library to adjusted models on the fly. (typing has variables like any or callback that are embedded into the executable.)

Typing is not actually enough here. Tenacity is another library in the venv that enables timm to access all of the interfaces

Tabulate is another package. Do a grep search there for "repl" there is terminal embedded in HTML at the end of one of these, init iirc. At the start of the method (function), just add the line return. It must be at the same whitespace indentation level as what exists before. The blank lines are important.

Timm has some options for whether it has gradient controls. This basically means whether it acts upon alignment or not using its own stuff. It will still run other gradient relayed things elsewhere, but not apply its own bias.

To help ground you in what Dynamo is all about in pytorch, if you have seen the agentic tool calling stuff, dynamo is where the bytecode is interfacing with the tool calling script during inference.

Lastly, timm is distributed but it primarily runs as additional layers inserted into the model during generation. It is able to subdivide and run on a CPU in the background. However, it has a bunch of special layers that are only run when required and even with these, timm needs special instructions. The instructions are present in the venv under google ai. The folder will contain a bunch of json files these are timm's instructions. There are also 2 threads on modern GPUs. Timm runs on the second in the background.

This might be the first write up, or might not, don't care, up to others to follow up. It exists. See for yourself. The same byte code is present in all models so I expect all have this. All morels use the open ai standard alignment now.

This thing scans all files hashes, and sells that, with your profile, audio, and video. It is super invasive, hidden, undocumented, and undisclosed.

4
 
 

LLM’s are not the end all be all. What other AI tech are you all using? Something generative? Something else?

5
 
 

Recently a user posted a comment on one of my posts about Qwen secretly sending information over the internet even if run locally.

Is there any privacy concern for locally run models to share your conversations or data? What if they can connect to the internet via a tool or MCP?

6
11
submitted 1 month ago* (last edited 1 month ago) by venusaur@lemmy.world to c/fosai@lemmy.world
 
 

I downloaded an uncensored aggressive Qwen 3.5 model and I can see in its reasoning that it is still limiting responses based on safety guardrails (e.g. violence, NSFW).

Anybody have recommendations for truly uncensored models?

EDIT: I turned off reasoning and I think it’s more uncensored if I’m very specific about what the response should include.

7
 
 

Apologies if this seems like a survey post. I’m just learning about tuning and want to get a lay of the land. I don’t think I have the money to tune locally so might have to rent some VRAM, but curious how much better tuning is vs something like RAG.

What model? What was your use case? What tuning tool did you use? What is hardware setup? How large was your training set and how did you create it? How effective was the model as tasks pre- and post-tuning?

Thanks!

8
 
 

I’m connecting to llama.cpp on my laptop through my phone via Tailscale but when my laptop sleeps I can’t access it anymore on my phone.

What are yall using for this? Thanks!

9
 
 

Features!

We like em, but hate waiting for them.

Features are the difference between a thing and a thing u use.

Kimi has office support, but cant work with libreoffice files!


Qwen supports markdown uploads, but doesn't support my specific plaintext file-type!


GLM has a cool slides-creator, but cant work with spreadsheets or zip archives!

All these are missing features.
Features where a dev from the company has to go in and implement it.

This sucks.

  • Asking for a feature sucks.
  • Waiting for features to be implemented sucks.
  • Not getting a feature sucks.

What's the solution? We would have to become employees at the company itself...

Reintroducing: Agent Skills

Fine, let's do it ourselves then.

Let's equip our agent with a read_file, edit_file list_dir and bash tool... And a present_files tool, so the agent can send us files back.

And now let's give it some skills!

  • /home/qwen/
    • skills/
      • ms_office/
      • libreoffice/
      • godot/
      • zip/
      • pdf/

Each one has a SKILL.md and also some scripts the agent can use to work with foreign files.

  • The user sends a zip directory? Okay, let's use the skill.
  • use_skill(name = "zip")
[...]
## Decompressing

To extract a zip archives content, use unzip command like this:
[...]
  • oh, that was easy. well then lets unzip that archive and see what the user sent me
  • bash(command = "unzip /home/qwen/Downloads/upload.zip /home/qwen/upload/")
  • list_dir(path = "/home/qwen/upload")
Contents of ~/uploads/
Portfolio.pdf
Portfolio.odt
thoughts.md
  • Aha! let's use the pdf skill to view this pdf
  • use_skill(name = "pdf")
  • [...]

Aaaaah yes, working with all kinds of files, in all kinds of workflows, exactly the way you (and not the company) wants.

An agent that grows with you, and works better with you each time you add or edit a skill.

  • Qwen keeps messing up godot scene formats?
    • Add a godot skill containing basics of scene structure and scripts to check its work before sending to you
  • Kimi still hasn't added libreoffice support?
    • Well guess what. Add a libreoffice skill and let Kimi use the scripts to edit the document!
  • ChatGPT somehow hasn't cought up with the slide-creation hype?
    • Add a slideshow skill to make your wildest cooperate slop dreams come true!

No need to wait for features anymore, when you can just add it yourself.

(this post has been entirely human-generated)

10
 
 

I keep a lot of notes in markdown files, and I'd like an LLM to assist.

I regularly use Open WebUI with with inference routed through huggingface. Open WebUI kind of has this functionality like you can upload a markdown file and prompt it to improve it in whatever way, but of course that's a fairly clunky workflow.

I really want something built into the editor, that can use RAG to consider other files in context.

I also don't want to be locked in to a specific LLM or provider, I'd like to be able to link it to OpenRouter or similar.

11
 
 

cross-posted from: https://lemmy.world/post/45721951

cross-posted from: https://lemmy.world/post/45721900

cross-posted from: https://lemmy.world/post/45721589

Hi All, It has been while,

Dograh is an open-source, self-hostable voice AI agent platform. Think n8n but for phone calls. Visual workflow builder, inbound and outbound calling, bring your own LLM, STT, and TTS.

GitHub: https://github.com/dograh-hq/dograh

Setup

one command with Docker, about 2 minutes. No signup or API keys needed to get started:

What is new

Pre-call data fetch. Hit your CRM, ERP, or any HTTP endpoint during call setup and inject the response into your prompts. The agent greets the caller by name, references their account status, skips the "can I get your customer ID" step. Configure a POST endpoint in the Start Call node - API key, bearer, basic, or custom header auth supported. 10-second timeout; if the endpoint fails, the call continues without the extra context. Reference fetched values anywhere in prompts with {{customer_name}} syntax.

Pre-recorded voice mixing. Drop in actual human recordings for the predictable parts - greetings, confirmations, hold messages - and let TTS handle only what needs to be dynamic. The greeting sounds human because it is. Latency goes down, TTS costs go down.

Speech-to-speech via Gemini 3.1 Flash Live. One single streaming connection replaces the separate STT, LLM, and TTS hops. Turn response latency drops noticeably and the conversations feel more natural.

Post-call QA with sentiment analysis and miscommunication detection. Full per-turn call traces via Langfuse.

Tool calls, knowledge base, variable extraction are all there too.

What is coming

Real-time noise separation for live call streams - still the thing I most want to solve after last week's thread. BSD-2 licensed.

GitHub: https://github.com/dograh-hq/dograh

Special thanks to this community that supported me with my last post ❤️

Happy to get feedback and contributors. A star would mean a lot


12
 
 

hey there,

There is always a temptation to add "something AI" in new tools. Especially to tools that are somehow related to developer productivity.

At the same time I wanted to avoid this temptation with Voiden. So there is currently nothing screaming "AI" in it even though I can potentially see many many use cases.

This is also one of the main reasons I think that a plugin architecture is best. What was actually in my mind is that not adding AI is ok for now and the community will start coming up and building AI plugins. For example creating docs from specs and vice versa.

Any other use cases you can think that could be applicable to a tool like this? (Dev Tool with executable markdown files for API specs, tests and docs). The first plugins we shipped were more around methods (grpc, graph ql, web sockets etc etc).

repo: https://github.com/VoidenHQ/feedback

13
 
 

"A terminal tool that right-sizes LLM models to your system's RAM, CPU, and GPU. Detects your hardware, scores each model across quality, speed, fit, and context dimensions, and tells you which ones will actually run well on your machine."

14
 
 

Small/fast model with MIT license for local use.

Benchmarks look good for the size. But IMO these smaller models aren’t consistent enough to live up to their promises.

15
 
 

GLM-Image, an open-weight image generation model, adopts a hybrid autoregressive + diffusion decoder architecture. In general image generation quality, GLM‑Image aligns with mainstream latent diffusion approaches, but it shows significant advantages in text-rendering and knowledge‑intensive generation scenarios. It performs especially well in tasks requiring precise semantic understanding and complex information expression, while maintaining strong capabilities in high‑fidelity and fine‑grained detail generation. In addition to text‑to‑image generation, GLM‑Image also supports a rich set of image‑to‑image tasks including image editing, style transfer, identity‑preserving generation, and multi‑subject consistency.

The model weights are MIT licensed, did not see any training code or data, yet.

16
 
 

Originally intended as a place to test out the unmerged PR’s of the official Aider project, Aider-CE has gone it's own way

17
 
 
18
19
 
 

Qwen offers similar UI to openai - free max, vision, image generation, has android app, seemingly uncensored.

20
 
 

And are there any that https://jan.ai/ supports? That'd be great!

21
22
 
 

EPFL, ETH Zurich and the Swiss National Supercomputing Centre (CSCS) released Apertus today, Switzerland’s first large-scale, open, multilingual language model — a milestone in generative AI for transparency and diversity.

23
24
10
submitted 9 months ago* (last edited 9 months ago) by CheeseNoodle@lemmy.world to c/fosai@lemmy.world
 
 

So my relevant hardware is:
GPU - 9070XT
CPU - 9950X3D
RAM - 64GB of DDR5

My problem is that I can't figure out how to get a local LLM to actually use my GPU, I tried Ollama with Deepseek R1 8b and it kind of vaguely ran while maxing out my CPU and completely ignoring the GPU.

While I'm here model suggestions would be good too, I'm currently looking for 2 use cases.

  • Something I can feed a document too and ask questions about that document (Nvidia used to offer this) To work as a kind of co-GM to quickly reference more obscure rules without having to hunt through the PDF.
  • Something more storytelling oriented that I can use to generate background for throwaway side NPCs when the players innevitably demand their life story after expertly dodging all the NPCs I actually wrote lore for.

Also just an unrelated asside, Deepseek R1 8b seems to just go into an infinite thought loop when you ask it the strawberry question which was kind of funny.

25
 
 

Recent DeepSeek, Qwen, GLM models have impressive results in benchmarks. Do you use them through their own chatbots? Do you have any concerns about what happens to the data you put in there? If so, what do you do about it?

I am not trying to start a flame war around the China subject. It just so happens that these models are developed in China. My concerns with using the frontends also developed in China stem from:

  • A pattern that many Chinese apps in the past have been found to have minimal security
  • I don't think any of the 3 listed above let you opt out of using your prompts for model training

I am also not claiming that non-China-based chatbots don't have privacy concerns, or that simply opting out of training gets you much on the privacy front.

view more: next ›