Take a look at GPT4All, very user friendly
LocalLLaMA
Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.
Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.
As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.
Rules:
Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.
Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.
Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.
Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.
I like KoboldCpp. It is easy to set up and runs well with little resources.
With something like that, you should be able to fit a much larger and better model into your RAM. If you use the quantized versions. Look for models in GGUF format on Huggingface. Q4_K_M is a good compromise between size and quality.
Which model depends on your exact use-case. I like Mythomax-L2-13b or Llama2-13B-Tiefighter for roleplay, Mistral 7B (Dolphin 2.1 Mistral 7B) or Toppy-M for more factual things. All of those are uncensored.
Hope you had some success. Don't hesitate to ask if you have further questions.
As an alternative you could look at distributed/shared inferencing. There's https://horde.koboldai.net/ (which you probably know), and petals.dev
I haven't tested tho..