18

So you don't have to click the link, here's the full text including links:

Some of my favourite @huggingface models I've quantized in the last week (as always, original models are linked in my repo so you can check out any recent changes or documentation!):

@shishirpatil_ gave us gorilla's openfunctions-v2, a great followup to their initial models: https://huggingface.co/bartowski/gorilla-openfunctions-v2-exl2

@fanqiwan released FuseLLM-VaRM, a fusion of 3 architectures and scales: https://huggingface.co/bartowski/FuseChat-7B-VaRM-exl2

@IBM used a new method called LAB (Large-scale Alignment for chatBots) for our first interesting 13B tune in awhile: https://huggingface.co/bartowski/labradorite-13b-exl2

@NeuralNovel released several, but I'm a sucker for DPO models, and this one uses their Neural-DPO dataset: https://huggingface.co/bartowski/Senzu-7B-v0.1-DPO-exl2

Locutusque, who has been making the Hercules dataset, released a preview of "Hyperion": https://huggingface.co/bartowski/hyperion-medium-preview-exl2

@AjinkyaBawase gave an update to his coding models with code-290k based on deepseek 6.7: https://huggingface.co/bartowski/Code-290k-6.7B-Instruct-exl2

@Weyaxi followed up on the success of Einstein v3 with, you guessed it, v4: https://huggingface.co/bartowski/Einstein-v4-7B-exl2

@WenhuChen with TIGER lab released StructLM in 3 sizes for structured knowledge grounding tasks: https://huggingface.co/bartowski/StructLM-7B-exl2

and that's just the highlights from this past week! If you'd like to see your model quantized and I haven't noticed it somehow, feel free to reach out :)

[-] noneabove1182@sh.itjust.works 12 points 3 months ago* (last edited 3 months ago)

Colour me intrigued. I want more manufactures that go against the norm. If they put out a generic slab with normal specs at an expected price, I won't be very interested, but if they do something cool I'm all for it

Except I just noticed the part where it's developed by Meizu so nevermind probably will be a generic Chinese phone

[-] noneabove1182@sh.itjust.works 15 points 3 months ago

Stop making me want to buy more graphics cards...

Seriously though this is an impressive result, "beating" gpt3.5 is a huge milestone and I love that we're continuing the trend. Will need to try out a quant of this to see how it does in real world usage. Hope it gets added to the lmsys arena!

18
submitted 3 months ago* (last edited 3 months ago) by noneabove1182@sh.itjust.works to c/localllama@sh.itjust.works

PolyMind is a multimodal, function calling powered LLM webui. It's designed to be used with Mixtral 8x7B + TabbyAPI and offers a wide range of features including:

Internet searching with DuckDuckGo and web scraping capabilities.

Image generation using comfyui.

Image input with sharegpt4v (Over llama.cpp's server)/moondream on CPU, OCR, and Yolo.

Port scanning with nmap.

Wolfram Alpha integration.

A Python interpreter.

RAG with semantic search for PDF and miscellaneous text files.

Plugin system to easily add extra functions that are able to be called by the model. 90% of the web parts (HTML, JS, CSS, and Flask) are written entirely by Mixtral.

15

Open source

Open data

Open training code

Fully reproducible and auditable

Pretty interesting stuff for embeddings, I'm going to try it for my RAG pipeline when I get a chance, I've not had as much success as I was hoping, maybe this english-focused one will help

[-] noneabove1182@sh.itjust.works 9 points 3 months ago

Yeah q2 logic is definitely a sore point, I'd highly recommend going with Mistral dolphin 2.6 DPO instead, the answers have been very high quality for a 7b model

But good info for anyone wanting to keep up to date on very low bit rate quants!

6

Thanks to Charles for the conversion scripts, I've converted several of the new internLM2 models into Llama format. I've also made them into ExLlamaV2 while I was at it.

You can find them here:

https://huggingface.co/bartowski?search_models=internlm2

Note, the chat models seem to do something odd without outputting [UNUSED_TOKEN_145] in a way that seems equivalent to <|im_end|>, not sure why, but it works fine despite outputting that at the end.

25

Based off of deepseek coder, the current SOTA 33B model, allegedly has gpt 3.5 levels of performance, will be excited to test once I've made exllamav2 quants and will try to update with my findings as a copilot model

16

Paper abstract:

Recent work demonstrates that, after being fine-tuned on a high-quality instruction dataset, the resulting model can obtain impressive capabilities to address a wide range of tasks. However, existing methods for instruction data generation often produce duplicate data and are not controllable enough on data quality. In this paper, we extend the generalization of instruction tuning by classifying the instruction data to 4 code-related tasks and propose a LLM-based Generator-Discriminator data process framework to generate diverse, high-quality instruction data from open source code. Hence, we introduce CodeOcean, a dataset comprising 20,000 instruction instances across 4 universal code-related tasks,which is aimed at augmenting the effectiveness of instruction tuning and improving the generalization ability of fine-tuned model. Subsequently, we present WaveCoder, a fine-tuned Code LLM with Widespread And Versatile Enhanced instruction tuning. This model is specifically designed for enhancing instruction tuning of Code Language Models (LLMs). Our experiments demonstrate that Wavecoder models outperform other open-source models in terms of generalization ability across different code-related tasks at the same level of fine-tuning scale. Moreover, Wavecoder exhibits high efficiency in previous code generation tasks. This paper thus offers a significant contribution to the field of instruction data generation and fine-tuning models, providing new insights and tools for enhancing performance in code-related tasks.

14
23

Available in instruct only currently:

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2

14

Early speculation is that it's an MoE (mixture of experts) of 8 7b models, so maybe not earth shattering like their last release but highly intriguing, will update with more info as it comes out

20
submitted 6 months ago* (last edited 6 months ago) by noneabove1182@sh.itjust.works to c/localllama@sh.itjust.works
34
18

LMSYS examines how improper data decontamination can lead to artificially inflated scores

[-] noneabove1182@sh.itjust.works 14 points 8 months ago

My biggest problem with vaping is that there's basically no distinction made between ecigarettes that this article addresses and vaping dry herbs.. would love to read up on it and any possible health concerns but rarely see it discussed

[-] noneabove1182@sh.itjust.works 9 points 10 months ago

For me it's gotta be immich, it replicates Google photos SO well and it's all local and self hosted, absolutely floored by how great it is

For browsing my photos on my device I use Aves which is also a great app, especially since it's the only app I've ever found that handles Sony burst format properly

[-] noneabove1182@sh.itjust.works 14 points 10 months ago

The article doesn't address it, maybe someone here can.. what does "consumed" mean? Where does the water go after it's used to cool? Surely it's reusable, right?

[-] noneabove1182@sh.itjust.works 9 points 10 months ago

Another question now, how do the hinges in the new foldables feel? We've had some good competition in that space so I'm hoping we see some refinement from Samsung this year. Which of the two would you most like to daily drive?

[-] noneabove1182@sh.itjust.works 24 points 10 months ago

Thanks for joining us here in our new home, so happy to have big names help to validate it!

I'd love to hear your thoughts on the watches, it feels like the 5 was kind of an incremental upgrade, and it's looking like 6 might be similar, anything that's not captured on the spec sheet that makes it a worthwhile upgrade?

[-] noneabove1182@sh.itjust.works 8 points 10 months ago* (last edited 10 months ago)

lollms-webui is the jankiest of the images, but that one's newish to the scene and I'm working with the dev a bit to get it nicer (main current problem is the requirement for CLI prompts which he'll be removing) Koboldcpp and text-gen are in a good place though, happy with how those are running

[-] noneabove1182@sh.itjust.works 14 points 10 months ago

for me it's painfully obvious when a phone is 60hz vs 120hz, i run mine at 120 and my wife doesn't care and runs at 60.. so yeah obviously some people just do not care or can't see it, others like me need it to be high refresh haha

[-] noneabove1182@sh.itjust.works 13 points 10 months ago* (last edited 10 months ago)

Yeah there's definitely been some aggregious recall issues, but the problem is the stats include minor things that only required a quick OTA, so it skews the numbers awkwardly and means we can't properly judge the real problems they had

If they separated the numbers, we might see that either Tesla has very few real recalls, Tesla actually does have a lot of real recalls but also happens to have software ones, or it's about normal

And without separating all we can do is guess

[-] noneabove1182@sh.itjust.works 39 points 10 months ago* (last edited 10 months ago)

100%, this number is skewed by the fact that tesla will basically "recall" for any minor issue because it's a simple software update, I imagine a lot of companies try to avoid recalls as aggressively and for as long as possible because it's a significantly bigger burden on them

I say this as someone who drives a Tesla but is still extremely judgemental of Tesla

view more: next ›

noneabove1182

joined 11 months ago
MODERATOR OF