1
61
submitted 6 months ago by ylai@lemmy.ml to c/fosai@lemmy.world
2
27
submitted 6 months ago by ylai@lemmy.ml to c/fosai@lemmy.world
3
10
4
9
mistral-8x7b-chat (huggingface.co)

A very capable chat model built on top of the new Mistral MoE model, trained on the SlimOrca dataset for 1 epoch, using QLoRA.

5
4
6
12
submitted 7 months ago by anonymoose@lemmy.ca to c/fosai@lemmy.world

cross-posted from: https://lemmy.ca/post/10994517

Sorry if this isn't relevant to the community, but couldn't think of anywhere better to post. I saw something curious in my RSS comics feed last night for the Abstruse Goose comic. The author is fairly prolific and used to post comics based on math, technology, etc. His site and archive of comics has now been replaced with a single cryptic message:

"AGI will not be designed by humans. It will be evolved through relentless evolutionary computational processes designed by humans."

Very curious! Anybody have any theories on what is going on? I can't imagine what his motivation might be :)

7
4

Large language models (LLMs) exhibit amazing performance on a wide variety of tasks such as text modeling and code generation. However, they are also very large. For example Llama 2 70B has 70 billion parameters that require 140GB of memory to store in half precision. This presents many challenges, such as needing multiple GPUs just to serve a single LLM. To address these issues, researchers have developed compression methods that reduce the size of models without destroying performance.

One class of methods, post-training quantization, compresses trained model weights into lower precision formats to reduce memory requirements. For example, quantizing a model from 16 bit to 2 bit precision would reduce the size of the model by 8x, meaning that even Llama 2 70B would fit on a single 24GB GPU. In this work, we introduce QuIP#, which combines lattice codebooks with incoherence processing to create state-of-the-art 2 bit quantized models. These two methods allow QuIP# to significantly close the gap between 2 bit quantized LLMs and unquantized 16 bit models.

Project Page: https://cornell-relaxml.github.io/quip-sharp/

Code: https://github.com/Cornell-RelaxML/quip-sharp

8
48
submitted 7 months ago* (last edited 7 months ago) by ylai@lemmy.ml to c/fosai@lemmy.world

Nitter “original” with magnet link: https://nitter.net/MistralAI/status/1733150512395038967

9
10
submitted 7 months ago* (last edited 7 months ago) by ylai@lemmy.ml to c/fosai@lemmy.world
10
13
11
2
12
8
Fine Tuning Mistral 7B on Magic the Gathering Draft (generallyintelligent.substack.com)

cross-posted from: https://derp.foo/post/467324

There is a discussion on Hacker News, but feel free to comment here as well.

13
12
14
18
submitted 7 months ago by tinwhiskers@lemmy.world to c/fosai@lemmy.world
15
12
submitted 7 months ago* (last edited 7 months ago) by Even_Adder@lemmy.dbzer0.com to c/fosai@lemmy.world

SeamlessM4T


SeamlessM4T is our foundational all-in-one Massively Multilingual and Multimodal Machine Translation model delivering high-quality translation for speech and text in nearly 100 languages.

SeamlessM4T models support the tasks of:

  • Speech-to-speech translation (S2ST)
  • Speech-to-text translation (S2TT)
  • Text-to-speech translation (T2ST)
  • Text-to-text translation (T2TT)
  • Automatic speech recognition (ASR)

🌟 We are releasing SemalessM4T v2, an updated version with our novel UnitY2 architecture. This new model improves over SeamlessM4T v1 in quality as well as inference latency in speech generation tasks.

To learn more about the collection of SeamlessM4T models, the approach used in each, their language coverage and their performance, visit the SeamlessM4T README or 🤗 Model Card

Code: https://github.com/facebookresearch/seamless_communication

16
13
submitted 7 months ago* (last edited 7 months ago) by Even_Adder@lemmy.dbzer0.com to c/fosai@lemmy.world

SDXL-Turbo is a fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation. A real-time demo is available here: http://clipdrop.co/stable-diffusion-turbo

Key Takeaways:

  • SDXL Turbo achieves state-of-the-art performance with a new distillation technology, enabling single-step image generation with unprecedented quality, reducing the required step count from 50 to just one.

  • See our research paper for specific technical details regarding the model’s new distillation technique that leverages a combination of adversarial training and score distillation.

  • Download the model weights and code on Hugging Face, currently being released under a non-commercial research license that permits personal, non-commercial use.

  • Test SDXL Turbo on Stability AI’s image editing platform Clipdrop, with a beta demonstration of the real-time text-to-image generation capabilities

Model weights and code: https://huggingface.co/stabilityai/sdxl-turbo

Demo: https://clipdrop.co/stable-diffusion-turbo

Paper: https://stability.ai/research/stability-ai-adversarial-diffusion-distillation

Blogpost: https://stability.ai/news/stability-ai-sdxl-turbo

17
10
submitted 7 months ago by j4k3@lemmy.world to c/fosai@lemmy.world

I'm curious what it is doing from a top down perspective.

I've been playing with a 70B chat model that has several datasets on top of Llama2. There are some unusual features somewhere in this LLM and I am not sure what was trained versus (unusual layers?). The model has built in roleplaying stories I've never seen other models perform. These stories are not in the Oobabooga Textgen WebUI. The model can do stuff like a Roman Gladiator, and some NSFW stuff. These are not very realistic stories and play out with the depth of a child's videogame. They are structured rigidly like they are coming from a hidden system context.

Like with the gladiators story it plays out like Tekken on the original PlayStation. No amount of dialogue context about how real gladiators will change the story flow. Like I tried modifying by adding how gladiators were mostly nonlethal fighters and showmen more closely aligned with the wrestler-actors that were popular in the 80's and 90's, but no amount of input into the dialogue or system contexts changed the story from a constant series of lethal encounters. These stories could override pretty much anything I added to system context in Textgen.

There was one story that turned an escape room into objectification of women, and another where name-1 is basically like a Loki-like character that makes the user question what is really happening by taking on elements in system context but changing them slightly. Like I had 5 characters in system context and it shifted between them circumstantially in a story telling fashion that was highly intentional with each shift. (I know exactly what a bad system context can do, and what errors look like in practice, especially with this model. I am 100% certain these are either (over) trained or programic in nature. Asking the model to generate a list of built in roleplaying stories creates a similar list of stories the couple of times I cared to ask. I try to stay away from these "built-in" roleplays as they all seem rather poorly written. I think this model does far better when I write the entire story in system context. One of the main things the built in stories do that surprise me is maintaining a consistent set of character identities and features throughout the story. Like the user can pick a trident or gladius, drop into a dialogue that is far longer than the batch size and then return with the same weapon in the next fight. Normally, I expect that kind of persistence would only happen if the detail was added to the system context.

Is this behavior part of some deeper layer of llama.cpp that I do not see in the Python version or Textgen source, like is there an additional persistent context stored in the cache?

18
6
submitted 7 months ago* (last edited 7 months ago) by Even_Adder@lemmy.dbzer0.com to c/fosai@lemmy.world

Abstract

The Large Vision-Language Model (LVLM) has enhanced the performance of various downstream tasks in visual-language understanding. Most existing approaches encode images and videos into separate feature spaces, which are then fed as inputs to large language models. However, due to the lack of unified tokenization for images and videos, namely misalignment before projection, it becomes challenging for a Large Language Model (LLM) to learn multi-modal interactions from several poor projection layers. In this work, we unify visual representation into the language feature space to advance the foundational LLM towards a unified LVLM. As a result, we establish a simple but robust LVLM baseline, Video-LLaVA, which learns from a mixed dataset of images and videos, mutually enhancing each other. Video-LLaVA achieves superior performances on a broad range of 9 image benchmarks across 5 image question-answering datasets and 4 image benchmark toolkits. Additionally, our Video-LLaVA also outperforms Video-ChatGPT by 5.8%, 9.9%, 18.6%, and 10.1% on MSRVTT, MSVD, TGIF, and ActivityNet, respectively. Notably, extensive experiments demonstrate that Video-LLaVA mutually benefits images and videos within a unified visual representation, outperforming models designed specifically for images or videos.

Paper: https://arxiv.org/abs/2311.10122

Code: https://github.com/PKU-YuanGroup/Video-LLaVA

Demo: https://huggingface.co/spaces/LanguageBind/Video-LLaVA

19
6
StyleTTS 2 (styletts2.github.io)
submitted 7 months ago* (last edited 7 months ago) by Even_Adder@lemmy.dbzer0.com to c/fosai@lemmy.world

Abstract

In this paper, we present StyleTTS 2, a text-to-speech (TTS) model that leverages style diffusion and adversarial training with large speech language models (SLMs) to achieve human-level TTS synthesis. StyleTTS 2 differs from its predecessor by modeling styles as a latent random variable through diffusion models to generate the most suitable style for the text without requiring reference speech, achieving efficient latent diffusion while benefiting from the diverse speech synthesis offered by diffusion models. Furthermore, we employ large pre-trained SLMs, such as WavLM, as discriminators with our novel differentiable duration modeling for end-to-end training, resulting in improved speech naturalness. StyleTTS 2 surpasses human recordings on the single-speaker LJSpeech dataset and matches it on the multispeaker VCTK dataset as judged by native English speakers. Moreover, when trained on the LibriTTS dataset, our model outperforms previous publicly available models for zero-shot speaker adaptation. This work achieves the first human-level TTS synthesis on both single and multispeaker datasets, showcasing the potential of style diffusion and adversarial training with large SLMs.

Paper: https://arxiv.org/abs/2306.07691

Code: https://github.com/yl4579/StyleTTS2

Colab: https://colab.research.google.com/github/yl4579/StyleTTS2/blob/main/

20
-1

Abstract

LLaVA-Plus is a general-purpose multimodal assistant that expands the capabilities of large multimodal models. It maintains a skill repository of pre-trained vision and vision-language models and can activate relevant tools based on users' inputs to fulfill real-world tasks. LLaVA-Plus is trained on multimodal instruction-following data to acquire the ability to use tools, covering visual understanding, generation, external knowledge retrieval, and compositions. Empirical results show that LLaVA-Plus outperforms LLaVA in existing capabilities and exhibits new ones. It is distinct in that the image query is directly grounded and actively engaged throughout the entire human-AI interaction sessions, significantly improving tool use performance and enabling new scenarios.

Paper: https://arxiv.org/abs/2311.05437

Code: https://github.com/LLaVA-VL/LLaVA-Plus-Codebase

Demo: https://llavaplus.ngrok.io/

Dataset: https://huggingface.co/datasets/LLaVA-VL/llava-plus-data

Model: https://llava-vl.github.io/llava-plus/

21
-1
submitted 7 months ago* (last edited 7 months ago) by Even_Adder@lemmy.dbzer0.com to c/fosai@lemmy.world
22
1
submitted 8 months ago* (last edited 8 months ago) by Blaed@lemmy.world to c/fosai@lemmy.world

Llama 2 & WizardLM Megathread

Starting another model megathread to aggregate resources for any newcomers.

It's been awhile since I've had a chance to chat with some of these models so let me know some your favorites in the comments below.

There are many to choose from - sharing your experience could help someone else decide which to download for their use-case.

Thread Models:


Quantized Base Llama-2 Chat Models

Llama-2-7b-Chat

GPTQ

GGUF

AWQ


Llama-2-13B-chat

GPTQ

GGUF

AWQ


Llama-2-70B-chat

GPTQ

GGUF

AWQ


Quantized WizardLM Models

WizardLM-7B-V1.0+

GPTQ

GGUF

AWQ


WizardLM-13B-V1.0+

GPTQ

GGUF

AWQ


WizardLM-30B-V1.0+

GPTQ

GGUF

AWQ


Llama 2 Resources

LLaMA 2 is a large language model developed by Meta and is the successor to LLaMA 1. LLaMA 2 is available for free for research and commercial use through providers like AWS, Hugging Face, and others. LLaMA 2 pretrained models are trained on 2 trillion tokens, and have double the context length than LLaMA 1. Its fine-tuned models have been trained on over 1 million human annotations.

Llama 2 Benchmarks

Llama 2 shows strong improvements over prior LLMs across diverse NLP benchmarks, especially as model size increases: On well-rounded language tests like MMLU and AGIEval, Llama-2-70B scores 68.9% and 54.2% - far above MTP-7B, Falcon-7B, and even the 65B Llama 1 model.

Llama 2 Tutorials

Tutorials by James Briggs (also link above) are quick, hands-on ways for you to experiment with Llama 2 workflows. See also a poor man's guide to fine-tuning Llama 2. Check out Replicate if you want to host Llama 2 with an easy-to-use API.


Did I miss any models? What are some of your favorites? Which family/foundation/fine-tuning should we cover next?

23
1
submitted 9 months ago* (last edited 8 months ago) by Blaed@lemmy.world to c/fosai@lemmy.world

Hey everyone!

I think it's time we had a fosai model on HuggingFace. I'd like to start collecting ideas, strategies, and approaches for fine-tuning our first community model.

I'm open to hearing what you think we should do. We will release more in time. This is just the beginning.

For now, I say let's pick a current open-source foundation model and fine-tune on datasets we all curate together, built around a loose concept of using a fine-tuned LLM to teach ourselves more bleeding-edge technologies (and how to build them using technical tools and concepts).

FOSAI is a non-profit movement. You own everything fosai as much as I do. It is synonymous with the concept of FOSS. It is for everyone to champion as they see fit. Anyone is welcome to join me in training or tuning using the workflows I share along the way.

You are encouraged to leverage fosai tools to create and express ideas of your own. All fosai models will be licensed under Apache 2.0. I am open to hearing thoughts if other licenses should be considered.


We're Building FOSAI Models! 🤖

Our goal is to fine-tune a foundation model and open-source it. We're going to start with one foundation family with smaller parameters (7B/13B) then work our way up to 40B (or other sizes), moving to the next as we vote on what foundation we should fine-tune as a community.


Fine-Tuned Use Case ☑️

Technical

  • FOSAI Model Idea #1 - Research & Development Assistant
  • FOSAI Model Idea #2 - Technical Project Manager
  • FOSAI Model Idea #3 - Personal Software Developer
  • FOSAI Model Idea #4 - Life Coach / Teacher / Mentor
  • FOSAI Model Idea #5 - FOSAI OS / System Assistant

Non-Technical

  • FOSAI Model Idea #6 - Dungeon Master / Lore Master
  • FOSAI Model Idea #7 - Sentient Robot Character
  • FOSAI Model Idea #8 - Friendly Companion Character
  • FOSAI Model Idea #9 - General RPG or Sci-Fi Character
  • FOSAI Model Idea #10 - Philosophical Character

OR

FOSAI Foundation Model ☑️


Foundation Model ☑️

(Pick one)

  • Mistral
  • Llama 2
  • Falcon
  • ..(Your Submission Here)

Model Name & Convention

  • snake_case_example
  • CamelCaseExample
  • kebab-case-example

0.) FOSAI ☑️

  • fosai-7B
  • fosai-13B

1.) FOSAI Assistant ☑️

  • fosai-assitant-7B
  • fosai-assistant-13B

2.) FOSAI Atlas ☑️

  • fosai-atlas-7B
  • fosai-atlas-13B

3.) FOSAI Navigator ☑️

  • fosai-navigator-7B
  • fosai-navigator-13B

4.) ?


Datasets ☑️

  • TBD!
  • What datasets do you think we should fine-tune on?

Alignment ☑️

To embody open-source mentalities, I think it's worth releasing both censored and uncensored versions of our models. This is something I will consider as we train and fine-tune over time. Like any tool, you are responsible for your usage and how you choose to incorporate into your business and/or personal life.


License ☑️

All fosai models will be licensed under Apache 2.0. I am open to hearing thoughts if other licenses should be considered.

This will be a fine-tuned model, so it may inherit some of the permissions and license agreements as its foundation model and have other implications depending on your country or local law.

Generally speaking, you can expect that all fosai models will be commercially viable through the selection process of its foundation family and the post-processing steps that are fine-tuning the model.


Costs

I will be personally covering all training and deployment costs. This may change if I choose to put together some sort of patronage, but for now - don't worry about this. I will be using something like RunPod or some other custom deployed solution for training.


Cast Your Votes! ☑️

Share Your Ideas & Vote in the Comments Below! ✅

What do you want to see out of this first community model? What are some of the fine-tuning ideas you've wanted to try, but never had the time or chance to test? Let me know in the comments and we'll brainstorm together.

I am in no rush to get this out, so I will leave this up for everyone to see and interact with until I feel we have a solid direction we can all agree upon. There will be plenty of more opportunities to create, curate, and customize more fosai models I plan to release in the future.

Update [10/25/23]: I may have found a fine-tuning workflow for both Llama (2) and Mistral, but I haven't had any time to validate the first test run. Once I have a chance to do this and test some inference I'll be updating this post with the workflow, the models, and some sample output with example datasets. Unfortunately, I have ran out of personal funds to allocate to training, so it is unsure when I will have a chance to make another attempt at this if this first attempt doesn't pan out. Will keep everyone posted as we approach the end of 2023.

24
1
submitted 9 months ago by Blaed@lemmy.world to c/fosai@lemmy.world

Hey everyone!

I don't think I've shared this one before, so allow me to introduce you to 'LM Studio' - a new application that is tailored to LLM developers and enthusiasts.

Check it out!


With LM Studio, you can ...

🤖 - Run LLMs on your laptop, entirely offline

👾 - Use models through the in-app Chat UI or an OpenAI compatible local server

📂 - Download any compatible model files from HuggingFace 🤗 repositories

🔭 - Discover new & noteworthy LLMs in the app's home page LM Studio supports any ggml Llama, MPT, and StarCoder model on Hugging Face (Llama 2, Orca, Vicuna, Nous Hermes, WizardCoder, MPT, etc.)

Minimum requirements: M1/M2 Mac, or a Windows PC with a processor that supports AVX2. Linux is under development.

Made possible thanks to the llama.cpp project.

We are expanding our team. See our careers page.


Love seeing these new tools come out! Especially with the new gguf format being widely adopted.

The regularly updated and curated list of new LLM releases they provide through this platform is enough for me to keep it installed.

I'll be tinkering plenty when I have the time this week. I'll be sure to let everyone know how it goes! In the meantime, if you do end up giving LM Studio a try - let us know your thoughts and experience with it in the comments below.

25
1
submitted 10 months ago* (last edited 10 months ago) by Blaed@lemmy.world to c/fosai@lemmy.world

Hello everyone!

I am working on figuring out better workflows bringing back more consistent post schedules. In the meantime, I'd like to leave you with a new update from LocalAI & Continue.

Check these projects out! More info from the Continue & LocalAI teams below:

Continue

The open-source autopilot for software development A VS Code extension that brings the power of ChatGPT to your IDE

LocalAI

LocalAI is a drop-in replacement REST API that’s compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families that are compatible with the ggml format. Does not require GPU.

Combining the Power of Continue + LocalAI!


Note

From this release the llama backend supports only gguf files (see 943 ). LocalAI however still supports ggml files. We ship a version of llama.cpp before that change in a separate backend, named llama-stable to allow still loading ggml files. If you were specifying the llama backend manually to load ggml files from this release you should use llama-stable instead, or do not specify a backend at all (LocalAI will automatically handle this).

Continue

logo

This document presents an example of integration with continuedev/continue.

Screenshot

For a live demonstration, please click on the link below:

Integration Setup Walkthrough

  1. As outlined in continue's documentation, install the Visual Studio Code extension from the marketplace and open it.

  2. In this example, LocalAI will download the gpt4all model and set it up as "gpt-3.5-turbo". Refer to the docker-compose.yaml file for details.

    # Clone LocalAI
    git clone https://github.com/go-skynet/LocalAI
    
    cd LocalAI/examples/continue
    
    # Start with docker-compose
    docker-compose up --build -d
    
  3. Type /config within Continue's VSCode extension, or edit the file located at ~/.continue/config.py on your system with the following configuration:

    from continuedev.src.continuedev.libs.llm.openai import OpenAI, OpenAIServerInfo
    
    config = ContinueConfig(
       ...
       models=Models(
            default=OpenAI(
               api_key="my-api-key",
               model="gpt-3.5-turbo",
               openai_server_info=OpenAIServerInfo(
                  api_base="http://localhost:8080",
                  model="gpt-3.5-turbo"
               )
            )
       ),
    )
    

This setup enables you to make queries directly to your model running in the Docker container. Note that the api_key does not need to be properly set up; it is included here as a placeholder.

If editing the configuration seems confusing, you may copy and paste the provided default config.py file over the existing one in ~/.continue/config.py after initializing the extension in the VSCode IDE.

Additional Resources

view more: next ›

Free Open-Source Artificial Intelligence

0 readers
2 users here now

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

Have no idea where to begin with AI/LLMs? Try visiting our Lemmy Crash Course for Free Open-Source AI. When you're done with that, head over to FOSAI ▲ XYZ or check out the FOSAI LLM Guide for more info.

Monthly Roadmap

October 2023

More AI Communities

AI Resources

Learn

Build

Serve

Fediverse / FOSAI

LLM Leaderboards

LLM Search Tools

LLM Evaluations

GitHub Projects

Documentation Theory

founded 1 year ago
MODERATORS