this post was submitted on 25 Jul 2024
647 points (100.0% liked)
196
17523 readers
914 users here now
Be sure to follow the rule before you head out.
Rule: You must post before you leave.
Other rules
Behavior rules:
- No bigotry (transphobia, racism, etc…)
- No genocide denial
- No support for authoritarian behaviour (incl. Tankies)
- No namecalling
- Accounts from lemmygrad.ml, threads.net, or hexbear.net are held to higher standards
- Other things seen as cleary bad
Posting rules:
- No AI generated content (DALL-E etc…)
- No advertisements
- No gore / violence
- Mutual aid posts are not allowed
NSFW: NSFW content is permitted but it must be tagged and have content warnings. Anything that doesn't adhere to this will be removed. Content warnings should be added like: [penis], [explicit description of sex]. Non-sexualized breasts of any gender are not considered inappropriate and therefore do not need to be blurred/tagged.
If you have any questions, feel free to contact us on our matrix channel or email.
Other 196's:
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Some apps allow you to offload to GPU, and CPU while loading the active part of the model. I have a an old SSD that give me 500gb of "usable" ram set up as swap.
It is horrendously slow and pointless but you can do it. I got about 2 tokens in 10 minutes before I gave up on a 70b model on a 1080 ti.
Even if they used more powerful hardware than you, the model they ran is still almost 6 times bigger - so if you got two tokens in 10 minutes, one token in 30 minutes for them sounds plausible.
I would have to use an entire 1tb drive for swap but I'm sure I could manage 1 token before the heat death of the universe.
I'd worry less about the heat death of the universe and more about your hardware's heat from all that load.