1
submitted 1 year ago* (last edited 1 year ago) by soulnull@burggit.moe to c/artificial_intelligence@burggit.moe

I know it's not directly AI discussions related, but since it's being censored ("medical disinformation"), it deserves to be put somewhere.

Edit: new version using revision 3 of my dataset/model:

https://deadfrog.org/alexjones-endofdays-v3.mp3

The even more fucked up part? The original song (Vinnie Paz - End Of Days) is on YouTube (2.2 million views). I pointed this out and my appeal was rejected anyways. Pretty much instantly. Apparently an AI Alex Jones rapping this song crossed a line.

Yet another reason not to trust the centralized platforms.

Edit: an assortment of AI songs that might be appreciated here.. I have a ton. Most don't make it online, but I enjoy making them. It's an addiction -_-

https://deadfrog.org/hatkid-iwantyouback.mp3 https://deadfrog.org/noa-merica.mp3 (noa himesaka) https://deadfrog.org/herbert-givenup.mp3 https://deadfrog.org/benshapirolovesong.mp3 https://deadfrog.org/butters-hurt.mp3 https://deadfrog.org/queen-igothiv.mp3

Alex Jones, Hat kid and Ben Shapiro are my own custom trained voice models, the others are not, I just did inference and production on them. All rights reserved by the original voice model authors, other CYA shit, etc..

1
submitted 1 year ago* (last edited 1 year ago) by soulnull@burggit.moe to c/artificial_intelligence@burggit.moe

With Llama kicking things off, development has been ridiculously fast in the self hosted text model space. The requirements are getting better, but still fairly steep. You can either have painfully slow CPU generation, or if you have 24gb+ of VRAM, you really open up the GPU options.

7b models can sorta run in 12gb, but they're not great. You really want at least 13b, which needs 24gb VRAM.. Or run it on the CPU. Some of them are getting close to ChatGPT quality, definitely not a subset to sleep on, and I feel as though the fediverse would appreciate the idea of self hosting their own chat bots. Some of these models have ridiculous context memory, so they actually remember what you're talking about ridiculously well.

A good starting point is this rentry: https://rentry.org/local_LLM_guide

I'm admittedly not great with these yet (and my GPU is only 12gb), but I'm fascinated and hope there can be some good discussions around these, as the tech is really fascinating

1

Voice replacement is getting faster to train, but seems to actually be getting worse with identifying pitch/keys.

There's still an issue with reverb/echo and doubled vocals. The only way I was able to make this passable was to find pre-separared vocals, and even still it struggled with the pitch drifting, so I had to rerecord parts of it.

Still, I trained these in so-vits-svc for about 2 hours each on a 3080ti. I spent more time producing it than the AI needed to completely replace someones voice with someone else's voice.

Combining these with deepfakes/wav2lip can give some damn good results. If anyone wants some guidance on the process for voice replacement, I can certainly share anything I've picked up along the way.

1
submitted 1 year ago* (last edited 1 year ago) by soulnull@burggit.moe to c/artificial_intelligence@burggit.moe

Whatever your opinion, you've certainly got one. This is the place to discuss and share things relating to AI. Do you make cool stuff with stable diffusion? Want to discuss the latest local text models? Got funny or interesting voice models, or just want to discuss the impacts of the technology? Rumors on new Bard (lol) improvements? Anything AI.

NSFW is allowed, just please mark it accordingly.

If you post your work, please keep it to a single thread/post to not clog up the plumbing too much.

If you have suggestions as to things we can do to improve this place, this thread is unlocked and ready for your input.

soulnull

joined 1 year ago
MODERATOR OF