this post was submitted on 13 May 2025
370 points (95.3% liked)

Technology

70255 readers
3984 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
(page 2) 50 comments
sorted by: hot top controversial new old
[–] ABetterTomorrow@lemm.ee 7 points 1 week ago

Left Amazon a handful of years ago. Glad I didn’t entirely contribute to this. Saw that coming….

[–] drmoose@lemmy.world 6 points 1 week ago* (last edited 1 week ago) (15 children)

This is clearly the future despite the outrage here.

There are at least 389 living languages with over 1M speakers. That alone means it's impossible to reach some people and they get left out. Most of these languages dont even have enough professional voice actors to cover the bandwidth.

There are thousands of books released every year. That's impossible to cover even in English alone.

Its an objective net good to have more accessible audio books and the privileged people who do care about this stuff can very much afford to vote with their wallets for non-ai voices.

In fact since AI moat is so minimal this will very quickly be adapted by open source solution providing audio book access to millions if not billions of people to whom this was not an option. Its amazing.

load more comments (15 replies)
[–] MiyamotoKnows@lemmy.world 5 points 1 week ago

This consumer says you don't get a red cent then!

It's already a plague on youtube where half of the docu style vids are AI narrated already. I quit them in disgust. It's so frustrating. It has eroded my perception of Youtube in short time.

[–] Jimmycakes@lemmy.world 4 points 1 week ago (1 children)

I only get the ones with a famous narrator or the author.

load more comments (1 replies)
[–] tal@lemmy.today 4 points 1 week ago* (last edited 1 week ago) (1 children)

AI voice synth is pretty solidly-useful in comparison to, say, video generation from scratch. I think that there are good uses for voice synth


e.g. filling in for an aging actor/actress who can't do a voice any more, video game mods, procedurally-generated speech, etc


but audiobooks don't really play to those strengths. I'm a little skeptical that in 2025, it's at the point where it's a good drop-in replacement for audiobooks. What I've heard still doesn't have emphasis on par with a human.

I don't know what it costs to have a human read an audiobook, but I can't imagine that it's that expensive; I doubt that there's all that much editing involved.

kagis

https://www.reddit.com/r/litrpg/comments/1426xav/whats_the_average_narrator_cost/

So I produced my own audiobooks for my Nova Roma series so I know the exact numbers for you:

$250 per finished hour for the narrator. Books ranged from about 200k words-270k words, which came out to 22 hours, 20 hours, and 25 hours.

So books 1-3 cost me $5,500, $5,000, and $6,250. I'm contracted for two more books with my narrator, so I expect to spend another 5k-6k for each of those.

So for a five book series, each one 200k+ words, the total cost out of pocket for me will be about $27,000 give or take to make the series into audiobooks.

That's actually lower than I expected. Like, if a book sells at any kind of volume, it can't be that hard to make that back.

EDIT: I can believe that it's possible to build a speech synth system that does do better, mind


I certainly don't think that there are any fundamental limitations on this. It'd guess that there's also room for human-assisted stuff, where you have some system that annotates the text with emphasis markers, and the annotated text gets fed into a speech synth engine trained to convert annotated text to voice. There, someone listens to the output and just tweaks the annotated text where the annotation system doesn't get it quite right. But I don't think that we're really there today yet.

load more comments (1 replies)
[–] Maeve@kbin.earth 3 points 1 week ago

More jobless, desperate people.

load more comments
view more: ‹ prev next ›