This is an automated archive made by the Lemmit Bot.
The original was posted on /r/homeassistant by /u/sobolanul11 on 2026-03-04 07:14:01+00:00.
We have many very high quality local TTS models for English and a handful of non English languages. However, there is none for Romanian.
So I decided to look into how to make one. It turns out it was easier than I expected. Took a week of trial and error and extensive Google usage, but I got it there.
I documented the steps for anyone who might want to reproduce it for other languages. The "secret" is to have 2 phase training, one for adding new, non English tokens to the vocab and then the actual training.
I used XTTS v2 for this (currently in the process of finding the right way to add extra languages to other popular TTS models).
I published everything into a Codeberg repository: https://codeberg.org/eduardm/romanian-tts-xtts-v2
All the code I used is there, to train a new language need to identify the best matching existing token phonetic matching (do not initialise with random) and update the scripts
I hope this helps