Babbitt News, on June 16, Meta announced the release of the speech generation AI model “Voicebox”, which supports speech generation from text, can match audio styles based on samples that are only two seconds long, and convert text samples to another language, Given individual speech samples, it reads the translated text in the speaker’s original voice, currently in six languages: English, French, German, Spanish, Polish, and Portuguese.
Meta said that Voicebox can also make virtual assistants and non-player characters in the metaverse make natural voices, and it can allow the visually impaired to hear written messages from friends that AI reads in their voices, providing creators with new tools to easily Create and edit audio tracks for videos and more.