Audio Generation

12 min

audio generation workflows let you create speech, music, and cloned voices from text this guide walks through a text to speech (tts) workflow in detail, then points you to workflows for music generation and voice cloning overview text to speech (tts) workflows convert written text into natural sounding speech using ai voices, useful for narrations, voiceovers, character dialogue, presentations, and accessibility content this guide demonstrates a tts workflow in floyo and points to related audio capabilities below example workflow this workflow uses fish audio tts to convert text into speech and automatically save the generated audio file using this workflow step 1 enter your text locate the prompt text node and enter the text you want the model to speak the generated audio will be based on the content provided in this field step 2 select a voice in the fish audio tts node, choose the voice you would like to use different voices may provide different tones, accents, and speaking styles depending on the workflow configuration step 3 review the settings this workflow comes with default settings and can be used without making any modifications advanced users can adjust settings such as speed temperature volume chunk length latency output format these settings can help fine tune the generated speech if needed step 4 run the workflow click the play button in the bottom toolbar to generate the audio the workflow will process the text and create a speech file using the selected voice step 5 download the audio once the workflow completes, navigate to the outputs folder in file browser to preview and download the generated audio common use cases text to music generate original music from a text prompt by describing the mood, genre, instruments, tempo, or style you want this workflow is ideal for creating background music, concept tracks, social media content, game audio, and creative projects without requiring traditional music production tools recommended workflow floyo ace step 1 5 xl – text to music https //www floyo ai/workflows/ace step 1 5 xl text to music i2oiomy8ihd8 text to speech convert written text into natural sounding speech using ai generated voices text to speech workflows are commonly used for voiceovers, presentations, tutorials, audiobooks, podcasts, character dialogue, and accessibility focused content recommended workflow floyo longcat audiodit for tts https //www floyo ai/workflows/longcat audiodit for tts hqxuqf68lxe7 voice cloning generate speech using a reference voice sample while preserving the tone and characteristics of the original speaker voice cloning can be used for personalized voiceovers, character voices, content localization, narration, and creative audio projects recommended workflow floyo longcat audiodit for voice clone https //www floyo ai/workflows/longcat audiodit for voice clone kb677c52ldiw tips use clear and descriptive prompts when generating music experiment with different voices to find the best match for your content high quality reference audio generally produces better voice cloning results review generated audio before publishing or sharing