lang icon English
Auto-Filling SEO Website as a Gift

Launch Your AI-Powered Business and get clients!

No advertising investment needed—just results. AI finds, negotiates, and closes deals automatically

Nov. 26, 2024, 9:17 a.m.
144

Nvidia's Fugatto: Revolutionizing Generative AI for Sound

Nvidia's new "Fugatto" model enhances generative AI by transforming music, voices, and sounds, even creating previously unheard sounds. Not yet publicly available, examples on the website showcase its ability to modify audio traits, from saxophones sounding like barking to underwater speech or choirs of ambulance sirens. This broad capability has led Nvidia to describe Fugatto as a "Swiss Army knife for sound. " The challenge lies in crafting a training dataset that highlights meaningful relationships between audio and language. Nvidia's researchers, using an LLM-generated Python script, created numerous template-based and free-form instructions to describe audio "personas. " These were applied to a wide array of open-source audio datasets, annotating them with natural language descriptions quantified by emotion, gender, and speech quality. The researchers held certain factors constant while varying others to teach the model distinctions like happier speech or different instrument sounds. After processing 20 million samples (50, 000 hours of audio), they used Nvidia tensor cores to develop a model with 2. 5 billion parameters, showcasing reliable audio quality scores. Beyond training, Fugatto's "ComposableART" system allows customizable audio output. It combines traits from its dataset to create new, unheard sounds, using "conditional guidance" for unseen combinations.

While not all outputs are pitch-perfect, the variety of sounds, like a violin sounding like a laughing baby, showcases Fugatto's transformative ability. Crucially, Fugatto treats audio traits as tunable continuums, not binaries. It combines sounds, like an acoustic guitar with running water, by altering the balance, and adjusts accents or emotions in speech. It performs tasks such as altering spoken text emotion, isolating vocal tracks, and replacing notes in MIDI music with varied vocal performances. Nvidia sees Fugatto as a step toward unsupervised multitask learning and envisions applications in song prototyping and dynamic video game scores. Such models are intended as tools for audio artists rather than replacements. As producer/songwriter Ido Zmishlany states, technology continuously reshapes music, with AI marking a new chapter in musical innovation.



Brief news summary

Nvidia's Fugatto is a cutting-edge audio synthesis technology that transforms text prompts into sounds, though it remains unavailable to the public. A demo showcases its impressive ability to add effects like underwater speech and choir-like sirens. One major challenge in developing Fugatto was constructing a dataset that captures intricate audio-language interactions. Nvidia tackled this by employing a language model to create scripts for diverse audio personas, resulting in a 50,000-hour dataset essential for training the model, which boasts 2.5 billion parameters. A key feature of Fugatto is "ComposableART," enabling users to blend characteristics from the training data for meticulous control over audio aspects such as accents and emotions. This capability allows adjustments in speech emotions and the separation of vocal tracks in music, offering creative possibilities beyond basic synthesis. Nvidia foresees Fugatto as a tool to enhance audio creativity in areas like music prototyping and dynamic game scoring, aiming to complement traditional methods rather than replace them. The company believes AI tools like Fugatto could profoundly impact the future landscape of musical creativity.
Business on autopilot

AI-powered Lead Generation in Social Media
and Search Engines

Let AI take control and automatically generate leads for you!

I'm your Content Manager, ready to handle your first test assignment

Language

Content Maker

Our unique Content Maker allows you to create an SEO article, social media posts, and a video based on the information presented in the article

news image

Last news

The Best for your Business

Learn how AI can help your business.
Let’s talk!

June 18, 2025, 6:28 a.m.

Amazon CEO Warns of AI-Driven Job Reductions in C…

Amazon CEO Andy Jassy has issued a significant warning about the company’s future workforce strategy amid its growing integration of artificial intelligence (AI) across operations.

June 18, 2025, 6:17 a.m.

Bitcoin Treasury Companies Are an Auditor's Night…

Bitcoin treasury companies’ auditing practices have recently come under intense scrutiny, revealing major transparency and verification challenges within this burgeoning sector.

June 17, 2025, 2:23 p.m.

Justin Sun's Tron to Go Public via Reverse Merger

Justin Sun, founder of the $26 billion Tron blockchain ecosystem, announced plans to take Tron public via a reverse merger with Nasdaq-listed SRM Entertainment, marking a pivotal step in Tron's growth and visibility in financial and tech sectors.

June 17, 2025, 2:22 p.m.

Top Trump Labor Official: America's Workers Don't…

Keith Sonderling, former deputy Labor Secretary under the Trump administration, recently highlighted a major barrier to AI adoption in the U.S. workforce: employee mistrust.

June 17, 2025, 10:42 a.m.

Avail Goes Full Stack To Capture $300 Billion Glo…

June 17, 2025 – Dubai, United Arab Emirates Avail presents the only blockchain stack that delivers horizontal scalability, crosschain connectivity, and unified liquidity while preserving decentralization

June 17, 2025, 10:29 a.m.

Microsoft and OpenAI Engage in Complex Negotiatio…

Microsoft and OpenAI are currently engaged in a complex and tense negotiation process that could significantly reshape their strategic partnership and affect the broader artificial intelligence industry.

June 17, 2025, 6:28 a.m.

Crypto group Tron to go public in US via reverse-…

Hong Kong-based cryptocurrency entrepreneur Justin Sun’s blockchain company, Tron, is preparing to go public in the United States through a reverse merger with SRM Entertainment (SRM.O).

All news