The ability to imitate sounds with our voice, such as a faulty car engine or a cat's meow, can be an effective way to convey concepts when words fall short. This vocal imitation is much like drawing a quick sketch to communicate an idea. Inspired by cognitive science, researchers from MIT's CSAIL have developed an AI system that can create human-like vocal imitations without prior training or exposure to human vocal impressions. The researchers constructed a model of the human vocal tract, simulating how throat, tongue, and lips shape sounds from the voice box. A cognitively-inspired AI algorithm controls this model to produce imitations, considering how humans choose to communicate sounds. The model can imitate various sounds, such as rustling leaves, a snake's hiss, and an ambulance siren. It can also reverse the process, guessing real-world sounds from human vocal imitations, similar to retrieving images from sketches. For example, it can distinguish between a human-imitated cat's "meow" and "hiss. " The research suggests potential uses for the model, such as imitation-based interfaces for sound designers, enhancing AI characters in virtual reality, and aiding language learners.
Co-lead authors from MIT CSAIL highlight that, like in visual expression, realism isn’t always the ultimate goal in sound imitation. Their work offers insights into auditory abstraction. To refine their model, the team developed three versions, starting with a baseline model that aimed for realistic sound imitation but didn't match human behavior well. They then created a "communicative" model focusing on a sound's distinctive features, which improved results. Finally, they added nuances accounting for the effort humans invest in imitation, leading to more human-like results. In a behavioral experiment, human judges sometimes preferred AI-generated vocal imitations over human ones for specific sounds. The researchers aim to apply their model in various fields, including language development, infant speech learning, and bird imitation behaviors. Although the model still faces challenges, such as accurately imitating some consonants or cross-language sound differences, it offers a promising step towards a deeper understanding of vocal imitation's role in communication and language evolution. The work highlights the interplay between physiological, social, and communicative factors, with implications for future technologies in music, art, and beyond.
MIT Develops AI for Human-like Vocal Imitation
This as-told-to essay is drawn from a conversation with Antoine Wade, a tech sales professional based in San Antonio.
Meta Platforms Inc.
The global HVLP (Very Low Profile) copper foil market is experiencing significant growth this year, primarily driven by rising demand for AI servers.
Jon Peddie, founder and president of Jon Peddie Research, was the featured guest on DE 24/7 tech podcaster Kenneth Wong’s show, where he discussed the rapidly expanding AI processor industry and the daily fluctuations within this billion-dollar market.
The evolving relationship between artificial intelligence (AI) and search engine optimization (SEO) is profoundly transforming the digital marketing landscape.
The post-production phase of video production is undergoing a major transformation with the growing adoption of artificial intelligence (AI) technologies.
Intel Corporation has initiated significant leadership changes and workforce reductions within its foundry operations as part of a broader corporate restructuring aimed at refocusing its business strategy to better address the rapidly evolving artificial intelligence (AI) market.
Launch your AI-powered team to automate Marketing, Sales & Growth
and get clients on autopilot — from social media and search engines. No ads needed
Begin getting your first leads today