When composer and vocalist Jen Wang took the stage at the
Monk Space in Los Angeles to perform Alvin Lucier’s “The Duke of York” (1971)
earlier this year, she sang with a digital rendition of her voice, synthesized
by artificial intelligence.
اضافة اعلان
It was the first time she had done that. “I thought it was
going to be really disorienting,” Wang said in an interview, “but it felt like
I was collaborating with this instrument that was me and was not me.”
Isaac Io Schankler, a composer and music professor at Cal
Poly Pomona, conceived the performance and joined Wang onstage to monitor and
manipulate Realtime Audio Variational autoEncoder, or RAVE, the neural audio
synthesis algorithm that modeled Wang’s voice.
RAVE is an example of machine learning, a specific category
of artificial intelligence technology that musicians have experimented with
since the 1990s — but that now is defined by rapid development, the arrival of
publicly available, AI-powered music tools and the dominating influence of
high-profile initiatives by large tech companies.
Schankler ultimately used RAVE in that performance of “The
Duke Of York,” though, because its ability to augment an individual performer’s
sound, they said, “seemed thematically resonant with the piece.” For it to
work, the duo needed to train it on a personalized corpus of recordings. “I
sang and spoke for three hours straight,” Wang recalled. “I sang every song I
could think of.”
Antoine Caillon developed RAVE in 2021, during his graduate
studies at IRCAM, an institute founded by composer Pierre Boulez in Paris.
“RAVE’s goal is to reconstruct its input,” he said. “The model compresses the
audio signal it receives and tries to extract the sound’s salient features in
order to resynthesize it properly.”
Machine learning then, and now
Tina Tallon, a composer and professor of AI and the arts at
the University of Florida, said that musicians have used various AI-related
technologies since the mid-20th century.
“There are rule-based systems, which is what artificial
intelligence used to be in the ’60s, ’70s, and ’80s,” she said, “and then there
is machine learning, which became more popular and more practical in the ’90s
and involves ingesting large amounts of data to infer how a system functions.”
Today, developments in AI that were once contained to
specialized applications impinge on virtually every corner of life and already
impact the way people make music. Caillon, in addition to developing RAVE, has
contributed to the Google-led projects SingSong, which generates accompaniments
for recorded vocal melodies, and MusicLM, another text-to-music generator.
Innovations in other areas are driving new music
technologies, too: WavTool, a recently released, AI-powered music production
platform, fully integrates OpenAI’s GPT-4 to enable users to create music via
text prompts.
For Tallon, the difference in scale between individual
composers’ customized use of AI and these new, broad-reaching technologies
represents a cause for concern.
“We are looking at different types of data sets that are
compiled for different reasons,” she said. “Tools like MusicLM are trained on
data sets that are compiled by pulling from thousands of hours of labeled audio
from YouTube and other places on the internet.
“When I design a tool for my own personal use, I’m looking
at data related to my sonic priorities. But public-facing technologies use data
sets that focus on, for instance, aesthetic ideals that align more closely with
Western classical systems of organizing pitches and rhythms.”
Bias in music-related AI
Concerns over bias in music-related AI tools do not stop at
aesthetics. Enongo Lumumba-Kasongo, a music professor at Brown University, also
worries about how these technologies can reproduce social hierarchies.
“There is a very specific racial discourse that I’m very
concerned about,” she said. “I don’t think it’s a coincidence that hip-hop
artistry is forming the testing ground for understanding how AI affects artists
and their artistry given the centuries-long story of co-optation and theft of
Black expressive forms by those in power.”
The popularity of recent AI-generated songs that mimicked
artists like Drake, the Weeknd, Travis Scott and others have animated
Lumumba-Kasongo’s fears. “What I’m most concerned about with AI Drake and AI
Travis Scott is that their music is highly listenable,” she said, “and calls
into question any need for an artist once they’ve articulated a distinct
‘voice.’”
For Schankler, there are key differences between using RAVE
to synthesize new versions of a collaborator’s voice and using AI to
anonymously imitate a living musician. “I don’t find it super interesting to
copy someone’s voice exactly, because that person already exists,” they said.
“I’m more interested in the new sonic possibilities of this technology. And
what I like about RAVE is that I can work with a small data set that is created
by one person who gives their permission and participates in the process.”
Composer Robert Laidlow also uses AI in his work to
contemplate the technology’s fraught implications. “Silicon,” which premiered
in October with the BBC Philharmonic under Vimbayi Kaziboni, employs tools to
explore themes drawn from the technology’s transformative and disruptive
potential.
Laidlow described “Silicon” as “about technology as much as
it uses technology,” adding: “The overriding aesthetic of each movement of this
piece are the questions, ‘What does it mean for an orchestra to use this
technology?’ and ‘What would be the point of an orchestra if we had a
technology that can emulate it in every way?’”
The work’s entirely acoustic first movement features a
mixture of Laidlow’s original music and ideas he adapted from the output, he
said, of a “symbolic, generative AI that was trained on notated material from
composers all throughout history.” The second movement features an AI-powered
digital instrument, performed by the orchestra’s pianist that “sometimes mimics
the orchestra and sometimes makes uncanny, weird sounds.”
In the last movement, the orchestra is accompanied by sounds
generated by a neural synthesis program called PRiSM-SampleRNN, which is akin
to RAVE and was trained on a large archive of BBC Philharmonic radio
broadcasts. Laidlow describes the resulting audio as “featuring synthesized
orchestral music, voices of phantom presenters and the sounds the artificial
intelligence has learned from audiences.”
The size of “Silicon” contrasts with the intimacy of
Schankler and Wang’s performance of “The Duke of York.” But both instances
illustrate AI’s potential to expand musical practices and human expression.
And, importantly, by employing small, curated data sets tailored to individual
collaborators, these projects attempt to obviate ethical concerns many have
identified in larger-scale technologies.
Read more Technology
Jordan News