This ‘Jukebox’ AI Generates Complete Songs

OpenAI, an independent research organization aimed at developing “friendly AI,” has been cranking out a lot of impressive work over the past few months. The organization, for example, recently provided the source code for the language generating tool, Talk to Transformer. Now, OpenAI is adding to its repertoire of AI tools, Jukebox: An AI that generates raw audio of genre-specific songs.

OpenAI recently announced the release of Jukebox, noting it’s an AI that’s able to generate music in the raw audio domain. Raw audio is a file format for storing uncompressed audio in raw form. Researchers unassociated with OpenAI have previously said that generating music with the idiosyncrasies and nuances of a real musical performance is only possible with raw audio. Generating it, however, is difficult when using training data from digital music that’s already been “cleaned up.”

OpenAI trained convolutional neutral networks with a curated list of 1.2 million songs to generate the raw-sounding music. In the organization’s paper describing Jukebox, the researchers say that the 1.2 million songs were paired with their corresponding lyrics and metadata, collected from LyricWiki. The metadata for each song included information like genre, artist, album, and any associated playlist keywords.

This ‘Jukebox’ AI Generates Complete Songs_1

Jukebox can categorize musicians based on the way they sound. OpenAI

Essentially, OpenAI’s software engineers trained the convolutional neural networks—which are machine learning algorithms especially good at identifying images and language patterns—with the 1.2 million songs and all of their related metadata. Using that training data, the neural networks made their own songs. In other words, the OpenAI team fed machine learning algorithms all of those songs and their associated metadata, and then had the algorithms spit out raw musical samples that follow the same patterns found in the samples fed to them.

The songs created by Jukebox are stunningly realistic. The track immediately above, for example, was generated by Jukebox after only receiving lyrics co-written by a language modeling tool and OpenAI researchers. Meaning that Jukebox was able to take the provided lyrics and generate an appropriate singing voice, instrumentals, and genre. Jukebox literally created all of that song, except for the lyrics, entirely from scratch.

Looking forward, OpenAI will be moving toward generating musical collaborations made by humans in conjunction with machines. “We expect human and model collaborations to be an increasingly exciting creative space,” OpenAI says in its press release. Although the organization adds that “While Jukebox is an interesting research result, [the musicians who’ve tested it so far] did not find it immediately applicable to their creative process given some of its current limitations.”

What do you think about OpenAI’s Jukebox? Do you see endless possibilities here for collaboration between human and AI musicians? Let us know your thoughts in the comments!

Feature image: OpenAI