BassNet can be a strong ally for artists: Start from a draft and build a part you would not imagine without it.
[A]fter few tries there’s a specific groove, . . . a kind of reggae pattern has been revealed to my ears... So cool, because I would never have thought of it on my own! That was THE good fragment for me: short notes, very airy, contrasting with the long smooth notes / ambiance you can hear in the middle of the arrangement, so I decided to edit this part and make it the main pattern for the bass.
— Luc Leroy, musician
BassNet definitely takes me outside of my usual patterning concepts. It really does make me end up with something unexpected.

Making AI work for artists

Sony Computer Science Laboratories is a fundamental research lab whose Music and AI team has a distinctively artist-centric vision: A new generation of AI-driven music production tools that augment creativity, and are beneficial to the music creation process. We believe that rather than producing novel pieces of music at the press of a button, AI music tools should trigger the imagination, offer opportunities for interaction, and blend into the artistic workflow.


What is BassNet?

BassNet is a new AI music tool prototype by Sony CSL. With BassNet you can interactively explore bass lines for your music project. BassNet is designed to react to any musical material you may already have in your project, rather than producing bass lines from scratch. A distinctive feature of BassNet is that you can control the way the bass reacts to the input audio while listening, simply by moving a point within a square (the latent space). Some regions in the square will produce simple and functional bass lines, while others give you more adventurous, if not crazy results.

BassNet features at a glance

  • You can input any audio
  • No restrictions on tempo or timing, BassNet follows your material
  • Explore and tweak bass lines interactively while the music is playing
  • Control note density, articulation, timbre and more
  • Export bass lines in audio and MIDI format to use in your DAW project
  • MIDI includes tuning, dynamics and pitch bend information

Schematic overview of BassNet


BassNet Demonstration Video

(Best heard using headphones or loudspeakers)

Examples

Examples 1–3 are arrangements by musician Luc Leroy, showcasing BassNet as a music production tool.

Examples 4 and 5 are works by musician Donn Healy for Sony CSL (2020), documenting his creative workflow with BassNet.

Examples 6–8 demonstrate some of the different bass lines you can create with BassNet just by selecting different positions of the latent space.

To hear the bass more clearly in the sound examples, they are best listened to using headphones or loudspeakers.

Example 1

Luc Leroy Production: Velvet

Example 2

Luc Leroy Production: Delilah

Example 3

Luc Leroy Production: Euphony

Example 4

Donn Healy Production

Example 5

Donn Healy Production

Example 6

Input: Rubato Piano

Example 7

Input: Polyphonic Bass Guitars

Example 8

Input: Drone Metal


Example 1

Title: Velvet
Author: Luc Leroy

This fragment consists of an intro of vocals and acoustic guitar, and a main part adding percussion, flute, and atmospherics. Luc is looking for a bass accompaniment in the main part. He uses BassNet to explore different possibilities. In about 5 minutes he finds something he likes. In his own words:

[A]fter few tries there’s a specific groove you can spot at 00:43 (instrumental part with drums), a kind of reggae pattern has been revealed to my ears... So cool, because I would never have thought of it on my own! That was THE good fragment for me: short notes, very airy, contrasting with the long smooth notes / ambiance you can hear in the middle of the arrangement, so I decided to edit this part and make it the main pattern for the bass.
— Luc Leroy, musician

Velvet with raw BassNet output

DAW Screenshot of raw BassNet track


Velvet with edited BassNet output

DAW Screenshot of edited BassNet track

Example 2

Title: Delilah
Author: Luc Leroy

The input to BassNet is a mix of sampled voice, guitar, and organ-like synths. The second half of the fragment features apreggiated and atmospheric synths.

The first the audio fragment includes the raw BassNet output without any editing. The second audio fragment features the output edited by Luc. As in the previous example, Luc has selected motifs from the raw BassNet output, and used them to construct the bass track for the arrangement.

Delilah with raw BassNet output

DAW Screenshot of raw BassNet track


Delilah with edited BassNet output

DAW Screenshot of edited BassNet track

Example 3

Title: Euphony
Author: Luc Leroy

The input to BassNet is a polyrhythmic multi-instrumental fragment, consisting of piano, mallet percussion, pizzicato strings, and synths among other instruments.

Euphony with edited BassNet output

Example 4

Author: Donn Healy

What musicians often value in their production tools (e.g. an effects rack, or a synthesizer) is personality — a persistent character that shapes the output in specific ways. Extended use of a tool tends to give musicians an intimate knowledge of its affordances, enabling them to use the tool effectively and efficiently.

In this sense BassNet resembles traditional music production tools, but it offers novel ways of interaction. Specifically, BassNet proposes creative musical content in reaction to:

  1. the music content provided by the musician to BassNet;
  2. the settings the musician chooses from the interface.

One interesting way to control BassNet is therefore not to to do it with the interface settings (2), but with musical input (1). It is possible to design inputs specifically to make BassNet react in a particular fashion.

This is what musician Donn Healy does in this example. After spending time with BassNet to get a feel for its behavior, Donn starts the production of a music sample by designing drum beats specifically intended to get interesting results from BassNet. Donn integrates the results into the production, in turn modifying the drum beats in response to BassNet propositions.

Donn qualifies this production method as a ‘‘conversation’’ with BassNet. His workflow for this example consists of the following steps:

  1. Make 3 rhythmical motifs
  2. Print them as one file and drop into BassNet
  3. Experiment with BassNet and download multiple results
  4. Select the best result and build a new beat inspired by the suggestion
  5. Select another result and transpose, use as ‘dark pad’
  6. Select another, transpose and use to make the distorted lead
  7. Arrange
  8. Mix & Master

Audio result


Video analysis

Example 5

Author: Donn Healy

The character of BassNet's behavior depends on factors like its window size over the audio, and the data on which it is trained. In this example, Donn Healy takes advantage of BassNet’s different personalities to build drum & bass music from a conversation with the different models. He uses the following models:

  • short-pr — Model with a 0.8s audio window, trained on a pop/rock dataset
  • long-pr — Model with a 6.4s audio window, trained on the PR dataset
  • long-pr-perc — Like long-pr, but trained only on the percussive parts of the song
  • long-pr-harm — Like long-pr, but trained only on the harmonic parts of the song

Procedure:

  1. Build a ‘drum & bass’ beat – drum parts only
  2. Feed the beat to BassNet
  3. Explore long-pr and download audio/MIDI
  4. Explore long-pr-perc and download audio/MIDI
  5. Add some chords to loop in order to test long-pr-harm
  6. long-pr-harm did not succeed in this case, but the harmonic content stimulated the most cohesive story from BN to date in short-pr mode; Download audio/MIDI
  7. In the spirit of seeking faster work flows with AI assistance, opt to use the audio output rather than MIDI output
  8. Edit the bass lines as desired
  9. Adjust the sound of bass lines to taste, use CSL's Profile EQ directly on bass parts
  10. Arrange
  11. Mix & master

Audio result


Structure / model use


Video analysis

Example 6

The source is a piano excerpt with strong rubato but clear onsets. In both examples BassNet follows the tempo changes and provides realistic harmonies. A notable difference between the variations is that variation 1 tends to rhythmically follow the 16th notes in the higher piano register (but using mostly low pitched notes), whereas variation 2 follows the slower, lower part of the piano (but with higher notes), overall leading to a sparser and quieter result in variation 2.

The pianorolls show excerpts of the respective bass lines to visually demonstrate the differences.

Original music: Emmanuel Deruty, 2012. BassNet MIDI output sonified with Native-Instruments Kontakt, Scarbee MM-Bass and Cerberus Bass Amp.

Input audio

Bass line 1

Bass line 2

Example 7

The source is a polyphonic fragment played by electric bass with non-trivial harmony. For this example the predicted bass tracks have been transposed one octave upward and quantized to 16th notes. The first variation is created using a latent position that is representative for the training data. The predicted bass line sometimes doubles parts in the source, and sometimes deviates from them. The output harmony is reasonable, and consistent with the input. The rhythm is mostly 8th notes like the source, with occasional 16th notes, that bring groove and variation.

Variation 2 corresponds to a more remote positioning in the latent space. Accordingly the result is more exotic, and is reminiscent of improvisation and freestyling or fiddling with the bass. At some point it remains silent for three seconds, then goes on again. It is not unlike a human bass player trying something different, succeeding at particular points and failing at others.

Original music: Emmanuel Deruty, 2008. BassNet MIDI output sonified with Native-Instruments Kontakt, Scarbee MM-Bass and Cerberus Bass Amp.

Input audio

Bass line 1

Bass line 2

Example 8

The source is a combination of guitar and drums. The drum programming is uncommon in that it does not feature an alternation of kick and snare. The rhythm of the guitar part is slow and sparse, and the heavy distortion obfuscates the harmony. We believe providing a bass accompaniment to this example would be not be trivial for a human player. Like in previous example the predicted bass is transposed upward one octave and quantized to 16th notes.

Both variations are quite similar in terms of pitch, and succeed in following the harmony of the guitar. In terms of rhythm BassNet produces uncommon grooves, likely in response to the atypical drum track. Varying the latent position in this case leads to different grooves.

Original music: Emmanuel Deruty, Yan Guérin, 2019. Drum programming made with CSL prototype DrumNet by Stefan Lattner and Kontakt Studio Drummer. BassNet MIDI output sonified with Native-Instruments Kontakt, Scarbee MM-Bass and Cerberus Bass Amp.

Input audio

Bass line 1

Bass line 2

How does BassNet work?

BassNet is driven by a neural network that was trained on a dataset of pop/rock songs to learn the relationship between the bass guitar and the rest of the music. It organizes the different ways in which the bass guitar can relate to the rest of the music in a 2-dimensional plane (the latent space), such that for a new piece of music different bass lines can be inferred by varying the position in the latent space.

The system is described in detail in the following open-access publication:

M. Grachten, S. Lattner, E. Deruty (2020). BassNet: A Variational Gated Autoencoder for Conditional Generation of Bass Guitar Tracks with Learned Interactive Control. Applied Sciences, Special Issue "Deep Learning for Applications in Acoustics: Modeling, Synthesis, and Listening", 10(18):6627.

PDF

The BassNet neural network architecture in detail