transforms module
- class transforms.AddWhiteNoise[source]
Bases:
Module
Transformation that adds white noise to the audio signal
Example :
>>> x = torch.zeros(16000) >>> transform = AddWhiteNoise() >>> x_with_noise = AddWhiteNoise(x)
- add_white_noise(audio_tensor, min_snr_db=20, max_snr_db=90, STD_n=0.5)[source]
Adds a random gaussian white noise to the audio_tensor input
- Parameters:
audio_tensor (torch.tensor) – 1 dimensional pytorch tensor
min_snr_db (int, optional) – minimum signal to noise ratio in dB. Defaults to 20.
max_snr_db (int, optional) – maximum signal to noise ratio in dB. Defaults to 90.
STD_n (float, optional) – Standard deviation of the gaussian distribution used to generate the noise. Defaults to 0.5.
- Returns:
tensor with noise
- Return type:
torch.tensor
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class transforms.MfccTransform(sample_rate)[source]
Bases:
Module
Transformation that returns the Mel-frequency cepstral coefficients of an audio tensor
Example :
>>> x = torch.zeros(16000) >>> transform = MfccTransform() >>> specgram = MfccTransform(x)
We can visualize the generated ceptrum with matplotlib using the following :
>>> fig, axs = plt.subplots(1, 1) >>> axs.set_title(title or "Mel-frequency cepstrum") >>> axs.set_ylabel(ylabel) >>> axs.set_xlabel("frame") >>> im = axs.imshow(librosa.power_to_db(specgram), origin="lower", aspect="auto") >>> fig.colorbar(im, ax=axs) >>> plt.show(block=False)
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class transforms.Scattering[source]
Bases:
Module
Wrapper for kymatio’s scattering transform. Returns the scattering coefficients of the input.
For more information about the transform checkout : https://www.kymat.io/
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class transforms.SpecAugment[source]
Bases:
Module
Transformation that returns double time-masked and frequency-masked Mel-frequency cepstral coefficients of an audio tensor
Example :
>>> x = torch.zeros(16000) >>> transform = MfccTransform() >>> specgram = MfccTransform(x)
We can visualize the modified ceptrum with matplotlib using the following :
>>> fig, axs = plt.subplots(1, 1) >>> axs.set_title(title or "Mel-frequency cepstrum") >>> axs.set_ylabel(ylabel) >>> axs.set_xlabel("frame") >>> im = axs.imshow(librosa.power_to_db(specgram), origin="lower", aspect="auto") >>> fig.colorbar(im, ax=axs) >>> plt.show(block=False)
- forward(x)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.