Utilities

A collection of utility functions.

Plotting


source

plot_sample

 plot_sample (a:numpy.ndarray, b:numpy.ndarray, gt_audio:numpy.ndarray,
              sample_rate:int=16000, plt_ioff:bool=True)

Plot a sample from the model.


source

db20

 db20 (x:numpy.ndarray)

Plot a dummy image

b = np.array([[[1.0, -1.60,  0.80]]])
a = np.array([[[1.0, -0.90,  0.27]]])

print(a.shape, b.shape)
impulse = np.zeros(int(0.3 * 16000))
impulse[0] = 1.0

gt_audio = np.random.randn(int(0.3 * 16000) + 1)

fig = plot_sample(a, b, gt_audio)
fig
(1, 1, 3) (1, 1, 3)
(<Figure size 600x300 with 1 Axes>,
 array([ 1.e+000, -7.e-001, -1.e-001, ..., -5.e-324, -5.e-324, -5.e-324]))

source

save_and_display_audio

 save_and_display_audio (audio:numpy.ndarray, name:str,
                         sample_rate:int=44100)

Save audio file and display in a Jupyter notebook.

save_and_display_audio(gt_audio, "gt_audio.wav")

Losses


source

fft_loss

 fft_loss (pred_fft:torch.Tensor, target_fft:torch.Tensor,
           lin_l1:float=1.0, lin_l2:float=0.0, log_l1:float=0.0,
           log_l2:float=0.0)

Compute the loss between the target and predicted fft

Type Default Details
pred_fft Tensor magnitude of the predicted fft
target_fft Tensor magnitude of the target fft
lin_l1 float 1.0 weight of the linear l1 loss
lin_l2 float 0.0 weight of the linear l2 loss
log_l1 float 0.0 weight of the log l1 loss
log_l2 float 0.0 weight of the log l2 loss
Returns Tensor scalar loss

source

FFTLoss

 FFTLoss (lin_l1:float=1.0, lin_l2:float=0.0, log_l1:float=0.0,
          log_l2:float=0.0)

Compute the loss between the target and predicted fft

Type Default Details
lin_l1 float 1.0 weight of the linear l1 loss
lin_l2 float 0.0 weight of the linear l2 loss
log_l1 float 0.0 weight of the log l1 loss
log_l2 float 0.0 weight of the log l2 loss

source

MelScaleLoss

 MelScaleLoss (n_mels:int=128, sample_rate:int=16000, n_fft:int=0,
               lin_l1:float=1.0, lin_l2:float=0.0, log_l1:float=0.0,
               log_l2:float=0.0, f_min:float=0.0,
               f_max:Optional[float]=8000.0)

Compute the loss between the target and predicted mel scale

Type Default Details
n_mels int 128
sample_rate int 16000
n_fft int 0
lin_l1 float 1.0 weight of the linear l1 loss
lin_l2 float 0.0 weight of the linear l2 loss
log_l1 float 0.0 weight of the log l1 loss
log_l2 float 0.0 weight of the log l2 loss
f_min float 0.0
f_max typing.Optional[float] 8000.0
Returns None

Other


source

to_zpk

 to_zpk (x:torch.Tensor, n_parallel:int, n_biquads:int,
         gain_factor:float=1.0)

Convert the parameters of the filter to zeros, poles and gain

Input: x: (B, n_parallel, n_biquads, 5) Output: z: (B, n_parallel, n_biquads) p: (B, n_parallel, n_biquads) k: (B, n_parallel, n_biquads)