Python library for audio augmentation

Pydiogment aims to simplify audio augmentation. It generates multiple audio files based on a starting mono audio file. The library can generates files with higher speed, slower, and different tones etc.



Pydiogment requires:

  • Python (>= 3.5)

  • NumPy (>= 1.17.2)
    pip install numpy

  • SciPy (>= 1.3.1)
    pip install scipy

  • FFmpeg
    sudo apt install ffmpeg


If you already have a working installation of NumPy and SciPy , you can simply install Pydiogment using pip:

pip install pydiogment

To update an existing version of Pydiogment, use:

pip install -U pydiogment

How to use

  • Amplitude related augmentation

    • Apply a fade in and fade out effect

      from pydiogment.auga import fade_in_and_out
      test_file = "path/test.wav"
    • Apply gain to file

      from pydiogment.auga import apply_gain
      test_file = "path/test.wav"
      apply_gain(test_file, -100)
      apply_gain(test_file, -50)
    • Add Random Gaussian Noise based on SNR to file

      from pydiogment.auga import add_noise
      test_file = "path/test.wav"
      add_noise(test_file, 10)
  • Frequency related augmentation

    • Change file tone

      from pydiogment.augf import change_tone
      test_file = "path/test.wav"
      change_tone(test_file, 0.9)
      change_tone(test_file, 1.1)
  • Time related augmentation

    • Slow-down/ speed-up file

      from pydiogment.augt import slowdown, speed
      test_file = "path/test.wav"
      slowdown(test_file, 0.8)
      speed(test_file, 1.2)
    • Apply random cropping to the file

      from pydiogment.augt import random_cropping
      test_file = "path/test.wav"
      random_cropping(test_file, 1)
    • Change shift data on the time axis in a certain direction

      from pydiogment.augt import shift_time
      test_file = "path/test.wav"
      shift_time(test_file, 1, "right")
      shift_time(test_file, 1, "left")