Home

gentlesnow

23 May 2019

【ASR】librosa处理音频文件

Librosa是一个用于音频、音乐分析、处理的python工具包。

librosa

librosa.beat

Functions for estimating tempo and detecting beat events.

librosa.core

Core functionality includes functions to load audio from disk, compute various spectrogram representations, and a variety of commonly used tools for music analysis. For convenience, all functionality in this submodule is directly accessible from the top-level librosa.* namespace.

  • librosa.core.load(path, sr=22050, mono=True, offset=0.0, duration=None, dtype=<class ‘numpy.float32’>, res_type=’kaiser_best’)

Load an audio file as a floating point time series.

Audio will be automatically resampled to the given rate (default sr=22050).

To preserve the native sampling rate of the file, use sr=None.

librosa.decompose

Functions for harmonic-percussive source separation (HPSS) and generic spectrogram decomposition using matrix decomposition methods implemented in scikit-learn.

librosa.display

Visualization and display routines using matplotlib.

librosa.effects

Time-domain audio processing, such as pitch shifting and time stretching. This submodule also provides time-domain wrappers for the decompose submodule.

librosa.feature

Feature extraction and manipulation. This includes low-level feature extraction, such as chromagrams, pseudo-constant-Q (log-frequency) transforms, Mel spectrogram, MFCC, and tuning estimation. Also provided are feature manipulation methods, such as delta features, memory embedding, and event-synchronous feature alignment.

librosa.filters

Filter-bank generation (chroma, pseudo-CQT, CQT, etc.). These are primarily internal functions used by other parts of librosa.

librosa.onset

Onset detection and onset strength computation.

librosa.output

Text- and wav-file output.

librosa.segment

Functions useful for structural segmentation, such as recurrence matrix construction, time-lag representation, and sequentially constrained clustering.

librosa.sequence

Functions for sequential modeling. Various forms of Viterbi decoding, and helper functions for constructing transition matrices.

librosa.util

Helper utilities (normalization, padding, centering, etc.)

  • librosa.util.normalize(S, norm=inf, axis=0, threshold=None, fill=None)

Normalize an array along a chosen axis.

Given a norm (described below) and a target axis, the input array is scaled so that

\[norm(S, axis=axis) == 1\]

For example, $axis=0$ normalizes each column of a 2-d array by aggregating over the rows (0-axis). Similarly, $axis=1$ normalizes each row of a 2-d array.

This function also supports thresholding small-norm slices: any slice (i.e., row or column) with norm below a specified threshold can be left un-normalized, set to all-zeros, or filled with uniform non-zero values that normalize to 1.

Note: the semantics of this function differ from $scipy.linalg.norm$ in two ways: multi-dimensional arrays are supported, but matrix-norms are not.

Til next time,
gentlesnow at 15:23

scribble