TP Yaafe Extension feature list

Telecom Paristech Yaafe Extension module.

Available features

BeatHistogramSummary

class tpyaafeextension.BeatHistogramSummary

Compute the beat histogram according to [GT2002], but using OnsetDetectionFunction as onset detection function.

[GT2002]Georges Tzanetakis, Musical Genre Classification of Audio Signals, IEEE Transactions on speech and audio processing, vol. 10, No. 5, July 2002.
Parameters:
  • ACPNbPeaks (default=3): Number of autocorrelation peaks to keep
  • BHSBeatFrameSize (default=128): Number of frames over which autocorrelation peaks is computed
  • BHSBeatFrameStep (default=64): Number of frames to skip between two consecutive autocorrelation peaks computation
  • BHSHistogramFrameSize (default=40): Number of beat frames over which histogram is computed
  • BHSHistogramFrameStep (default=40): Number of beat frames to skip between two consecutive histogram computation
  • FFTLength (default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.
  • FFTWindow (default=Hanning): Weighting window to apply before fft. Hanning|Hamming|None
  • HInf (default=40): Minimal BPM to take into consideration
  • HNbBins (default=80): Nb bins of histogram
  • HSup (default=200): Maximal BPM to tage into consideration
  • NMANbFrames (default=5000): Number of frames to normalize together, -1 means all frames
  • blockSize (default=1024): output frames size
  • stepSize (default=512): step between consecutive frames

Declaration example:

BeatHistogramSummary ACPNbPeaks=3  BHSBeatFrameSize=128  BHSBeatFrameStep=64  BHSHistogramFrameSize=40  BHSHistogramFrameStep=40  FFTLength=0  FFTWindow=Hanning  HInf=40  HNbBins=80  HSup=200  NMANbFrames=5000  blockSize=1024  stepSize=512

CQT

class tpyaafeextension.CQT

Compute the Constant-Q transform according to [CS2010] with improvements from [JPCQT].

[CS2010]C.Schörkhuber and A.Klapuri, CONSTANT-Q TRANSFORM TOOLBOX FOR MUSIC PROCESSING, 7th Sound and Music Conference (SMC‘2010), 2010, Barcelona.
[JPCQT]J.Prado, Calcul rapide de la transformée à Q constant, http://perso.telecom-paristech.fr/~prado/cqt/cqt_modif.pdf
Parameters:
  • CQTAlign (default=c): Alignment of cqt kernels on analysis frame. ‘l’ to the left, ‘c’ to the center, ‘r’ to the right
  • CQTBinsPerOctave (default=36): Number of bins per octave to consider
  • CQTMinFreq (default=73.42): Minimal frequency. If <0.5 then assume it’s a factor of sampleRate else assume it’s expressed in Hertz.
  • CQTNbOctaves (default=3): Number of octaves to consider for analysis
  • stepSize (default=512): step between consecutive frames

Declaration example:

CQT CQTAlign=c  CQTBinsPerOctave=36  CQTMinFreq=73.42  CQTNbOctaves=3  stepSize=512

See also

Frames

CQT2

class tpyaafeextension.CQT2

Compute the Constant-Q transform according to Blankertz’s implementation [BB], with improvments from [JP2010].

[BB]B.Blankertz, The Constant Q Transform, http://wwwmath.uni-muenster.de/logik/Personen/blankertz/constQ/constQ.html
[JP2010]J.Prado, Transformée à Q constant, technical report 2010D004, Institut TELECOM, TELECOM ParisTech, CNRS LTCI, 2010.
Parameters:
  • CQTAlign (default=c): Alignment of cqt kernels on analysis frame. ‘l’ to the left, ‘c’ to the center, ‘r’ to the right
  • CQTBinsPerOctave (default=3): Number of bins per octave to consider
  • CQTMaxFreq (default=0.5): Maximum frequency. 0.5 then assume it’s a factor of sampleRate else assume it’s expressed in Hertz.
  • CQTMinFreq (default=97.999): Minimal frequency. If <0.5 then assume it’s a factor of sampleRate else assume it’s expressed in Hertz.
  • stepSize (default=512): step between consecutive frames

Declaration example:

CQT2 CQTAlign=c  CQTBinsPerOctave=3  CQTMaxFreq=0.5  CQTMinFreq=97.999  stepSize=512

See also

Frames

Chords

class tpyaafeextension.Chords

Chords recognize chords from chromagrams, according to L.Oudre’s algorithm [LO2009].

[LO2009]Oudre, L. and Grenier, Y. and Fevotte, C., Chord recognition by fitting rescaled chroma vectors to chord templates, IEEE Transactions on Audio, Speech and Language Processing, vol. 19, pages 2222 - 2233, Sep. 2011.
Parameters:
  • ChordsSmoothing (default=1.5s): Chords smoothing duration
  • ChordsUse7 (default=0): If 1 then use 7th chords to enrich chord dictionnary, else use only major an minor chords
  • stepSize (default=512): step between consecutive frames

Declaration example:

Chords ChordsSmoothing=1.5s  ChordsUse7=0  stepSize=512

Chroma

class tpyaafeextension.Chroma

Chroma compute short-term chromagram according to [BP2005].

[BP2005]Bello, J.P. and Pickens, J. A Robust Mid-level Representation for Harmonic Content in Music Signals. In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR-05), London, UK. September 2005.
Parameters:
  • CQTAlign (default=c): Alignment of cqt kernels on analysis frame. ‘l’ to the left, ‘c’ to the center, ‘r’ to the right
  • CQTBinsPerOctave (default=36): Number of bins per octave to consider
  • CQTMinFreq (default=73.42): Minimal frequency. If <0.5 then assume it’s a factor of sampleRate else assume it’s expressed in Hertz.
  • CQTNbOctaves (default=3): Number of octaves to consider for analysis
  • CTInitDuration (default=15): Duration on which perform chroma bias initialisation, in seconds.
  • ChromaSmoothing (default=0.75s): Chroma smoothing duration
  • stepSize (default=512): step between consecutive frames

Declaration example:

Chroma CQTAlign=c  CQTBinsPerOctave=36  CQTMinFreq=73.42  CQTNbOctaves=3  CTInitDuration=15  ChromaSmoothing=0.75s  stepSize=512

See also

CQT

Chroma2

class tpyaafeextension.Chroma2

Chroma2 compute short-term pitch profile according to [ZK2006].

[ZK2006]
  1. Zhu and M.S. Kankanhalli. Precise pitch profile feature extraction from musical audio for key detection. IEEE Transactions on Multimedia, 2006.
Parameters:
  • CQTAlign (default=c): Alignment of cqt kernels on analysis frame. ‘l’ to the left, ‘c’ to the center, ‘r’ to the right
  • CQTBinsPerOctave (default=48): Number of bins per octave to consider
  • CQTMinFreq (default=27.5): Minimal frequency. If <0.5 then assume it’s a factor of sampleRate else assume it’s expressed in Hertz.
  • CQTNbOctaves (default=7): Number of octaves to consider for analysis
  • CZBinsPerSemitone (default=1): number of bins per semitone for the PCP
  • CZNbCQTBinsAggregatedToPCPBin (default=-1): number of CQT bins which are aggregated for each PCP bin. if -1 then use CQTBinsPerOctave / 24
  • CZTuning (default=440): frequency of the A4, in Hz.
  • stepSize (default=512): step between consecutive frames

Declaration example:

Chroma2 CQTAlign=c  CQTBinsPerOctave=48  CQTMinFreq=27.5  CQTNbOctaves=7  CZBinsPerSemitone=1  CZNbCQTBinsAggregatedToPCPBin=-1  CZTuning=440  stepSize=512

See also

CQT

OnsetDetectionFunction

class tpyaafeextension.OnsetDetectionFunction

Compute onset detection function (spectral energy flux) according to [MA2005] method.

[MA2005]M.Alonso, G.Richard, B.David, EXTRACTING NOTE ONSETS FROM MUSICAL RECORDINGS, International Conference on Multimedia and Expo (IEEE-ICME‘05), Amsterdam, The Netherlands, 2005.
Parameters:
  • FFTLength (default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.
  • FFTWindow (default=Hanning): Weighting window to apply before fft. Hanning|Hamming|None
  • NMANbFrames (default=5000): Number of frames to normalize together, -1 means all frames
  • blockSize (default=1024): output frames size
  • stepSize (default=512): step between consecutive frames

Declaration example:

OnsetDetectionFunction FFTLength=0  FFTWindow=Hanning  NMANbFrames=5000  blockSize=1024  stepSize=512

See also

MagnitudeSpectrum

SpectralIrregularity

class tpyaafeextension.SpectralIrregularity

Compute difference between consecutive CQT bins, see [Brown2000].

[Brown2000]J.C. Brown, O.Houix, Stephen McAdams, Feature dependence in the automatic identification of musical woodwind instruments., Journal of the Acoustical Society of America, 109: 1064-1072, 2000.
Parameters:
  • CQTAlign (default=c): Alignment of cqt kernels on analysis frame. ‘l’ to the left, ‘c’ to the center, ‘r’ to the right
  • CQTBinsPerOctave (default=36): Number of bins per octave to consider
  • CQTMinFreq (default=73.42): Minimal frequency. If <0.5 then assume it’s a factor of sampleRate else assume it’s expressed in Hertz.
  • CQTNbOctaves (default=3): Number of octaves to consider for analysis
  • stepSize (default=512): step between consecutive frames

Declaration example:

SpectralIrregularity CQTAlign=c  CQTBinsPerOctave=36  CQTMinFreq=73.42  CQTNbOctaves=3  stepSize=512

See also

CQT

Table Of Contents