

Australian audio analysis tools

Project Links

Related Information


A module is a content analysis function that work either on information provided by tier 1 or on results of another module. Tier 2 provides data structure used by modules and an interface to dynamically load and call a module from a library. The goal of a module is to extract higher level content analysis features based on low-level features.

A set of fundamental modules is distributed with Maaate as a set of default modules in a separate library. The features are described in the CSIRO Technical Report No.196, December 2001.

If you would like to add new modules or libraries to Maaate, read the module guide.

Maaate Fundamental Modules List

The Maaate Fundamental Modules Library contains the following modules:


In the formulae, all indexes start at number 0. The window index start at 0 and covers the whole file.

  • n: time index 0 <= n <= N-1
  • N: duration of file
  • i: subband index 0 <= i <= I-1
  • I: number of subbands
  • Si(n): is the value of subband i, at time n. In the following formulae, Si(n) is used as a placeholder and any of the pre-processed values may be used in the formulae interchangeably.
  • m: time position within window 0 <= m <= M-1
  • M: number of time division (frequency samples) in one window, i.e. the window size.
  • o: overlap of successive analysis windows
  • t: analysis window number 0 <= t <= (N-1)/(M-o)
  • h(m): window function for window m; may be one of: Rectangular, Hamming, Welch, Barlett; depending on the required narrowess or peakness of spectral leakage.
  • Ts: threshold.

Normalised Subband Energies


    Give an energy spectrum, and allow to observe the spreading of energy over subbands. The normalisation absorbs the sound level dependency on an audio source.

    Resolution: Window.
    Parameters: SOUNDfile, start time, end time, start subband number, end subband number.
    Based on the Subband Mean.


Modules List Symbols Legend

Normalised Subband Values Energies


    Give an energy spectrum, and allow to observe the spreading of energy over subbands. The normalisation absorbs the sound level dependency on an audio source.

    Resolution: Frequency sample.
    Parameters: SOUNDfile, start time, end time, start subband number, end subband number.
    Based on the Subband Values Mean.


Modules List Symbols Legend

Subband Scalefactors (Channels 0)


    It returns a spectrum giving the maximum of the frequency samples over each window for each selected subbands.

    Resolution: Window.
    Parameters: SOUNDfile, start time, end time, start subband number, end subband number.
    For MPEG layer 1 and 2 files, it directly uses the scalefactor values as an approximation of the maximum.


Modules List Symbols Legend

Subband Mean


    Give a spectrum (this provides a way to go from a frequency sample resolution to a window resolution).

    Resolution: Window.
    Parameters: SOUNDfile, start time, end time, start subband number, end subband number.
    Based on the Subband Values Mean.


Modules List Symbols Legend

Subband RMS


    Return a spectrum. The Root Mean Squared subband vector is a better measure of signal energy than the subband values themselves.

    Resolution: Window.
    Parameters: SOUNDfile, start time, end time, start subband number, end subband number.
    Based on the Subband Values Mean.


Modules List Symbols Legend

Subband Values


    Give a spectrum of the raw subband values directly extracted from the analysed file for channel 0.

    Resolution: Frequency sample.
    Parameters: SOUNDfile, start time, end time, start subband number, end subband number.
Modules List

Subband Values Mean


    Give a spectrum of the mean of subband values directly extracted from the analysed file over channels.

    Resolution: Frequency sample.
    Parameters: SOUNDfile, start time, end time, start subband number, end subband number.
    Based on the Subband Values.
Modules List

Subband Values RMS


    Give a spectrum of the RMS of subband values directly extracted from the analysed file over channels.

    Resolution: Frequency sample.
    Parameters: SOUNDfile, start time, end time, start subband number, end subband number.
    Based on the Subband Values.
Modules List

Band Energy


    An approximation of the signal energy over the selected subbands.

    Resolution: Window.
    Parameters: SOUNDfile, start time, end time, start subband, end subband, window number.
    Based on the Subband Values Mean.


Modules List Symbols Legend

Signal Energy


    An approximation of the signal energy.

    Resolution: Window.
    Parameters: SOUNDfile, start time, end time, window number.
    Based on the Subband Values Mean.


Modules List Symbols Legend

Signal Magnitude


    Approximation of the perceptual loudness less sensitive to noise than Signal Energy.

    Resolution: Window.
    Parameters: SOUNDfile, start time, end time, window number.
    Based on the Subband Values Mean.


Modules List Symbols Legend

Sum of Scalefactors


    Provides a faster approximation of the Signal Magnitude.

    Resolution: Window.
    Parameters: SOUNDfile, start time, end time, start subband number, end subband number.
    Based on the Scalefactors.


    Simply sum the scalefactors over a whole window.
Modules List



    Return the bandwidth, i.e. the frequency min and max using a threshold.

    Resolution: Window.
    Parameters: SOUNDfile, start time, end time, Ts dynamic threshold.
    Based on the Subband Mean.


Modules List Symbols Legend

Signal Bandwidth


    Return the bandwidth from a SegmentData, i.e. the frequency min and max using a threshold.

    Resolution: Window.
    Parameters: SegmentData, start time, end time, Ts threshold.


Modules List Symbols Legend

Significant Subbands


    Count the number of subbands with a significant level (higher than the threshold).

    Resolution: Window.
    Parameters: SegmentData, start time, end time, Ts threshold.
    Based on the Subband Mean.


Modules List Symbols Legend

Band Energy Ratio


    It sets the energy of the low frequencies in relation to the high frequencies. This is an indicator of the voicedness of a sound.

    Resolution: window.
    Parameters: SOUNDfile, start time, end time, J subband boundary, window function number.
    Based on the Subband Mean.


Modules List Symbols Legend

Central Moment


    It calculates statistics within subbands over several windows. It captures how much a subband's energy is dispersed from its mean.

    Resolution: several windows.
    Parameters: SOUNDfile, start time, end time, start subband number, end subband number, duration of the analysis window, k parameter.
    Based on the Subband Mean.


Modules List Symbols Legend

Spectral Centroid


    The spectral centroid is the balancing point of the subband energy distribution. It determines the frequency area around which most of the signal energy concentrates and is thus closely realted to the time-domain Zero Crossing Rate feature. It is also frequently used as an approximation for a perceptual brightness measure.

    Resolution: window.
    Parameters: SOUNDfile, start time, end time, start subband number, end subband number.
    Based on the Subband RMS.


Modules List Symbols Legend

Spectral Flux


    It determines changes of spectral energy distribution of two successive windows

    Resolution: window.
    Parameters: SOUNDfile, start time, end time.
    Based on the Subband RMS.


Modules List Symbols Legend

Spectral Rolloff


    The spectral RollOff point R determines where 85% of the window`s energy is achieved. It is used to distinguish voiced from unvoiced speech and music.

    Resolution: window.
    Parameters: SOUNDfile, start time, end time.
    Based on the Subband RMS.


Modules List Symbols Legend

Low Energy


    It is the percentage of windows that have less than the average energy. it is used to separate speech from music.

    Resolution: several windows.
    Parameters: SegmentData, start time, end time, duration.
    Shall use the Signal Energy.


Modules List Symbols Legend

Pause Rate


    The pause rate counts the number of silence segment on a time interval corresponding to a window. It is proposed as an indicator to separate speech from non-speech signals.

    Resolution: several windows.
    Parameters: SegmentData, start time, end time, Ts dynamic threshold, duration.
    Shall use the Signal Energy.


Modules List Symbols Legend



    It calculates relatively silent intervals from a loudness contour.

    Parameters: SegmentData, start time, end time, threshold, minimum duration, maximum interrupt, onset time, release time.
    Based on Segmentation.
    Shall use Sum of Scalefactors as loudness contour.
Modules List

Background Noise Level


    It calculates the level of background noise using Silence Segmentation to find that threshold at the first silence segmentation result.

    Parameters: SegmentData, start time, end time, minimum duration, maximum interrupt, onset time, release time.
    Based on Segmentation.
    Shall use Sum of Scalefactors as input.
Modules List



    It calculates relatively noise intervals from a loudness contour.

    Parameters: SegmentData, start time, end time, threshold, minimum duration, maximum interrupt, onset time, release time.
    Based on Segmentation.
    Shall use Sum of Scalefactors as loudness contour.
Modules List



    Using an adaptative thresholding, windows are determined as silence if there sum of scalefactors stays under the threshold. A sequence of silence windows gets clustered into a pause segment if it covers at least the minimum duration and is not interrupted by non-silence windows longer than the tolerated interruption.

    Parameters: SegmentData, start time, end time, below, threshold, minimum duration, maximum interrupt, onset time, release time.
    Based on the Sum of Scalefactors.
Modules List

Histogramme 1D


    Show the repartition of the values of a curve.

    Parameters: SegmentData, start time, end time, bins, start histo, end histo.
Modules List



    Computes the variance of the input on each interval of `duration` seconds.

    Parameters: SegmentData, start time, end time, duration.
Modules List

last updated 23/08/2008