Skip to content
/ .. / .. /
Audio Matching Configuration










Audio Matcher Configuration

The AudioMatcher reads an audio recording and uses an algorithm to compute a collection of audio fingerprints. This collection of fingerprints is characteristic to the recording. The fingerprints of the recording to be assessed is compared to a collection of fingerprints that is calculated from reference recordings. Every comparison yields an integer number that is a similarity measure of audio recordings. A higher number indicates a stronger similarity to the reference recording.

  • The Fingerprint Algorithm uses a vector of real numbers that represents the input audio signal. This input signal is subdivided into small, overlapping windows. Every window is Fourier transformed to yield the frequency content of the audio signal at the instant the window is located at. In total, the transform yields a two-dimensional spectrogram, where each value is indexed by frequency and time.

  • The next step is to locate local maxima in this second representation of the signal. After sorting the local peaks, pairs of those peaks are formed. Every pair of peaks corresponds to a single audio fingerprint for the signal.

  • Eventually, the signal is represented by a list of audio fingerprints that is characteristic to the input audio.

Configuration

Multiple AudioMatcher profiles can be configured. intaQt will always provide the default profile, as well as default values for minimumConfidence and referenceDirectory. Each profile will create its own audio reference file, named <profileName>.audio.

Syntax

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
AudioMatcher = {
        profiles = {
      <profileName String> = {
        stftWindowSize: <Number>,
        stftOverlap:  <Number>,
        stftWindowType:  <Number>,
        afpAmplitudeCap:  <Number>,
        afpPairingRange:  <Number>
      },
            ...
    },
        minimumConfidence:  <Number>
        referenceDirectory : <String>
}

Parameters

  • profileName - The name assigned to the audio profile

  • stftWindowSize - The Short Time Fourier Tranform (STFT)'s window size

    • The size of the input signal for audio matching must be larger than or equal to the window size
    • Default value is set to 1024
  • stftOverlap - The windows overlap in the STFT

    • The overlap must be smaller than the window size
    • Default value is set to 992
  • stftWindowType - Defines the averaging window function to be used for the STFT

    • Allowed values are:
      • 0 (Default) - Hann
      • 1 - Hamming
      • 2 - Blackman
      • 3 - No averaging
  • afpAmplitudeCap - The limit value a local maximum needs to exceed in order to be considered a peak that is eligible for fingerprinting

    • The values in the spectrogram are derived from a log-scaled power spectrum
    • Default value is set to 15.0
  • afpPairingRange - The number for pairs that are considered for a specific peak to form fingerprints

    • The larger this number, the more fingerprints are created
    • Default value is set to 10
  • minimumConfidence - The minimum number of matches (such as equals fingerprints) that an audio recording must yield in order to be considered "equal" (sufficiently similar) to a reference recording

    • Raising this number means ignoring all those matches whose matching score (such as number of matching fingerprints) if it is too low
    • Default value is set to 0
  • referenceDirectory - The location directory where the reference recordings are stored

    • The default is "../audio-ref" relative to intaQt's bin directory

Important! This is an experimental feature. It is highly recommended to use the default mathematical matching configuration parameters for the AudioMatcher.

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
AudioMatcher = {
        profiles = {
            "default" = {
                stftWindowSize: 1024,
                stftOverlap:  992,
                stftWindowType:  0,
                afpAmplitudeCap:  30.0,
                afpPairingRange:  10
            },

            "profile2" = {
                stftWindowSize: 1024,
                stftOverlap:  992,
                stftWindowType:  0,
                afpAmplitudeCap:  30.0,
                afpPairingRange:  20
            },
            ...
        },
        minimumConfidence:  0
        referenceDirectory : "../audio-ref"
}