Audio Monitor Configuration¶

The AudioMonitor configuration specifies parameters for Speech Channel Monitoring, which is used to compare audio emitted through speech channels (contributions) against reference files.

The configuration addresses two key criteria that evaluate speech channel quality: the audio frequency (frequencyTolerance) and the pause duration between audio signals (durationTolerance). When the pause durations are not as expected, mismatches occur and a warning is generated in the log. The configuration specifies the number of permitted warnings before the test case is considered failed due to poor speech channel quality (maxWarnings).

Important! Two experimental configurations are available for filtering frequencies above a specified amplitude in terms of absolute value and percentage: frequencyPeakLimit and capLimit. However, it is strongly recommended to use the default settings, because adjusting these configurations can negatively impact the accuracy and performance of the speech channel monitoring algorithm.

Syntax

AudioMonitor = {
        durationTolerance: <Number>
        frequencyTolerance: <Number>
        frequencyPeakLimit: <Number>
        capLimit: <Number>
        maxWarnings: <Number>
        ignorePausesBelowMillis: <Number>
        numDominantFreq : <Number>

}

Parameters

durationTolerance (optional) - The error percentage allowed in the pause duration detected by the speech channel monitor compared to the pause duration in the reference, expressed as a number between 0 and 1
- Default value is set to 0.3
- If the duration is no more than 30% above or below the reference duration, it is considered valid
- Each time a duration mismatch occurs (a duration that exceeds the set durationTolerance), a warning is created
frequencyTolerance (optional) - The error percentage allowed in the frequencies detected by the speech channel monitor compared to the frequencies in the reference, expressed as a number between 0 and 1
- Default value is set to 0.05
- If the frequency is no more than 5% above or below the reference frequency, it is considered good
frequencyPeakLimit (experimental) - The amplitude (intensity) that must be met or exceeded by the contribution file in order to be taken into account
- Default value is set to 20.0
capLimit (experimental) - The percentage of the amplitude (intensity) that must be met or exceeded by the contribution file in order to be taken into account, expressed as a number between 0 and 1
- Default value is set to 0.75
maxWarnings (optional) - The maximum tolerated ratio of duration mismatches before the monitoring is considered failed
- Default value is set to 0.05
- It must be a decimal number between 0.0 and 1.0, for example, 0.05 corresponds to 5%
ignorePausesBelowMillis (experimental) - Only pauses with a duration larger than ignorePausesBelowMillis are taken into account for speech channel monitoring
- Default value is set to 0
- Pauses with shorter durations are ignored as this reduces the sensitivity of the monitoring algorithm to short audio drops
numDominantFreq - The number of dominant frequencies to detect
- Default value is set to 3
- If the number of detected frequencies is lower than numDominantFreq, a warning is issued in the intaQt log file
- Lowering the value makes the speech channel monitoring more strict

Example

AudioMonitor = {
        durationTolerance: 0.3
        frequencyTolerance: 0.05
        frequencyPeakLimit: 20.0
        capLimit: 0.75
        maxWarnings: 0.05
        ignorePausesBelowMillis: 0
        numDominantFreq :2
}