SINES Tools: Frequency Bands Analysis Tool (Batch Audio Signal Analysis with Essentia.js)
1. Click on "Start" to reset all values on default.
2. Upload one or multiple mp3 or wav audio file(s) of your choice.
3. optional: Select the noise gate setting to avoid artifacts in silent parts of the sound: noise gate off (quiet parts will be analysed) or
noise gate on (quiet parts will not be analysed).
4. Select the frequency/spectral bands calculation:
24 Mel Bands
40 ERB Bands
13 MFCCs
11 Octave Bands
31 Third-Octave Bands
13 GFCCs (Gammatone)
6 Spectral Contrast Bands (Coarse)
12 Spectral Contrast Bands (Fine)
20 Spectral Peaks
5. optional: If you have uploaded multiple audio files to analyze them in a batch:
Additionally, export a table for each file containing all individual values over time as a .zip file
6. Click on "Analyse" to start the analyzing process.
7. optional: View data in the tables below.
8. Click on "Show JS Arrays" or "Export CSV", to export data.
Audio Feature
Wertebereich
Bedeutung
27 Bark Bands
≥ 0
Frequenzgruppen, Frequenzbereiche, die beim menschlichen Hören
gemeinsam ausgewertet werden (eigentlich 24 Bark Bands. Die tiefsten
Frequenzgruppen (0-100 Hz und 100-200 Hz werden hier für
eine bessere Auflösung in den Tiefen geteilt, deswegen 27
Bark Bands))
24 Mel Bands
≥ 0
Auflösung des Spektrums in 24 sich überlappende Mel-Bänder,
die den Klang gehörgerecht darstellen
40 ERB Bands
≥ 0
Equivalent Rectangular Bandwidth (ERB), gehörgerechte Auflösung
des Spektrums in 40 auditive, sich überlappende Bandpassfilter
13 MFCCs
Variabel
Mel-Frequency Cepstral Coefficients (MFCCs), beschreiben mit
13 Koeffizienten den klanglichen "Fingerabdruck" eines
Spektrums, trennen Quelle und Filter (tonhöhenunabhängige
spektrale Merkmale). Die niedrigen Koeffizienten beschreiben die
grobe Form des Klangs, die höheren Koeffizienten die feineren
Details
13 GFCCs
Variabel
Gammatone Frequency Cepstral Coefficients (GFCCs), die 13 GFCCs
sind quasi die biologisch korrekte Version der MFCCs mit Gammtone-Filtern
anstelle der Mel-Skala als Grundlage (feinere Auflösung,
robuster gegenüber Hintergrundrauschen)
11 Octave Bands
≥ 0
Aufteilung des Spektrums in 11 oktavbreite Frequenzbereiche,
für einen groben Überblick über die Klangbalance
31 Third-Octave Bands
≥ 0
Aufteilung des Spektrums in 31 terzbreite Frequenzbereiche,
für einen feineren Überblick über die Klangbalance
6 Spectral Contrast Bands
Variabel
Kontrast in einem Frequenzband, hoch: starke tonale Anteile,
gering: starke perkussive/verrauschte Anteile. Valley Magnitudes:
hoch: lauter, rauschiger Hintergrund, gering: hohe Signadynamik,
klares Signal.
Bei 6 Bändern:
2057 Hz: Sub-Bass
57164 Hz: Bass-Bereich
164469 Hz: Untere Mitten (Wärme & Körper)
4691343 Hz: Mitten (Präsenz)
13433843 Hz: Obere Mitten (Klarheit & Konsonanten)
384311000 Hz: Höhen (Brillanz & Luft)
12 Spectral Contrast Bands
Variabel
Kontrast in einem Frequenzband, hoch: starke tonale Anteile,
gering: starke perkussive/verrauschte Anteile. Valley Magnitudes:
hoch: lauter, rauschiger Hintergrund, gering: hohe Signadynamik,
klares Signal.
Bei 12 Bändern:
2034 Hz: Sub-Bass, allerunterster "Sub-Wumms"
3457 Hz: Sub-Bass, Körper" der Kick-Drum und tiefste
Bass-Saiten.
5797 Hz: Oberer Bass, "Punch"
97164 Hz: Übergangsbereich (hier entscheidet sich ob
der Bass "dröhnt")
164277 Hz: (Wärme und Fülle), Grundtöne von
Snare Drums, Gitarren
277469 Hz: "Matschiger" Bereich
469794 Hz: untere Mitte der Stimme
7941343 Hz: obere Mitte der Stimme, wichtig für Sprachverständlichkeit
13432271 Hz: harte Anschlaggeräusche
22713843 Hz: "Edge"-Bereich: zu viel Kontrast:
schrill, zu wenig: dumpf
38436501 Hz: Klarheit, Brillanz von Streichinstrumenten
650111000 Hz: "Luft", Bereich für klangliche
Weite, Hochwertigkeit.
20 Spectral Peaks
≥ 0
20 stärkste Amplituden im Spektrum (Frequenzen mit ihren
Amplituden).
Audio Feature
Value Range
Meaning
27 Bark Bands
≥ 0
Critical Bands, or frequency ranges, that are processed together
by the human ear (actually 24 Bark bands. The lowest frequency
groups (0100 Hz and 100200 Hz) are splited here to
improve resolution in the low frequencies, resulting in 27 Bark
bands)
24 Mel Bands
≥ 0
Decomposition of the spectrum into 24 overlapping Mel bands
that represent the sound in a way that is consistent with hearing
40 ERB Bands
≥ 0
Equivalent Rectangular Bandwidth (ERB), a hearing-appropriate
resolution of the spectrum into 40 overlapping auditory bandpass
filters
13 MFCCs
Variable
Mel-frequency cepstral coefficients (MFCCs) use 13 coefficients
to describe the acoustic fingerprint of a spectrum,
distinguishing between source and filter (pitch-independent spectral
features). The lower coefficients describe the general shape of
the sound, while the higher coefficients describe the finer details
13 GFCCs
Variable
Gamma-tone Frequency Cepstral Coefficients (GFCCs); the 13 GFCCs
are essentially the biologically accurate version of MFCCs, using
gamma-tone filters instead of the Mel scale as their basis (finer
resolution, more robust against background noise)
11 Octave Bands
≥ 0
Divides the spectrum into 11 octave bands, providing a coarse
overview of the sound balance
31 Third-Octave Bands
≥ 0
Divides the spectrum into 31 third-octave bands, providing a
more detailed overview of the sound balance
6 Spectral Contrast Bands
Variable
Contrast within a frequency band: high: strong tonal components;
low: strong percussive/noisy components. Valley Magnitudes:
high: loud, noisy background; low: high signal dynamic range,
clear signal.
For 6 bands:
2057 Hz: Sub-bass
57164 Hz: Bass range
164469 Hz: Lower mids (warmth & body)
4691343 Hz: Midrange (presence)
13433843 Hz: Upper midrange (clarity & consonants)
384311000 Hz: Treble (brilliance & air)
12 Spectral Contrast Bands
Variable
Contrast within a frequency band: high: strong tonal components;
low: strong percussive/noisy components. Valley Magnitudes:
high: loud, noisy background; low: high signal dynamic range,
clear signal.
For 12 bands:
2034 Hz: Sub-bass, the very lowest sub-thump
3457 Hz: Sub-bass, body of the kick drum and
the lowest bass strings.
5797 Hz: Upper bass, punch
97164 Hz: Transition range (determines whether the bass
booms)
164277 Hz: (Warmth and fullness), fundamentals of snare
drums, guitars
277469 Hz: Muddy range
469794 Hz: Lower midrange of the voice
7941343 Hz: Upper midrange of the voice, speech intelligibility
13432271 Hz: Hard impact sounds
22713843 Hz: Edge range: too much contrast:
shrill, too little: dull
38436501 Hz: Clarity, brilliance of string instruments
650111000 Hz: Air, range for sonic spaciousness,
high-quality sound.
20 Spectral Peaks
≥ 0
The 20 strongest amplitudes in the spectrum (frequencies and
their amplitudes).
Please note: Essentia and the underlying Essentia library are licensed under the Affero GPLv3 (AGPLv3).
This library is available under the AGPLv3 for non-commercial use. If you wish to use Essentia in a commercial product, you have to contact the Music Technology Group (MTG)
at Pompeu Fabra University (UPF) directly to negotiate commercial licensing terms. Translation of the Chinese version of the website: Li Hao