Lykartsis, Athanasios; Lerch, Alexander; Weinzierl, Stefan Analysis of Speech Rhythm for Language Identification Based on Beat Histograms Proceedings Article In: Proceedings of the DAGA (Jahrestagung fur Akustik), Nuremberg, 2015. Lykartsis, Athanasios; Lerch, Alexander Beat Histogram Features for Rythm-based Musical Genre Classification Using Multiple Novelty Functions Proceedings Article In: Proceedings of the International Conference on Digital Audio Effects (DAFX), Trondheim, Norway, 2015. Abstract | Links | BibTeX | Tags: Lykartsis, Athanasios; Wu, Chih-Wei; Lerch, Alexander Beat Histogram Features from NMF-Based Novelty Functions for Music Classification Proceedings Article In: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), ISMIR, Malaga, 2015. Abstract | Links | BibTeX | Tags: O'Brien, Cian; Lerch, Alexander Genre-Specific Key Profiles Proceedings Article In: Proceedings of the International Computer Music Conference (ICMC), ICMA, Denton, 2015. Abstract | Links | BibTeX | Tags: Wu, Chih-Wei; Lerch, Alexander Drum Transcription using Partially Fixed Non-Negative Matrix Factorization Proceedings Article In: Proceedings of the European Signal Processing Conference (EUSIPCO), EURASIP, Nice, 2015. Abstract | Links | BibTeX | Tags: Wu, Chih-Wei; Lerch, Alexander Drum Transcription using Partially Fixed Non-Negative Matrix Factorization With Template Adaptation Proceedings Article In: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), ISMIR, Malaga, 2015. Abstract | Links | BibTeX | Tags: Zhou, Xinquan; Lerch, Alexander Chord Detection Using Deep Learning Proceedings Article In: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), ISMIR, Malaga, 2015. Abstract | Links | BibTeX | Tags: von Coler, Henrik; Lerch, Alexander CMMSD: A Data Set for Note-Level Segmentation of Monophonic Music Proceedings Article In: Proceedings of the AES 53rd International Conference on Semantic Audio, Audio Engineering Society (AES), London, UK, 2014. Abstract | Links | BibTeX | Tags: Lerch, Alexander Music Information Retrieval Book Section In: Weinzierl, Stefan (Ed.): Akustische Grundlagen der Musik, no. 5, pp. 79–102, Laaber, 2014, ISBN: 978-3-89007-699-7. BibTeX | Tags: Kraft, Sebastian; Lerch, Alexander; Zölzer, Udo The Tonalness Spectrum: Feature-Based Estimation of Tonal Components Proceedings Article In: Proceedings of the 16th International Conference on Digital Audio Effects, Maynooth, 2013. Abstract | Links | BibTeX | Tags: Lerch, Alexander An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics Book Wiley-IEEE Press, Hoboken, 2012, ISBN: 978-1-118-26682-3. Abstract | Links | BibTeX | Tags: analysis, audio, audio signal processing, information, listening, machine, machine listening, music, music analysis, music information retrieval, processing, retrieval, signal Kirchhoff, Holger; Lerch, Alexander Evaluation of Features for Audio-to-Audio Alignment Journal Article In: Journal of New Music Research, vol. 40, no. 1, pp. 27–41, 2011. Abstract | Links | BibTeX | Tags: Lerch, Alexander Software-gestützte Merkmalsextraktion für die musikalische Aufführungsanalyse Book Section In: von Loesch, Heinz; Weinzierl, Stefan (Ed.): Gemessene Interpretation - Computergestützte Aufführungsanalyse im Kreuzverhör der Disziplinen, pp. 205–212, Schott, Mainz, 2011, ISBN: 978-3-7957-0771-2. BibTeX | Tags: Ness, Steven R; Lerch, Alexander; Tzanetakis, George Strategies for Orca Call Retrieval to Support Collaborative Annotation of a Large Archive Proceedings Article In: Proceedings of the International Workshop on Multimedia Signal Processing (MMSP), IEEE, Hangzhou, 2011, ISBN: 978-1-4577-1434-4. Abstract | Links | BibTeX | Tags: Wiesener, Constantin; Flohrer, Tim; Lerch, Alexander; Weinzierl, Stefan Adaptive Noise Reduction for Real-time Applications Proceedings Article In: Proceedings of the 128th Audio Engineering Society Convention (Preprint #8048), Audio Engineering Society, London, 2010. Abstract | Links | BibTeX | Tags: Lerch, Alexander Software-Based Extraction of Objective Parameters from Music Performances Book GRIN Verlag, München, 2009, ISBN: 978-3-640-29496-1. Abstract | Links | BibTeX | Tags: analysis, audio, content, information, music, performance, retrieval Lerch, Alexander; Weinzierl, Stefan Digitale Audiotechnik: Grundlagen Book Section In: Weinzierl, Stefan (Ed.): Handbuch der Audiotechnik, pp. 785–811, Springer, Berlin, 2008, ISBN: 978-3-540-34300-4. BibTeX | Tags: Lerch, Alexander Bitratenreduktion Book Section In: Weinzierl, Stefan (Ed.): Handbuch der Audiotechnik, pp. 849–884, Springer, Berlin, 2008, ISBN: 978-3-540-34300-4. BibTeX | Tags: Yogev, Noam; Lerch, Alexander A System for Automatic Audio Harmonization ( Ein System für automatische Audio-Harmonisierung) Proceedings Article In: Proceedings of the VdT International Convention (25. Tonmeistertagung), Leipzig, 2008. Abstract | Links | BibTeX | Tags: audio, harmonization Lerch, Alexander On the Requirement of Automatic Tuning Frequency Estimation Proceedings Article In: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), ISMIR, Victoria, 2006. Abstract | Links | BibTeX | Tags: frequency, tuning Lerch, Alexander; Eisenberg, Gunnar; Tanghe, Koen FEAPI: A Low Level Feature Extraction Plugin API Proceedings Article In: Proceedings of 8th International Conference on Digital Audio Effects (DAFX), Madrid, 2005. Abstract | Links | BibTeX | Tags: Lerch, Alexander; Klich, Ingmar-Leander On the Evaluation of Automatic Onset Tracking Systems Technical Report zplane.development Berlin, 2005. Abstract | Links | BibTeX | Tags: Burred, Juan José; Lerch, Alexander Hierarchical Automatic Audio Signal Classification Journal Article In: Journal of the Audio Engineering Society (JAES), vol. 52, no. 7/8, pp. 724–739, 2004. Abstract | Links | BibTeX | Tags: Lerch, Alexander Ein Ansatz zur automatischen Erkennung der Tonart in Musikdateien Proceedings Article In: Proceedings of the VDT International Audio Convention (23. Tonmeistertagung), Leipzig, 2004. Abstract | Links | BibTeX | Tags: Burred, Juan José; Lerch, Alexander A Hierarchical Approach to Automatic Musical Genre Classification Proceedings Article In: Proceedings of the 6th International Conference on Digital Audio Effects (DAFX), London, 2003. Abstract | Links | BibTeX | Tags: Baumgarte, Frank; Lerch, Alexander Delayed Contribution Document 6Q/18-E: Implementation of Recommendation ITU-R BS.1387 (PEAQ) Miscellaneous 2001.2015
@inproceedings{lykartsis_analysis_2015,
title = {Analysis of Speech Rhythm for Language Identification Based on Beat Histograms},
author = {Athanasios Lykartsis and Alexander Lerch and Stefan Weinzierl},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/06/Lykartsis%20et%20al_2015_Analysis%20of%20Speech%20Rhythm%20for%20Language%20Identification%20Based%20on%20Beat%20Histograms.pdf},
year = {2015},
date = {2015-01-01},
booktitle = {Proceedings of the DAGA (Jahrestagung fur Akustik)},
address = {Nuremberg},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
@inproceedings{lykartsis_beat_2015,
title = {Beat Histogram Features for Rythm-based Musical Genre Classification Using Multiple Novelty Functions},
author = {Athanasios Lykartsis and Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/12/DAFx-15_submission_42-1.pdf},
year = {2015},
date = {2015-01-01},
booktitle = {Proceedings of the International Conference on Digital Audio Effects (DAFX)},
address = {Trondheim, Norway},
abstract = {In this paper we present beat histogram features for multiple level
rhythmdescriptionandevaluatetheminamusicalgenreclassifica-
tion task. Audio features pertaining to various musical content cat-
egories and their related novelty functions are extracted as a basis
for the creation of beat histograms. The proposed features capture
not only amplitude, but also tonal and general spectral changes
in the signal, aiming to represent as much rhythmic information
as possible. The most and least informative features are identi-
fied through feature selection methods and are then tested using
Support Vector Machines on five genre datasets concerning classi-
fication accuracy against a baseline feature set. Results show that
the presented features provide comparable classification accuracy
with respect to other genre classification approaches using period-
icity histograms and display a performance close to that of much
more elaborate up-to-date approaches for rhythm description. The
use of bar boundary annotations for the texture frames has pro-
vided an improvement for the dance-oriented Ballroom dataset.
The comparably small number of descriptors and the possibility of
evaluating the influence of specific signal components to the gen-
eral rhythmic content encourage the further use of the method in
rhythm description tasks.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
rhythmdescriptionandevaluatetheminamusicalgenreclassifica-
tion task. Audio features pertaining to various musical content cat-
egories and their related novelty functions are extracted as a basis
for the creation of beat histograms. The proposed features capture
not only amplitude, but also tonal and general spectral changes
in the signal, aiming to represent as much rhythmic information
as possible. The most and least informative features are identi-
fied through feature selection methods and are then tested using
Support Vector Machines on five genre datasets concerning classi-
fication accuracy against a baseline feature set. Results show that
the presented features provide comparable classification accuracy
with respect to other genre classification approaches using period-
icity histograms and display a performance close to that of much
more elaborate up-to-date approaches for rhythm description. The
use of bar boundary annotations for the texture frames has pro-
vided an improvement for the dance-oriented Ballroom dataset.
The comparably small number of descriptors and the possibility of
evaluating the influence of specific signal components to the gen-
eral rhythmic content encourage the further use of the method in
rhythm description tasks.@inproceedings{lykartsis_beat_2015-1,
title = {Beat Histogram Features from NMF-Based Novelty Functions for Music Classification},
author = {Athanasios Lykartsis and Chih-Wei Wu and Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/10/Lykartsis-et-al_2015_Beat-Histogram-Features-from-NMF-Based-Novelty-Functions-for-Music.pdf},
year = {2015},
date = {2015-01-01},
booktitle = {Proceedings of the International Society for Music Information Retrieval Conference (ISMIR)},
publisher = {ISMIR},
address = {Malaga},
abstract = {In this paper we present novel rhythm features derived from
drum tracks extracted from polyphonic music and evaluate
them in a genre classification task. Musical excerpts are
analyzed using an optimized, partially fixed Non-Negative
Matrix Factorization (NMF) method and beat histogram
features are calculated on basis of the resulting activation
functions for each one out of three drum tracks extracted
(Hi-Hat, SnareDrumandBassDrum). Thefeaturesareeval-
uated on two widely used genre datasets (GTZAN and Ball-
room) using standard classification methods, concerning
the achieved overall classification accuracy. Furthermore,
their suitability in distinguishing between rhythmically sim-
ilar genres and the performance of the features resulting
from individual activation functions is discussed. Results
show that the presented NMF-based beat histogram features
can provide comparable performance to other classification
systems, while considering strictly drum patterns.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
drum tracks extracted from polyphonic music and evaluate
them in a genre classification task. Musical excerpts are
analyzed using an optimized, partially fixed Non-Negative
Matrix Factorization (NMF) method and beat histogram
features are calculated on basis of the resulting activation
functions for each one out of three drum tracks extracted
(Hi-Hat, SnareDrumandBassDrum). Thefeaturesareeval-
uated on two widely used genre datasets (GTZAN and Ball-
room) using standard classification methods, concerning
the achieved overall classification accuracy. Furthermore,
their suitability in distinguishing between rhythmically sim-
ilar genres and the performance of the features resulting
from individual activation functions is discussed. Results
show that the presented NMF-based beat histogram features
can provide comparable performance to other classification
systems, while considering strictly drum patterns.@inproceedings{obrien_genre-specific_2015,
title = {Genre-Specific Key Profiles},
author = {Cian O'Brien and Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/09/O'Brien_Lerch_2015_Genre-Specific%20Key%20Profiles.pdf},
year = {2015},
date = {2015-01-01},
booktitle = {Proceedings of the International Computer Music Conference (ICMC)},
publisher = {ICMA},
address = {Denton},
abstract = {The most common approaches to the automatic recognition
of musical key are template-based, i.e., an extracted pitch
chroma vector is compared to a template key profile in order
to identify the most similar key. General as well as domain-
specific templates have been used in the past, but to the au-
thors best knowledge there has been no study that evaluated
genre-specific key profiles extracted from the audio signal. We
investigate the pitch chroma distributions for 9 different gen-
res, their distances, and the degree to which these genres can
be identified using these distributions when utilizing different
strategies for achieving key-invariance.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
of musical key are template-based, i.e., an extracted pitch
chroma vector is compared to a template key profile in order
to identify the most similar key. General as well as domain-
specific templates have been used in the past, but to the au-
thors best knowledge there has been no study that evaluated
genre-specific key profiles extracted from the audio signal. We
investigate the pitch chroma distributions for 9 different gen-
res, their distances, and the degree to which these genres can
be identified using these distributions when utilizing different
strategies for achieving key-invariance.@inproceedings{wu_drum_2015,
title = {Drum Transcription using Partially Fixed Non-Negative Matrix Factorization},
author = {Chih-Wei Wu and Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/09/Wu_Lerch_2015_Drum%20Transcription%20using%20Partially%20Fixed%20Non-Negative%20Matrix%20Factorization.pdf},
year = {2015},
date = {2015-01-01},
booktitle = {Proceedings of the European Signal Processing Conference (EUSIPCO)},
publisher = {EURASIP},
address = {Nice},
abstract = {In this paper, a drum transcription algorithm using partially
fixed non-negative matrix factorization is presented. The pro-
posed method allows users to identify percussive events in
complex mixtures with a minimal training set. The algorithm
decomposes the music signal into two parts: percussive part
with pre-defined drum templates and harmonic part with un-
defined entries. The harmonic part is able to adapt to the
music content, allowing the algorithm to work in polyphonic
mixtures. Drum event times can be simply picked from the
percussive activation matrix with onset detection. The system
is efficient and robust even with a minimal training set. The
recognition rates for the ENST dataset vary from 56.7 to 78.9%
for three percussive instruments extracted from polyphonic
music.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
fixed non-negative matrix factorization is presented. The pro-
posed method allows users to identify percussive events in
complex mixtures with a minimal training set. The algorithm
decomposes the music signal into two parts: percussive part
with pre-defined drum templates and harmonic part with un-
defined entries. The harmonic part is able to adapt to the
music content, allowing the algorithm to work in polyphonic
mixtures. Drum event times can be simply picked from the
percussive activation matrix with onset detection. The system
is efficient and robust even with a minimal training set. The
recognition rates for the ENST dataset vary from 56.7 to 78.9%
for three percussive instruments extracted from polyphonic
music.@inproceedings{wu_drum_2015-1,
title = {Drum Transcription using Partially Fixed Non-Negative Matrix Factorization With Template Adaptation},
author = {Chih-Wei Wu and Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/10/Wu_Lerch_2015_Drum-Transcription-using-Partially-Fixed-Non-Negative-Matrix-Factorization-With.pdf},
year = {2015},
date = {2015-01-01},
booktitle = {Proceedings of the International Society for Music Information Retrieval Conference (ISMIR)},
publisher = {ISMIR},
address = {Malaga},
abstract = {In this paper, a template adaptive drum transcription algo-
rithm using partially fixed Non-negative Matrix Factoriza-
tion (NMF) is presented. The proposed method detects per-
cussive events in complex mixtures of music with a minimal
training set. The algorithm decomposes the music signal
into two dictionaries: a percussive dictionary initialized
with pre-defined drum templates and a harmonic dictionary
initialized with undefined entries. The harmonic dictionary
is adapted to the non-percussive music content in a standard
NMF procedure. The percussive dictionary is adapted to
each individual signal in an iterative scheme: it is fixed
during the decomposition process, and is updated based on
the result of the previous convergence. Two template adap-
tation methods are proposed to provide more flexibility and
robustness in the case of unknown data. The performance
of the proposed system has been evaluated and compared
to state of the art systems. The results show that template
adaptation improves the transcription performance, and the
detection accuracy is in the same range as more complex
systems.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
rithm using partially fixed Non-negative Matrix Factoriza-
tion (NMF) is presented. The proposed method detects per-
cussive events in complex mixtures of music with a minimal
training set. The algorithm decomposes the music signal
into two dictionaries: a percussive dictionary initialized
with pre-defined drum templates and a harmonic dictionary
initialized with undefined entries. The harmonic dictionary
is adapted to the non-percussive music content in a standard
NMF procedure. The percussive dictionary is adapted to
each individual signal in an iterative scheme: it is fixed
during the decomposition process, and is updated based on
the result of the previous convergence. Two template adap-
tation methods are proposed to provide more flexibility and
robustness in the case of unknown data. The performance
of the proposed system has been evaluated and compared
to state of the art systems. The results show that template
adaptation improves the transcription performance, and the
detection accuracy is in the same range as more complex
systems.@inproceedings{zhou_chord_2015,
title = {Chord Detection Using Deep Learning},
author = {Xinquan Zhou and Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/10/Zhou_Lerch_2015_Chord-Detection-Using-Deep-Learning.pdf},
year = {2015},
date = {2015-01-01},
booktitle = {Proceedings of the International Society for Music Information Retrieval Conference (ISMIR)},
publisher = {ISMIR},
address = {Malaga},
abstract = {In this paper, we utilize deep learning to learn high-level
features for audio chord detection. The learned features,
obtained by a deep network in bottleneck architecture, give
promising results and outperform state-of-the-art systems.
We present and evaluate the results for various methods and
configurations, including input pre-processing, a bottleneck
architecture, and SVMs vs. HMMs for chord classification.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
features for audio chord detection. The learned features,
obtained by a deep network in bottleneck architecture, give
promising results and outperform state-of-the-art systems.
We present and evaluate the results for various methods and
configurations, including input pre-processing, a bottleneck
architecture, and SVMs vs. HMMs for chord classification.2014
@inproceedings{coler_cmmsd:_2014,
title = {CMMSD: A Data Set for Note-Level Segmentation of Monophonic Music},
author = {Henrik von Coler and Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/04/Coler_Lerch_2014_CMMSD.pdf},
year = {2014},
date = {2014-01-01},
booktitle = {Proceedings of the AES 53rd International Conference on Semantic Audio},
publisher = {Audio Engineering Society (AES)},
address = {London, UK},
abstract = {A musical data set for note-level segmentation of monophonic music is presented. It contains 36 excerpts from
commercial recordings of monophonic classical western music and features the instrument groups strings,
woodwind and brass. The excerpts are self-contained phrases with a mean length of 17.97 seconds and an
average of 20 notes. All phrases are played in moderate tempo, mostly with significant amounts of expressive
articulation. A manually annotated ground truth splits each item into a sequence of the three states note,
transition and rest. The set is designed as an open source project, aiming at the development and evaluation
of algorithms for segmentation, music performance analysis and feature selection. This paper presents the
process of ground truth labeling and a detailed description of the data set and its properties.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
commercial recordings of monophonic classical western music and features the instrument groups strings,
woodwind and brass. The excerpts are self-contained phrases with a mean length of 17.97 seconds and an
average of 20 notes. All phrases are played in moderate tempo, mostly with significant amounts of expressive
articulation. A manually annotated ground truth splits each item into a sequence of the three states note,
transition and rest. The set is designed as an open source project, aiming at the development and evaluation
of algorithms for segmentation, music performance analysis and feature selection. This paper presents the
process of ground truth labeling and a detailed description of the data set and its properties.@incollection{lerch_music_2014,
title = {Music Information Retrieval},
author = {Alexander Lerch},
editor = {Stefan Weinzierl},
isbn = {978-3-89007-699-7},
year = {2014},
date = {2014-01-01},
booktitle = {Akustische Grundlagen der Musik},
number = {5},
pages = {79--102},
publisher = {Laaber},
series = {Handbuch der Systematischen Musikwissenschaft},
keywords = {},
pubstate = {published},
tppubtype = {incollection}
}
2013
@inproceedings{kraft_tonalness_2013,
title = {The Tonalness Spectrum: Feature-Based Estimation of Tonal Components},
author = {Sebastian Kraft and Alexander Lerch and Udo Z\"{o}lzer},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/04/Kraft%20et%20al_2013_The%20Tonalness%20Spectrum.pdf},
year = {2013},
date = {2013-01-01},
urldate = {2014-01-16},
booktitle = {Proceedings of the 16th International Conference on Digital Audio Effects},
address = {Maynooth},
abstract = {The tonalness spectrum shows the likelihood of a spectral bin be-
ing part of a tonal or non-tonal component. It is a non-binary
measure based on a set of established spectral features. An eas-
ily extensible framework for the computation, selection, and com-
bination of features is introduced. The results are evaluated and
compared in two ways. First with a data set of synthetically gen-
erated signals but also with real music signals in the context of a
typical MIR application.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
ing part of a tonal or non-tonal component. It is a non-binary
measure based on a set of established spectral features. An eas-
ily extensible framework for the computation, selection, and com-
bination of features is introduced. The results are evaluated and
compared in two ways. First with a data set of synthetically gen-
erated signals but also with real music signals in the context of a
typical MIR application.2012
@book{lerch_introduction_2012,
title = {An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics},
author = {Alexander Lerch},
url = {http://ieeexplore.ieee.org/xpl/bkabstractplus.jsp?bkn=6266785},
isbn = {978-1-118-26682-3},
year = {2012},
date = {2012-01-01},
publisher = {Wiley-IEEE Press},
address = {Hoboken},
abstract = {With the proliferation of digital audio distribution over digital media, audio content analysis is fast becoming a requirement for designers of intelligent signal-adaptive audio processing systems. Written by a well-known expert in the field, this book provides quick access to different analysis algorithms and allows comparison between different approaches to the same task, making it useful for newcomers to audio signal processing and industry experts alike. A review of relevant fundamentals in audio signal processing, psychoacoustics, and music theory, as well as downloadable MATLAB files are also included. Please visit the companion website: www.AudioContentAnalysis.org},
keywords = {analysis, audio, audio signal processing, information, listening, machine, machine listening, music, music analysis, music information retrieval, processing, retrieval, signal},
pubstate = {published},
tppubtype = {book}
}
2011
@article{kirchhoff_evaluation_2011,
title = {Evaluation of Features for Audio-to-Audio Alignment},
author = {Holger Kirchhoff and Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/04/Kirchhoff_Lerch_2011_Evaluation%20of%20Features%20for%20Audio-to-Audio%20Alignment.pdf},
doi = {10.1080/09298215.2010.529917},
year = {2011},
date = {2011-01-01},
journal = {Journal of New Music Research},
volume = {40},
number = {1},
pages = {27--41},
abstract = {Audio-to-audio alignment is the task of synchronizing two audio sequences with similar musical content in time. We investigated a large set of audio features for this task. The features were chosen to represent four different content-dependent similarity categories: the envelope, the timbre, note-onsets and the pitch. The features were subjected to two processing stages. First, a feature subset was selected by evaluating the alignment performance of each individual feature. Second, the selected features were combined and subjected to an automatic weighting algorithm.
A new method for the objective evaluation of audio-to-audio alignment systems is proposed that enables the use of arbitrary kinds of music as ground truth data. We evaluated our algorithm by this method as well as on a data set of real recordings of solo piano music. The results showed that the feature weighting algorithm could improve the alignment accuracies compared to the results of the individual features.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
A new method for the objective evaluation of audio-to-audio alignment systems is proposed that enables the use of arbitrary kinds of music as ground truth data. We evaluated our algorithm by this method as well as on a data set of real recordings of solo piano music. The results showed that the feature weighting algorithm could improve the alignment accuracies compared to the results of the individual features.@incollection{lerch_software-gestutzte_2011,
title = {Software-gest\"{u}tzte Merkmalsextraktion f\"{u}r die musikalische Auff\"{u}hrungsanalyse},
author = {Alexander Lerch},
editor = {Heinz von Loesch and Stefan Weinzierl},
isbn = {978-3-7957-0771-2},
year = {2011},
date = {2011-01-01},
booktitle = {Gemessene Interpretation - Computergest\"{u}tzte Auff\"{u}hrungsanalyse im Kreuzverh\"{o}r der Disziplinen},
pages = {205--212},
publisher = {Schott},
address = {Mainz},
series = {Klang und Begriff},
keywords = {},
pubstate = {published},
tppubtype = {incollection}
}
@inproceedings{ness_strategies_2011,
title = {Strategies for Orca Call Retrieval to Support Collaborative Annotation of a Large Archive},
author = {Steven R Ness and Alexander Lerch and George Tzanetakis},
url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6093798},
doi = {10.1109/MMSP.2011.6093798},
isbn = {978-1-4577-1434-4},
year = {2011},
date = {2011-01-01},
booktitle = {Proceedings of the International Workshop on Multimedia Signal Processing (MMSP)},
publisher = {IEEE},
address = {Hangzhou},
abstract = {The Orchive is a large audio archive of hydrophone recordings of Killer whale (Orcinus orca) vocalizations. Researchers and users from around the world can interact with the archive using a collaborative web-based annotation, visualization and retrieval interface. In addition a mobile client has been written in order to crowdsource Orca call annotation. In this paper we describe and compare different strategies for the retrieval of discrete Orca calls. In addition, the results of the automatic analysis are integrated in the user interface facilitating annotation as well as leveraging the existing annotations for supervised learning. The best strategy achieves a mean average precision of 0.77 with the first retrieved item being relevant 95% of the time in a dataset of 185 calls belonging to 4 types.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2010
@inproceedings{wiesener_adaptive_2010,
title = {Adaptive Noise Reduction for Real-time Applications},
author = {Constantin Wiesener and Tim Flohrer and Alexander Lerch and Stefan Weinzierl},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2015/04/Wiesener%20et%20al_2010_Adaptive%20Noise%20Reduction%20for%20Real-time%20Applications.pdf},
year = {2010},
date = {2010-01-01},
booktitle = {Proceedings of the 128th Audio Engineering Society Convention (Preprint #8048)},
publisher = {Audio Engineering Society},
address = {London},
abstract = {We present a new algorithm for real-time noise reduction of audio signals. In order to derive the noise reduction function, the proposed method adaptively estimates the instantaneous noise spectrum from an autoregressive signal model as opposed to the widely-used approach of using a constant noise spectrum fingerprint. In conjunction with the Ephraim and Malah suppression rule a significant reduction of both stationary and non-stationary noise can be obtained. The adaptive algorithm is able to work without user interaction and is capable of real-time processing. Furthermore, quality improvements are easily possible by integration of additional processing blocks such as transient preservation.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2009
@book{lerch_software-based_2009,
title = {Software-Based Extraction of Objective Parameters from Music Performances},
author = {Alexander Lerch},
url = {http://dx.doi.org/10.14279/depositonce-2025},
isbn = {978-3-640-29496-1},
year = {2009},
date = {2009-01-01},
publisher = {GRIN Verlag},
address = {M\"{u}nchen},
abstract = {Different music performances of the same score may significantly differ from each other. It is obvious that not only the composer’s work, the score, defines the listener’s music experience, but that the music performance itself is an integral part of this experience. Music performers use the information contained in the score, but interpret, transform or add to this information. Four parameter classes can be used to describe a performance objectively: tempo and timing, loudness, timbre and pitch. Each class contains a multitude of individual parameters that are at the performers’ disposal to generate a unique physical rendition of musical ideas. The extraction of such objective parameters is one of the difficulties in music performance research. This work presents an approach to the software-based extraction of tempo and timing, loudness and timbre parameters from audio files to provide a tool for the automatic parameter extraction from music performances. The system is applied to extract data from 21 string quartet performances and a detailed analysis of the extracted data is presented. The main contributions of this thesis are the adaptation and development of signal processing approaches to performance parameter extraction and the presentation and discussion of string quartet performances of a movement of Beethoven’s late String Quartet op. 130.},
keywords = {analysis, audio, content, information, music, performance, retrieval},
pubstate = {published},
tppubtype = {book}
}
2008
@incollection{lerch_digitale_2008,
title = {Digitale Audiotechnik: Grundlagen},
author = {Alexander Lerch and Stefan Weinzierl},
editor = {Stefan Weinzierl},
isbn = {978-3-540-34300-4},
year = {2008},
date = {2008-01-01},
booktitle = {Handbuch der Audiotechnik},
pages = {785--811},
publisher = {Springer},
address = {Berlin},
keywords = {},
pubstate = {published},
tppubtype = {incollection}
}
@incollection{lerch_bitratenreduktion_2008,
title = {Bitratenreduktion},
author = {Alexander Lerch},
editor = {Stefan Weinzierl},
isbn = {978-3-540-34300-4},
year = {2008},
date = {2008-01-01},
booktitle = {Handbuch der Audiotechnik},
pages = {849--884},
publisher = {Springer},
address = {Berlin},
keywords = {},
pubstate = {published},
tppubtype = {incollection}
}
@inproceedings{yogev_system_2008,
title = {A System for Automatic Audio Harmonization ( Ein System f\"{u}r automatische Audio-Harmonisierung)},
author = {Noam Yogev and Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2016/10/Yogev-and-Lerch-2008-A-System-for-Automatic-Audio-Harmonization-Ein-S-1.pdf},
doi = {10.1.1.148.8391},
year = {2008},
date = {2008-01-01},
booktitle = {Proceedings of the VdT International Convention (25. Tonmeistertagung)},
address = {Leipzig},
abstract = {A rule-based system for automatic melody harmonization is presented. It models the cognitive process a human arranger undergoes when confronted with the same task, namely: segmenting the melody into phrases, tagging melody notes with harmonic functions, establishing a palette of possible chords for each note, and finding the most agreeable voicing through these chords. The system is designed to be embedded in an audio framework, which synthe- sizes a four-voiced audio output using pitch-shifting techniques. Principles of classical counterpoint as well as common voice-leading conven- tions are utilized by the system. We shall outline the various phases of computa- tion, describe the rules applied in each phase, and present perspectives regarding the stylistic flexibility suggested by the system's design.},
keywords = {audio, harmonization},
pubstate = {published},
tppubtype = {inproceedings}
}
2006
@inproceedings{lerch_requirement_2006,
title = {On the Requirement of Automatic Tuning Frequency Estimation},
author = {Alexander Lerch},
url = {http://dx.doi.org/10.14279/depositonce-2037},
year = {2006},
date = {2006-01-01},
booktitle = {Proceedings of the International Society for Music Information Retrieval Conference (ISMIR)},
publisher = {ISMIR},
address = {Victoria},
abstract = {The deviation of the tuning frequency from the standard tuning frequency 440 Hz is evaluated for a database of classical music. It is discussed if and under what circumstances such a deviation may affect the robustness of pitch-based systems for musical content analysis.},
keywords = {frequency, tuning},
pubstate = {published},
tppubtype = {inproceedings}
}
2005
@inproceedings{lerch_feapi:_2005,
title = {FEAPI: A Low Level Feature Extraction Plugin API},
author = {Alexander Lerch and Gunnar Eisenberg and Koen Tanghe},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2016/10/Lerch-et-al.-2005-FEAPI-A-Low-Level-Feature-Extraction-Plugin-API.pdf},
year = {2005},
date = {2005-09-01},
booktitle = {Proceedings of 8th International Conference on Digital Audio Effects (DAFX)},
address = {Madrid},
abstract = {This paper presents FEAPI, an easy-to-use platform-independent
plugin application programming interface (API) for the extraction
of low level features from audio in PCM format in the context of
music information retrieval software. The need for and advantages
of using an open and well-defined plugin interface are outlined in
this paper and an overview of the API itself and its usage is given.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
plugin application programming interface (API) for the extraction
of low level features from audio in PCM format in the context of
music information retrieval software. The need for and advantages
of using an open and well-defined plugin interface are outlined in
this paper and an overview of the API itself and its usage is given.@techreport{lerch_evaluation_2005,
title = {On the Evaluation of Automatic Onset Tracking Systems},
author = {Alexander Lerch and Ingmar-Leander Klich},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2016/10/Lerch-and-Klich-2005-On-the-Evaluation-of-Automatic-Onset-Tracking-Syst.pdf},
year = {2005},
date = {2005-01-01},
address = {Berlin},
institution = {zplane.development},
abstract = {This paper summarizes the problems, definitions and requirements that are important for the evaluation of onset tracking systems for audio signals in PCM format. Different procedures and metrics for evaluation and parametrization are presented and commented. Overall, a complete methodology for the evaluation of automatic onset detection systems is proposed.},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
2004
@article{burred_hierarchical_2004,
title = {Hierarchical Automatic Audio Signal Classification},
author = {Juan Jos\'{e} Burred and Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2016/10/Burred-and-Lerch-2004-Hierarchical-Automatic-Audio-Signal-Classification.pdf},
year = {2004},
date = {2004-01-01},
journal = {Journal of the Audio Engineering Society (JAES)},
volume = {52},
number = {7/8},
pages = {724--739},
abstract = {The design, implementation, and evaluation of a system for automatic audio signal classification is presented. The signals are classified according to audio type, differentiating between three speech classes, 13 musical genres, and background noise. A large number of audio features are evaluated for their suitability in such a classification task, including MPEG-7 descriptors and several new features. The selection of the features is carried out systematically with regard to their robustness to noise and bandwidth changes, as well as to their ability to distinguish a given set of audio types. Direct and hierarchical approaches for the feature selection and for the classification are evaluated and compared.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
@inproceedings{lerch_ansatz_2004,
title = {Ein Ansatz zur automatischen Erkennung der Tonart in Musikdateien},
author = {Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2016/10/Lerch-2004-Ein-Ansatz-zur-automatischen-Erkennung-der-Tonart-.pdf},
year = {2004},
date = {2004-01-01},
booktitle = {Proceedings of the VDT International Audio Convention (23. Tonmeistertagung)},
address = {Leipzig},
abstract = {Es wird ein Verfahren zur automatischen Erkennung der Tonart von Musikdateien vorgestellt. Das Verfahren analysiert mittels einer Filterbank den Tonvorrat des Eingangssignals, der in einem Tonvektor zusammenfasst wird. Dabei sind sowohl mehrstimmige als auch ein- stimmige Eingangssignale zul\"{a}ssig. Mit Hilfe eines Nearest-Neighbour-Classifiers wird anschlie\ss end das wahrscheinlichste Ergebnis f\"{u}r den extrahierten Tonvektor bestimmt. Parallel zur Analyse des Tonvorrats wird die Stimmh\"{o}he des Kammertons detektiert, um eine gleichbleibende Erkennungsrate f\"{u}r Signale unterschiedlicher Stimmh\"{o}he zu gew\"{a}hrleisten.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2003
@inproceedings{burred_hierarchical_2003,
title = {A Hierarchical Approach to Automatic Musical Genre Classification},
author = {Juan Jos\'{e} Burred and Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2016/10/Burred-and-Lerch-2003-A-Hierarchical-Approach-to-Automatic-Musical-Genre.pdf},
doi = {10.1.1.2.6582},
year = {2003},
date = {2003-01-01},
booktitle = {Proceedings of the 6th International Conference on Digital Audio Effects (DAFX)},
address = {London},
abstract = {A system for the automatic classification of audio signals according to audio category is presented. The signals are recognized as speech, background noise and one of 13 musical genres. A large number of audio features are evaluated for their suitability in such a classification task, including well-known physical and perceptual features, audio descriptors defined in the MPEG-7 standard, as well as new features proposed in this work. These are selected with regard to their ability to distinguish between a given set of audio types and to their robustness to noise and bandwidth changes. In contrast to previous systems, the feature selection and the classification process itself are carried out in a hierarchical way. This is motivated by the numerous advantages of such a tree-like structure, which include easy expansion capabilities, flexibility in the design of genre-dependent features and the ability to reduce the probability of costly errors. The resulting application is evaluated with respect to classification accuracy and computational costs.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2001
@misc{baumgarte_delayed_2001,
title = {Delayed Contribution Document 6Q/18-E: Implementation of Recommendation ITU-R BS.1387 (PEAQ)},
author = {Frank Baumgarte and Alexander Lerch},
url = {http://www.musicinformatics.gatech.edu/wp-content_nondefault/uploads/2016/10/Baumgarte-and-Lerch-2001-Delayed-Contribution-Document-6Q18-E-Implementat.pdf},
year = {2001},
date = {2001-01-01},
publisher = {ITU},
keywords = {},
pubstate = {published},
tppubtype = {misc}
}
publications
Analysis of Speech Rhythm for Language Identification Based on Beat Histograms Proceedings Article In: Proceedings of the DAGA (Jahrestagung fur Akustik), Nuremberg, 2015. Beat Histogram Features for Rythm-based Musical Genre Classification Using Multiple Novelty Functions Proceedings Article In: Proceedings of the International Conference on Digital Audio Effects (DAFX), Trondheim, Norway, 2015. Beat Histogram Features from NMF-Based Novelty Functions for Music Classification Proceedings Article In: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), ISMIR, Malaga, 2015. Genre-Specific Key Profiles Proceedings Article In: Proceedings of the International Computer Music Conference (ICMC), ICMA, Denton, 2015. Drum Transcription using Partially Fixed Non-Negative Matrix Factorization Proceedings Article In: Proceedings of the European Signal Processing Conference (EUSIPCO), EURASIP, Nice, 2015. Drum Transcription using Partially Fixed Non-Negative Matrix Factorization With Template Adaptation Proceedings Article In: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), ISMIR, Malaga, 2015. Chord Detection Using Deep Learning Proceedings Article In: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), ISMIR, Malaga, 2015. CMMSD: A Data Set for Note-Level Segmentation of Monophonic Music Proceedings Article In: Proceedings of the AES 53rd International Conference on Semantic Audio, Audio Engineering Society (AES), London, UK, 2014. Music Information Retrieval Book Section In: Weinzierl, Stefan (Ed.): Akustische Grundlagen der Musik, no. 5, pp. 79–102, Laaber, 2014, ISBN: 978-3-89007-699-7. The Tonalness Spectrum: Feature-Based Estimation of Tonal Components Proceedings Article In: Proceedings of the 16th International Conference on Digital Audio Effects, Maynooth, 2013. An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics Book Wiley-IEEE Press, Hoboken, 2012, ISBN: 978-1-118-26682-3. Evaluation of Features for Audio-to-Audio Alignment Journal Article In: Journal of New Music Research, vol. 40, no. 1, pp. 27–41, 2011. Software-gestützte Merkmalsextraktion für die musikalische Aufführungsanalyse Book Section In: von Loesch, Heinz; Weinzierl, Stefan (Ed.): Gemessene Interpretation - Computergestützte Aufführungsanalyse im Kreuzverhör der Disziplinen, pp. 205–212, Schott, Mainz, 2011, ISBN: 978-3-7957-0771-2. Strategies for Orca Call Retrieval to Support Collaborative Annotation of a Large Archive Proceedings Article In: Proceedings of the International Workshop on Multimedia Signal Processing (MMSP), IEEE, Hangzhou, 2011, ISBN: 978-1-4577-1434-4. Adaptive Noise Reduction for Real-time Applications Proceedings Article In: Proceedings of the 128th Audio Engineering Society Convention (Preprint #8048), Audio Engineering Society, London, 2010. Software-Based Extraction of Objective Parameters from Music Performances Book GRIN Verlag, München, 2009, ISBN: 978-3-640-29496-1. Digitale Audiotechnik: Grundlagen Book Section In: Weinzierl, Stefan (Ed.): Handbuch der Audiotechnik, pp. 785–811, Springer, Berlin, 2008, ISBN: 978-3-540-34300-4. Bitratenreduktion Book Section In: Weinzierl, Stefan (Ed.): Handbuch der Audiotechnik, pp. 849–884, Springer, Berlin, 2008, ISBN: 978-3-540-34300-4. A System for Automatic Audio Harmonization ( Ein System für automatische Audio-Harmonisierung) Proceedings Article In: Proceedings of the VdT International Convention (25. Tonmeistertagung), Leipzig, 2008. On the Requirement of Automatic Tuning Frequency Estimation Proceedings Article In: Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), ISMIR, Victoria, 2006. FEAPI: A Low Level Feature Extraction Plugin API Proceedings Article In: Proceedings of 8th International Conference on Digital Audio Effects (DAFX), Madrid, 2005. On the Evaluation of Automatic Onset Tracking Systems Technical Report zplane.development Berlin, 2005. Hierarchical Automatic Audio Signal Classification Journal Article In: Journal of the Audio Engineering Society (JAES), vol. 52, no. 7/8, pp. 724–739, 2004. Ein Ansatz zur automatischen Erkennung der Tonart in Musikdateien Proceedings Article In: Proceedings of the VDT International Audio Convention (23. Tonmeistertagung), Leipzig, 2004. A Hierarchical Approach to Automatic Musical Genre Classification Proceedings Article In: Proceedings of the 6th International Conference on Digital Audio Effects (DAFX), London, 2003.2015
2014
2013
2012
2011
2010
2009
2008
2006
2005
2004
2003
2001