Audio fingerprinting using wavelet transform

Kanalıcı, Evren

Yıldız Teknik Üniversitesi Açık Arşivi
→
Tezler
→
Fen Bilimleri Enstitüsü
→
Fen Bilimleri Enstitüsü Yüksek Lisans Tezleri
→
Bilgisayar Mühendisliği
→
Öğe Göster

dc.contributor.author	Kanalıcı, Evren
dc.date.accessioned	2022-08-09T11:30:15Z
dc.date.available	2022-08-09T11:30:15Z
dc.date.issued	2019
dc.identifier.uri	http://dspace.yildiz.edu.tr/xmlui/handle/1/12955
dc.description	Tez (Yüksek Lisans) - Yıldız Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2019	en_US
dc.description.abstract	Audio fingerprinting systems have many real-world use-cases such as digital rights management/copyright detection, duplicated audio detection, untagged audio labelling or identify/query-by-example recognition systems. Nowadays, there are popular online platforms that offer identify/query-by-example music recognition services where users can query by snippets of recorded audio to retrieve the matched song metadata. The compact, robust and fast retrieving fingerprint design is the cornerstone of these systems. Although short-term Fourier transform and Mel-spectral representations are common tools that come to mind, these feature extraction methods suffer from being unstable and having somehow limited resolution. In order to overcome these challenges, scattering wavelet transform (SWT) provides an alternative solution to these limitations by recovering information loss, while ensuring translation invariance and stability. In this study, a two-stage audio fingerprint characteristic/feature extraction framework is introduced using SWT integrated with Siamese neural network hashing model for musical audio identification. Similarity-preserving hashes provided by the Siamese neural network model correspond to sound fingerprints and can be defined by a similarity distance metric in the embedded hashing space. The Siamese neural network hashing model was trained by two-layer scattering wavelet transform coefficients using relatively aligned segments of the same music files and segments of different music files. The proposed system achieves successful performance scores under environmental noise, modeling the challenges of detecting music and audio data that may be encountered in everyday life. Using very compact storage, it has been shown to achieve high ROC-AUC scores both by one-to-one comparison and by using locality-sensitive hashing (LSH) for content storage.	en_US
dc.language.iso	en	en_US
dc.subject	Audio fingerprinting	en_US
dc.subject	Music information retrieval	en_US
dc.subject	Wavelet transform	en_US
dc.subject	Scattering wavelet transform	en_US
dc.subject	Siamese neural networks	en_US
dc.title	Audio fingerprinting using wavelet transform	en_US
dc.type	Thesis	en_US