YTÜ DSpace Kurumsal Arşivi

Audio fingerprinting using wavelet transform

Basit öğe kaydını göster

dc.contributor.author Kanalıcı, Evren
dc.date.accessioned 2022-08-09T11:30:15Z
dc.date.available 2022-08-09T11:30:15Z
dc.date.issued 2019
dc.identifier.uri http://dspace.yildiz.edu.tr/xmlui/handle/1/12955
dc.description Tez (Yüksek Lisans) - Yıldız Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2019 en_US
dc.description.abstract Audio fingerprinting systems have many real-world use-cases such as digital rights management/copyright detection, duplicated audio detection, untagged audio labelling or identify/query-by-example recognition systems. Nowadays, there are popular online platforms that offer identify/query-by-example music recognition services where users can query by snippets of recorded audio to retrieve the matched song metadata. The compact, robust and fast retrieving fingerprint design is the cornerstone of these systems. Although short-term Fourier transform and Mel-spectral representations are common tools that come to mind, these feature extraction methods suffer from being unstable and having somehow limited resolution. In order to overcome these challenges, scattering wavelet transform (SWT) provides an alternative solution to these limitations by recovering information loss, while ensuring translation invariance and stability. In this study, a two-stage audio fingerprint characteristic/feature extraction framework is introduced using SWT integrated with Siamese neural network hashing model for musical audio identification. Similarity-preserving hashes provided by the Siamese neural network model correspond to sound fingerprints and can be defined by a similarity distance metric in the embedded hashing space. The Siamese neural network hashing model was trained by two-layer scattering wavelet transform coefficients using relatively aligned segments of the same music files and segments of different music files. The proposed system achieves successful performance scores under environmental noise, modeling the challenges of detecting music and audio data that may be encountered in everyday life. Using very compact storage, it has been shown to achieve high ROC-AUC scores both by one-to-one comparison and by using locality-sensitive hashing (LSH) for content storage. en_US
dc.language.iso en en_US
dc.subject Audio fingerprinting en_US
dc.subject Music information retrieval en_US
dc.subject Wavelet transform en_US
dc.subject Scattering wavelet transform en_US
dc.subject Siamese neural networks en_US
dc.title Audio fingerprinting using wavelet transform en_US
dc.type Thesis en_US


Bu öğenin dosyaları

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Basit öğe kaydını göster