A Quest for Formant-Based Compact Nonuniform Trapezoidal Filter Banks for Speech Processing With Vgg16

dc.contributor.author Parlak, Cevahir
dc.contributor.author Altun, Yusuf
dc.contributor.other Bilgisayar Mühendisliği Bölümü
dc.date.accessioned 2025-01-11T13:03:35Z
dc.date.available 2025-01-11T13:03:35Z
dc.date.issued 2024
dc.description PARLAK (PhD), CEVAHIR/0000-0002-5500-7379; ALTUN, Prof. Dr. Yusuf/0000-0002-2099-0959 en_US
dc.description.abstract In this text, we discuss the filter banks used for speech analysis and propose a novel filter bank for speech processing applications. Filter banks are building blocks of speech processing applications. Multiple filter strategies have been proposed, including Mel, PLP, Seneff, Lyon, and Gammatone filters. MFCC is a transformed version of Mel filters and is still a state-of-the-art method for speech recognition applications. However, 40 years after their debut, time is running out to launch new structures as novel speech features. The proposed acoustic filter banks (AFB) are innovative alternatives to dethrone Mel filters, PLP filters, and MFCC features. Foundations of AFB filters are based on the formant regions of vowels and consonants. In this study, we pioneer an acoustic filter bank comprising 11 frequency regions and conduct experiments using the VGG16 model on the TIMIT and Speech Command V2 datasets. The outcomes of the study concretely indicate that MFCC, Mel, and PLP filters can effectively be replaced with novel AFB filter bank features. en_US
dc.identifier.citation 0
dc.identifier.doi 10.1007/s00034-024-02794-z
dc.identifier.issn 0278-081X
dc.identifier.issn 1531-5878
dc.identifier.scopus 2-s2.0-85200054858
dc.identifier.uri https://doi.org/10.1007/s00034-024-02794-z
dc.identifier.uri https://hdl.handle.net/20.500.14627/288
dc.language.iso en en_US
dc.publisher Springer Birkhauser en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Speech Processing en_US
dc.subject Mfcc en_US
dc.subject Mel Filters en_US
dc.subject Plp en_US
dc.subject Filter Banks en_US
dc.subject Convolutional Neural Networks en_US
dc.title A Quest for Formant-Based Compact Nonuniform Trapezoidal Filter Banks for Speech Processing With Vgg16 en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id PARLAK (PhD), CEVAHIR/0000-0002-5500-7379
gdc.author.id ALTUN, Prof. Dr. Yusuf/0000-0002-2099-0959
gdc.author.institutional Parlak, Cevahir
gdc.author.scopusid 55807221400
gdc.author.scopusid 25031391400
gdc.author.wosid PARLAK (PhD), Cevahir/ABA-4914-2021
gdc.author.wosid ALTUN, Prof. Dr. Yusuf/AAA-9929-2020
gdc.description.department Fenerbahçe University en_US
gdc.description.departmenttemp [Parlak, Cevahir] Fenerbahce Univ, Comp Engn Dept, Istanbul, Turkiye; [Altun, Yusuf] Duzce Univ, Comp Engn Dept, Duzce, Turkiye en_US
gdc.description.endpage 7338 en_US
gdc.description.issue 11 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q2
gdc.description.startpage 7309 en_US
gdc.description.volume 43 en_US
gdc.description.woscitationindex Science Citation Index Expanded
gdc.description.wosquality Q3
gdc.identifier.openalex W4401134257
gdc.identifier.wos WOS:001281629000005
gdc.openalex.fwci 0.0
gdc.openalex.normalizedpercentile 0.13
gdc.plumx.mendeley 5
gdc.plumx.scopuscites 1
gdc.scopus.citedcount 1
gdc.wos.citedcount 0
relation.isAuthorOfPublication d57697e4-e3d0-4c57-bb5c-4b92ceb92d1a
relation.isAuthorOfPublication.latestForDiscovery d57697e4-e3d0-4c57-bb5c-4b92ceb92d1a
relation.isOrgUnitOfPublication 85e04a04-fb9d-4894-961f-e92f27bb6cb6
relation.isOrgUnitOfPublication.latestForDiscovery 85e04a04-fb9d-4894-961f-e92f27bb6cb6

Files