A Quest for Formant-Based Compact Nonuniform Trapezoidal Filter Banks for Speech Processing With Vgg16

dc.authoridPARLAK (PhD), CEVAHIR/0000-0002-5500-7379
dc.authoridALTUN, Prof. Dr. Yusuf/0000-0002-2099-0959
dc.authorscopusid55807221400
dc.authorscopusid25031391400
dc.authorwosidPARLAK (PhD), Cevahir/ABA-4914-2021
dc.authorwosidALTUN, Prof. Dr. Yusuf/AAA-9929-2020
dc.contributor.authorParlak, Cevahir
dc.contributor.authorAltun, Yusuf
dc.contributor.otherBilgisayar Mühendisliği Bölümü
dc.date.accessioned2025-01-11T13:03:35Z
dc.date.available2025-01-11T13:03:35Z
dc.date.issued2024
dc.departmentFenerbahçe Universityen_US
dc.department-temp[Parlak, Cevahir] Fenerbahce Univ, Comp Engn Dept, Istanbul, Turkiye; [Altun, Yusuf] Duzce Univ, Comp Engn Dept, Duzce, Turkiyeen_US
dc.descriptionPARLAK (PhD), CEVAHIR/0000-0002-5500-7379; ALTUN, Prof. Dr. Yusuf/0000-0002-2099-0959en_US
dc.description.abstractIn this text, we discuss the filter banks used for speech analysis and propose a novel filter bank for speech processing applications. Filter banks are building blocks of speech processing applications. Multiple filter strategies have been proposed, including Mel, PLP, Seneff, Lyon, and Gammatone filters. MFCC is a transformed version of Mel filters and is still a state-of-the-art method for speech recognition applications. However, 40 years after their debut, time is running out to launch new structures as novel speech features. The proposed acoustic filter banks (AFB) are innovative alternatives to dethrone Mel filters, PLP filters, and MFCC features. Foundations of AFB filters are based on the formant regions of vowels and consonants. In this study, we pioneer an acoustic filter bank comprising 11 frequency regions and conduct experiments using the VGG16 model on the TIMIT and Speech Command V2 datasets. The outcomes of the study concretely indicate that MFCC, Mel, and PLP filters can effectively be replaced with novel AFB filter bank features.en_US
dc.description.woscitationindexScience Citation Index Expanded
dc.identifier.citation0
dc.identifier.doi10.1007/s00034-024-02794-z
dc.identifier.endpage7338en_US
dc.identifier.issn0278-081X
dc.identifier.issn1531-5878
dc.identifier.issue11en_US
dc.identifier.scopus2-s2.0-85200054858
dc.identifier.scopusqualityQ2
dc.identifier.startpage7309en_US
dc.identifier.urihttps://doi.org/10.1007/s00034-024-02794-z
dc.identifier.urihttps://hdl.handle.net/20.500.14627/288
dc.identifier.volume43en_US
dc.identifier.wosWOS:001281629000005
dc.identifier.wosqualityQ3
dc.language.isoenen_US
dc.publisherSpringer Birkhauseren_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectSpeech Processingen_US
dc.subjectMfccen_US
dc.subjectMel Filtersen_US
dc.subjectPlpen_US
dc.subjectFilter Banksen_US
dc.subjectConvolutional Neural Networksen_US
dc.titleA Quest for Formant-Based Compact Nonuniform Trapezoidal Filter Banks for Speech Processing With Vgg16en_US
dc.typeArticleen_US
dspace.entity.typePublication
relation.isAuthorOfPublicationd57697e4-e3d0-4c57-bb5c-4b92ceb92d1a
relation.isAuthorOfPublication.latestForDiscoveryd57697e4-e3d0-4c57-bb5c-4b92ceb92d1a
relation.isOrgUnitOfPublication85e04a04-fb9d-4894-961f-e92f27bb6cb6
relation.isOrgUnitOfPublication.latestForDiscovery85e04a04-fb9d-4894-961f-e92f27bb6cb6

Files