2026 |
Using Cross-Attention for Conversational ASR over the Telephone |
Contributo in atti di convegno pubblicato in volume |
Go to |
2025 |
S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction |
Articolo in rivista |
Go to |
2025 |
Bilingual Dual-Head Deep Model for Parkinson’s Disease Detection from Speech |
Contributo in atti di convegno pubblicato in volume |
Go to |
2025 |
Lightweight Audio-Visual Wake Word Spotting with Diverse Acoustic Knowledge Distillation |
Articolo in rivista |
Go to |
2025 |
Controllable Conformer for Speech Enhancement and Recognition |
Articolo in rivista |
Go to |
2025 |
Variational Bayesian Adaptive Learning of Deep Latent Variables for Acoustic Knowledge Transfer |
Articolo in rivista |
Go to |
2025 |
MSEMG: Surface Electromyography Denoising with a Mamba-based Efficient Network |
Contributo in atti di convegno pubblicato in volume |
Go to |
2025 |
Cross-attention among spectrum, waveform and SSL representations with bidirectional knowledge distillation for speech enhancement |
Articolo in rivista |
Go to |
2025 |
HPCNet: Hybrid Pixel and Contour Network for Audio-Visual Speech Enhancement with Low-Quality Video |
Articolo in rivista |
Go to |
2025 |
An Explicit Consistency-Preserving Loss Function for Phase Reconstruction and Speech Enhancement |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
How word semantics and phonology affect handwriting of Alzheimer’s patients: A machine learning based analysis |
Articolo in rivista |
Go to |
2024 |
Speech Analysis of Language Varieties in Italy |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
Exploiting Consistency-Preserving Loss and Perceptual Contrast Stretching to Boost SSL-Based Speech Enhancement |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection from Speech in Real-World Operative Conditions |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
FlanEC: Exploring Flan-T5 for Post-ASR Error Correction |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
An Investigation of Incorporating Mamba For Speech Enhancement |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
Large Language Model Based Generative Error Correction: A Challenge and Baselines For Speech Recognition, Speaker Tagging, and Emotion Recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
Boosting End-to-End Multilingual Phoneme Recognition Through Exploiting Universal Speech Attributes Constraints |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
IT'S NEVER TOO LATE: FUSING ACOUSTIC INFORMATION INTO LARGE LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
Summary on the Chat-Scenario Chinese Lipreading (ChatCLR) Challenge |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
Summary on the Multimodal Information-Based Speech Processing (MISP) 2023 Challenge |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
Benchmarking Representations for Speech, Music, and Acoustic Events |
Contributo in atti di convegno pubblicato in volume |
Go to |
2024 |
Federated learning for privacy-preserving speech recognition |
Capitolo o Saggio |
Go to |
2023 |
Summary on the Multimodal Information Based Speech Processing (MISP) 2022 Challenge |
Contributo in atti di convegno pubblicato in volume |
Go to |
2023 |
HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models |
Contributo in atti di convegno pubblicato in volume |
Go to |
2023 |
The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2023 |
Cumulative Sum Analysis of Learning Curve Process for Vaginal Natural Orifice Transluminal Endoscopic Surgery Hysterectomy |
Articolo in rivista |
Go to |
2023 |
A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models |
Contributo in atti di convegno pubblicato in volume |
Go to |
2023 |
Description and analysis of the KPT system for NIST Language Recognition Evaluation 2022 |
Contributo in atti di convegno pubblicato in volume |
Go to |
2023 |
Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2023 |
Inference and Denoise: Causal Inference-Based Neural Speech Enhancement |
Contributo in atti di convegno pubblicato in volume |
Go to |
2023 |
Differentially Private Adapters for Parameter Efficient Acoustic Modeling |
Contributo in atti di convegno pubblicato in volume |
Go to |
2023 |
A step-by-step training method for multi generator GANs with application to anomaly detection and cybersecurity |
Articolo in rivista |
Go to |
2022 |
A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer |
Contributo in atti di convegno pubblicato in volume |
Go to |
2022 |
Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models |
Articolo in rivista |
Go to |
2022 |
AN EXPERIMENTAL STUDY ON PRIVATE AGGREGATION OF TEACHER ENSEMBLE LEARNING FOR END-TO-END SPEECH RECOGNITION |
Contributo in atti di convegno pubblicato in volume |
Go to |
2022 |
The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results |
Contributo in atti di convegno pubblicato in volume |
Go to |
2022 |
A Study on Joint Modeling and Data Augmentation of Multi-Modalities for Audio-Visual Scene Classification |
Contributo in atti di convegno pubblicato in volume |
Go to |
2022 |
Hierarchical Residual Learning Based Vector Quantized Variational Autoencoder for Image Reconstruction and Generation |
Contributo in atti di convegno pubblicato in volume |
Go to |
2022 |
Audio-Visual Wake Word Spotting in MISP2021 Challenge: Dataset Release and Deep Analysis |
Contributo in atti di convegno pubblicato in volume |
Go to |
2022 |
Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis |
Contributo in atti di convegno pubblicato in volume |
Go to |
2022 |
An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2021 |
Raw Speech-to-Articulatory Inversion by Temporal Filtering and Decimation |
Contributo in atti di convegno pubblicato in volume |
Go to |
2021 |
A Two-Stage Approach to Device-Robust Acoustic Scene Classification |
Contributo in atti di convegno pubblicato in volume |
Go to |
2021 |
A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion |
Contributo in atti di convegno pubblicato in volume |
Go to |
2021 |
A Two-Stage Deep Modeling Approach to Articulatory Inversion |
Contributo in atti di convegno pubblicato in volume |
Go to |
2021 |
Automatic Speech Recognition by Machines |
Capitolo o Saggio |
Go to |
2021 |
PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification |
Contributo in atti di convegno pubblicato in volume |
Go to |
2021 |
Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2021 |
Vector-to-Vector Regression via Distributional Loss for Speech Enhancement |
Articolo in rivista |
Go to |
2021 |
Bone-Conducted Speech Enhancement Using Hierarchical Extreme Learning Machine |
Capitolo o Saggio |
Go to |
2020 |
Performance Analysis for Tensor-Train Decomposition to Deep Neural Network Based Vector-to-Vector Regression |
Contributo in atti di convegno pubblicato in volume |
Go to |
2020 |
Sequence-to-Sequence Articulatory Inversion Through Time Convolution of Sub-Band Frequency Signals |
Contributo in atti di convegno pubblicato in volume |
Go to |
2020 |
Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement |
Contributo in atti di convegno pubblicato in volume |
Go to |
2020 |
An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances |
Contributo in atti di convegno pubblicato in volume |
Go to |
2020 |
Tensor-To-Vector Regression for Multi-Channel Speech Enhancement Based on Tensor-Train Network |
Contributo in atti di convegno pubblicato in volume |
Go to |
2020 |
Transfer Learning of Articulatory Information Through Phone Information |
Contributo in atti di convegno pubblicato in volume |
Go to |
2020 |
A multimodal retina-iris biometric system using the levenshtein distance for spatial feature comparison |
Articolo in rivista |
Go to |
2020 |
On Mean Absolute Error for Deep Neural Network Based Vector-to-Vector Regression |
Articolo in rivista |
Go to |
2020 |
A Cross-Task Transfer Learning Approach to Adapting Deep Speech Enhancement Models to Unseen Background Noise Using Paired Senone Classifiers |
Contributo in atti di convegno pubblicato in volume |
Go to |
2020 |
Maximal Figure-of-Merit Framework to Detect Multi-label Phonetic Features for Spoken Language Recognition |
Articolo in rivista |
Go to |
2020 |
Relational Teacher Student Learning with Neural Label Embedding for Device Adaptation in Acoustic Scene Classification |
Contributo in atti di convegno pubblicato in volume |
Go to |
2020 |
Analyzing Upper Bounds on Mean Absolute Errors for Deep Neural Network-Based Vector-to-Vector Regression |
Articolo in rivista |
Go to |
2020 |
Ensemble Hierarchical Extreme Learning Machine for Speech Dereverberation |
Articolo in rivista |
Go to |
2019 |
Improving Audio-visual Speech Recognition Performance with Cross-modal Student-teacher Training |
Contributo in atti di convegno pubblicato in volume |
Go to |
2019 |
Audio-Visual Speech Enhancement using Hierarchical Extreme Learning Machine |
Contributo in atti di convegno pubblicato in volume |
Go to |
2019 |
A Phonetic-Level Analysis of Different Input Features for Articulatory Inversion |
Contributo in atti di convegno pubblicato in volume |
Go to |
2019 |
Compressed multimodal hierarchical extreme learning machine for speech enhancement |
Contributo in atti di convegno pubblicato in volume |
Go to |
2019 |
A Theory on Deep Neural Network Based Vector-to-Vector Regression With an Illustration of Its Expressive Power in Speech Enhancement |
Articolo in rivista |
Go to |
2019 |
Exploring Retraining-free Speech Recognition for Intra-sentential Code-switching |
Contributo in atti di convegno pubblicato in volume |
Go to |
2019 |
Improving Mispronunciation Detection of Mandarin Tones for Non-Native Learners With Soft-Target Tone Labels and BLSTM-Based Deep Tone Models |
Articolo in rivista |
Go to |
2018 |
Improving Mandarin Tone Mispronunciation Detection for Non-Native Learners with Soft-Target Tone Labels and BLSTM-Based Deep Models |
Contributo in atti di convegno pubblicato in volume |
Go to |
2018 |
Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features Using Extended Recognition Networks |
Articolo in rivista |
Go to |
2017 |
A transfer learning and progressive stacking approach to reducing deep model sizes with an application to speech enhancement |
Contributo in atti di convegno pubblicato in volume |
Go to |
2017 |
Using tone-based extended recognition network to detect non-native Mandarin tone mispronunciations |
Contributo in atti di convegno pubblicato in volume |
Go to |
2017 |
DEEP LEARNING WITH MAXIMAL FIGURE-OF-MERIT COST TO ADVANCE MULTI-LABEL SPEECH ATTRIBUTE DETECTION |
Contributo in atti di convegno pubblicato in volume |
Go to |
2017 |
Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition |
Articolo in rivista |
Go to |
2017 |
Experimental Study on Extreme Learning Machine Applications for Speech Enhancement |
Articolo in rivista |
Go to |
2017 |
A unified deep modeling approach to simultaneous speech dereverberation and recognition for the reverb challenge |
Contributo in atti di convegno pubblicato in volume |
Go to |
2017 |
A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation |
Articolo in rivista |
Go to |
2017 |
Towards a direct Bayesian adaptation framework for deep models |
Contributo in atti di convegno pubblicato in volume |
Go to |
2017 |
Hierarchical Bayesian Combination of Plug-in Maximum A Posteriori Decoders in Deep Neural Networks-based Speech Recognition and Speaker Adaptation |
Articolo in rivista |
Go to |
2017 |
An End-to-End Deep Learning Approach to Simultaneous Speech Dereverberation and Acoustic Modeling for Robust Speech Recognition |
Articolo in rivista |
Go to |
2017 |
Joint training of multi-channel-condition dereverberation and acoustic modeling of microphone array speech for robust distant speech recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2017 |
Improving mispronunciation detection for non-native learners with multisource information and LSTM-based deep models |
Contributo in atti di convegno pubblicato in volume |
Go to |
2017 |
Adaptation to New Microphones Using Artificial Neural Networks With Trainable Activation Functions |
Articolo in rivista |
Go to |
2016 |
Improving non-native mispronunciation detection and enriching diagnostic feedback with DNN-based speech attribute modeling |
Contributo in atti di convegno pubblicato in volume |
Go to |
2016 |
i-Vector Modeling of Speech Attributes for Automatic Foreign Accent Recognition |
Articolo in rivista |
Go to |
2016 |
A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition |
Articolo in rivista |
Go to |
2016 |
Detecting Mispronunciations of L2 Learners and Providing Corrective Feedback Using Knowledge-Guided and Data-Driven Decision Trees |
Contributo in atti di convegno pubblicato in volume |
Go to |
2015 |
Rapid adaptation for deep neural networks through multi-task learning |
Contributo in atti di convegno pubblicato in volume |
Go to |
2015 |
Maximum a posteriori adaptation of network parameters in deep models |
Contributo in atti di convegno pubblicato in volume |
Go to |
2015 |
Boosting universal speech attributes classification with deep neural network for foreign accent characterization |
Contributo in atti di convegno pubblicato in volume |
Go to |
2014 |
Introducing attribute features to foreign accent recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2014 |
An artificial neural network approach to automatic speech processing |
Articolo in rivista |
Go to |
2014 |
Attribute based lattice rescoring in spontaneous speech recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2014 |
Dialect levelling in Finnish: A universal speech attribute approach |
Contributo in atti di convegno pubblicato in volume |
Go to |
2014 |
Architecture for parking management in smart cities |
Articolo in rivista |
Go to |
2014 |
Feature space maximum a posteriori linear regression for adaptation of deep neural networks |
Contributo in atti di convegno pubblicato in volume |
Go to |
2013 |
An introductory study on deep neural networks for high resolution aerial images |
Contributo in atti di convegno pubblicato in volume |
Go to |
2013 |
An experimental study on structural-MAP approaches to implementing very large vocabulary speech recognition systems for real-world tasks |
Contributo in atti di convegno pubblicato in volume |
Go to |
2013 |
Exploiting Deep Neural Networks for Detection-Based Speech Recognition |
Articolo in rivista |
Go to |
2013 |
Hermitian Polynomial for Speaker Adaptation of Connectionist Speech Recognition Systems |
Articolo in rivista |
Go to |
2013 |
Knowledge Integration for Improving Performance in LVCSR |
Contributo in atti di convegno pubblicato in volume |
Go to |
2013 |
Speech recognition using long-span temporal patterns in a deep network model |
Articolo in rivista |
Go to |
2013 |
Model-based margin estimation for hidden Markov model learning and generalisation |
Articolo in rivista |
Go to |
2013 |
An Information-Extraction Approach to Speech Processing: Analysis, Detection, Verification, and Recognition |
Articolo in rivista |
Go to |
2013 |
A bottom-up modular search approach to large vocabulary continuous speech recognition |
Articolo in rivista |
Go to |
2013 |
Universal attribute characterization of spoken languages for automatic spoken language recognition |
Articolo in rivista |
Go to |
2012 |
A study on cross-language knowledge integration in Mandarin LVCSR |
Contributo in atti di convegno pubblicato in volume |
Go to |
2012 |
Consumer-level multimedia event detection through unsupervised audio signal modeling |
Contributo in atti di convegno pubblicato in volume |
Go to |
2012 |
Hermitian-Based Hidden Activation Functions for Adaptation of Hybrid HMM/ANN Models |
Contributo in atti di convegno pubblicato in volume |
Go to |
2012 |
Experiments on cross-language attribute detection and phone recognition with minimal target-specific training data |
Articolo in rivista |
Go to |
2012 |
A new confidence measure combining Hidden Markov Models and Artificial Neural Networks of phonemes for effective keyword spotting |
Contributo in atti di convegno pubblicato in volume |
Go to |
2012 |
A NOVEL ARCHITECTURE FOR PARKING MANAGEMENT IN SMART CITIES |
Articolo in rivista |
Go to |
2012 |
Combining Speech Attribute Detection and Penalized Logistic Regression for Phoneme Recognition |
Articolo in rivista |
Go to |
2012 |
Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2011 |
Bootstrapping a spoken language identification system using unsupervised integrated sensing and processing decision trees |
Contributo in atti di convegno pubblicato in volume |
Go to |
2011 |
A bottom-up stepwise knowledge-integration approach to large vocabulary continuous speech recognition using weighted finite state machines |
Contributo in atti di convegno pubblicato in volume |
Go to |
2010 |
Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2010 |
Experimental studies on continuous speech recognition using neural architectures with "adaptive" hidden activation functions |
Contributo in atti di convegno pubblicato in volume |
Go to |
2010 |
A survey on recent progress in the ASAT/SIRKUS paradigm |
Contributo in atti di convegno pubblicato in volume |
Go to |
2010 |
Penalized logistic regression with HMM log-likelihood regressors for speech recognition |
Articolo in rivista |
Go to |
2009 |
Minimum classification error training to improve isolated chord recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2009 |
A Multi-Objective Programming-Based Approach to Language Model Adaptation |
Contributo in atti di convegno pubblicato in volume |
Go to |
2009 |
A phonetic feature based lattice rescoring approach to LVCSR |
Contributo in atti di convegno pubblicato in volume |
Go to |
2009 |
Exploring Universal Attribute Characterization of Spoken Languages for Spoken Language Recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2009 |
A study on integrating acoustic-phonetic information into lattice rescoring for automatic speech recognition |
Articolo in rivista |
Go to |
2008 |
Continuous phone recognition without target language training data |
Contributo in atti di convegno pubblicato in volume |
Go to |
2008 |
A penalized logistic regression approach to detection based phone classification |
Contributo in atti di convegno pubblicato in volume |
Go to |
2008 |
Joint optimization of event detectors and evidence merger for continuous phone recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2008 |
Toward a detector-based universal phone recognizer |
Contributo in atti di convegno pubblicato in volume |
Go to |
2008 |
An experimental study on continuous phone recognition with little or no language specific-training data |
Contributo in atti di convegno pubblicato in volume |
Go to |
2007 |
High-accuracy phone recognition by combining high-performance lattice generation and knowledge based rescoring |
Contributo in atti di convegno pubblicato in volume |
Go to |
2007 |
Detection-Based ASR in the Automatic Speech Attribute Transcription Project |
Contributo in atti di convegno pubblicato in volume |
Go to |
2007 |
Approximate test risk minimization through soft margin estimation |
Contributo in atti di convegno pubblicato in volume |
Go to |
2007 |
Towards Bottom-up Continuous Phone Recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2006 |
Noise Robust Aurora-2 speech recognition employing a codebook-constrained Kalman filter preprocessor |
Contributo in atti di convegno pubblicato in volume |
Go to |
2006 |
A study on lattice rescoring with knowledge scores for automatic speech recognition |
Contributo in atti di convegno pubblicato in volume |
Go to |
2006 |
A Study of Perceptron Mapping Capability to Design Speech Event Detectors |
Proceedings |
Go to |
2006 |
Embedded Knowledge-based Speech Detectors for Real-Time Recognition Tasks |
Proceedings |
Go to |
2006 |
Application of EalphaNets to Feature Recognition of Articulation Manner in Knowledge-Based Automatic Speech Recognition |
Capitolo o Saggio |
Go to |
2005 |
Neural Classification of HEP Experimental Data |
Proceedings |
Go to |
2005 |
Efficient FPGA Implementation of a Knowledge-based Automatic Speech Classifier |
Capitolo o Saggio |
Go to |
2005 |
Application of Enets to Feature Recognition of Articulation Manner in Knowledge-based Automatic Speech Recognition |
Capitolo o Saggio |
Go to |
2004 |
Efficient Rapid Prototyping of Image and Video Processing Algorithms |
Proceedings |
Go to |