Skip to main content
Passa alla visualizzazione normale.

SABATO MARCO SINISCALCHI

Curriculum and Research

Subjects

Academic Year Subject identification code Subject name ECTS Course of study
2025/2026 24183 ANALISI INTELLIGENTE DEI SEGNALI 6 INGEGNERIA INFORMATICA
2025/2026 01527 BASI DI DATI E SISTEMI INFORMATIVI 6 INGEGNERIA INFORMATICA
2025/2026 04203 LABORATORIO DI INFORMATICA 6 SCIENZE DELL'EDUCAZIONE

Publications

Date Title Type Record
2026 Using Cross-Attention for Conversational ASR over the Telephone Contributo in atti di convegno pubblicato in volume Go to
2025 S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction Articolo in rivista Go to
2025 Bilingual Dual-Head Deep Model for Parkinson’s Disease Detection from Speech Contributo in atti di convegno pubblicato in volume Go to
2025 Lightweight Audio-Visual Wake Word Spotting with Diverse Acoustic Knowledge Distillation Articolo in rivista Go to
2025 Controllable Conformer for Speech Enhancement and Recognition Articolo in rivista Go to
2025 Variational Bayesian Adaptive Learning of Deep Latent Variables for Acoustic Knowledge Transfer Articolo in rivista Go to
2025 MSEMG: Surface Electromyography Denoising with a Mamba-based Efficient Network Contributo in atti di convegno pubblicato in volume Go to
2025 Cross-attention among spectrum, waveform and SSL representations with bidirectional knowledge distillation for speech enhancement Articolo in rivista Go to
2025 HPCNet: Hybrid Pixel and Contour Network for Audio-Visual Speech Enhancement with Low-Quality Video Articolo in rivista Go to
2025 An Explicit Consistency-Preserving Loss Function for Phase Reconstruction and Speech Enhancement Contributo in atti di convegno pubblicato in volume Go to
2024 How word semantics and phonology affect handwriting of Alzheimer’s patients: A machine learning based analysis Articolo in rivista Go to
2024 Speech Analysis of Language Varieties in Italy Contributo in atti di convegno pubblicato in volume Go to
2024 Exploiting Consistency-Preserving Loss and Perceptual Contrast Stretching to Boost SSL-Based Speech Enhancement Contributo in atti di convegno pubblicato in volume Go to
2024 Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection from Speech in Real-World Operative Conditions Contributo in atti di convegno pubblicato in volume Go to
2024 Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition Contributo in atti di convegno pubblicato in volume Go to
2024 FlanEC: Exploring Flan-T5 for Post-ASR Error Correction Contributo in atti di convegno pubblicato in volume Go to
2024 An Investigation of Incorporating Mamba For Speech Enhancement Contributo in atti di convegno pubblicato in volume Go to
2024 Large Language Model Based Generative Error Correction: A Challenge and Baselines For Speech Recognition, Speaker Tagging, and Emotion Recognition Contributo in atti di convegno pubblicato in volume Go to
2024 Boosting End-to-End Multilingual Phoneme Recognition Through Exploiting Universal Speech Attributes Constraints Contributo in atti di convegno pubblicato in volume Go to
2024 The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction Contributo in atti di convegno pubblicato in volume Go to
2024 IT'S NEVER TOO LATE: FUSING ACOUSTIC INFORMATION INTO LARGE LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION Contributo in atti di convegno pubblicato in volume Go to
2024 Summary on the Chat-Scenario Chinese Lipreading (ChatCLR) Challenge Contributo in atti di convegno pubblicato in volume Go to
2024 Summary on the Multimodal Information-Based Speech Processing (MISP) 2023 Challenge Contributo in atti di convegno pubblicato in volume Go to
2024 Benchmarking Representations for Speech, Music, and Acoustic Events Contributo in atti di convegno pubblicato in volume Go to
2024 Federated learning for privacy-preserving speech recognition Capitolo o Saggio Go to
2023 Summary on the Multimodal Information Based Speech Processing (MISP) 2022 Challenge Contributo in atti di convegno pubblicato in volume Go to
2023 HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models Contributo in atti di convegno pubblicato in volume Go to
2023 The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition Contributo in atti di convegno pubblicato in volume Go to
2023 Cumulative Sum Analysis of Learning Curve Process for Vaginal Natural Orifice Transluminal Endoscopic Surgery Hysterectomy Articolo in rivista Go to
2023 A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models Contributo in atti di convegno pubblicato in volume Go to
2023 Description and analysis of the KPT system for NIST Language Recognition Evaluation 2022 Contributo in atti di convegno pubblicato in volume Go to
2023 Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Recognition Contributo in atti di convegno pubblicato in volume Go to
2023 Inference and Denoise: Causal Inference-Based Neural Speech Enhancement Contributo in atti di convegno pubblicato in volume Go to
2023 Differentially Private Adapters for Parameter Efficient Acoustic Modeling Contributo in atti di convegno pubblicato in volume Go to
2023 A step-by-step training method for multi generator GANs with application to anomaly detection and cybersecurity Articolo in rivista Go to
2022 A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer Contributo in atti di convegno pubblicato in volume Go to
2022 Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models Articolo in rivista Go to
2022 AN EXPERIMENTAL STUDY ON PRIVATE AGGREGATION OF TEACHER ENSEMBLE LEARNING FOR END-TO-END SPEECH RECOGNITION Contributo in atti di convegno pubblicato in volume Go to
2022 The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results Contributo in atti di convegno pubblicato in volume Go to
2022 A Study on Joint Modeling and Data Augmentation of Multi-Modalities for Audio-Visual Scene Classification Contributo in atti di convegno pubblicato in volume Go to
2022 Hierarchical Residual Learning Based Vector Quantized Variational Autoencoder for Image Reconstruction and Generation Contributo in atti di convegno pubblicato in volume Go to
2022 Audio-Visual Wake Word Spotting in MISP2021 Challenge: Dataset Release and Deep Analysis Contributo in atti di convegno pubblicato in volume Go to
2022 Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis Contributo in atti di convegno pubblicato in volume Go to
2022 An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition Contributo in atti di convegno pubblicato in volume Go to
2021 Raw Speech-to-Articulatory Inversion by Temporal Filtering and Decimation Contributo in atti di convegno pubblicato in volume Go to
2021 A Two-Stage Approach to Device-Robust Acoustic Scene Classification Contributo in atti di convegno pubblicato in volume Go to
2021 A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion Contributo in atti di convegno pubblicato in volume Go to
2021 A Two-Stage Deep Modeling Approach to Articulatory Inversion Contributo in atti di convegno pubblicato in volume Go to
2021 Automatic Speech Recognition by Machines Capitolo o Saggio Go to
2021 PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification Contributo in atti di convegno pubblicato in volume Go to
2021 Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition Contributo in atti di convegno pubblicato in volume Go to
2021 Vector-to-Vector Regression via Distributional Loss for Speech Enhancement Articolo in rivista Go to
2021 Bone-Conducted Speech Enhancement Using Hierarchical Extreme Learning Machine Capitolo o Saggio Go to
2020 Performance Analysis for Tensor-Train Decomposition to Deep Neural Network Based Vector-to-Vector Regression Contributo in atti di convegno pubblicato in volume Go to
2020 Sequence-to-Sequence Articulatory Inversion Through Time Convolution of Sub-Band Frequency Signals Contributo in atti di convegno pubblicato in volume Go to
2020 Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement Contributo in atti di convegno pubblicato in volume Go to
2020 An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances Contributo in atti di convegno pubblicato in volume Go to
2020 Tensor-To-Vector Regression for Multi-Channel Speech Enhancement Based on Tensor-Train Network Contributo in atti di convegno pubblicato in volume Go to
2020 Transfer Learning of Articulatory Information Through Phone Information Contributo in atti di convegno pubblicato in volume Go to
2020 A multimodal retina-iris biometric system using the levenshtein distance for spatial feature comparison Articolo in rivista Go to
2020 On Mean Absolute Error for Deep Neural Network Based Vector-to-Vector Regression Articolo in rivista Go to
2020 A Cross-Task Transfer Learning Approach to Adapting Deep Speech Enhancement Models to Unseen Background Noise Using Paired Senone Classifiers Contributo in atti di convegno pubblicato in volume Go to
2020 Maximal Figure-of-Merit Framework to Detect Multi-label Phonetic Features for Spoken Language Recognition Articolo in rivista Go to
2020 Relational Teacher Student Learning with Neural Label Embedding for Device Adaptation in Acoustic Scene Classification Contributo in atti di convegno pubblicato in volume Go to
2020 Analyzing Upper Bounds on Mean Absolute Errors for Deep Neural Network-Based Vector-to-Vector Regression Articolo in rivista Go to
2020 Ensemble Hierarchical Extreme Learning Machine for Speech Dereverberation Articolo in rivista Go to
2019 Improving Audio-visual Speech Recognition Performance with Cross-modal Student-teacher Training Contributo in atti di convegno pubblicato in volume Go to
2019 Audio-Visual Speech Enhancement using Hierarchical Extreme Learning Machine Contributo in atti di convegno pubblicato in volume Go to
2019 A Phonetic-Level Analysis of Different Input Features for Articulatory Inversion Contributo in atti di convegno pubblicato in volume Go to
2019 Compressed multimodal hierarchical extreme learning machine for speech enhancement Contributo in atti di convegno pubblicato in volume Go to
2019 A Theory on Deep Neural Network Based Vector-to-Vector Regression With an Illustration of Its Expressive Power in Speech Enhancement Articolo in rivista Go to
2019 Exploring Retraining-free Speech Recognition for Intra-sentential Code-switching Contributo in atti di convegno pubblicato in volume Go to
2019 Improving Mispronunciation Detection of Mandarin Tones for Non-Native Learners With Soft-Target Tone Labels and BLSTM-Based Deep Tone Models Articolo in rivista Go to
2018 Improving Mandarin Tone Mispronunciation Detection for Non-Native Learners with Soft-Target Tone Labels and BLSTM-Based Deep Models Contributo in atti di convegno pubblicato in volume Go to
2018 Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features Using Extended Recognition Networks Articolo in rivista Go to
2017 A transfer learning and progressive stacking approach to reducing deep model sizes with an application to speech enhancement Contributo in atti di convegno pubblicato in volume Go to
2017 Using tone-based extended recognition network to detect non-native Mandarin tone mispronunciations Contributo in atti di convegno pubblicato in volume Go to
2017 DEEP LEARNING WITH MAXIMAL FIGURE-OF-MERIT COST TO ADVANCE MULTI-LABEL SPEECH ATTRIBUTE DETECTION Contributo in atti di convegno pubblicato in volume Go to
2017 Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition Articolo in rivista Go to
2017 Experimental Study on Extreme Learning Machine Applications for Speech Enhancement Articolo in rivista Go to
2017 A unified deep modeling approach to simultaneous speech dereverberation and recognition for the reverb challenge Contributo in atti di convegno pubblicato in volume Go to
2017 A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation Articolo in rivista Go to
2017 Towards a direct Bayesian adaptation framework for deep models Contributo in atti di convegno pubblicato in volume Go to
2017 Hierarchical Bayesian Combination of Plug-in Maximum A Posteriori Decoders in Deep Neural Networks-based Speech Recognition and Speaker Adaptation Articolo in rivista Go to
2017 An End-to-End Deep Learning Approach to Simultaneous Speech Dereverberation and Acoustic Modeling for Robust Speech Recognition Articolo in rivista Go to
2017 Joint training of multi-channel-condition dereverberation and acoustic modeling of microphone array speech for robust distant speech recognition Contributo in atti di convegno pubblicato in volume Go to
2017 Improving mispronunciation detection for non-native learners with multisource information and LSTM-based deep models Contributo in atti di convegno pubblicato in volume Go to
2017 Adaptation to New Microphones Using Artificial Neural Networks With Trainable Activation Functions Articolo in rivista Go to
2016 Improving non-native mispronunciation detection and enriching diagnostic feedback with DNN-based speech attribute modeling Contributo in atti di convegno pubblicato in volume Go to
2016 i-Vector Modeling of Speech Attributes for Automatic Foreign Accent Recognition Articolo in rivista Go to
2016 A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition Articolo in rivista Go to
2016 Detecting Mispronunciations of L2 Learners and Providing Corrective Feedback Using Knowledge-Guided and Data-Driven Decision Trees Contributo in atti di convegno pubblicato in volume Go to
2015 Rapid adaptation for deep neural networks through multi-task learning Contributo in atti di convegno pubblicato in volume Go to
2015 Maximum a posteriori adaptation of network parameters in deep models Contributo in atti di convegno pubblicato in volume Go to
2015 Boosting universal speech attributes classification with deep neural network for foreign accent characterization Contributo in atti di convegno pubblicato in volume Go to
2014 Introducing attribute features to foreign accent recognition Contributo in atti di convegno pubblicato in volume Go to
2014 An artificial neural network approach to automatic speech processing Articolo in rivista Go to
2014 Attribute based lattice rescoring in spontaneous speech recognition Contributo in atti di convegno pubblicato in volume Go to
2014 Dialect levelling in Finnish: A universal speech attribute approach Contributo in atti di convegno pubblicato in volume Go to
2014 Architecture for parking management in smart cities Articolo in rivista Go to
2014 Feature space maximum a posteriori linear regression for adaptation of deep neural networks Contributo in atti di convegno pubblicato in volume Go to
2013 An introductory study on deep neural networks for high resolution aerial images Contributo in atti di convegno pubblicato in volume Go to
2013 An experimental study on structural-MAP approaches to implementing very large vocabulary speech recognition systems for real-world tasks Contributo in atti di convegno pubblicato in volume Go to
2013 Exploiting Deep Neural Networks for Detection-Based Speech Recognition Articolo in rivista Go to
2013 Hermitian Polynomial for Speaker Adaptation of Connectionist Speech Recognition Systems Articolo in rivista Go to
2013 Knowledge Integration for Improving Performance in LVCSR Contributo in atti di convegno pubblicato in volume Go to
2013 Speech recognition using long-span temporal patterns in a deep network model Articolo in rivista Go to
2013 Model-based margin estimation for hidden Markov model learning and generalisation Articolo in rivista Go to
2013 An Information-Extraction Approach to Speech Processing: Analysis, Detection, Verification, and Recognition Articolo in rivista Go to
2013 A bottom-up modular search approach to large vocabulary continuous speech recognition Articolo in rivista Go to
2013 Universal attribute characterization of spoken languages for automatic spoken language recognition Articolo in rivista Go to
2012 A study on cross-language knowledge integration in Mandarin LVCSR Contributo in atti di convegno pubblicato in volume Go to
2012 Consumer-level multimedia event detection through unsupervised audio signal modeling Contributo in atti di convegno pubblicato in volume Go to
2012 Hermitian-Based Hidden Activation Functions for Adaptation of Hybrid HMM/ANN Models Contributo in atti di convegno pubblicato in volume Go to
2012 Experiments on cross-language attribute detection and phone recognition with minimal target-specific training data Articolo in rivista Go to
2012 A new confidence measure combining Hidden Markov Models and Artificial Neural Networks of phonemes for effective keyword spotting Contributo in atti di convegno pubblicato in volume Go to
2012 A NOVEL ARCHITECTURE FOR PARKING MANAGEMENT IN SMART CITIES Articolo in rivista Go to
2012 Combining Speech Attribute Detection and Penalized Logistic Regression for Phoneme Recognition Articolo in rivista Go to
2012 Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition Contributo in atti di convegno pubblicato in volume Go to
2011 Bootstrapping a spoken language identification system using unsupervised integrated sensing and processing decision trees Contributo in atti di convegno pubblicato in volume Go to
2011 A bottom-up stepwise knowledge-integration approach to large vocabulary continuous speech recognition using weighted finite state machines Contributo in atti di convegno pubblicato in volume Go to
2010 Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognition Contributo in atti di convegno pubblicato in volume Go to
2010 Experimental studies on continuous speech recognition using neural architectures with "adaptive" hidden activation functions Contributo in atti di convegno pubblicato in volume Go to
2010 A survey on recent progress in the ASAT/SIRKUS paradigm Contributo in atti di convegno pubblicato in volume Go to
2010 Penalized logistic regression with HMM log-likelihood regressors for speech recognition Articolo in rivista Go to
2009 Minimum classification error training to improve isolated chord recognition Contributo in atti di convegno pubblicato in volume Go to
2009 A Multi-Objective Programming-Based Approach to Language Model Adaptation Contributo in atti di convegno pubblicato in volume Go to
2009 A phonetic feature based lattice rescoring approach to LVCSR Contributo in atti di convegno pubblicato in volume Go to
2009 Exploring Universal Attribute Characterization of Spoken Languages for Spoken Language Recognition Contributo in atti di convegno pubblicato in volume Go to
2009 A study on integrating acoustic-phonetic information into lattice rescoring for automatic speech recognition Articolo in rivista Go to
2008 Continuous phone recognition without target language training data Contributo in atti di convegno pubblicato in volume Go to
2008 A penalized logistic regression approach to detection based phone classification Contributo in atti di convegno pubblicato in volume Go to
2008 Joint optimization of event detectors and evidence merger for continuous phone recognition Contributo in atti di convegno pubblicato in volume Go to
2008 Toward a detector-based universal phone recognizer Contributo in atti di convegno pubblicato in volume Go to
2008 An experimental study on continuous phone recognition with little or no language specific-training data Contributo in atti di convegno pubblicato in volume Go to
2007 High-accuracy phone recognition by combining high-performance lattice generation and knowledge based rescoring Contributo in atti di convegno pubblicato in volume Go to
2007 Detection-Based ASR in the Automatic Speech Attribute Transcription Project Contributo in atti di convegno pubblicato in volume Go to
2007 Approximate test risk minimization through soft margin estimation Contributo in atti di convegno pubblicato in volume Go to
2007 Towards Bottom-up Continuous Phone Recognition Contributo in atti di convegno pubblicato in volume Go to
2006 Noise Robust Aurora-2 speech recognition employing a codebook-constrained Kalman filter preprocessor Contributo in atti di convegno pubblicato in volume Go to
2006 A study on lattice rescoring with knowledge scores for automatic speech recognition Contributo in atti di convegno pubblicato in volume Go to
2006 A Study of Perceptron Mapping Capability to Design Speech Event Detectors Proceedings Go to
2006 Embedded Knowledge-based Speech Detectors for Real-Time Recognition Tasks Proceedings Go to
2006 Application of EalphaNets to Feature Recognition of Articulation Manner in Knowledge-Based Automatic Speech Recognition Capitolo o Saggio Go to
2005 Neural Classification of HEP Experimental Data Proceedings Go to
2005 Efficient FPGA Implementation of a Knowledge-based Automatic Speech Classifier Capitolo o Saggio Go to
2005 Application of Enets to Feature Recognition of Articulation Manner in Knowledge-based Automatic Speech Recognition Capitolo o Saggio Go to
2004 Efficient Rapid Prototyping of Image and Video Processing Algorithms Proceedings Go to