CIVS: A Collective-Intelligence Ensemble for Automated Software Vulnerability Scoring
- Autori: Mirtaheri, S.L.; Shahbazian, R.; Pascucci, V.; Movahedkor, N.; Pugliese, A.
- Anno di pubblicazione: 2025
- Tipologia: Articolo in rivista
- OA Link: http://hdl.handle.net/10447/692277
Abstract
Automated and accurate Common Vulnerability Scoring System (CVSS) labeling is required for quick patch processing. Large Language Models (LLMs) have shown impressive capabilities in understanding and generating human language; however, their performance can vary depending on factors like training data and architecture. LLMs may generate biased or irrelevant responses and mis-rank critical flaws. This paper presents CIVS, a collective-intelligence framework that fuses GPT-4 with a fine-tuned GPT-3.5-Turbo via weighted aggregation and ensemble learning. CIVS can match or surpass the accuracy and cost-efficiency of a single large and expensive model while reducing the risk and enhancing the reliability. Evaluated on recent records of the National Vulnerability Database (NVD), CIVS reduces mean-squared error by 10% and improves macro-F1 to 0.76 compared with the strongest individual model. CIVS shows robustness even when challenged with GPT-generated “what-if” variations of vulnerability descriptions. Due to reusing existing models without adding any new trainable parameters, the framework remains cost-efficient while still generalizing to previously unseen vulnerabilities.
