Salta al contenuto principale
Passa alla visualizzazione normale.

MARINELLA SCIORTINO

Bit Catastrophes for the Burrows-Wheeler Transform

  • Autori: Giuliani, Sara; Inenaga, Shunsuke; Lipták, Zsuzsanna; Romana, Giuseppe; Sciortino, Marinella; Urbina, Cristian
  • Anno di pubblicazione: 2025
  • Tipologia: Articolo in rivista
  • OA Link: http://hdl.handle.net/10447/680030

Abstract

A bit catastrophe, loosely defined, is when a change in just one character of a string causes a significant change in the size of the compressed string. We study this phenomenon for the Burrows-Wheeler Transform (BWT), a string transform at the heart of several of the most popular compressors and aligners today. The parameter determining the size of the compressed data is the number of equal-letter runs of the BWT, commonly denoted $r$. We exhibit infinite families of strings in which insertion, deletion, resp.\ substitution of one character increases $r$ from constant to $\Theta(\log n)$, where $n$ is the length of the string. These strings can be interpreted both as examples for an increase by a multiplicative or an additive $\Theta(\log n)$-factor. As regards the multiplicative factor, they attain the upper bound given by Akagi, Funakoshi, and Inenaga [Inf \& Comput. 2023] of $\Oh(\log n \log r)$, since here $r=\Oh(1)$. We then give examples of strings in which insertion, deletion, resp.\ substitution of a character increases $r$ by a $\Theta(\sqrt n)$ additive factor. These strings significantly improve the best known lower bound for an additive factor of $\Omega(\log n)$ [Giuliani et al., SOFSEM 2021].