Skip to main content
Passa alla visualizzazione normale.

ARIANNA MARIA PAVONE

Approximate String Matching with Non-Overlapping Adjacent Unbalanced Translocations

Abstract

In this paper, we investigate the approximate string matching problem when the allowed edit operations are non-overlapping unbalanced translocations of adjacent factors. This kind of edit operation takes place when two adjacent substrings of the text swap, resulting in a modified string. The two involved substrings are allowed to be of different lengths. Such large-scale modifications of strings have various applications, notably in fields such as computational biology and genomics, where structural rearrangements play a key role. However, despite their central role in many fields of text processing, little attention has been devoted to the problem of matching strings allowing for this kind of edit operation. In this paper, we present three algorithms for solving the problem, all of them with an (Formula presented.) worst-case and an (Formula presented.) -space complexity, where m and n are the length of the pattern and of the text, respectively. Specifically, our first algorithm is based on the dynamic programming approach. Our second solution improves the previous one by making use of the Directed Acyclic Word Graph of the pattern. Finally, our third algorithm is based on an alignment procedure. We also show that under the assumptions of equiprobability and independence of characters, our second algorithm has an (Formula presented.) average time complexity for an alphabet of size (Formula presented.).