Alignment Free Dissimilarities for Nucleosome Classification
- Autori: LO BOSCO, G.
- Anno di pubblicazione: 2016
- Tipologia: Contributo in atti di convegno pubblicato in volume (Capitolo o saggio)
- Parole Chiave: k-mers; L-tuples; Alignment free DNA sequence dissimilarities; Nucleosome classification; Epigenetic; Knn classifier
- OA Link: http://hdl.handle.net/10447/181678
Epigenetic mechanisms such as nucleosome positioning, histone modications and DNA methylation play an important role in the regulation of cell type-specic gene activities, yet how epigenetic patterns are established and maintained remains poorly understood. Recent studies have shown a role of DNA sequences in recruitment of epigenetic regulators. For this reason, the use of more suitable similarities or dissimilarity between DNA sequences could help in the context of epigenetic studies. In particular, alignment-free dissimilarities have already been successfully applied to identify distinct sequence features that are associated with epigenetic patterns and to predict epigenomic proles. In this work, we focalize the study on the problem of nucleosome classification, providing a benchmark study of 6 alignment free dissimilarity measures between sequences, belonging to the categories of geometricbased, correlation-based, information-based and compression based. Their comparisons have been done versus an alignment based dissimilarity, by measuring the performance of several nearest neighbour classiers that incorporate each one the considered dissimilarities. Results computed on three dataset of nucleosome forming and inhibiting sequences, shows that among the alignment free dissimilarities, the geometric and correlation are the more suitable for the purpose of nucleosome classication, making them a more ecient alternative to the alignment-based similarity measures, which nevertheless are yet the preferred choice when dealing with sequence similarity measurements