A Dictionary based Informational Genome Analysis

Alberto Castellini, Giuditta Franco, and Vincenzo Manca

 
 

Mycoplasma genitalium

  • Length: 580.076 bp

 

  • Multiplicity-Comultiplicity 6-distribution: "multiplicity is the number of occurrences of single 6-words" while "comultiplicity is the number of different 6-words having a given occurrence"



  • Multiplicity-Comultiplicity 6-distribution (comparison between the real genomic sequence and a random permutation of this sequence): "multiplicity is the number of occurrences of single 6-words" while "comultiplicity is the number of different 6-words having a given occurrence"


  • Zipf's diagram



  • Cardinality trends of genomic k-dictionaries: all k-mers, k-hapaxes, and k-repeats



  • Length-cardinality repeats distribution