A Dictionary based Informational Genome Analysis

Alberto Castellini, Giuditta Franco, and Vincenzo Manca


Salmonella enterica SL 254

  • Length: 4.827.641 bp


  • Multiplicity-Comultiplicity 6-distribution: "multiplicity is the number of occurrences of single 6-words" while "comultiplicity is the number of different 6-words having a given occurrence"

  • Zipf's diagram

  • Cardinality trends of genomic k-dictionaries: all k-mers, k-hapaxes, and k-repeats