N-gram analysis of 970 microbial organisms reveals presence of biological language models

<p>Abstract</p> <p>Background</p> <p>It has been suggested previously that genome and proteome sequences show characteristics typical of natural-language texts such as "signature-style" word usage indicative of authors or topics, and that the algorithms origin...

Full description

Bibliographic Details
Main Authors: Ganapathiraju Madhavi K, Osmanbeyoglu Hatice
Format: Article
Language:English
Published: BMC 2011-01-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/12/12