BiGram+Frequency

This page discusses a bi-gram, aka bigraph (but avoid "digraph" because of the two more common meanings of that word) which is simply an ordered pair of letters which appear in words. Pairs of letters in a natural language (Eg. English) do not appear with frequency predictable just from the relative frequency of the individual letters. Go see[| http://www.math.cornell.edu/~mec/2003-2004/cryptography/subs/digraphs.html] I found this work referenced elsewhere... they ONLY used 40,000 words for sample analysis. /usr/share/dict/words on my computer has 235,000 words, although it includes many proper nouns. And obscure words appear as often as common ones, so it's not a good sample. Great volumes of text should be easy to find online and analyze. (Maybe I will)

[|Claude Shannon's] famous 1948 paper //[|A mathematical theory of communication]//, which in a single step created the field of [|information theory], opens by introducing the concept of [|entropy] through Markov modeling of the English language.

See also []