From: CSBVAX::CSBVAX::MRGATE::"SMTP::PYRITE.RUTGERS.EDU::SECURITY" 16-MAR-1989 03:33 To: MRGATE::"ARISIA::EVERHART" Subj: Password length Sender: security@pyrite.rutgers.edu Date: Mon, 6 Mar 89 20:45:16 AST From: Don Chiasson Subject: Password length To: security@pyrite.rutgers.edu, tihor@acf6.nyu.edu Cc: G.CHIASSON@xx.drea.dnd.ca The classic reference on letter probabilities in English is Shannon, "Prediction and Entropy of Printed English", Bell Systems Telephone Journal, Vol 30 no 1, pp.50-64, January 1951. An old book I have, "Information Theory and Coding", Abramson, 1961 gives the following information: - For a 27 letter equiprobable alphabet (A-Z,space) with no inter symbol dependency, the entropy or per symbol information content of the source is 4.75 bits/symbol. - If the probabilities of the symbols are included, ie ETAOIN.... the entropy becomes 4.03 bits/symbol. - If each letter depends only on the previous symbol, we get 3.32 bits/symbol; if on the previous two symbols, approximately 3.1 bits/symbol. - If the dependence is on all the preceding text, it is between 0.6 and 1.3 bits per symbol. One bit per symbol is often assumed. In other words given a long string of text, there is a 50% chance you can correctly guess the next letter. Thus using a long and correctly spelled English word does not give a great deal of password security. Violate one or more of these conditions and the problem of guessing a password is much more difficult. Don