Tuesday, October 22, 2013

Herbert Yardley: king of the whistleblowers

Herbert O. Yardley is America's archetypical spook whistleblower. He had successfully modernized America's code-breaking power as an Army Signal Corps lieutenant during and shortly after World War I. But the powers that be decided to force him out of his well-paid post as a high-caliber code-cracker.

Herbert Yardley's NSA biography
http://www.nsa.gov/public_info/_files/cryptologic_spectrum/many_lives.pdf

As an NSA history says, Yardley, "with no civil service status or retirement benefits, found himself unemployed just as the stock market was collapsing and the Great Depression beginning. He left Queens and returned to his hometown of Worthington, Indiana, where he began writing what was to become the most famous book in the history of cryptology. There had never been anything like it. In today's terms, it was as if an NSA employee had publicly revealed the complete communications intelligence operations of the Agency for the past twelve years-all its techniques and major successes, its organizational structure and budget-and had, for good measure, included actual intercepts, decrypts, and translations of the communications not only of our adversaries but of our allies as well.

"The American Black Chamber created a sensation when it appeared on 1 June 1931, preceded by excerpts in the Saturday Evening Post, the leading magazine of its time. The State Department, in the best tradition of 'Mission: Impossible,' promptly disavowed any knowledge of Yardley's activities."

Government officials, though angry, decided to do nothing. According to some accounts, Yardley then went to work for the Japanese. The Canadians hired him briefly at the onset of World War I but British intelligence insisted on his ouster.

Yardley went on to write a successful book on poker strategies.

Wednesday, August 28, 2013

Note on probability and periodicty

Draft 1

Please let me know of errors. My email address is conant78@gmail.com



By PAUL CONANT

We consider a binary string that, we assume, began specifically at the first observation.

If that string appears to follow a periodic pattern, a question often asked is whether that string was nonrandomly generated, that is, whether the probabilities for bit selection are independent.

One approach is the runs test, which matches the number of runs against a normal distribution of runs of length n. This is a very useful test, but it fails for

00110011

which has the mean number of runs for n = 8 but which one might suspect is not as likely to be random as would be an aperiodic string.

So what we want to know are the number of periods, which we calculate as 2(1 + 2C2 + 4C2). That subset's members are all unique permutations, each of which we consider to occur with equal probability, based on the provisional condition of independence of probability for each bit, which is put at 1/2.

So let us consider this test for nonrandomness on a string with length n, where n is a composite.

Let n = 8

and the string is

00110011

On length 8 we have the factors 1, 2, 4, 8. Now a string composed of all 0s or all 1s is certainly periodic. So we use the factor 1, along with factors 2 and 4. But we do not consider the factor 8 because a period of length 8 with no repetitions gives us no information about the probability of periodicity (there is no obvious periodic pattern).

So then the cardinality of the set of periods is:

2[1 + 2C2 + 4C2] = 16

which we divide by 2^8, or

8/2^7 = 1/16.

So our reasoning is that the probability of happening upon a randomly generated periodic 8-bit string is 1/16, or 6.25% in contrast to happening upon an 8-bit string of a specific permutation agreed upon in advance, which is 2^-8 or 1/256, or 0.39%. The probability of happening upon an aperiodic bit string is of course 15/16 or 93.75%. This all seems reasonable, where p(specific permutation) < p(having property of periodicity when the bit-length is composite) < p(having property of periodicity on that same string length).

So we argue that the probability of nonrandom influence is 93.75% 

The general formula for strings with remainder 0 is

[1 + (A_1)C2 + ... (A_m)C2]/2^(n-1)

Let's check n = 9.

[1 + 3C2]/2^8 = 1/64

A caveat: sometimes one permutation corresponds to more than one period. It will be found that that only occurs when the number of bits equals p^m, where p is some prime and m is a positive integer. We check the case of 8 bits. Here we find that

0000,0000    0101,0101    0011,0011

and their mirror images are the only strings that have one period superposed on another. That means we might wish to subtract 3 from our set of periodic strings, giving  [2(1 + 2C2 + 4C2) - 3]/2^8 = 5/2^6 = 0.078. However, as n increases we will be able to neglect this adjustment.

We have been discussing exact periodicity. Often, however, we are confronted by partial periodicity, such as this:

00100100100

So what we want to know is the probability that this is part of string 001001001001

which we obtain by 1/64(1/2) = 1/128 = 0.0078;

similarly for 0010010010

where we calculate 1/64(1/4) = 1/256 =  0.0039. This represents the probability that the string is part of a periodic string of length 12.

This probability is distinct from the probability of happening upon a periodic 12-bit string, which is:

(1 + 2C2 + 3C2 + 4C2)/2^11 = 11/2^11 = 0.00537.

Important points:

1. The periodicity probabilities change in accordance with the primes, which are not distributed smoothly.

2. As bit length n tends to infinity, the numerator 2(set of combinations of aliquot factors + 1) tends to 0 with respect to denominator 2^n. This means that with n sufficiently large, the difference between the probability of periodicity and the probability of a specific permutation are close enough to be viewed as identical.

Point 2 permits us to look at a specific string of bit length n >> 5, see that it is periodic or "near" periodic, and assign it a probability of about 2^-n. This is important because we are able to discern the probability of dependence by use of a number that is traditionally only used to predict a specific bit string.

A nicety here is that the ratio of primes to composites diminishes as bit length goes to infinity. For a prime, there are only aperiodic strings. That is, we have pC2/2^n. In the case of 11 bits, we have 55/2048 = 0.0269. So as n increases, the probability that a randomly selected number could be periodic goes up. This consideration does not affect the basic idea we have given.

(The formula of periodicity -- with no repetitions of the period -- for a prime is simply 1/2^(p-1).)

Of course periodicity isn't the only sort of pattern. One can use various algorithms -- say all 0's except at the (n^2)th bit -- to make patterns.

A simple pattern is mirror imaging, in which the string on either side of a midpoint or mid-space is a mirror of the other; that is, bits are reversed.

To wit:

001001.100100

How many mirror pairs are there? Answer 2^6. So the probability of  happening upon a mirror pair is 2^6/2^12 =  1/64 = 0.015625..