Steganography : Weaknesses in Current Software ---------------------------------------------------------------------------- A WORK IN PROGRESS : Please keep checking back ---------------------------------------------------------------------------- [ Paradigm | S-Tools | Jsteg ] Paradigm The goal of steganography is to add a message m to a carrier M in such a way that it is unfeasible for an attacker to tell, from the messages M and M+m, which is which. A secondary, oft-quoted, aim is that of deniability. ("Message? What message? How did I know that was there?") In my opinion, this can most easily be achieved by satisfying the requirements given earlier. The scenario to envisage is of an attacker with a statistical model of typical messages M. Each message M' available to him is then tested against this model to see if it fits; if a sufficient number don't, then steganography is suspected, and he is likely to launch a denial of service attack, or even resort to rubber hose cryptanalysis... In a personal communication, Andy Brown (the author of S-Tools) states his belief that steganography is only of use in not arousing suspicion in the first place; once suspicicion is aroused, all the steganographer can hope to achieve is make the message irretrievable without the correct key. Personally, I believe that once suspicicion is aroused, the steganographer has lost, for to "make the message irretrievable without the correct key" is the aim of conventional cryptography, which, for some reason, the steganographer has avoided relying solely upon. And the theoretical basis of conventional cryptography is much better understood than the snake-oil of steganography... At this stage, I would also like to differentiate between the steganographic application, and the algorithm upon which it is based. Whilst the bells and whistles of a good application may encourage frequent use, and even give peace of mind to non-experts, if the underlying algorithm M+m is poor, then those same non-experts have been lulled into a false sense of security. For while pre-processing steps such as compression and encryption may be valuable tools, they are unable to rescue a weak algorith. S-Tools 8-Bit Images I have a short program which investigates the palette P of palette-mapped images. Amongst other things, it calculates cover(P,N), the number of NxNxN boxes needed to cover all the palette entries in P. (In fact, the general problem is in NP, so I actually calculate an upper bound.) Clearly, we would expect cover(P,1)=|P|. If cover(P,1)<|P| then two palette entries are the same; the only purpose for this is steganography. Now, although it does theoretically depend upon the image and the quantisation algorithm, in practice, every 8-bit (colour) image without an injected message also has cover(P,2)=|P|. Unfortunately, with images generated by S-Tools, cover(P,2) is very low. For example, if |P|=256 (ie. a 256-colour image), then after injecting a short message (1Kb) into a typical image (800x600) cover(P,2)<50. For a larger message (10 Kb) into the same image, f(P,2)~=32. In fact, knowing only cover(P,2) and the image size, it is even possible to estimate the size of the injected message! It is clear that the algorithm that S-Tools uses to inject messages into 8-bit images seriously distorts the statistical distribution of the image from that which is expected, and in a way that can be easily detected. Jsteg In no particular order : * According to the documentation, The lowest-order bits of all non-zero frequency coefficients are replaced with successive bits from the steganographic source file, and these modified coefficients are sent to the Huffmann coder. As a result, the message (and hence added noise) is concentrated in the earlier blocks, the top of the image, rather than being spread evenly throughout. * Zero-frequency coefficients are untouched, as for "typical" images and quantisation tables, at least three quarters of the coefficients will be zero. Halving this would be easily noticeable, particularly as later blocks would have a "standard" distribution of values. As the number of non-zero-frequency coefficients is not known until compression has been performed, the only way to tell in general whether a given message will fit into a given image is to "suck it and see." * short inject(short inval) { short inbit; if ((inval & 0x1) != inval) if (use_inject) { if ((inbit=bitgetbit()) != -1) { inval &= ~0x1; inval |= inbit; } else use_inject = 0; } return inval; } To avoid possible synchronisation problems, the code maps the following sets of coefficient-values onto themselves : ... {-4,-3}, {-2,-1}, {+2,+3}, {+4,+5}, ... (One-frequency, as well as zero-frequency, coefficients are untouched.) Note that, unlike the distributions of "almost all" AC-coefficients for "typical" images, these transforms are not symmetric about zero. As a result, the coefficient distributions become skewed, resulting in files which are easily flagged as containing extra data. However, this problem is easily fixed by mapping different sets of coefficient-values onto themselves : ... {-4,-3}, {-2,-1}, {+1,+2}, {+3,+4}, ... * The distribution of coefficients is roughly Laplacian. If we define a function N[X,Y] to be the number of times coefficient X has value Y, then for "typical" images I, and "almost all" interesting X, N[X,0] >> N[X,1] ~= N[X,-1] >> N[X,2] ~= N[X,-2] >> N[X,3] ... Unfortunately, if we assume for the moment that approximately half the message bits are zeroes and half are ones, then (with the fix given above), an image I' with a hidden message has, for "almost all" X, N[X,0] = N'[X,0] >> N'[X,1] ~= N'[X,-1] ~= N'[X,2] ~= N'[X,-2] >> N'[X,3] ... i.e. There is a very noticeable step function in earlier blocks, resulting in files which are easily flagged as containing extra data. Although it is possible to skew the distribution of message bits so that, for some A, N[A,y] = N'[A,y] for y in {-2, -1, 0, +1, +2} the distribution would still be distorted for |y| > 2 as, for all X, N[X,1] / N[X,2] != N[X,2] / N[X,3] != ... Furthermore, as for all coefficients A there exists a coefficient B such that N[A,1] / N[A,2] != N[B,1] / N[B,2] != ... the distributions of other coefficients would still be distorted in the centre. * According to the documentation, There is still a danger in that the sixth bit of the stream will always be one; this is solved by tacking an extra zero onto the beginning of the primary length field in half the cases. Surely a simpler solution is simply not to transmit this bit at all, resulting in headers 1.5 bits shorter, on average. Aside from minor implementation niggles, it is clear that the algorithm that Jsteg uses to inject messages seriously distorts the statistical distribution of the image from that which is expected, and in a way that can be easily detected. ---------------------------------------------------------------------------- Bob Tinsley tinsley@dcs.warwick.ac.uk