Colossus computer
Colossus was a set of computers developed by British codebreakers in the years 1943–1945 to help in the cryptanalysis of the Lorenz cipher. Colossus used thermionic valves to perform Boolean and counting operations. Colossus is regarded as the world's first programmable, electronic, digital computer, although it was programmed by switches and plugs and not by a stored program.
Colossus was designed by General Post Office research telephone engineer Tommy Flowers based on plans developed by mathematician Max Newman at the Government Code and Cypher School at Bletchley Park.
Alan Turing's use of probability in cryptanalysis contributed to its design. It has sometimes been erroneously stated that Turing designed Colossus to aid the cryptanalysis of the Enigma.
The prototype, Colossus Mark 1, was shown to be working in December 1943 and was in use at Bletchley Park by early 1944. An improved Colossus Mark 2 that used shift registers to run five times faster first worked on 1 June 1944, just in time for the Normandy landings on D-Day. Ten Colossi were in use by the end of the war and an eleventh was being commissioned. Bletchley Park's use of these machines allowed the Allies to obtain a vast amount of high-level military intelligence from intercepted radiotelegraphy messages between the German High Command and their army commands throughout occupied Europe.
The existence of the Colossus machines was kept secret until the mid-1970s. All but two machines were dismantled into such small parts that their use could not be inferred. The two retained machines were eventually dismantled in the 1960s. In January 2024, new photos were released by GCHQ that showed re-engineered Colossus in a very different environment from the Bletchley Park buildings, presumably at GCHQ Cheltenham. A functioning reconstruction of a Mark 2 Colossus was completed in 2008 by Tony Sale and a team of volunteers; it is on display in The National Museum of Computing at Bletchley Park.
Purpose and origins
The Colossus computers were used to help decipher intercepted radio teleprinter messages that had been encrypted using an unknown device. Intelligence information revealed that the Germans called the wireless teleprinter transmission systems "Sägefisch". This led the British to call encrypted German teleprinter traffic "Fish", and the unknown machine and its intercepted messages "Tunny".Before the Germans increased the security of their operating procedures, British cryptanalysts diagnosed how the unseen machine functioned and built an imitation of it called "British Tunny".
It was deduced that the machine had twelve wheels and used a Vernam ciphering technique on message characters in the standard 5-bit ITA2 telegraph code. It did this by combining the plaintext characters with a stream of key characters using the XOR Boolean function to produce the ciphertext.
In August 1941, a blunder by German operators led to the transmission of two versions of the same message with identical machine settings. These were intercepted and worked on at Bletchley Park. First, John Tiltman, a very talented GC&CS cryptanalyst, derived a keystream of almost 4000 characters. Then Bill Tutte, a newly arrived member of the Research Section, used this keystream to work out the logical structure of the Lorenz machine. He deduced that the twelve wheels consisted of two groups of five, which he named the χ and ψ wheels, the remaining two he called μ or "motor" wheels. The chi wheels stepped regularly with each letter that was encrypted, while the psi wheels stepped irregularly, under the control of the motor wheels.
With a sufficiently random keystream, a Vernam cipher removes the natural language property of a plaintext message of having an uneven frequency distribution of the different characters, to produce a uniform distribution in the ciphertext. The Tunny machine did this well. However, the cryptanalysts worked out that by examining the frequency distribution of the character-to-character changes in the ciphertext, instead of the plain characters, there was a departure from uniformity which provided a way into the system. This was achieved by "differencing" in which each bit or character was XOR-ed with its successor. After Germany surrendered, allied forces captured a Tunny machine and discovered that it was the electromechanical Lorenz SZ in-line cipher machine.
In order to decrypt the transmitted messages, two tasks had to be performed. The first was "wheel breaking", which was the discovery of the cam patterns for all the wheels. These patterns were set up on the Lorenz machine and then used for a fixed period of time for a succession of different messages. Each transmission, which often contained more than one message, was enciphered with a different start position of the wheels. Alan Turing invented a method of wheel-breaking that became known as Turingery. Turing's technique was further developed into "Rectangling", for which Colossus could produce tables for manual analysis. Colossi 2, 4, 6, 7 and 9 had a "gadget" to aid this process.
The second task was "wheel setting", which worked out the start positions of the wheels for a particular message and could only be attempted once the cam patterns were known. It was this task for which Colossus was initially designed. To discover the start position of the chi wheels for a message, Colossus compared two character streams, counting statistics from the evaluation of programmable Boolean functions. The two streams were the ciphertext, which was read at high speed from a paper tape, and the keystream, which was generated internally, in a simulation of the unknown German machine. After a succession of different Colossus runs to discover the likely chi-wheel settings, they were checked by examining the frequency distribution of the characters in the processed ciphertext. Colossus produced these frequency counts.
Decryption processes
By using differencing and knowing that the psi wheels did not advance with each character, Tutte worked out that trying just two differenced bits of the chi-stream against the differenced ciphertext would produce a statistic that was non-random. This became known as Tutte's "1+2 break in". It involved calculating the following Boolean function:and counting the number of times it yielded "false". If this number exceeded a pre-defined threshold value known as the "set total", it was printed out. The cryptanalyst would examine the printout to determine which of the putative start positions was most likely to be the correct one for the chi-1 and chi-2 wheels.
This technique would then be applied to other pairs of, or single, impulses to determine the likely start position of all five chi wheels. From this, the de-chi of a ciphertext could be obtained, from which the psi component could be removed by manual methods. If the frequency distribution of characters in the de-chi version of the ciphertext was within certain bounds, "wheel setting" of the chi wheels was considered to have been achieved, and the message settings and de-chi were passed to the "Testery". This was the section at Bletchley Park led by Major Ralph Tester where the bulk of the decrypting work was done by manual and linguistic methods.
Colossus could also derive the start position of the psi and motor wheels. The feasibility of utilizing this additional capability regularly was made possible in the last few months of the war when there were plenty of Colossi available and the number of Tunny messages had declined.
Design and construction
Colossus was developed for the "Newmanry", the section headed by the mathematician Max Newman that was responsible for machine methods against the twelve-rotor Lorenz SZ40/42 on-line teleprinter cipher machine. The Colossus design arose out of a parallel project that produced a less-ambitious counting machine dubbed "Heath Robinson". Although the Heath Robinson machine proved the concept of machine analysis for this part of the process, it had serious limitations. The electro-mechanical parts were relatively slow and it was difficult to synchronise two looped paper tapes, one containing the enciphered message, and the other representing part of the keystream of the Lorenz machine. Also the tapes tended to stretch and break when being read at up to 2000 characters per second.File:COLOSSUS, part of the machine, presented by Director GCHQ to Director NSA in 1986 - National Cryptologic Museum - DSC07890.JPG|thumbnail|Stepping switch said to be from an original Colossus, presented by the Director of GCHQ to the Director of the NSA to mark the 40th anniversary of the UKUSA Agreement in 1986
Tommy Flowers MBE was a senior electrical engineer and Head of the Switching Group at the Post Office Research Station at Dollis Hill. Prior to his work on Colossus, he had been involved with GC&CS at Bletchley Park from February 1941 in an attempt to improve the Bombes that were used in the cryptanalysis of the German Enigma cipher machine. He was recommended to Max Newman by Alan Turing, who had been impressed by his work on the Bombes. The main components of the Heath Robinson machine were as follows.
- A tape transport and reading mechanism that ran the looped key and message tapes at between 1000 and 2000 characters per second.
- A combining unit that implemented the logic of Tutte's method.
- A counting unit that had been designed by C. E. Wynn-Williams of the Telecommunications Research Establishment at Malvern, which counted the number of times the logical function returned a specified truth value.
Flowers and his team of some fifty people in the switching group spent eleven months from early February 1943 designing and building a machine that dispensed with the second tape of the Heath Robinson, by generating the wheel patterns electronically. Flowers used some of his own money for the project. This prototype, Mark 1 Colossus, contained 1,600 thermionic valves. It performed satisfactorily at Dollis Hill on 8 December 1943 and was dismantled and shipped to Bletchley Park, where it was delivered on 18 January and re-assembled by Harry Fensom and Don Horwood. It was operational in January and it successfully attacked its first message on 5 February 1944. It was a large structure and was dubbed 'Colossus'. A memo held in the National Archives written by Max Newman on 18 January 1944 records that "Colossus arrives today".
During the development of the prototype, an improved design had been developed – the Mark 2 Colossus. Four of these were ordered in March 1944 and by the end of April the number on order had been increased to twelve. Dollis Hill was put under pressure to have the first of these working by 1 June. Allen Coombs took over leadership of the production Mark 2 Colossi, the first of which – containing 2,400 valves – became operational at 08:00 on 1 June 1944, just in time for the Allied Invasion of Normandy on D-Day. Subsequently, Colossi were delivered at the rate of about one a month. By the time of V-E Day there were ten Colossi working at Bletchley Park and a start had been made on assembling an eleventh. Seven of the Colossi were used for 'wheel setting' and three for 'wheel breaking'.
File:Wartime photo of Colossus 10.png|right|thumbnail|upright=1.35|Colossus 10 with its extended bedstead in Block H at Bletchley Park in the space now containing the Tunny gallery of The National Museum of Computing
The main units of the Mark 2 design were as follows.
- A tape transport with an 8-photocell reading mechanism.
- A six character FIFO shift register.
- Twelve thyratron ring stores that simulated the Lorenz machine generating a bit-stream for each wheel.
- Panels of switches for specifying the program and the "set total".
- A set of functional units that performed Boolean operations.
- A "span counter" that could suspend counting for part of the tape.
- A master control that handled clocking, start and stop signals, counter readout and printing.
- Five electronic counters.
- An electric typewriter.
Data input to Colossus was by photoelectric reading of a paper tape transcription of the enciphered intercepted message. This was arranged in a continuous loop so that it could be read and re-read multiple times – there being no internal storage for the data. The design overcame the problem of synchronizing the electronics with the speed of the message tape by generating a clock signal from reading its sprocket holes. The speed of operation was thus limited by the mechanics of reading the tape. During development, the tape reader was tested up to 9700 characters per second before the tape disintegrated. So 5000 characters/second was settled on as the speed for regular use. Flowers designed a 6-character shift register, which was used both for computing the delta function and for testing five different possible starting points of Tunny's wheels in the five processors. This five-way parallelism enabled five simultaneous tests and counts to be performed giving an effective processing speed of 25,000 characters per second. The computation used algorithms devised by W. T. Tutte and colleagues to decrypt a Tunny message.