Intel 4004
The Intel 4004, released by the Intel Corporation on November 15, 1971, was the first in a long line of Intel central processing units. Priced at, the chip marked both a technological and economic milestone in computing.
The 4-bit 4004 CPU was the first significant commercial example of large-scale integration, using the abilities of the MOS silicon gate technology to integrate the CPU into a single chip. Compared to the existing technology, SGT enabled twice the transistor density and five times the operating speed, making future single-chip CPUs feasible. The MCS-4 chipset design, of which the 4004 was a part, served as a model on how to use SGT for complex logic and memory circuits, accelerating the adoption of SGT by the world's semiconductor industry.
The project originated in 1969 when Busicom Corp. commissioned Intel to design a family of seven chips for electronic calculators, including a three-chip CPU. Busicom initially envisioned using shift registers for data storage and ROM for instructions. Intel engineer Marcian Hoff proposed a simpler architecture based on data stored on RAM, making a single-chip CPU possible. Design work, led by Federico Faggin with contributions from Masatoshi Shima, began in April 1970. The first fully operational 4004 was delivered in March 1971 for Busicom's 141-PF printing calculator prototype, now housed at the Computer History Museum. General sales began in July 1971.
Faggin, who had developed SGT at Fairchild Semiconductor and used it to create the Fairchild 3708, the first commercially produced SGT integrated circuit, used SGT, a method of using poly-silicon instead of metal, at Intel to achieve the integration required for the 4004. Additionally, he developed the "bootstrap load," previously considered unfeasible with silicon gate technology, and the "buried contact," which enabled silicon gates to connect directly to the transistor's source and drain without the use of metal. Together, these innovations doubled the circuit density, and thus halved cost, allowing a single chip to contain 2,300 transistors and run five times faster than designs using the previous MOS technology with aluminum gates.
The 4004's architecture laid the foundation for subsequent Intel processors, including the improved Intel 4040, released in 1974, and the 8-bit Intel 8008 and 8080.
History
Original concept
In April 1969, Busicom approached Intel and asked them to produce a chip set to handle the operations for an electronic calculator. The partnership was facilitated in part by Sharp engineer Tadashi Sasaki, who had been dreaming for decades of miniaturizing computers to the point where they could fit into a pocket. Sasaki monitored the trend of increasing integrated circuit complexity throughout the 1960s, and calculated in 1968 that by 1970 it should be affordable to build a calculator out of just two chips. Sasaki discussed his ideas for low-cost, low-chip-count processor designs with his college classmate Yoshio Kojima, who was president of Busicom, as well as with Robert Noyce of Intel. Busicom was interested in building a lower-cost competitor to the 1965 Olivetti Programma 101, one of the world's first tabletop programmable calculators. The key difference was that the Busicom design would use integrated circuits to replace the printed circuit boards filled with individual components, and solid-state shift registers for memory instead of the costly magnetostriction wire in the 101. Busicom also realized that they could launch a general-purpose processor in a low-end desktop printing calculator, and then use the same design for other equipment like cash registers and automatic teller machines. The company had already produced a calculator using TTL small-scale integration logic ICs and were interested in having Intel reduce the chip count using Intel's medium-scale integration techniques.Intel assigned the recently hired Marcian Hoff, employee number 12, to act as the liaison between the two companies. In late June, three engineers from Busicom, Masatoshi Shima and his colleagues Masuda and Takayama, traveled to Intel to introduce the design. Although he had only been assigned to liaise with the engineers, Hoff began studying the concept. Their initial proposal had seven ICs: program control, arithmetic unit, timing, program ROM, shift registers for temporary memory, printer controller and input/output control.
Hoff became concerned that the number of chips and the required interconnections between them would make Busicom's price goals impossible to meet. Combining the chips would reduce the complexity and cost. He was also concerned that the still-small Intel would not have enough design staff to make seven separate chips at the same time. He raised these concerns with upper management, and Bob Noyce, the CEO, told Hoff he would support a different approach if it seemed feasible.
Simplified design
A key concept in the Busicom design was that the program control and ALU were not aimed specifically at the calculator market, it was the program in ROM that turned it into a calculator. The original idea was that the company could use the same chips with different amounts of shift-register RAM and program ROM to produce a range of calculating machines. Hoff was struck by how closely the Busicom's instruction set architecture matched that of general-purpose computers. He began to consider whether a truly general-purpose processor could be made cheaply enough to be used in a calculator. When later asked where he got the ideas for the architecture of the first microprocessor, Hoff related that Plessey, "a British tractor company", had donated a minicomputer to Stanford, and he had "played with it some" while he was there.Another development that made this design practical was Intel's work on the earliest dynamic RAM chips. Shift registers at that time were among the only low-cost read and write memory devices. However, shift register memory is not suited for random access, as each access must wait for the desired bit to flow through the chain. DRAM, on the other hand, can be accessed randomly, and the three-transistor DRAM cell saves silicon area compared to the six-transistor shift-register cell.
Finally, Hoff noticed that much of the complexity of the program control chip was due to every instruction being implemented separately. He suggested that the chip instead support subroutine calls and instructions be implemented as subroutines where possible. The application naturally suggested a 4-bit design, as this allowed for direct manipulation of binary-coded decimal values used by calculators. Hoff worked on the overall design concept through July and August 1969 but found that the Busicom executives seemed uninterested in his proposal. Intel had to work smarter so Busicom would accept their proposal for the 141-PF calculator. They began to conceptualize a general purpose microprocessor that could be given instructions and return their results, as well as be able to merge all of the CPU functions of a computer. Later in fall of that year, Intel's engineers proposed a new design of just four chips, including one that could be programmed for use; the programmable chip would end up becoming the 4004 microprocessor.
Mazor joins
Unknown to Hoff, the Busicom team were extremely interested in his proposal. However, there were a number of specific issues that they were concerned about. One key issue was that certain routines like decimal adjust and keyboard handling would use large amounts of ROM space if implemented as subroutines. Another was that the design did not feature any sort of interrupt, so dealing with real-time events would be difficult. Finally, storing the numbers as 4-bit BCD would require additional memory to store the sign and decimal place.In September 1969, Stanley Mazor joined Intel from Fairchild. Hoff and Mazor quickly came up with solutions to the Busicom concerns. To address the complexity of the subroutines, originally solved in Busicom's design using one-byte macroinstructions and complex decoder circuitry, Mazor developed a 20-byte long interpreter that executed the same macroinstructions. Shima suggested adding a new interrupt that would be triggered by a pin so that the keyboard could be interrupt driven. He also modified the Branch Back instruction to clear the accumulator.
To reach the price goals, it was important that the chip be as small as possible and use the fewest number of leads. As data was 4 bits and the address space was 12 bits, there was no way direct access could be arranged with anything fewer than about 24 pins. This was not small enough, so the design would use a 16-pin dual in-line package layout and use multiplexing of a single set of 4 lines. This meant specifying which address in ROM to access required three clock cycles, and another two to read it from memory. Running at 1 MHz it would perform math on the BCD values at about 80 microseconds per digit.
The result of the discussions between Intel and Busicom was an architecture that reduced the 7-chip Busicom design to a 4-chip Intel proposal composed of CPU, ROM, RAM and I/O devices. The proposal was presented to a visiting team of Busicom executives in October 1969. They agreed that the new concept was superior and gave Intel the go-ahead to begin development. Hoff was upset to learn that the contract assigned all rights to the design to Busicom, in spite of it being designed entirely within Intel. The team then left for Japan, but Shima remained in California until December, developing many of the subroutines.
Faggin joins
Neither Hoff nor Mazor, who worked in the Applications Research group, had experience designing the actual silicon, and the design group was already overworked with the development of memory devices. In April 1970, Leslie Vadász, who ran the MOS design group, hired Federico Faggin from Fairchild Semiconductor to take over the project. Faggin had already made a name for himself by leading the entire development of the MOS silicon gate technology and the design of the first commercial integrated circuit made with it. The new technology was going to change the entire semiconductor market.Integrated circuits consist of a number of individual components like transistors and resistors that are produced by mixing the underlying silicon with "dopants". This is normally accomplished by heating the chip in the presence of a chemical gas, which diffuses into the surface. Previously, the individual components were connected together to make a circuit using aluminum wires deposited on the surface. As aluminum melts at 600 degrees and silicon at 1000, the traces typically had to be deposited as the last step, which often complicated the production cycle.
In 1967, Bell Labs released a paper about making MOS transistors with self-aligned gates made of silicon rather than metal. These devices, however, were a proof-of-concept and could not be used to make ICs. Faggin and Tom Klein had taken what was a curiosity and developed the entire process technology needed to fabricate reliable ICs. Faggin also designed and produced the Fairchild 3708, the first IC made with SGT, first sold at the end of 1968, and featured on the cover of Electronics in September 1969. The silicon gate technology also reduced the leakage current by more than 100 times, making possible sophisticated dynamic circuits like DRAMs. Also, the highly doped silicon used for the gates could be used for the interconnections, and this greatly improved the circuit density of random-logic ICs like microprocessors.
This technique meant the interconnections could be performed at any time in the process. More importantly, the wiring was deposited using the same equipment that made the rest of the components. This meant that the slight differences in layout between different machine types was eliminated. Previously the interconnects had to be much larger than required in order to ensure the aluminum touched the silicon components which would be offset due to inaccuracies in the machinery. With this issue eliminated, the circuits could be placed much closer together, immediately doubling the density of the components and reducing their cost by the same amount. Additionally, the aluminum wiring acted as parasitic capacitors which limited the signal speed; without these parasitics the chips could run at faster speeds.
At Intel, Faggin began designing the new processor using this self-aligned gate process. Only days after Faggin joined Intel, Shima arrived from Japan. He was disappointed to learn that the project had stalled since he'd left in December, and expressed concern his original schedule was now impossible. Faggin responded by working well into the night every day, and Shima stayed on for another six months to help. Faggin himself immersed himself in workweeks that spanned 70 to 80 hours. Additional advances were needed to reach the required circuit density. One of these advances was the use of "buried contacts" to connect the wires directly to the components. Another was figuring out how to make adding "bootstrap loads" with silicon gate as part of one of the masking steps, eliminating one step from the processing. Without these two innovations by Faggin, Hoff's architecture could not have been realized in a single chip.