What made the 1960s CDC6600 supercomputer fast?

Anybody who has ever taken an advanced computer architecture class has heard of the CDC6600, which was the world’s fastest computer from 1964 to 1969. It was the machine that put Seymour Cray on the map as a supercomputer architect. The design of the machine is well documented in a book by James Thornton, the lead designer, and is therefore publically accessible. Among several architectural concepts that later found use in RISC, the CDC6600 is known for introducing the Scoreboard. Which is, along with Tomasulo’s algorithm, one of the earliest concepts for out-of-order processing.

Besides the architectural progress, the CDC6600 was impressive for its clock speed of 10 MHz. This may not sound much, but consider that this was a physically very large machine entirely built from discrete resistors and transistors in the early 1960s. Not a single integrated circuit was involved. For comparison, the PDP-8, released in 1965 and also based on discrete logic, had a clock speed of 1.5 MHz. The first IBM PC, released 20 years later, was clocked at less than half the speed of the CDC6600 despite being based on integrated circuits. The high clockrate is even more impressive when comparing it to more recent (hobbyist) attempts to design CPUs with discrete components such as the MT15, the Megaprocessor or the Monster6502. Although these are comparatively small designs based on modern components, none of them get to even a tenth of the CDC6600 clock speed.

How did they do it?

The logic circuitry of the CDC6600 is based on resistor-transistor-logic (RTL).  The Thornton book calls it DCTL (Directly Coupled Transistor Logic), but due to  the presence of the base resistor it is obviosuly the same as RTL. Thornton himself refers to RTL in a later publication.

The image above shows the basic logic element of the CDC6600, an inverter

The outputs of two or or more inverters can be connected to form a wired AND, as shown on the left. Two inversions plus an AND implement a NOR2 function in positive boolean logic.

Thornton uses an unusual notation for logic circuits as shown above on the right. Each arrow corresponds to an inverter (base resistor plus transistor), each square or circle correspond to a collector resistor. The circles and squares indicate whether the current node is to be interpreted in positive (circle) or inverted (square) logic. The circuit is exactly the same for both. This “mental gyration” takes a while to get used to, but helps to cope with the issue that every connection is inverting.

Every module in the CDC6600 is constructed from a multitude of basic inverters with a single transitor each. This does also include latches and registers, which are fully static. There are no stacked transistors, no floating nodes, no diode logic, no pulse latches or other specialties. As an example, one slice of a boolean unit from the book is shown above.

This approach is quite genius in its simplicity and should be the base of every clean architecture: Optimize one basic thing very well, replicate it, and use it as a hierarchical building block. This is in stark contrast to other contemporay designs like the PDP8, which tried to use every circuit trick there is to reduce the number of (expensive) transistors.

So, why is it fast? The table above lists the operating point of one inverter. Voltage levels are adjusted by picking the right values for base and collector resistors. The switching level is probably around 0.7V, so the hi and low levels are adjusted for symmetry around this point. Besides from this, there is really nothing special in the circuit. However, the switching time is still 5 ns, compared to 50 ns of the MT-15. Clearly, as the book states: [the logic] “…is heavily dependent on the transistor characteristics for its operation”.

The transistor is key

While there is not too much specific information about the CDC6600 transistors in Thorntons book, there is an interesting article about the development of a special silicon transistor for the CDC6600 in 1961 on the computer history museum website. The transistor in question is the 2N709 and was developed by Jean Hoerni at Fairchild. Fairchild was the company that put “Silicon” into “Silicon Valley” and was a fairly recent startup in 1961. Jean Hoernis claim to fame is the invention of the planar process, which is was the key enabler for integrated circuits.

The key to the high speed transistor he developed was to introduce a tiny amount of gold doping into the base area of the transistor (See patent and also patent notebook). Gold is usually a highly undesirable contaminant in semiconductors because it leads to rapid minority carrier recombination. In this case, this is done on purpose is to avoid build up of charge in the transistor while it is in saturation. This so called saturation charge is what makes turning off bipolar transistors slow, because the charge needs to diffuse out of the base, which is a slow process. There are numerous circuit tricks to reduce buildup of saturation charge, the most famous being the baker clamp. However, by using a transistor that does intrinsically not build up a high saturation charge, all of these complexities can be omitted at an even better performance.

Curiously there do not seem to be many fast bipolar switching transistors around anymore. The 2N709 is, of course, long out of production. Some old reference tables list the AF66, BSX19, BSY18 and the 2N2369 as replacements. All of these are out of production as well, however the 2N2369 lives on as an SMD version in form of the PMBT2369 (Nexperia) or MMBT2369 (Onsemi). Nexperia seems to be the only company that lists switching times for bipolar transistors in their product selector. Their product selector lists one additional fast switching transistor, the BSV52. However, the characteristics are very similar to the PMBT2369.

Putting things to practice

Can the CDC6600 logic style be replicated with modern components? It’s best to start a new design with a simulation. You can find the design of an RTL inverter based around the PMBT2369 above. It uses Nexperias transistor model. I tested the propagation delay of the inverter by simulating a ring oscillator consisting of five inverters in LTSpice. As you can see, a propagation delay of 6.1 ns was observed despite a relatively moderate collector current setting of 2mA, a fifth of that used in the CDC6600.

I built up the same circuit on a PCB to test the ring oscillators in real world conditions. I tested a total number of three different transistor types: The BC847C, the MMBTH10-4 and the PMBT2369.

The BC847C was used as a reference. Using the lower gain A or B versions would have increased switching speed by reducing the miller effect. But the improvement would most likely not have changed the outcome. The MMBTH10-4 is a common high-frequency transistor with ft=800 MHz and a low collector capacitance. At a first glance it looks like it should beat the PMBT2369 in switching speed, but keep in mind that “high-frequency” only refers to small signal excitation around an operating point. To optimize for this, it is necessary to reduce capacitances and base transit time, not saturation charge. Finally, the PMBT2369 is a device optimized for high switching speeds which employs techniques to eliminate buildup of saturation charge. (I don’t know if it also uses gold diffusion or a more modern approach based on ion implantation.) The 2N709 used in the CDC6600 has slightly lower collector capacitance than the PMBT2369 and may be even faster. It is debatable whether the higher parasitics of the ancient 2N709 TO-can package compared to a modern SMD package would have compensated some of this benefit.

The diagrams above show the variation of oscillator frequency with supply voltage and the gate propagation delay calculated from the frequency.

The high frequency transistor is the fastest device at the lowest end of operating voltages. This is probably owed to low terminal capacitances. Increasing the supply increases oscillator frequency for all devices. However, the MMBTH10 and BC847C exhibit a maximum at around 2 V. For higher supply voltages, the frequency starts to decrease again. This is most likely owed to an increase of base current and hence increase of saturation charge, which makes turning the transistor off slow. The PMTB2369 based ring oscillator shows a continuous increase in frequency, suggesting that the saturation charge does not dominate switching frequencies.

The measured frequency at 5V is 17.7 MHz, which is surprisingly close to the simulated 16.5 MHz from the spice simulation. Kudos to Nexperia for providing proper large signal spice models!

Summary

What do we learn from this? Choosing the right transistor is key to building 1960s style high speed discrete RTL circuits. Forget all the “clever” tricks like baker clamp, feedthrough capacitor or bleeding resistor. They just add to component count without addressing the root cause.

Unfortunately, true bipolar high-speed switching transistors seem to be a dying breed, with the MMBT/PMBT2369 being (almost) the last of its kind.

Let me know if you find similar devices. I also started looking into “digital” or “pre-biased” switching transistors, but it seems they are not close to the PMBT2369. It is also possible that more modern high-frequency transistors also show a better switching time than MMBTH10 due to thinner base region.

Addendum

It appears that Fairchild did indeed replace the 2n709 with the 2n2369, as you can see in their 1985 discrete databook. So the PMBT2369 is the rightful SMD-packaged heir to the original 2n709 used in the CDC6600. The PMBT2369 seems to be faster than the MMBT2369. Apparently there is only one choice…

Addendum 2

Some more finetuning of the PMBT2369 circuit shows that a minimum propagation delay of 3.5 ns is achieved at approximately 10 mA supply current per gate. This is exactly coincident with the CDC6600 operating conditions and is a cue to the enourmous power requirements of fast RTL technology: 30 mW per gate on average!

24 thoughts on “What made the 1960s CDC6600 supercomputer fast?”

  1. Great article. I used the cyber 205, cdc 6600 and cdc 6700, vps32, then went on to the cray 1, 2, xmp, and ymp. But it was the 6600 that cemented the success of scientific computing. In addition to the circuit innovations, the OS and documentation was 2nd to none. Went I left, we were approaching exascale, but it was crays fma and later vector processing combined with parallel computing that inspired carried compute forward.

    1. Thank you for your comment. Seems like you really worked will all relevant machines from the supercomputer age.
      Indeed, the circuit and architecture innovations are only a small part of what made the CDC6600 stand out.

  2. Bruce Sherry, who led the CDC 6500 restoration at the LCM+L, mentioned that he uses 2n2369 for replacement transistors and mmbt2369 for use in replacement modules. He just orders them from Digi-Key.

    1. Thank you, that’s quite interesting. So it seems that the 2369 is a worthy successor to the 709. On the other hand, one would have hoped that there was some progress in low power bipolar switching transistors in 50 years? Apparently not as much as one would think…

  3. I remember at the U of A they had one of the CDC computers, I don’t remember which one, maybe a CDC 6500.

    One of the bits was getting to a register late. The solution? Cut a foot off the length of the wire going to the register. This meant the because of the speed of light considerations, the bit got to the register about a nano second earlier and the problem was solved!

  4. Nice in-depth information overlooked by me till now.
    I knew the scoreboarding and 10 functional units in CDC6600. I used to call it as a typical Superscalar, CISC. In addition, it uses 32-way memory interleaving. I think the authors, Bell & Newell call the CDC6600 as a ‘Network Computer’.
    In my classes I used to compare it with IBM System/ 360 Model 67 which is a Multiprocessor, CISC. I have never looked into basic circuit design aspect in my Computer Architecture classes.

  5. We had a Cyber 205 at Goddard when I worked there. The thing used wire lengths to control signal timing, and the inside of the cabinets looked like an enormous loom that had suffered a mischance. The thing also had some pretty stiff environmental requiremnts; as I recall the machine room was kept at 55F, which meant all we operators kept winter coats on hand that we put ON when we got to work.

  6. (minor correction) While indeed CDC sponsored the 2N709, almost everyone else used them. Having a company sponsor component development is an old practice going back to the tube days, and is the reason why we have so many “similar-but-not” tube and transistor type numbers in the 1950s and 60s. But yes, 2N709 was obviously a big deal.

    1. Yes indeed. Looking through Jean Hoernis patent notebook (linked above), you can also see that he invented this transitor type already way earlier. The NRE payment by CDC may have only been a way to prioritize development at Fairchild a bit.

  7. The key difference here between DCTL and RTL is the lack of a base resistor (which is the thing that makes it “directly-coupled”); googling “hackaday ttlers dctl” gives you an excellent writeup of it. It’d be nice if you made a correction or part 2 that explores DCTL rather than RTL.

    1. That’s a good point. I am firmly aware of the distinction and specifically looked into this.

      See paragraph: “The logic circuitry of the CDC6600 is based on resistor-transistor-logic (RTL). The Thornton book calls it DCTL (Directly Coupled Transistor Logic), but due to the presence of the base resistor it is obviosuly the same as RTL. Thornton himself refers to RTL in a later publication.”

      Not sure where a correction would be needed?

      As far as I see it, the DCTL implementation without base resistor, as shown on hackaday.io, is something that looks nice on paper, but will suffer from serious stability issues when used with a large population of unmatched discrete transistors. It may work better in integrated circuits where you can achieve better matching, but as history has taught us, there were other technologies that made the cut.

      1. The fun thing is : Dan’s comment is dated Feb. 26th, 12 days after CPLDCPU posted about this page on our TTLers group 🙂

        This page tickles me even more about this technology, and if the design is well planned, with good fanout rules, it seems that there would be no need of base resistors… I’ll have to try it !!!

  8. *** Looking for switching transistors
    From old TESLA catalog, and from “Technical data of semiconductor devices, selection from Comecon countries”

    | | f_T | t_on | t_off |
    | | [MHz] | [ns] | [ns] |
    |——–+——-+——+——-|
    | KSY71 | >500 | <12+ | 450 | 15 | 25 |
    | 2N2369 | >500 | <12+ | 500 | <12+ | 500 | <12+ | 400 | <12+ | 600 | <12+ | 600 | <12+ | <12+ |

    Some of them are in actual TME catalog. Even the 2N2369 in through hole and big metal can.
    – BSX20 :: https://www.tme.eu/Document/668724c023091df823f9d69a26f7a15a/BSX20.pdf

    This was very short research as I'm not an expert, and going through all the documents and understanding the parameter details will took too much time.

    1. Thank you, nice find. “Continental Device India”? I guess this is a reseller of NOS.

      Indeed, it still seems possible to find many of the older switching transistors in metal can package. But the only ones that made it to SMD age seems to be the PMBT2369 and BVS52.

  9. Hello
    If you search for fast transistors, try out LNA bipolars.
    I have a surplus roll of BFR182 (FT=8GHz NPN). They should switch quickly.

    1. These are nice HF transistors indeed, I should also try them. I also ordered some BFP420 recently (SiGe with ft=25GHz).

      It will be interesting to see if they fare better than the PMTB2369, since these are HF transistors and not switchign transistors. The hold capacitances and SiGe base may be very helpful, though.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s