Saturday, April 24, 2021

PDP-11/70: Even More Stuff

January 2021: Debugging Commences

In my last post I left off my tale of PDP-11/70 restoration in late January, having just powered up the rebuilt supplies after reinstalling them in the chassis.  The next step was to reinstall the processor, cache, and memory and see what happens:

(not quite) The First Power-up with Stuff Installed

The answer is: not much.  The processor was almost entirely unresponsive: it powered up with the RUN and MASTER lights on and wasn't responding to most input from the front panel.  Toggling the "Halt" switch and hitting "Start" caused the RUN light to go out, but that's the only response I got from the console.

Enter the KM11-A:

How does one debug a processor as complex as the 11/70's?  These days, advanced diagnostic tools like Logic Analyzers and digital storage oscilloscopes are commonplace, but in 1974 they weren't really an option.  DEC's solution to this was the KM11-A "Maintenance Set", a pair of boards with an array of lights and four switches.  The lights were used to monitor device state, and the switches controlled the behavior of the device and allowed for single-stepping processors.  The KM11 could be used to debug a variety of DEC hardware -- various PDP-11 processors and a few different peripherals and device controllers.  My KM11-A is a reproduction, which I built over a decade ago to debug my PDP-11/40, since then I've also used it to repair my PDP-11/05.  And now, it's time for the KM11 to work its magic again.

With the KM11 boardset installed  (you can see it sticking out from the left-hand side in the above picture) I was able to step the processor through micro-instruction execution.  A toggle switch on the KM11 clocks the processor, and the DATA lights on the front panel show the microcode address in the right 8 bits (with the selector knob turned to "uADDRS FPP/CPU).  (In the above picture it's showing address 200 octal).  The PDP-11/70's KB11-C processor is microcoded, using an array of small, high-speed bipolar PROMs to store 256 64-bit microcode words.  These 256 words are interpreted by the hardware to implement the PDP-11 instruction set, address memory and the Unibus, and to interface the processor to the front panel.
 
The KB11-C Engineering Drawings contain 14 pages of "flow diagrams" which detail precisely how the microcode executes.  The "KB11-C Processor Manual" (EK-KB11C-TM-001) provides 376 pages explaining exactly how the hardware works.  A typical flow diagram looks like:
Feel Flows
This is FLOWS 14, which diagrams the Console (front panel) portion of the microcode.  Top-center, you can see a starting bubble labeled "CON.00" which marks the start of the console portion of the microcode.  The box below it represents a single microinstruction, and details the operation of this microinstruction in each of the processor instruction cycle's "T-states."  The arrows coming out of this box indicate branches to other microinstructions, depending on the state of the hardware at the time of the instruction execution.  Branches may also lead to other flows (indicated by diamonds). 

Use of the KM11 indicated that the processor was definitely executing microinstructions, and seemed to be following the flow diagrams in the engineering drawings.  This is excellent -- it indicates that a lot of the hardware is functional.
 
Curiously, left to its own devices the processor didn't seem to be executing microinstructions at all and was stuck at micro-address 200 octal.  This is "ZAP.00" in the flow diagrams and is where the processor starts at power up or after a reset.

In the troubleshooting section of the 11/70 service docs (diagram on p. 5-16) it states:
IF LOAD ADRS DOES NOT WORK AND:
- RUN, MASTER & ALL DATA INDICATORS ARE ON
- uADRS = 200 (ZAP)
THEN MEMORY HAS LOST POWER
Which seems to adequately describe the symptoms I was seeing -- there is power-fail hardware in the processor that forces the microcode address to 200 in the event that power is lost, but the AC and DC LO signals (which are what the power supply uses to tell the processor of such a failure) were all fine (after checking again, just to be sure).   Also if this was the case I wouldn't expect that the KM11 would be able to step the processor at all -- the power fail hardware should force the processor's microcode address to 200 at all times until the power failure is resolved.

Probing the processor clock signal on the backplane with an oscilloscope revealed no clock signal at all, just a flat line.  The clock signal is provided from one of three sources on the "TIG" (Timing Generator) board:  Normally it comes from a 33.3333Mhz clock crystal.  While debugging with the KM11, it can come either from the MAINT STPR switch on the KM11, or from a special diagnostic RC clock network on the TIG board (this latter can be adjusted to a wide range of frequencies for margin testing.)  This lack of a clock signal was definitely an important clue.

Another oddity was revealed after a closer look at the service docs: In Chapter 4 of the Processor Manual, Section 4.1.3 it states:

"The third source of timing [the other two being the crystal clock and a diagnostic R/C network] is the manually-operated, single-step MAINT STPR switch S4, located on the maintenance card.  This switch is only enabled when maintenance card switches S2 and S3 are both set to 1."

Section 4.2.3 confirms this:

"The maintenance card S2 and S1 switches are both set to 1 to allow single timing pulses to be generated by MAINT STPR switch S4.... Removing the S2 or S1 input conditions the MS EN flip-flop to be cleared."

What was interesting about the above is that on my system, switch S4 (MAINT STPR) stepped the processor with switches S1 and S2 set to any configuration.  This being the case, I wondered if the logic that selects the clock source was faulty, and was always selecting the MAINT STPR input.
 
Well, only one way to be sure, and this would require getting the TIG board out on an extender for some extensive probing.  In doing so, I found that no clock signal was being generated by the 33.3333Mhz crystal at all; in fact while probing it one of the legs to the crystal fell right off.  This is usually a sign of a faulty component.

So I placed an order on Digi-Key for a replacement.

But then I got impatient and remembered that the rusty burned-out hulk of a PDP-11/45 I picked up along with the 11/70 was in the garage, and the 11/45 also has a TIG board, very similar to the one in the 11/70, and also using a 33.3333Mhz crystal.  
 
A short while later, the 11/70s TIG had a new, stolen, clock crystal:
Where'd you get that shiny new crystal?

And after reinstalling the TIG back in the backplane and powering up:


It's alive!  A bit.  With a working clock, the processor was able to respond to the front panel and I was able to load addresses and examine and deposit into memory.  However, instructions would not execute -- loading an address and hitting "Start" on the front panel had no effect.  More pressing: after the system warmed up for a minute or two, the "Load Address" switch on the front panel would stop working properly, and would always load "0" rather than what was in the front panel switches.
 
Still, good progress for just a few evenings of research and debugging (and conversing with people on cctalk for advice.)  Over the next few days I started in on investigating these issues... which I'll talk about in my next exciting installment.  Until then... go find something else to read.


Saturday, April 17, 2021

PDP-11/70 Repair: Part One

Looks like a few months have passed since my last post, as seems to be typical.

Never you mind, let's just ask the question: How'd that whole "Restore a rusty soot-covered PDP-11/70" thing turn out?  Well, I don't want to spoil anything.  Let's pick up where we left off.

 

November 2020:Make it Look Good

Looking good!

Well, first I installed the replacement front panel assembly.  That'll get you 90% of the way there, as anyone who restores old computers can tell you.

December 2020, January 2021: Cleaning, Capacitor Reforming, and Fan Replacement

Despite a pretty new face, the computer was still extremely dirty.  When I first brought the system home I'd given the rusty parts a rough sanding to get rid of the grit and loose paint, but the inside of the chassis was still amazingly filthy. 

Dirty Little Fingers

The boards themselves cleaned up quite well: they were all covered in a fine grit of soot and who knows what else, but soaking them in warm soapy water for 10-15 minutes then scrubbing with an old toothbrush eliminated most of the detritus.  After drying in front of a box fan, the gold fingers were cleaned up with liberal use of Scotch Brite(tm) cleaning pads.  

None of the 17 boards that comprise the Processor, Cache, and Memory of the system appeared to be seriously damaged.  That's good!



My next major concern was whether the backplane itself was hiding some corrosion -- corroded pins make poor contact with the boards and might never work reliably.  And these pins aren't exactly trivial to replace -- while it might theoretically be possible to undo the wire wrap to a bad pin, desolder it from the backplane assembly PCB and remove it... no, you know what: it's impossible for all intents and purposes.  If you have a dead pin on a backplane like this, the backplane (and it follows, the computer) is toast.

This weighed fairly heavily on my mind, as soot and moisture could easily have destroyed this computer.


Empty backplane, mostly.
With the boards removed for cleaning the chassis was now empty, with the backplane exposed for easy (if extremely slow and somewhat painful) cleaning. There are 44 slots in the KB11-C backplane, each of which is divided into 6 sections, designated A-F.  Each one of these sections needed to be cleaned.  A strategy I've used in the past is to fold a thin piece of cardboard over a credit card; this jerry-rigged assembly can be dipped in 99% isopropyl alcohol and then used to clean the slot by inserting it and removing it a few times.  Accumulated dirt and light corrosion will be pulled off leaving the slot at least slightly cleaner than it started off.  For good measure, before starting on each slot, I gave it a good dose of contact cleaner to help loosen things up.

Repeat this for all 6 sections of all 44 slots.  I did this over the course of about three weeks, a few slots every night to prevent my fingers from falling off.  Most of the slots were already pretty clean, but on a few the cleaning card came out quite dirty and required a few extra passes.  The last few slots toward the rear of the chassis took the most time to clean.

Had I to do this again, I might have tried removing the entire backplane from the chassis and rinsing it out with water or isopropyl first, but I got pretty good results with this approach.

Reforming Capacitors

In the midst of the above cleaning process, I started in on the power supplies.

The PDP-11/70 gets its power from two H7420a power supplies.  Each H7420 is a large bulky unit with an extremely heavy transformer up front; this transformer provides 30VAC to up to five modular power supply units which take the AC and provide regulated DC.  Different modules provide different voltages; the H745, for example, provides -15V at 10A; the H744 provides +5V at 25A, the 754 gives you +20V and -5V.

Depending on what system you have and what options it has fitted, the H7420 might have a variety of these modules installed.  For the PDP-11/70 system I have, it's entirely H744s -- seven of them, providing a whole lot of +5.  The H7420 itself provides + or -15VDC, as well as 8V and the ACLO and DCLO signals used to let the computer know if it's about to lose power.

A really dirty H744, prior to cleaning.

Over the course of a month, each of the H744s was removed, disassembled and cleaned.  The capacitors were removed and reformed.  Normally, I like to replace capacitors, rather than reforming just for the sake of reliability and peace of mind.  However, I thought I'd give reforming a try this time around, mostly due to cost considerations:  Each H744 contains three large capacitors, and with seven of them to restore (plus two extra for spares that I happened to have lying about), it was looking like I'd be investing about $750 to replace them all.

I won't go into details on capacitor reforming here -- it's well documented all over on the 'net (David Gesswein has a nice write-up here) and it's not all that exciting.  The upshot of my reforming experience was that of the 28 capacitors in the supplies, four of them ended up being marginal, and two were completely dead.  Not too shabby.  

Testing of the supplies (and the capacitors) was done using an electronic load that I bought for the occasion.  It's a lot more convenient than using banks of resistors, and it looks cool too:

Burning in an H744 (middle) on the bench.  Electronic load on the left, H7420 on right.

The electronic load gives me the ability to vary the load while testing, starting with a small amount of load for initial smoke testing, then ramping it up to really soak test the thing.  I let them run for a couple of hours each.  While this is going on, the output of the supply is monitored on an oscilloscope, to check that ripple is within tolerances (about 200mV, max). This is also a good time to do an initial adjustment of the voltage level (each H744 has a small potentiometer exposed on the front that is used for this purpose).

Of the 9 H744s, all but one tested out fine and required no additional repairs.  The one that failed would occasionally make an interesting short squeaking noise, with an associated drop in voltage.  I put that one back on the shelf for a future investigation (which as of this time has not yet occurred).

Fan Replacement

I have neglected to mention the state of the fans in this system: they were bad.  Very bad.  There are 18 fans in the PDP-11/70: 8 in the two H7420s, and 10 in the processor chassis.  Of these 18 fans, only three actually spun freely and even those sounded pretty bad when doing so.

OBEY THE PAPST FAN

I opted to replace these entirely.  While some of them were designed to be disassembled and cleaned, they were all rusty to the point where I just did not want to bother.  I found a decent supply of Papst fans on eBay for a reasonable price.  These are nice fans, well built with metal blades rather than the more common plastic ones.  Heavy.  Elegant.  Subtle.  Hungarian.  I like them.  

I installed 8 of them in the H7420s and reinstalled six of the H744s (and one H7441 that snuck in there while I wasn't looking.  See if you can find it, it's really exciting).

The supplies: all cleaned up and put back together!

At this point, all the wiring was double-checked, both for continuity and also for shorts and breaks in the insulation.  Everything checked out OK, so I fired the system up with an empty processor chassis, while holding my breath:

Hey, not bad.  No smoke or fire or bad smells, and the front panel lit up (all the lights are on by default since it's disconnected from the logic that normally drives it).  All voltages at the backplane were tested, per the service manual:

This is actually kind of a pain in the neck because you're finding tiny pins in a rat's nest of


The wire-wrap side of the 11/70 backplane: Where's Waldo?

and trying not to accidently brush against another pin while doing so.  I used a set of small jumper wires that clip over the ends of the wire-wrap pins to help keep things isolated.  Even so it was kind of nerve-wracking.  Long story short: all voltages were present in the right places on the backplane, and the ACLO and DCLO signals were both high, as they should be.

Conclusion:

As January drew to a close, I had gotten the PDP-11/70 to a point where it was clean and safely powering up.  What would the following months bring?  STAY TUNED TO FIND OUT!