Sunday, August 25, 2013

Bringing the 8/L Back to Life


The 8/L's backplane, stuffed with Flip Chips.
I spent some time last week cleaning the 8/L in preparation for the new power supply capacitors.  This was a pretty painless process -- the machine's already very clean externally and internally it was a matter of gently brushing 44 years' worth of accumulated dust off of all of the flip chips installed in the backplane.  This was done with an old toothbrush, and while I was at it I cleaned all the edge connectors with a bit of Scotchbrite.

The PDP-8/L 718 Power Supply

The replacement capacitors for the 8/L arrived on Thursday, so that evening I put on some MST3K ("Escape 2000," in which you are encouraged to Leave the Bronx!) and got to work.

The above picture shows the 8/L's power supply, designated as the 718.  It's a simple linear supply and it was a trivial matter to replace the three large electrolytics (18000uF, 22000uf, and 80000uF) with their new replacements, since they're all screw-terminal types.  All three appear to be originals, and while they look yellow in color in the photo, back in 1969 they were silver.  Some folks prefer to reform old capacitors, but having had a couple of working capacitors go dead short on me in a previous project, I prefer not to take chances.  I'll be keeping the originals just in case.

With the new capacitors in place and the power supply unit bolted back together, I powered it up and checked the voltages; the 718 provides a regulated +5V supply, and unregulated -30V and -15V supplies.  The 5V measured in at 5.05V under load, and the -30V and -15V were -35V and -18V unloaded -- well within tolerances.  The 8/L also uses a -6V supply and this confused me for a bit, as the -6V line was reading at -35V as well, which would be wildly wrong.  It turns out that the -6V line is regulated by the G826 regulator board installed in the 8/L chassis -- it's not part of the power supply itself.  Together, the -6V and -30V supplies provide the power supply for the core memory system.

Since everything looked good, I reinstalled the 718 in the 8/L chassis and powered it up (while keeping my fingers crossed):
First power-up
And hey, everything lit up and the fans sputtered to life (they'll probably need a bit of lubrication).  An initial checkout indicated that the front panel was working -- I could load addresses into the Memory Address register from the Switch Register, and use of the Exam and Dep switches would increment the Memory Address correctly.  Only four dead lights on the front panel (these are low voltage incandescent bulbs, not LEDs) and fortunately I have a few spares.

Hitting the "Run" switch caused the CPU to take off running whatever it thought it was getting from memory, and hitting "Stop" caused it to stop --so quite a bit of the CPU logic appeared to be working.

Not a lot of response from the memory, though.  Hitting "Exam" would step through the address space but I mostly got zeros back and I was completely unable to deposit anything into memory.

Well, I figured that was to be expected since the power supply had been rebuilt and the core memory system requires careful calibration in order to work correctly.  The calibration procedure is documented in the maintenance manual (starting on page 5-7, for those following along at home).  Essentially, the difference between the MEMORY SUPPLY + and MEMORY SUPPLY - lines (the -6 and -30V supplies mentioned earlier) should be about 22.5V.  Mine was only reading about -18V, so I goosed it up to (er, down to) -22.5V using the trimpot on the G826 regulator):

Everything's nominal...
This had no effect at all on the behavior of the memory, so I turned my eyes to the memory control timings (also discussed in the maintenance manual).  Things looked pretty much OK there as well, and the waveforms coming back from the core memory read amplifiers looked OK as well.

Using the logic analyzer indicated that all the requisite control signals were being generated correctly (they looked identical to figure 5-3 in the manual).

Core memory is a complicated mechanism and I won't bore you too much with the details (the maintenance manual covers it in exhausting detail if you're curious).  Effectively, a "read" operation destroys the contents of the memory being read (resets everything to a "zero" state), so every read must be followed by a write to put the original contents back.  Given that the waveform for the read amps looked good (and also indicated that everything was zeros) and in combination with the fact that "write" operations from the front panel were not working, I began to suspect that the problem with the core memory was with the "write" side of things.

Since every bit of every address seemed to be coming back as all zeros, it seemed unlikely that it was any of the circuitry related to control or amplification of the individual data bits in the MB buffer (i.e. it was unlikely that every single read/write amp and inhibit driver for the memory data was broken in the same way).  On a hunch, I swapped the pair of G228 Inhibit Driver flip chips for the X/Y Read/Write control with a pair from the Memory Buffer Inhibit Drivers.  And voila -- the memory started responding again and I was able to write to and read from it reliably (with a single bit stuck).

From here it was easy to isolate the faulty flip chip, and from here to narrow down the cause to a bad 7440 IC.

The faulty G228 Inhibit Driver.
After replacing the 7440, the memory system appeared to be working reliably, but it's difficult to track down individual dead or stuck bits just by playing with the front panel.  DEC supplied a plethora of diagnostics to run (the checkerboard test is a particularly brutal one), but at the moment I have no way to load them onto the machine other than toggling them in manually (I'm waiting for my RS232<->Current Loop adapter to arrive).  I'm far too lazy to toggle all that in, so I whipped up a very basic diagnostic myself:

00 7200  CLA
01 1023  TAD 30  ; Load start address
02 3024  DCA 24  ; Copy to curr. addr.
03 3025  DCA 25  ; Copy AC to test value
04 1025  TAD 25  ; reload into AC
05 3424  DCA I 24  ; Copy to memory loc
06 7200  CLA
07 1424  TAD I 24  ; reload memory
10 7041  CIA       ; negate (2s cmpl)
11 1025  TAD 25    ; add current value
12 7440  SZA     ; Should be zero
13 7402  HLT     ; Memory did not match
14 2025  ISZ 25  ; Move to next value
15 5021  JMP 21
16 2024  ISZ 24  ; Move to next address
17 5003  JMP 03  ; Run test with new address
20 5000  JMP 00  ; Address has wrapped to 0, start again.
21 1025  TAD 25
22 5003  JMP 03  ; Run test against current address with next value
23 0030  ; Constant - starting memory address to test
24 0000  ; Variable - Address being tested
25 0000  ; Variable - Value used to test address

This walks memory from 0030 to 7777 and for each address, it writes, reads back, and compares all possible values.  If a mismatch is found, it halts, otherwise it will loop forever.

I've now had it running the above test for a few hours without issue.  Once I get my current loop adapter, I'll be able to run some real diagnostics on it but this is still a pretty good sign -- there are no stuck or dead bits in the core and quite a bit of the CPU is working properly.

So one dead 7440 is the only fault so far.  Not bad for a 44 year-old computer.

That's all for now!  Until next time, always let your conscience be your guide!

Addendum (8/26/13): Looks like I spoke a bit too soon about the core memory being 100% operational.  Fired up the machine and ran test program again this evening and after about 30 seconds it hit some stuck bits (3 and 8) at address 0506; these stuck bits continued for a few pages, after which there were a few more good pages, followed by stuck bits, etc.

But, after letting the machine warm up for 20 minutes or so the test started passing again.  This indicates that the memory isn't aligned quite properly, so it will need some fine tuning.

Wednesday, August 21, 2013

Two Steps Forward, One Step Back

Well, it's been awhile since my last update and while I was hoping to post some forward progress with the Imlac, there have been certain setbacks so I'll post about those instead...

Since the last post, I spent some time putting the rest of the Imlac's CPU through its paces and a lot of instructions were working fine, while a subset of them were showing interesting behavior.  In particular, any instruction that addressed memory ended up changing the contents of that memory address, regardless if the instruction was a read or a write.

After doing a bit more playing around, it was clear that the memory contents were being incremented on each access, and I narrowed it down to one of a pair of faulty components in the logic controlling the indirect increment register behavior -- rather than incrementing the memory contents only when an indirect access to an increment register (addresses 10-17) was made, it was incrementing all memory addresses, all the time.

Seemed simple enough, but when I went to track down the fault suddenly the front panel started acting strangely, and I have not yet isolated the cause.

The Imlac no longer runs programs via the "Start" or "Continue" switch, and about half the time, when doing a "Read" or "Store" operation from the front panel nothing happens at all.  The behavior is erratic and appears to be basically random.

I've spent a good deal of time tracking this issue down.  I started by looking at what should normally happen when an operation is triggered by the front panel.  One of the main things is that a set of "T" clocks (T1 through T10) are fired in sequence -- these T clock impulses drive the different parts of the CPU logic at the right time in order for an instruction to execute.  (For example, during the "T2" cycle of a fetch instruction, the PC is incremented by 1, and during T8, T9 and T10 the accumulator is shifted (one or more times) during a Shift operation.)

For the times the front panel operation succeeds, T1->T10 is generated properly.  During those times the front panel operation fails, T1->T10 does not change state at all.  So this issue is clearly related to the generation of the system clock.  The system clock is generated by the "Timing Pulse Generator and Clock" board (slot 215), which as it turns out I don't have schematics for (or so I thought, more on that later).  So, I spent a couple of evenings reverse-engineering the board and drawing up my own.

It became clear that the T clocks were generated only if the "RUN" signal was high and after doing a bit of probing it was clear that RUN was not being asserted when the panel was failing.  So... what generates the RUN signal?  Why, the "Run Control" board (in slot 231), of course!
The Run Control schematic (partial)
 The "RUN" signal is controlled by the output of a pair of J/K flip-flops (the 7476 at E1 in the schematic above).  When a front panel switch (such as the READ switch at the top left of the schematic) is toggled, this ends up triggering the 74121 at E2 to fire a one-shot pulse about 5 microseconds later -- this in turn triggers the PR input (pin 7) of one of the flip flops (labeled "RUN SYNC") which in tandem with the "RUN CLK" signal on pin 6, eventually raises the "RUN" signal, originating from pin 15 (the Q output of the other J/K flip flop -- labeled "RUN") of the 7476.  Whew.

The purpose of this circuit is to synchronize panel operations with the "RUN CLK" signal -- since panel operations are run by humans who do things any damned time they please and the Imlac runs on a very regulated 1.8 microsecond schedule, the 74121 in combination with the 7476 make sure that the operation started by the operator happens at the beginning of one of these 1.8 microsecond periods.

After even more investigation it became clear that the input (pin 7) to the state machine implemented in part by the 7476 was being raised properly, as was the RUN CLK signal, and so I suspected the 7476 to be faulty.  On replacement, no change was noted.

Hmm.  Let's take a closer look at the signals here, shall we?  Using my state-of-the art Tektronix 1241 Logic Analyzer (one of my favorite tools, c. 1988) we see the below when the front panel is working:

The "correct" behavior

As you can see, the "RUN CLK" signal (labeled as "CLK" on the analyzer) is clocking, and at the same time the panel trigger input (labeled as "P") is going low; when P goes high again, the "Q" output (labeled, oddly enough, "Q") changes state to low -- raising the "RUN" signal properly.

OK, so what does this look like when things don't work?
The "incorrect" behavior    
Here, we see the clock signal being generated as before, but the panel signal "P" is going low in between clock pulses!  So in this case, the "Q" output does not change state, and thus "RUN" is not properly raised.

So it seems clear that one of two things is broken here:

1) The "P" signal isn't getting generated properly (i.e. it's too short and so it's not reliably overlapping with the RUN CLK signal)
2) The RUN CLK signal is incorrect (i.e. the duty or "on" cycle is too short, leading to the problem discussed in (1) above).

After investigating possibility (1) for quite awhile it seemed that the 74121 was operating properly, as was the cute R/C network set up to fire the one-shot pulse for the "P" signal.  I'm still not 100% certain as I don't have any frame of reference for what characteristics this pulse is supposed to have.

I also spent a good deal of time investigating issue (2) and there's nothing obviously at fault.  However, looking at the RUN CLK signal with the oscilloscope reveals this waveform:

The RUN CLK signal
Which doesn't look incredibly "clocky" to me -- normally I'd expect this signal to look much more squared off and even -- those spikes (which correspond to the "on" duty cycle in the CLK signal seen in the logic analyzer photos) do not look good to me.  However, once again I have no frame of reference since I have no idea what characteristics this waveform is supposed to have.  Ahh, to have a service manual...

I even brought in the big guns on this one, my friend Ian who knows a lot more about this stuff than I do (it's his day job) and he was stumped too.  We're still working on it, and I know we'll get to a solution eventually.

In the meantime, I've shot off a mail to Tom Uban, the owner of the only working Imlac I know of, to see if he'd be willing to grab a photo of the RUN CLK signal on his machine.  He's graciously offered to help me out, but it'll be a few weeks.

So, eventually I'll get this machine running again.  Just a minor setback...

Meanwhile, I thought I'd look through the PDS-1 schematics on Bitsavers -- since the PDS-1D schematics have no timing details in them, I thought I'd double check to see if the -1 schematics did.  They didn't.  But... about halfway through the PDF I started seeing schematics for things that seemed to belong to the 1D, not the 1.  Including... the Timing Pulse Generator board that I spent 8 hours drawing up my own schematic for.  Sigh.

But on the positive note, I now have schematics for parts of the system I previously thought I was missing -- including the aforementioned TPG, but also the Long Vector Hardware (I can now confirm that the protoboard in my machine is a hand-built implementation of that) and the Disk Controller hardware (which I can now confirm my machine once had installed).  That's nice to have.  So, something good came out of this setback.

While I'm brooding over this issue, I'll be spending some time with the PDP-8/L, and I'll post about it once I've made some progress.  I'm currently waiting for new capacitors for the power supply to arrive.

Until next time, keep fighting the good fight!


Wednesday, August 7, 2013

The Imlac's memory is alive!

This is a quick update; I'll post more details later.

After dealing with corroded pins, bad connections, and counterfeit 7400-series logic (no, really) I finally have one of the two Dataram core memory boardsets running, giving me a whopping 8KW of memory.

And now that I have some memory to play with, I can actually get it to run a program!  Since Blogger apparently refuses to let me embed the video, you can see it here.


I assembled the program manually, here it is in all its glory:

0 CAL     ; 100011
1 ISZ 37  ; 030037
2 JMP 1   ; 010001
3 IAC     ; 100004
4 JMP 1   ; 010001

Pretty fancy.  It increments the contents of memory location 37 until it overflows (becomes zero) and then increments the Accumulator and starts over again.  In this way, it slowly increments the Accumulator.

That this program executes indicates that quite a bit of the main processor is functional, which is encouraging.  Since I don't have any official diagnostics (other than a memory test routine) I'll be spending some time coding up a few diagnostics of my own, to ferret out any remaining issues in the main processor before moving onto the display processor.

That's all for now.  Until next time...