The mystery of the cursed high-speed UART may (I emphasize may) be sorted. And I'm having a flashback to the mid-1990s.
In today's effort, I eliminated several more things, and basically narrowed it down to: if either receive interrupt happens (that'd be character received, or character(s) received followed by a gap), there's some probability of things going wonko.
What it finally came down to was: this UART, unlike its predecessors, is running on a clock generated by a Spartan 6 DCM, and thus is more or less unrelated to all the other clocks in the FPGA. ("How is this clock unlike all other clocks?" Wait, that's out of season, innit?)
Anyway, synchronizing all the interrupt requests to the CPU clock seems to have solved the problem, or at least made it vastly less frequent. Still gonna run the test for a while before declaring victory and checking in the updated code.
And the flashback? Back in the day, when Signetics was still Signetics and not Lowe's, someone there had a brilliant idea: a sort of programmable DMA controller chip, called the I/O Processor or IOP, whose part number escapes me at the moment though I think I still have at least one of the chips around. They made an initial batch, and I got to design a demo board for it. (I was working for IPT back then, and one of IPT's activities was developing demo boards and software for new chips; that was a seriously fun part of the job.)
Well, it was awesome. Right up until it started flaking out every so often. After much diagnosis, involving a logic analyzer hanging on the address bus, I came to the conclusion that the request inputs were not being properly synchronized, so sometimes a race condition would lead to it taking the wrong vector or even loading part of the address from one vector and part from another.
Alas, they didn't respin the chip to resolve the problem; they just killed it off.
My problem, being in FPGA logic, just required a respin of the code, assuming I've correctly identified and solved the problem. So it doesn't kill the project.
(Checks back on the test: still blinky-blinky, and no evidence of trouble.)
And so, as the month comes to an end, it seems I can include good tidings in my report.
Update: Code checked in. End-of-month paperwork shuffled. Back lawn mowed. Straw in chicken coop changed. Hatches battened for Rainy Weekend Ian. Test is still running happily, so I'll assume the problem is fixed and proceed (in the coming days) with tidying up and finishing stuff so far left incomplete, plus getting the RP2040-based serial dongle working.
Time to fix dinner now.
Comments