The toaster project that's the current headache?
I mentioned the pseudo-thermal duty-cycle limit a couple of days back... and yesterday I started testing that feature. Which ought to be pretty simple, being as how I basically copied the logic from a known-working project I'd done some time ago, using a Xilinx FPGA, and this is a Lattice CPLD, but it's all Verilog, right?
Um. I'm getting fault indications. The fault indications persist when there can't possibly be an actual fault.
So, route some internal signals out to test points (conveniently already connected to the logic analyzer), and fiddle with parameters until I get something that shows up as a recurring event.
O-kay. Time scale is 100ns/division, which lines up nicely with the 10 MHz clock used for this section of the logic. First trace (no, I'm not showing a picture; you'll have to use your imagination) shows the the incoming request signal; second shows the lockout signal that goes high for some amount of time beginning one clock tick after an error. Next two traces are the actual error flags, reflecting different criteria. All of these things are synchronous (declared reg, and assigned with <= within always @(posedge lfclk)), so they should change only on clock edges.
When the error indications happen, they happen 200ns after the request goes high, which makes sense, given that there's some pipelining involved. The lockout happens, right enough, 100ns after that.
But...
Wait a minute.
The error indications are only manifesting themselves for around 25ns, not a full clock cycle or more as they should. They're looking more combinatorial than registered, and they seem to be popping up juuuuust early enough to trip the lockout.
What's more, they're consistently showing up 200ns (give or take slightly fuzzy timing measurements) after the synchronized request... but the request, as shown on the test point, isn't synchronized to 10 MHz. It's synchronized to 40 MHz. No, wait. It's not bloody synchronized at all; it's just a signal, currently supplied by the (independently-clocked) MCU, that's been through a glorified MUX at that point. The limit logic is taking that, synchronizing it to the 10 MHz clock, and working with the synchronized version.
So! The problem only occurs when the input chances to have a very specific phase relationship with the internal clock. Which happens often enough to cause obvious problems.
Hmmm.
Well, having spotted that, what is I toss in one more level of synchronization? Back in a few minutes.
... Bingo. It needed to be double-synchronized.
Still one bug known to be present, but at least things are somewhat making sense now.
Comments