Working on a project that involves a Spartan 6 FPGA.
Features that used to work had recently stopped working.
I hadn't changed the Verilog code for anything close to those features. (I checked against the repository.)
In tracking down one of them, I created a pathway to the outside, so I could keep an eye on the contents of an internal register.
It promptly started working.
Looks like yet another case of Xilinx's toolchain optimizing out things that really need to be there.
And we don't have any sort of test automation for this project, so regression testing kinda isn't happening.
Well, the full debugging path was rather cumbersome, but I've replaced it with a spurious use of the register contents. We shall see....
Update: Bouncing back and forth between projects, I find that the networking issue I've been trying to track down (in the mess of third- and fourth-party RTOS+networking code) is also a Heisenbug; dropping just the right sprintf in just the right place makes it work almost every time. I suspect some wonky race condition, though such are supposed to be impossible in this context. Grrrrr.
Update 2: Bounce to FPGA project again. Could it be a timing problem, and not miscompilation? Timing analysis is verbosely uninformative. Look back at source code for the section that's not working at the moment. Um. That section is clocked at 10 MHz. The 100 MHz and 300 MHz sections seem to be OK at the moment; how is logic that has 100 freaking nanoseconds between ticks not working?
(Thinks back to the Olden Days when clocking logic at 10 MHz was fast....)
(But at least you could put a scope probe on it!)
Update 3: FPGA problem sorted. A key signal was crossing from the 300 MHz domain into 10 MHz. There was a synchronizer, but apparently it wasn't working. Or maybe the transition detector just wasn't working in the 300 MHz domain, though it was simple enough it ought to have worked.
Meanwhile, on the other project, it turns out that the problem isn't in fact fixed-enough-for-now, and furthermore there's a problem with configuration memory getting corrupted; the latter issue is possibly architectural in nature, and I haven't really looked at that part of the inherited code yet. Aieeee! And the first project (the one with the FPGA) has just sprouted a fine crop of spec clarifications and possibly revisions. Wowf.
Comments