March 23, 2026
Cycle-counting is back, baby
BIO – The Bao I/O Co-Processor
BIO drops: fans cheer, skeptics shout “cycle-counting,” and the dupe police crash the party
TLDR: Baochip’s BIO aims to handle I/O timing more predictably, taking cues from Raspberry Pi’s PIO while using fewer resources and clocking faster on custom chips. Commenters are split: supporters praise the clarity, while skeptics warn of tedious cycle-counting and argue that headline MHz doesn’t beat real-world throughput.
Baochip’s new BIO co-processor just landed, promising to take the fussy timing work off the main chip and make gadgets respond on time. Think of it as a helper that talks to the outside world so your main brain can relax. The vibe? Mixed — and spicy. One commenter called themselves an “unranked unwashed neophyte” yet still loved the clear write-up and even name-dropped a clever RISC‑V trick called Streaming Semantic Registers (basically, moving data without constantly yelling “load” and “store”).
Then the drama hit. A top concern: timing. BIO leans on a “finish your work before the next tick” loop, which triggered the anxious chorus: “Are we back to cycle counting?” Translation: will developers have to count every clock tick by hand again? Meanwhile, the numbers fight exploded. Fans pointed out BIO uses far fewer hardware resources than Raspberry Pi’s PIO on reprogrammable chips (about 14.6k vs 39k cells) and can clock over 4x faster on custom silicon. Critics shot back: per clock, it’s roughly 15x less efficient, so MHz isn’t the same as real speed — cue the 400 Mb/s brag from Pi land.
Add a classic “CISC vs RISC” nerd cage match (complicated instructions vs simple ones), a “dupe?” hall monitor, and a Bunnie blog post cameo, and you’ve got peak comment-section theater.
Key Points
- •BIO is introduced as the I/O co-processor for the Baochip-1x, with its architecture and usage demonstrated through assembly and C examples.
- •Raspberry Pi’s PIO is analyzed as a reference: four processors, nine instructions each, and a 32-instruction memory enabling cycle-accurate GPIO control.
- •A PIO clone was built by forking Lawrie Griffith’s fpga_pio, with regression tests and simulations producing a near RP2040-compliant core on GitHub.
- •On an XC7A100 FPGA, the PIO block consumed over half the fabric and had a critical path at least twice as long as a VexRiscv core, challenging timing closure.
- •The heavy resource and timing costs are attributed to PIO’s complex, multi-function (CISC-like) instructions that combine flow control, data shifts, FIFO handling, side-setting, and interrupt logic.