CISC (Complex Instruction Set Computer): Unlike typical CPUs where instructions are simple, the TPU operates at a higher abstraction level. A single TPU instruction is extremely powerful and can trigger operations taking thousands of clock cycles.
Because of this complexity, the average CPI (Cycles Per Instruction) is very high (typically 10 to 20, but can be much higher for large matrix operations). This reduces the bottleneck of constantly fetching instructions, allowing the chip to focus purely on math.
Bidirectional Arrows: Double connections (e.g., between CPU and PCIe) represent bidirectional data flow. This allows the TPU to both receive input data/instructions and send finished inferences back to the Host.