A minimal tensor processing unit (TPU), inspired by Google's TPU

https://news.ycombinator.com/rss Hits: 4
Summary

A minimal tensor processing unit (TPU), reinvented from Google's TPU V2 and V1. tinytpu.mp4 Table of Contents Architecture Processing Element (PE) Function : Performs a multiply-accumulate operation every clock cycle : Performs a multiply-accumulate operation every clock cycle Data Flow : Incoming data is multiplied by a stored weight and added to an incoming partial sum to produce an output sum Incoming data also passes through to the next element for propagation across the array : Systolic Array Architecture : A grid of processing elements, starting from 2x2 : A grid of processing elements, starting from 2x2 Data Movement : Input values flow horizontally across the array Partial sums flow vertically down the array Weights remain fixed within each processing element during computation : Input Preprocessing : Input matrices are rotated 90 degrees (implemented in hardware) Inputs are staggered for correct computation in the systolic array Weight matrices are transposed and staggered to align with mathematical formulas : Vector Processing Unit (VPU) Performs element-wise operations after the systolic array Control : Module selection depends on the computation stage : Module selection depends on the computation stage Modules (pipelined) : Bias addition Leaky ReLU activation function MSE loss Leaky ReLU derivative : Unified Buffer (UB) Dual-port memory for storing intermediate values Stored Data : Input matrices Weight matrices Bias vectors Post-activation values for backpropagation Activation leak factors Inverse batch size constant for MSE backpropagation : Interface : Two read and two write ports per data type Data is accessed by specifying a start address and count Reads can occur continuously in the background until the requested count is reached : Control Unit Instruction width : 94 bits : 94 bits See Instruction Set section below for more information. Instruction Set Our ISA is 94 bits wide. The full image is available in the images/ folder. Our ISA defines all n...

First seen: 2025-08-18 21:47

Last seen: 2025-08-19 00:48