A fundamental, generalized data structure optimized for efficient numerical computations across GPUs/TPUs, enabling fast operations on model inputs, weights, and activations.

Tensors support techniques like quantization, parallelism, and caching.