CUDA has two-levels of caches, L1 and L2.
- L1 cache is shared by thread across a single SM. It uses the same memory as the shared memory
- L2 cache is shared across all SMs so every thread can access this memory
Created: May 30, 2023Last Modified: Mar 14, 2024
CUDA has two-levels of caches, L1 and L2.