NVIDIA has released cuTile BASIC, bringing tile-based GPU programming, introduced in CUDA 13.1, to the BASIC programming language. This allows developers to accelerate legacy BASIC applications using modern GPU performance. cuTile BASIC simplifies parallel programming by automatically handling parallelism and data partitioning, requiring minimal syntax changes.
The article showcases examples like vector addition and matrix multiplication, demonstrating the ease of use. Running cuTile BASIC requires an NVIDIA GPU with compute capability 8.x or higher, along with specific driver and toolkit versions. It opens possibilities for running AI and scientific computing codebases in BASIC, leveraging the power of NVIDIA GPUs.
CUDA 13.2 brings full support for CUDA Tile on Ampere, Ada, and Blackwell architectures, alongside enhancements to cuTile Python including recursive functions, closures, and custom reductions. Core updates include improved memory transfer APIs, reduced LMEM footprint in Windows, and a shift to MCDM for better compatibility. Math libraries gain experimental Grouped GEMM with MXFP8 and FP64-emulated cuSOLVERD. Developer tools see updates to Nsight Python, Nsight Compute, and Nsight Systems, alongside a modern C++ runtime in CCCL 3.2. CuPy also gains support for CUDA 13 and stream sharing.
NVIDIA CUDA 13.1 introduces CUDA Tile, a tile-based programming model, and performance gains across developer tools and libraries. It also features runtime API exposure of green contexts and a rewritten CUDA programming guide.