Back to GPU Hub
04 / Learning Path
Compilation Pipeline
From CUDA source to PTX/SASS and architecture execution.
Main Sections
Sub Topics
Topic 01
01
CUDA C++ -> PTX -> SASS -> Binary Pipeline
Theory
This is the complete path from source code to GPU execution, with four major stages.
You write CUDA C++ kernels in .cu files, nvcc emits PTX, ptxas produces architecture-specific SASS, and final binaries are packaged for runtime loading.
Four stages
- Stage 1 - CUDA C++: kernel code with CUDA keywords like __global__, __shared__, threadIdx, blockIdx.
- Stage 2 - PTX: virtual ISA (readable assembly-like intermediate representation).
- Stage 3 - SASS: real machine instructions for a specific SM target (for example sm_80 or sm_90).
- Stage 4 - cubin/fatbin: packaged binaries, with fatbin holding multiple targets.
your_code.cu -> nvcc -> PTX -> ptxas -> SASS -> cubin/fatbin -> GPU executes