Understanding the properties and capabilities of the GPU devices being used is important for optimizing CUDA applications. One important aspect of CUDA device management is retrieving the properties of the GPU device, such as its name, memory bus width, total global memory, memory clock rate, and more. This information can be used to make informed decisions on how to best utilize the device for a specific application. This tutorial shows to get CUDA device properties using C++.
The following code retrieves and prints various properties of each CUDA device in the system, including device name, compute capability, total global memory, maximum threads per block, etc. You can access other properties from the cudaDeviceProp
structure as needed, based on your requirements.
Let's dive in the code. We use the cudaGetDeviceCount
function to retrieve the number of available CUDA devices and store it in the deviceCount
variable. We declare a cudaDeviceProp
structure for storing the properties of the current device. The cudaGetDeviceProperties
function retrieves the properties of the current device and store them in the structure. Finally, various device properties are printed to the console.
#include <iostream>
#include <cuda_runtime.h>
int main()
{
int deviceCount;
cudaGetDeviceCount(&deviceCount);
for (int i = 0; i < deviceCount; ++i) {
cudaDeviceProp prop{};
cudaGetDeviceProperties(&prop, i);
std::cout << "Device Number: " << i << std::endl;
std::cout << " Device Name: " << prop.name << std::endl;
std::cout << " Compute Capability: " << prop.major << "." << prop.minor << std::endl;
std::cout << " Total Global Memory (bytes): " << prop.totalGlobalMem << std::endl;
std::cout << " Max Threads per Block: " << prop.maxThreadsPerBlock << std::endl;
std::cout << " Memory Clock Rate (kHz): " << prop.memoryClockRate << std::endl;
std::cout << " Memory Bus Width (bits): " << prop.memoryBusWidth << std::endl;
std::cout << " L2 Cache Size (bytes): " << prop.l2CacheSize << std::endl;
}
return 0;
}
Here's an example of the output you might see in the console:
Device Number: 0
Device Name: NVIDIA GeForce RTX 3070 Laptop GPU
Compute Capability: 8.6
Total Global Memory (bytes): 8361017344
Max Threads per Block: 1024
Memory Clock Rate (kHz): 6001000
Memory Bus Width (bits): 256
L2 Cache Size (bytes): 4194304
Leave a Comment
Cancel reply