GPU sensor for PRTG Network Monitor

Submitted by zubanst on Mon, 06/07/2021 - 15:00

There is little writing about how to monitor GPU load in PRTG. Here is a complete tutorial applying for Windows servers.

Prerequisites:

Target machine and PRTG host must be in the same domain

On the target machine

Add C:\Program Files\NVIDIA Corporation\NVSMI\ to Path System variable in the Windows Environment variable

In a PowerShell, execute nvidia-smi –L to list available GPUs. Result would be something like:

PS C:\Users\zubanst> nvidia-smi -L

GPU 0: Quadro RTX 8000 (UUID: GPU-66fb0c09-78d8-a817-8342-6e42ab1579a1)

For each GPU, execute the command nvidia-smi --id=GPU-66fb0c09-78d8-a817-8342-6e42ab1579a1 --query-gpu=utilization.gpu --format="csv,nounits,noheader" where the –id is the GPU UUID.

This should return instant GPU usage. Now that we are able to see GPU usage through PS script, we need to allow remote execution of PowerShell scripts. In a PowerShell command line in administration mode execute

Enable-PSRemoting -Force

On the PRTG Host

We need to add all remote GPU powered hosts as trusted hosts in order to be able to execute remotely the nvidia-smi command. First, let’s list the current trusted hosts:

PS> Get-Item WSMan:\localhost\Client\TrustedHosts

   WSManConfig: Microsoft.WSMan.Management\WSMan::localhost\Client

Type            Name                           SourceOfValue   Value                                                                                                     

----               ----                                 -------------   -----                                                                                                     

System.String   TrustedHosts                                   WIN-0TBTR2CPOVQ.DOM.local

To add a new trusted host, in a PowerShell command line in administrator mode, do

Set-Item WSMan:\localhost\Client\TrustedHosts –Concatenate WIN-6I5T2P2U75B.DOM.local

PS C:\Users\zubanst> Get-Item WSMan:\localhost\Client\TrustedHosts

   WSManConfig: Microsoft.WSMan.Management\WSMan::localhost\Client

Type                     Name                                                 SourceOfValue   Value                                                                                                     

----                         ----                                                        -------------   -----                                                                                                     

System.String   TrustedHosts                                   WIN-0TBTR2CPOVQ.DOM.local,WIN-6I5T2P2U75B.DOM.local

Add a folder for PowerShell scripts, say C:\PSscripts and create first a simple script GPU-WMI.ps1 with the following content (note that we did not mention the GPU id for simplicity of the test, as long as there is only one GPU on the tested host:

[string]$UTIL = $(nvidia-smi --query-gpu=utilization.gpu --format="csv,nounits,noheader")

Write-Output $UTIL":OK"

It is time now we test the remote script on the PRTG host. In the PowerShell command:

PS > Invoke-Command -ComputerName WIN-0TBTR2CPOVQ.DOM.local -FilePath C:\PSscripts\GPU-WMI.ps1

56:OK

For single GPU machines, we can keep this configuration, for multi-GPU target machines, the GPU-WMI.ps1 script must be adapted. Now that the remote script is working it is time we add the sensor in PRTG. Before, we need to add the custom scripts in the PRTG custom scripts folder. So in the C:\Program Files (x86)\PRTG Network Monitor\Custom Sensors\EXE folder, add your custom script Powershell Script GPU WMI WIN-0TBTR2CPOVQ.ps1 with the following content

[string]$UTIL = $(Invoke-Command -ComputerName WIN-0TBTR2CPOVQ.DOM.local -FilePath C:\PSscripts\GPU-WMI.ps1)

Write-Output $UTIL":OK"

exit 0

You can keep it simple and have one script/target GPU powered host, keeping the hostname in the filename for easy reference, or do a generic more complex one, not in the scope of this tutorial. Once the script there, we can add the sensor.

In the PRTG admin page, chose your device and add the sensor:

In the sensor definition, chose the sensor from the drop-down menu

Set security context to Windows credentials

And

Later you can rename the sensor as I did as GPU Load

Enjoy. Sould you have questions, contact me at mailto:zuban@pennyitsupport.eu