Featured image of post Monitoring with PowerShell: SMART status via CrystalDiskInfo

Monitoring with PowerShell: SMART status via CrystalDiskInfo

In a peer-group that I am a member of recently we’ve had a small discussion about monitoring the SMART status of hard drives. We all agreed that the issue with SMART monitoring is that often it is unreliable when using RMM systems. This is due to RMM systems using only the Windows SMART output which lacks some critical values you should monitor. SMART itself could be a pretty decent early warning system when using all values supplied.

To resolve this, I’ve created a set that uses CrystalDiskInfo. A tool made by CrystalMark which presents the values to you in a nice overview. We’ve used this in the past to troubleshoot or check disks for predictive failures manually, but figured we should also try the same automated. This piece of PowerShell makes SMART monitoring more agile and reliable, because we alert on more information than just the predicted failure values.

The script relies on Invoke-expression, and expand-archive, as such at least Windows 8.1 will be required.

The script

As always, the script is self-explanatory. Please upload the zip file to your own web server or location to where the latest version of CrystalDiskInfo is hosted. This also creates a folder in program program files directory and unzips itself there.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#Replace the Download URL to where you've uploaded the ZIP file yourself. We will only download this file once.
$DownloadURL = "http://rwthaachen.dl.osdn.jp/crystaldiskinfo/71535/CrystalDiskInfo8_3_0.zip"
$DownloadLocation = "$($Env:ProgramFiles)\CrystalDiskInfo"
#Script:
$TestDownloadLocation = Test-Path $DownloadLocation
if(!$TestDownloadLocation){
new-item $DownloadLocation -ItemType Directory -force
Invoke-WebRequest -Uri $DownloadURL -OutFile "$($DownloadLocation)\CrystalDiskInfo.zip"
Expand-Archive "$($DownloadLocation)\CrystalDiskInfo.zip" -DestinationPath $DownloadLocation -Force
}
#We start CrystalDiskInfo with the COPYEXIT parameter. This just collects the SMART information in DiskInfo.txt
Start-Process "$($Env:ProgramFiles)\CrystalDiskInfo\DiskInfo64.exe" -ArgumentList "/CopyExit" -wait
$DiskInfoRaw  = get-content "$($Env:ProgramFiles)\CrystalDiskInfo\DiskInfo.txt" | select-string "-- S.M.A.R.T. --------------------------------------------------------------" -Context 0,16
$diskinfo = $DiskInfoRaw -split "`n" | select -skip 2 | Out-String | convertfrom-csv -Delimiter " " -Header "NOTUSED1","NOTUSED2","ID","RawValue" | Select-Object ID,RawValue

[int64]$CriticalWarnings = "0x" + ($diskinfo | Where-Object { $_.ID -eq "01"}).rawvalue
[int64]$CompositeTemp = "0x" + ($diskinfo | Where-Object { $_.ID -eq "02"}).rawvalue -273.15
[int64]$AvailableSpare = "0x" +($diskinfo | Where-Object { $_.ID -eq "03"}).rawvalue
[int64]$ControllerBusyTime ="0x" + ($diskinfo | Where-Object { $_.ID -eq "0A"}).rawvalue
[int64]$PowerCycles ="0x" + ($diskinfo | Where-Object { $_.ID -eq "0B"}).rawvalue
[int64]$PowerOnHours = "0x" + ($diskinfo | Where-Object { $_.ID -eq "0C"}).rawvalue
[int64]$UnsafeShutdowns = "0x" +($diskinfo | Where-Object { $_.ID -eq "0D"}).rawvalue
[int64]$IntegrityErrors ="0x" + ($diskinfo | Where-Object { $_.ID -eq "0E"}).rawvalue
[int64]$InformationLogEntries ="0x" + ($diskinfo | Where-Object { $_.ID -eq "0F"}).rawvalue

The output variables will always contain data, this data can be used to threshold against in your RMM system. The thresholds I would use are:

  • $CriticalWarnings = 0
  • $CompositeTemp = 55 (this is 55 degrees celsius)
  • $AvailableSpare = 50 (This means there are 50 reallocation blocks available. This is extremely preventive so you might want to tune it to your personal preference)
  • $ControllerBusyTime = Not monitored, currently only log this for reporting purposes
  • $PowerCycles = Not monitored, currently only log this for reporting purposes
  • $PowerOnHours = 40000 (This is around 5 years of constant runtime.)
  • $UnsafeShutdowns = 365 (I like to know if users are not shutting down their computers normally. This could also point at other software related problems.)
  • $IntegrityErrors = 1 (This is what Windows normally reports on. We want to know as soon as these issues arise)
  • $InformationLogEntries = 1 (How many events have been generated related to disk SMART events)

I hope this helps MSPs that are having issues with SMART monitoring in their RMM systems, anyway – As always, Happy PowerShelling!

All blogs are posted under AGPL3.0 unless stated otherwise
comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy