In a peer-group that I am a member of recently we’ve had a small discussion about monitoring the SMART status of hard drives. We all agreed that the issue with SMART monitoring is that often it is unreliable when using RMM systems. This is due to RMM systems using only the Windows SMART output which lacks some critical values you should monitor. SMART itself could be a pretty decent early warning system when using all values supplied.
To resolve this, I’ve created a set that uses CrystalDiskInfo. A tool made by CrystalMark which presents the values to you in a nice overview. We’ve used this in the past to troubleshoot or check disks for predictive failures manually, but figured we should also try the same automated. This piece of PowerShell makes SMART monitoring more agile and reliable, because we alert on more information than just the predicted failure values.
The script relies on Invoke-expression, and expand-archive, as such at least Windows 8.1 will be required.
The script
As always, the script is self-explanatory. Please upload the zip file to your own web server or location to where the latest version of CrystalDiskInfo is hosted. This also creates a folder in program program files directory and unzips itself there.
|
|
The output variables will always contain data, this data can be used to threshold against in your RMM system. The thresholds I would use are:
- $CriticalWarnings = 0
- $CompositeTemp = 55 (this is 55 degrees celsius)
- $AvailableSpare = 50 (This means there are 50 reallocation blocks available. This is extremely preventive so you might want to tune it to your personal preference)
- $ControllerBusyTime = Not monitored, currently only log this for reporting purposes
- $PowerCycles = Not monitored, currently only log this for reporting purposes
- $PowerOnHours = 40000 (This is around 5 years of constant runtime.)
- $UnsafeShutdowns = 365 (I like to know if users are not shutting down their computers normally. This could also point at other software related problems.)
- $IntegrityErrors = 1 (This is what Windows normally reports on. We want to know as soon as these issues arise)
- $InformationLogEntries = 1 (How many events have been generated related to disk SMART events)
I hope this helps MSPs that are having issues with SMART monitoring in their RMM systems, anyway – As always, Happy PowerShelling!