Samsung 970 EVO: HDS says "is PERFECT" but actual S.M.A.R.T.

How, what, where and why - when using the software.
HaroldFinch
Posts: 5
Joined: 2021.08.11. 17:14

Samsung 970 EVO: HDS says "is PERFECT" but actual S.M.A.R.T.

Post by HaroldFinch »

I have a Samsung 970 EVO 2 TB SSD that was the main boot drive of my workstation laptop.

Several days ago that laptop would boot Windows, but no login password textfield would display, so it was impossible to get in. I attempted to run Windows restore, that failed, and now the drive is totally unable to boot Windows at all.

I eventually removed it from the laptop, put it on a M2 to PCIe adapter card, and examined that SSD with Hard Disk Sentinel and Samsung Magician.

Hard Disk Sentinel gives this glowing report on its Overview tab:

Code: Select all

	The status of the solid state disk is PERFECT. Problematic or weak sectors were not found. 
	The health is determined by SSD specific S.M.A.R.T. attribute(s):  Available Spare (Percent), Percentage Used
	The TRIM feature of the SSD is supported and enabled for optimal performance.
	
	No actions needed.
But when I look at its S.M.A.R.T. tab, I see 3 attributes that disturb me:

Code: Select all

	Unsafe Shutdowns                            166
	Media and Data Integrity Errors             870
	Number of Error Information Log Entries  42,463
There is no way that my computer was hard powered off 166 times in the < 3 years that I have owned this laptop, so where does Unsafe Shutdowns = 166 come from?

Most disturbing of all is Media and Data Integrity Errors = 870. Is it lots of NAND cells failing? What does it actually mean? How can Hard Disk Sentinel's Overview tab claim "Problematic or weak sectors were not found"?

Aside: hdsentinel, if you are reading this, you should reword that part of Overview to directly say "No problematic or weak sectors were found"; your phrasing puts the negation ("not") almost at the very end.

Finally, it claims Number of Error Information Log Entries = 42,463. First, that sounds like tons of errors are being logged. But when I look at Hard Disk Sentinel's Log tab it says "No problems logged". Another mystery

If anyone has any insight into these numbers and can advise me as to my drive's health, I would be grateful.

I also examined that SSD with Samsung Magician:
  • under "Drive Details" the main thing is that I was able to export all the S.M.A.R.T. data as a .csv file that I have attached to this post
  • under "Performance Benchmark" I ran a speed test and got sequential read / write = 1,618 / 1,486 MB /s, random read / write = 339,111 / 292,480 IOPS
    --those speeds are low: the review below claims that I should see sequential read / write = 3,500 / 2,500 /s, random read / write = 500,000 / 500,000 IOPS
    https://www.storagereview.com/review/sa ... evo-review
  • under "Diagnostic Scan": The selected drive does not support this feature (too bad, the successor model, the 970 Evo Plus, does)
I am sufficiently concerned that I may send in this drive to Samsung support for evaluation.
User avatar
hdsentinel
Site Admin
Posts: 3128
Joined: 2008.07.27. 17:00
Location: Hungary
Contact:

Re: Samsung 970 EVO: HDS says "is PERFECT" but actual S.M.A.

Post by hdsentinel »

According the information, the SSD may serious problems - but may not.

Exactly as you saw, there were Unsafe Shutdowns = 166. Yes, these generally reflect the hard "power off" situations, but these may also indicate issue when the SSD is not 100% securely seated in its slot: there can be a temporary connection issue when the system can't communicate with the SSD (and/or the SSD does not get power for a short time) which can also cause such problem.

While the Media and Data Integrity Errors = 870 seems serious, usually alone this amount is not critical. No, this does NOT mean that there were failed NAND: if sectors would fail, then (similarly as hard disk drives do) the SSD would mark them as "bad" to prevent further use and replace them with sectors from the spare area.

As I do not see the complete status of the SSD, hard to say anything for sure, but if the "Available Spare (Percent)" value is 100 (maybe), then current no such reallocation happened.
When the amount of those Media and Data Integrity Errors would be higher and/or there would other indication, then the Health % would be lower of course in Hard Disk Sentinel.

The "Number of Error Information Log Entries" field reflects how much errors the SSD logged in its internal log. This is completely independent from the Log page of Hard Disk Sentinel: the Log page of Hard Disk Sentinel logs possible changes of some critical attributes/values. In contrast, the internal log of the SSD many times logs everything (not really "hard" errors), for example if a command is not supported or if a chipset driver does not work correctly or similar.
Currently examining those logs and display the contens - but alone that does not mean troubles.

I'd suggest to please use Report menu -> Send test report to developer option - as then it is possible to check the actual, complete status of the SSD and examine the possibilities.
The complete status means MORE than the S.M.A.R.T. page, as alone the S.M.A.R.T. values (especially if exported from other tools) may not give complete picture: may not show the current NVMe driver, the functionality/features of the SSD which can also help to identify what could be the problem - and what can be done to improve the situation.

First I'd recommend to use the latest possible Hard Disk Sentinel 5.70.6 version because by that, you can use the internal self test functions of the SSD, the Disk menu -> Short self test and Disk menu -> Extended self test functions (if supported by the NVMe SSD). Probably this is similar what the other tool may attempt (but failed) so you may try them in Hard Disk Sentinel, as these may work better even if other tools fail.

Also I'd surely use Disk menu -> Surface test -> Read test. This performs a complete (safe) read scan of the complete SSD to reveal possible issues: slower areas, damaged sectors and so which may require attention (and may be repaired if required).
If there is no error (no yellow or red blocks on the disk surface map - some darker green blocks are acceptable) PLUS the attributes do not change (no new problems reported) then the SSD is likely working correctly and there were something in the operating environment (eg. connection, sudden power loss). This may be important - because if you replace the SSD and the problem was different, then there is good chance that you'll see similar with the new, replacement drive too.

> I was able to export all the S.M.A.R.T. data as a .csv file that I have attached to this post
Sorry, I see no attached files - but as I tried to explain, this may not help at all...

> under "Performance Benchmark" I ran a speed test and got sequential read
Sorry to say, but this is irrelevant. Any drive (even failing ones) can show such (or even higher) values simply because they are NOT tested properly: only a very small fraction of the sectors are checked. As long as we do not test the complete surface area (all sectors) we can't say anything for sure about the real status of the drive.

> [*]under "Diagnostic Scan": The selected drive does not support this feature (too bad, the successor model, the 970 Evo Plus, does)
Maybe this is the internal self test function (?) just this tool use different name for that.
If the SSD does not support it - then the Short self test / Extended self test function in Hard Disk Sentinel also can't be used - but the above mentioned complete Read test can be used to verify the complete surface area, reveal possible issues which would be otherwise missed.


> am sufficiently concerned that I may send in this drive to Samsung support for evaluation.

I'd surely perform the tests before that - exactly to confirm if the issue is surely related to the SSD - or not.
Post Reply