Crash?

How, what, where and why - when using the software.
JackN
Posts: 15
Joined: 2011.11.28. 01:31

Crash?

Post by JackN »

A few weeks ago the health rating of one of my drives dropped from 100% to 99%. No big deal, but I was kind of surprised because the drive wasn’t in use. It only gets used once every 3 months for back-ups. Then two days ago it dropped another 2%. I didn’t like the trend I was seeing so yesterday morning I decided to pull all of the data off of it just in case. Then, just seconds before I was going to start moving data I started getting buried in “Uncorrectable Error” messages and the health rating dropped like a rock. It didn’t take very long for the health rating to drop all the way down to 13%. For some unknown reason the decline stopped there and I spent the next several hours trying to get the data off of the drive. I got lucky and was able to recover all of it except for one large file. Once I had all the data off I performed a destructive surface test just to see what would show up. Oddly enough, the disk health started going back up! It got all the way up to 89% percent by the time the test was done.

I don’t know what to make of it. I’m afraid to use it. Should I toss it?

Also, is the health rating simply a percentage of how much replacement storage is left?
User avatar
hdsentinel
Site Admin
Posts: 3115
Joined: 2008.07.27. 17:00
Location: Hungary
Contact:

Re: Crash?

Post by hdsentinel »

Usually when the hard disk health changes (even with 1%), hard to say, but it is expected to see more.
The hard disk constantly finds and updates the self-monitoring (S.M.A.R.T.) data when a new problem found.
With normal usage, it is possible that problems may remain un-detected (even for longer time) until the hard disk reads/writes the appropriate sector.

For example,
http://www.hdsentinel.com/hard_disk_cas ... ectors.php
page shows a such situation: until the drive completely filled, problems remain un-detected (even for years).

This is why generally it is important to verify the status before the hard disk used to store actual data (even on new hard disks as things could happen during transport, shipment, placing in the computer, etc..) to reveal prolems then - or confirm that the hard disk really perfect and can be used.
See
http://www.hdsentinel.com/faq.php#tests
for more details about such tests.

I completely agree that when you saw the heath decrease (especially when it happened again) you performed the backup - as yes, it is not surprising that new and new problems detected during actual use of the hard disk - and if this happens, the health can drop dramatically quickly.

> I got lucky and was able to recover all of it except for one large file.

Yes - as you react to the changes, new problems and degradations, you had the opportunity to save almost all data.

> Once I had all the data off I performed a destructive surface test just to see what would show up.
> Oddly enough, the disk health started going back up! It got all the way up to 89% percent by the time the test was done.

Yes, generally this is the purpose of the tests of Hard Disk Sentinel: to improve the situation, both the usability and the health (if possible) to make the hard disk more stable and usable.

> I don’t know what to make of it. I’m afraid to use it. Should I toss it?

What was the problem described in the text description in Hard Disk Sentinel when you saw the lowest health value?
Weak sectors were detected?
Data communication problems were detected?

These may be related to the current operating environment, eg. can be cased by power failure / power loss, accidental removal, insufficient power, cables / connections <- may be more important for external drives, as I understand this may be an external (USB?) hard disk.

These both can cause issues and the tests can help to improve the status (as you saw) - but if the original "source" of the problems remain, then it is possible that new issues will be reported later. Maybe not tomorrow, but maybe in weeks / months.

See
http://www.hdsentinel.com/hard_disk_cases.php
for typical issues, they may help to diagnose, fix and avoid hard disk problems - with Hard Disk Sentinel and in general.

Personally now, as the destructive test completed and things seem stable, I'd try to
- fill the hard disk with data
- perform the Disk menu -> Surface test -> Refresh data area (read+write+read) test

to verify if the complete surface can be used: all data sectors can be both read, written - and then after rewriting, the data match with the original - as these are tested by the Refresh data area (read+write+read) test.

If this happens, it confirms that now the hard disk can both record and hold the data - so it may be useful.

Also personally I'd use only with constant monitoring and be prepared with empty space, as then I'd perform complete backup immediately when there is any (even minor) new problem reported.

If you use Report menu -> Send test report to developer option, it is possible to check the actual status, verify the current reported problems and this may help to determine if the drive can be trusted, used.
Also if the status changes in the future and use Report menu -> Send test report to developer option again - it will be possible to compare and verify, examine the degradation.

> Also, is the health rating simply a percentage of how much replacement storage is left?

No, it is absolutely not.
The health % is calculated by the actual error counters.
Please see Help -> Appendix -> Health calculation, which shows the foundamentals of how the health calculated.
(the actual calculation is more complicated depending on the hard disk model, firmware (and we do not yet speak about SSDs) - but shows the basics and generic ideas, conception).
JackN
Posts: 15
Joined: 2011.11.28. 01:31

Re: Crash?

Post by JackN »

Thanks for the links and good info. Two days ago I did a read test and nothing changed. The health stayed at 89% and nothing new was added to the log. Yesterday I did a write test and again nothing changed. Health stayed at 89% and nothing new was added to the log. Today I did another read test and again nothing changed. Health is still at 89% and nothing new was added to the log. So there hasn’t been any log activity since doing a write test last weekend.

On the grid map of the disk surface, there are numerous slightly darker green squares on the top half with none on the bottom half. Some of the darker squares are lined up in a slanted striping type of pattern. Because this is one of the Seagate single disk 1tb drives, I’m assuming that the problems are all on one side of the disk, with the other side being fine.

I definitely don’t want to use it for current data. If I were to use it I wouldn’t trust it for anything more than back-ups. Is it safe for back-ups, or should I just toss it?
JackN
Posts: 15
Joined: 2011.11.28. 01:31

Re: Crash?

Post by JackN »

Sorry, I forgot to answer your questions.

There are 44 entries in the log. One entry was from the day before, but all the rest were put there the day of the crash. The vast majority (80%) of the problems read “Reported uncorrectable errors”. All of those errors are associated with event #187. The other 20% read “Off line uncorrectable sector count” and “Current pending sector count”. All of those errors are associated with event numbers #197 & #198.

This is an internal drive. It’s a Seagate single platter 1tb drive. Model ST31000528AS.

A lack of power shouldn’t be the issue. The computer has a Cooler Master 1kw power supply to power the 10 internal HDDs. The computer doesn’t show any signs of being underpowered. Cables should be good. The computer is only four years old. I haven’t had any power outages since last year.
User avatar
hdsentinel
Site Admin
Posts: 3115
Joined: 2008.07.27. 17:00
Location: Hungary
Contact:

Re: Crash?

Post by hdsentinel »

The darker areas mean that accessing the hard disk on the particular area is slower than expected.
Some of such darker green blocks can be fine, they may indicate problem only if they make a really long, continuous space and the color is really dark which indicates that the area is constantly slower than expected.
For example on this image:

Image

which also shows problems (red) blocks inside this darker green area.


If you want to be sure and prevent problems, then you may make partition(s) to cover the good area only and use only the perfect parts of the disk surface. This will limit the usable capacity, but you can be sure that even this problematic area will be never read/written again, so even its degradation can't cause problems.
For this, create a partition to cover this area: on the disk surface map in Hard Disk Sentinel, if you move the mouse over the last darker green block, the bottom status line immediately shows the MB position of this block. Maybe you can also add some blocks (and their capacity, which is also displayed
in the upper right area).
In Windows Disk Management, create a partition for this size (to cover all such blocks) but do not format it. Then create a partition to the remaining size (which will be used) and format it as should. Then delete the first partition created.
This way the problematic (slower) area will be on the unpartitioned/unused space - and the real data partition will only use the remaining, 100% perfect (with no delays, slower performance) data area.

This technique is best if the amount of such suspicious blocks is low and they reside on the very beginning of the disk surface. So losing 2-5-10% of hard disk capacity this way may be acceptable - to make sure that data stored on the hard disk (even with reduced capacity) will be safe.


> I’m assuming that the problems are all on one side of the disk, with the other side being fine.

No, we can't assume this.

The disk surface map displayed is not divided to parts this way, both sides (generally all heads/sides of all platters) used before the head advances to the next track.
This is why if you examine the disk performance graph after the test, you can see a curve with highest transfer speed on the beginning of the hard disk: these sectors are located in the outer tracks of the disk platter where a single rotation can read/write higher number of sectors than in the inner tracks (as sectors have fixed size on the disk platters).

> I definitely don’t want to use it for current data. If I were to use it I wouldn’t trust it for anything more than back-ups.
> Is it safe for back-ups, or should I just toss it?

I completely agree you, personally I'd also recommend only for storing secondary data, but not for intensive use.
JackN
Posts: 15
Joined: 2011.11.28. 01:31

Re: Crash?

Post by JackN »

Thanks for an explanation of how the map works. Knowing that, now I really don't trust it. I get the feeling a catastrophic failure isn't too far down the road. I'm going to replace it. One of the other drives I have in the same machine has lots of darker spots also. I think I'll replace that one too. Thanks again.
User avatar
hdsentinel
Site Admin
Posts: 3115
Joined: 2008.07.27. 17:00
Location: Hungary
Contact:

Re: Crash?

Post by hdsentinel »

Please note that in some cases the background activity of the operating system and/or other tools may cause slightly darker blocks.

Hard Disk Sentinel tries to lock the hard disk drive for the test - to prevent any other software (and the OS) to access it during testing - but it may be not possible if there are open files/folders on a partition of that hard disk (for example the hard disk contains the system itself).

If the number of such darker areas is low, they are not too dark and they appear random, it is not a big problem - and if you re-run the test, it is possible that they'll no longer appear - which confirms that the hard disk is usable.
Post Reply