Hitachi 7k1000 1TB drive failures
Posted: Sun Feb 06, 2011 4:37 pm
I've always used these drives in my servers as they are cheap and have good performance for their price.
I had 3 in raid 5 and they ran great and are still running now. I recently added 2 more drives, 7k1000.c to be exact (honestly not sure what they others are, but they are a few years older) maybe a month ago. BOTH are failing. I ordered 3 more, added two of them in so I can rebuild the raid on them and pull out the others... the two new ones are failing too!
I am currently trying the 3rd new one. This time I'm going to give it more burn in time before I rebuild the raid to it. so far so good...
Also one of the 3 newer ones only had one error, and it seems to have cleared but I still don't trust these drives anymore.
Here's some error output of a few of the drives:
Do not buy!
Archived topic from Iceteks, old topic ID:5389, old post ID:39873
I had 3 in raid 5 and they ran great and are still running now. I recently added 2 more drives, 7k1000.c to be exact (honestly not sure what they others are, but they are a few years older) maybe a month ago. BOTH are failing. I ordered 3 more, added two of them in so I can rebuild the raid on them and pull out the others... the two new ones are failing too!
I am currently trying the 3rd new one. This time I'm going to give it more burn in time before I rebuild the raid to it. so far so good...
Also one of the 3 newer ones only had one error, and it seems to have cleared but I still don't trust these drives anymore.
Here's some error output of a few of the drives:
Sure it's normal for drives to fail every now and then, but 4 in a row?!?!
drive 1:
[root@borg rsbackup]# smartctl -a /dev/sde
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright © 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: Hitachi HDS721010CLA332
Serial Number: JP2940HD034DRC
Firmware Version: JP4OA3EA
User Capacity: 1,000,204,886,016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sat Feb 5 20:07:19 2011 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (9988) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 167) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 096 096 016 Pre-fail Always - 262148
2 Throughput_Performance 0x0005 135 135 054 Pre-fail Offline - 97
3 Spin_Up_Time 0x0007 118 118 024 Pre-fail Always - 320 (Average 322)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 24
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 138 138 020 Pre-fail Offline - 31
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 2252
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 23
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 24
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 24
194 Temperature_Celsius 0x0002 253 253 000 Old_age Always - 23 (Lifetime Min/Max 18/31)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 3
SMART Error Log Version: 1
ATA Error Count: 1
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 1 occurred at disk power-on lifetime: 1838 hours (76 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 09 67 0d ad 09
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 28 30 48 0d ad 40 00 20d+04:03:37.681 READ FPDMA QUEUED
60 10 28 80 09 ad 40 00 20d+04:03:37.680 READ FPDMA QUEUED
60 28 20 30 f4 ac 40 00 20d+04:03:37.680 READ FPDMA QUEUED
60 10 00 70 51 10 40 00 20d+04:03:37.680 READ FPDMA QUEUED
60 10 30 48 71 ed 40 00 20d+04:03:37.674 READ FPDMA QUEUED
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Drive 2:
[root@borg rsbackup]# smartctl -a /dev/sdf
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright © 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: Hitachi HDS721010CLA332
Serial Number: JP2940HD032PMC
Firmware Version: JP4OA3EA
User Capacity: 1,000,204,886,016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Sat Feb 5 20:08:10 2011 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (10517) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 175) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 094 094 016 Pre-fail Always - 786447
2 Throughput_Performance 0x0005 134 134 054 Pre-fail Offline - 100
3 Spin_Up_Time 0x0007 120 120 024 Pre-fail Always - 301 (Average 329)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 21
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 140 140 020 Pre-fail Offline - 30
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 2252
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 19
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 21
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 21
194 Temperature_Celsius 0x0002 253 253 000 Old_age Always - 23 (Lifetime Min/Max 18/30)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
SMART Error Log Version: 0
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Do not buy!
Archived topic from Iceteks, old topic ID:5389, old post ID:39873