HDDの状態をSMARTで調べる

整理したら古いHDDがたくさん出てきました。それぞれの状態を見てみたいと思います。
HDDの状態を見るといえばSMARTですが、Linux Mintの「ディスク」ユーティリティでもSMARTの状態を見ることができます。

ディスクユーティリティを起動して、対象とするHDDを選択したところです。ここではeSATAで外付けにしたHDDを選択しています。

右上の方のメニューに「SMARTのデータとセルフテスト」というのがあるので、これを選ぶと、最後に取得したデータが表示されるようです。

このHDD、自分の父がHDDを交換したので処分したい、ということで引き取ってきたものなのですが、どうも過去に温度条件の良くないところに突っ込まれてたのでしょうか。温度自体ではなく、Airflow Temparatureなので、筐体の温度ということではないのかもしれませんので、ファンでも壊れてたんでしょうか??(西日のあたるところに置かれてたようですが、それでも??)
それ以外は特に問題はないようです。

簡単に扱うならこれでいいのですが、できれば記録に残したいところです。そういう場合はCLIの方が便利です。

標準ではコマンドラインのツールは入っていないようなので、

$ sudo apt-get install smartmontools

としてインストールします。–info オプションをつけて smartctl コマンドを実行すると、デバイスの概要を表示してくれます。型番やシリアルナンバーも表示してくれるので、管理には便利です。今回は対象のデバイスが /dev/sdb ですので実行すると以下のような感じになりました。(他のデバイスで行うときには /dev/sdb の部分は読み替えてください)

$ sudo smartctl --info /dev/sdb
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-30-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar SE Serial ATA
Device Model:     WDC WD2500JS-19NCB1
Serial Number:    WD-WCANK1638205
Firmware Version: 10.02E01
User Capacity:    250,058,268,160 bytes [250 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA/ATAPI-7 (minor revision not indicated)
Local Time is:    Wed Aug 15 05:26:51 2018 JST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

最後の行にSMARTが有効かどうか記載されていますが、無効の場合には以下の方法で有効化する必要があります。

$ sudo smartctl --smart=on /dev/sdb

デバイスが対応しているテストを調べます。

$ sudo smartctl -c /dev/sdb
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-30-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		( 7680) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  90) minutes.
Conveyance self-test routine
recommended polling time: 	 (   6) minutes.
SCT capabilities: 	       (0x103f)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

最後の方に記載されている内容によれば、shortテストは2分、extendedテストは90分、conveyanceテスト(輸送時の影響を調べる)だと6分かかるようです。
90分は長いので、shortテストを起動してみます。

$ sudo smartctl -t short /dev/sdb
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-30-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after (終了予定時刻)

Use smartctl -X to abort test.
$

ということですぐ帰ってきます。実際の処理はバックグラウンドで行われるようです。なお、extendedテストを実行する場合は-tの後ろはlong、conveyanseテストを実行する場合はconveyanseを指定します。

テスト結果を表示してみます。

$ sudo smartctl -H /dev/sdb
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-30-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Please note the following marginal Attributes:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
190 Airflow_Temperature_Cel 0x0022   043   030   045    Old_age   Always   FAILING_NOW 57

テストは無事にPASSしましたが、Airflow Temparatureのイベントに注意、ということみたいです。

最新のテスト結果のリストを表示してみます。

$ sudo smartctl -l selftest /dev/sdb
[sudo] tom のパスワード: 
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-30-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     25508         -
# 2  Short offline       Completed without error       00%     25321         -

詳細の情報を表示してみます。

$ sudo smartctl -a /dev/sdb
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-30-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar SE Serial ATA
Device Model:     WDC WD2500JS-19NCB1
Serial Number:    WD-WCANK1638205
Firmware Version: 10.02E01
User Capacity:    250,058,268,160 bytes [250 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA/ATAPI-7 (minor revision not indicated)
Local Time is:    Wed Aug 15 06:51:02 2018 JST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
					was suspended by an interrupting command from host.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		( 7680) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  90) minutes.
Conveyance self-test routine
recommended polling time: 	 (   6) minutes.
SCT capabilities: 	       (0x103f)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0003   211   182   021    Pre-fail  Always       -       4441
  4 Start_Stop_Count        0x0032   093   093   000    Old_age   Always       -       7644
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   200   200   051    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   066   066   000    Old_age   Always       -       25509
 10 Spin_Retry_Count        0x0013   100   100   051    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x0012   100   100   051    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   093   093   000    Old_age   Always       -       7638
190 Airflow_Temperature_Cel 0x0022   045   030   045    Old_age   Always   FAILING_NOW 55
194 Temperature_Celsius     0x0022   095   080   000    Old_age   Always       -       55
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       1
200 Multi_Zone_Error_Rate   0x0009   200   200   051    Pre-fail  Offline      -       0

SMART Error Log Version: 1
ATA Error Count: 2
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 occurred at disk power-on lifetime: 7345 hours (306 days + 1 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 af 24 01 e0  Error: UNC 8 sectors at LBA = 0x000124af = 74927

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 d8 08 af 24 01 00 00      00:00:19.240  READ DMA EXT
  25 d8 08 af 24 01 00 00      00:00:17.341  READ DMA EXT
  25 d8 08 a7 24 01 00 00      00:00:17.341  READ DMA EXT
  25 d8 08 9f 24 01 00 00      00:00:17.341  READ DMA EXT
  25 d8 08 97 24 01 00 00      00:00:17.339  READ DMA EXT

Error 1 occurred at disk power-on lifetime: 7345 hours (306 days + 1 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 af 24 01 e0  Error: UNC 8 sectors at LBA = 0x000124af = 74927

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 d8 08 af 24 01 00 00      00:00:17.341  READ DMA EXT
  25 d8 08 a7 24 01 00 00      00:00:17.341  READ DMA EXT
  25 d8 08 9f 24 01 00 00      00:00:17.341  READ DMA EXT
  25 d8 08 97 24 01 00 00      00:00:17.339  READ DMA EXT
  25 d8 08 d7 00 00 00 00      00:00:17.339  READ DMA EXT

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     25508         -
# 2  Short offline       Completed without error       00%     25321         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Arch LinuxのS.M.A.R.T.の項目を参考にしました(というか、概ねそのままです)

コメントを残す

メールアドレスが公開されることはありません。 が付いている欄は必須項目です

日本語が含まれない投稿は無視されますのでご注意ください。(スパム対策)