====== S.M.A.R.T. ====== //Self-Monitoring, Analysis and Reporting Technology// es un sistema de monitorización incluido en discos duros y SSD. Su principal función es detectar e generar informes sobre la salud / fiabilidad del disco con la intención de anticiparse a fallos de hardware inminentes. Si un disco comienza a fallar, las siguientes estadísticas de SMART irán en aumento: * **SMART 5 – Reallocated_Sector_Count**: representa el número de sectores encontrados como erróneos y que han sido apuntados a una zona especial del disco. Los discos nuevos tienen un valor de 0. Si este número aumenta constantemente con el tiempo, el futuro fallo del disco es inminente. * **SMART 187 – Reported_Uncorrectable_Errors**. * **SMART 188 – Command_Timeout**. * **SMART 197 – Current_Pending_Sector_Count**: número de sectores inestables (no quiere decir que estén defectuosos), puede indicar problemas de hardware según siga creciendo el número * **SMART 198 – Offline_Uncorrectable**. ===== Ejemplo ===== SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 25 3 Spin_Up_Time POS--K 178 177 021 - 6083 4 Start_Stop_Count -O--CK 100 100 000 - 283 5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0 7 Seek_Error_Rate -OSR-K 100 253 000 - 0 9 Power_On_Hours -O--CK 031 031 000 - 50733 10 Spin_Retry_Count -O--CK 100 100 000 - 0 11 Calibration_Retry_Count -O--CK 100 100 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 283 192 Power-Off_Retract_Count -O--CK 200 200 000 - 96 193 Load_Cycle_Count -O--CK 192 192 000 - 26689 194 Temperature_Celsius -O---K 122 110 000 - 28 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 7 198 Offline_Uncorrectable ----CK 100 253 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0 200 Multi_Zone_Error_Rate ---R-- 100 253 000 - 0 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning ===== Herramientas ===== En Linux, el paquete ''smartmontools'' ofrece herramientas para controlar y monitorizar dispositivos S.M.A.R.T. ==== Comprobar si está activo SMART ==== Si instalamos el paquete ''smartmontools'', tendremos disponible el comando ''smartctl'' con el que podremos hacer todo tipo de consultas SMART. Por ejemplo, para obtener información sobre el disco y saber si soporta SMART y está activo: smartctl /dev/sda Ejemplo de salida: sudo smartctl -i /dev/sda smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.12-arch1-1] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Samsung based SSDs Device Model: Samsung SSD 860 EVO 1TB Serial Number: S4CSNF0MA00815A LU WWN Device Id: 5 002538 e49a08082 Firmware Version: RVT03B6Q User Capacity: 1.000.204.886.016 bytes [1,00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: 2.5 inches TRIM Command: Available, deterministic, zeroed Device is: In smartctl database 7.3/5319 ATA Version is: ACS-4 T13/BSR INCITS 529 revision 5 SATA Version is: SATA 3.2, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Sat Feb 25 16:19:36 2023 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled Las últimas dos líneas nos indican si el dispositivo soporta SMART y si está activo. Si apareciese //SMART support is: Disabled//, podríamos activarlo con: smartctl -s on /dev/sda ==== Revisar estado SMART ==== smartctl -a /dev/sda Ejemplo de salida: smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.12-arch1-1] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Samsung based SSDs Device Model: Samsung SSD 860 EVO 1TB Serial Number: S4CSNF0MA00815A LU WWN Device Id: 5 002538 e49a08082 Firmware Version: RVT03B6Q User Capacity: 1.000.204.886.016 bytes [1,00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: 2.5 inches TRIM Command: Available, deterministic, zeroed Device is: In smartctl database 7.3/5319 ATA Version is: ACS-4 T13/BSR INCITS 529 revision 5 SATA Version is: SATA 3.2, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Sat Feb 25 16:27:20 2023 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 0) seconds. Offline data collection capabilities: (0x53) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 85) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 17816 12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 935 177 Wear_Leveling_Count 0x0013 098 098 000 Pre-fail Always - 29 179 Used_Rsvd_Blk_Cnt_Tot 0x0013 100 100 010 Pre-fail Always - 0 181 Program_Fail_Cnt_Total 0x0032 100 100 010 Old_age Always - 0 182 Erase_Fail_Count_Total 0x0032 100 100 010 Old_age Always - 0 183 Runtime_Bad_Block 0x0013 100 100 010 Pre-fail Always - 0 187 Uncorrectable_Error_Cnt 0x0032 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0032 079 057 000 Old_age Always - 21 195 ECC_Error_Rate 0x001a 200 200 000 Old_age Always - 0 199 CRC_Error_Count 0x003e 100 100 000 Old_age Always - 0 235 POR_Recovery_Count 0x0012 099 099 000 Old_age Always - 27 241 Total_LBAs_Written 0x0032 099 099 000 Old_age Always - 43631484933 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing 256 0 65535 Read_scanning was never started Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Otro ejemplo, pero con un disco mecánico (el anterior era SSD): $ smartctl -a /dev/sdb smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.12-arch1-1] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Barracuda Green (AF) Device Model: ST2000DL003-9VT166 Serial Number: 5YD6785T LU WWN Device Id: 5 000c50 045faf109 Firmware Version: CC3C User Capacity: 2.000.398.934.016 bytes [2,00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5900 rpm Device is: In smartctl database 7.3/5319 ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Sat Feb 25 16:29:46 2023 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 612) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 338) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x30b7) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 101 099 006 Pre-fail Always - 85416 3 Spin_Up_Time 0x0003 093 092 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 093 093 020 Old_age Always - 7404 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 087 060 030 Pre-fail Always - 4842680032 9 Power_On_Hours 0x0032 024 024 000 Old_age Always - 67349 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 097 097 020 Old_age Always - 3482 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 098 000 Old_age Always - 83 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 075 055 045 Old_age Always - 25 (Min/Max 17/25) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 198 193 Load_Cycle_Count 0x0032 097 097 000 Old_age Always - 7688 194 Temperature_Celsius 0x0022 025 045 000 Old_age Always - 25 (0 14 0 0 0) 195 Hardware_ECC_Recovered 0x001a 005 003 000 Old_age Always - 85416 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 43904 (43 180 0) 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 1167924056 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 1363793857 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. ===== Recursos ===== * [[https://en.wikipedia.org/wiki/Self-Monitoring,_Analysis_and_Reporting_Technology|Self-Monitoring, Analysis and Reporting Technology]] (Wikipedia)