Common Maintenance Tasks (Workstations and Servers)

Articles > Common Maintenance Tasks (Workstations and Servers)

The following items should be completed to maintain the health of your workstation or server. For compute clusters, please see Common Maintenance Tasks (Clusters).

Backup non-replaceable data

Remember that RAID is not a replacement for backups. If your system is stolen, hacked or started on fire, your data will be gone forever. Automate this task or you will forget.

For many groups, a weekly or monthly cron job is fine. Write a script calling rsync or tar which writes the files to a separate server, NAS or SAN. Place the script in /etc/cron.weekly/ or /etc/cron.monthly/
Users with more complex requirements should look at AMANDA or Bacula
Tape backup systems are still available for those who prefer them. Contact us.

Verify the health of the drive arrays (RAIDs)

Drive sectors can go bad silently. Scheduling regular verifies will weed out any issues before they occur. Automate them or you will forget.

Linux Software RAID (mdadm) arrays can be easily kicked into verify mode. Many distributions (Red Hat, CentOS, Ubuntu) come with their own utilities. To manually start a verify, run this line for each RAID (as root):
echo check > /sys/block/md#/md/sync_action
Watch the text file /proc/mdstat and the output of dmesg to watch the status of each verify.
Hardware RAID controllers provide their own methods for automated verifies and alert notification. Reference the controller’s manual.

Monitor system alarms and system health

Preferred: learn how to use the IPMI capability of your system for remote monitoring and management. You’ll spend a lot less time trekking to the datacenter.
Alternative: listen for system alarms and check for warning LEDs.

Don’t ignore alarms! If you put it off, you’ll soon find that something else is wrong and the system needs major repair.

Common Maintenance Tasks (Workstations and Servers)

Backup non-replaceable data

Verify the health of the drive arrays (RAIDs)

Monitor system alarms and system health

Archives

Meta

Talk to an Expert

Configure Your Solution

Schedule a Consultation

Knowledge Center Categories

Customer Testimonials

Technologies

Products

Knowledge Center

Pre-Configured Systems

NVIDIA DGX H100™

NVIDIA DGX POD™

EOL – NVIDIA DGX A100™

AI Anywhere Solution

Common Maintenance Tasks (Workstations and Servers)

Backup non-replaceable data

Verify the health of the drive arrays (RAIDs)

Monitor system alarms and system health

Archives

Meta

Talk to an Expert

Configure Your Solution

Schedule a Consultation

Knowledge Center Categories

Knowledge Center Tags

Customer Testimonials