HP/HPE Acronyms:
Notes:
1) HPE MSA Best Practice for Controller Firmware Update | https://www.youtube.com/watch?v=exaQMRKjNvA |
2) HPE MSA Storage best practice for expansion module firmware | https://www.youtube.com/watch?v=_a-FaQcWhBc |
3) Updating HPE MSA Storage drive firmware demo | https://www.youtube.com/watch?v=5jodXVECav8&t=111s |
date: 2018-12-20 (updated: 2021-05-20)
1) GUI-based SSA Utility (way cool tool)
Steps: 1. login as root from the graphical front console 2. download all files related to ssa-2.65-7.0.x86_64.rpm from https://support.hpe.com 3. rpm -i ssa-2.65-7.0.x86_64.rpm 4. /usr/sbin/ssa -local (Firefox auto opens with a beautiful colored diagram of your RAID config). See page 18 of this manual 5. /usr/sbin/ssa -help (view all available command-line switches) --------------------------------------------------------------------------- Tips: 1. I have used this tool to convert a volume from "8-disk RAID-60" to "8-disk RAID-0" on the fly This requires several hours and would definitely impact server performance 2. my next experiment was to convert from "8-disk RAID-0" to "4-disk RAID-0" on the fly I didn't even know this was possible (would not work if the volume was full)
2) CLI-based SSA Utility (great for scripting)
Steps: 1. login as root from anywhere 2. rpm -i ssacli-2.65-7.0.x86_64.rpm 3. then just type "ssacli" (my typing is in blue) 4. Notice that the drive in bay-8 is marked "Predictive Failure" ############################################################################################### [root@localhost ~]# ssacli Smart Storage Administrator CLI 2.65.7.0 Detecting Controllers...Done. Type "help" for a list of supported commands. Type "exit" to close the console. => ctrl all show # firmware sensitive; does not work on all platforms this is the only way to see which controllers were found (slot #0 means embedded) => set target ctrl slot=0 # or: set target ctrl all
# or: set target ctrl first => show config Smart Array P420i in Slot 0 (Embedded) (sn: 001438024F5D170) Port Name: 1I Port Name: 2I Internal Drive Cage at Port 1I, Box 2, OK Internal Drive Cage at Port 2I, Box 2, OK Array A (SAS, Unused Space: 0 MB) logicaldrive 1 (1.1 TB, RAID 60, OK) physicaldrive 1I:2:1 (port 1I:box 2:bay 1, SAS HDD, 300 GB, OK) physicaldrive 1I:2:2 (port 1I:box 2:bay 2, SAS HDD, 300 GB, OK) physicaldrive 1I:2:3 (port 1I:box 2:bay 3, SAS HDD, 300 GB, OK) physicaldrive 1I:2:4 (port 1I:box 2:bay 4, SAS HDD, 300 GB, OK) physicaldrive 2I:2:5 (port 2I:box 2:bay 5, SAS HDD, 300 GB, OK) physicaldrive 2I:2:6 (port 2I:box 2:bay 6, SAS HDD, 300 GB, OK) physicaldrive 2I:2:7 (port 2I:box 2:bay 7, SAS HDD, 300 GB, OK) physicaldrive 2I:2:8 (port 2I:box 2:bay 8, SAS HDD, 300 GB, Predictive Failure) SEP (Vendor ID PMCSIERA, Model SRCv8x6G) 380 (WWID: 5001438024F5D17F) => show status Smart Array P420i in Slot 0 (Embedded) Controller Status: OK Cache Status: OK Battery/Capacitor Status: OK => show config detail bla...bla...bla... drive details bla...bla...bla... => exit [root@localhost ~]#
#!/bin/bash
#=============================================================================
# title : raid_monitor.sh
# purpose: inspect the health of drives not visible to Linux
# notes : meant to be run from root since ssacli is not SUDO friendly
# : this script will be run 3-times a day from crontab
# history:
# NSR 20190906 1. original effort
# NSR 20190911 2. more work
# NSR 20190917 3. minor fix in cleanup
# NSR 20191104 4. moved logging to /var/log
# NSR 20200306 5. now do not stop on error (needed if ssacli is not installed)
#=============================================================================
set -vex # tron (v=verbose, e=stop-on-error, x=display data)
STUB="raid_monitor-"
YADA="/var/log/"${STUB}$(date +%Y%m%d.%H%M%S)".trc"
echo "-i-diverting output to file: "${YADA}
exec 1>>${YADA}
exec 2>&1
set +e # do not stop on errors (in this script)
echo "-i-starting: "${0}" at "$(date +%Y%m%d.%H%M%S)
rm -f raid_monitor.tmp
# ssacli is installed with RPM
ssacli ctrl slot=0 show config > raid_monitor.tmp
saved_status=$?
echo "-i-saved_status:"$saved_status
if [ $saved_status != 0 ];
then
# mail -s "RAID Problem" neil,neil@kawc09.on.bell.ca,neil@kawc96.on.bell.ca <<< "-e-could not execute SSACLI"
# note: ats_adm_list is an alias defined here: /etc/aliases
mail -s "RAID Problem-01 on host: "$HOSTNAME ats_adm_list <<< "-e-could not execute SSACLI"
mail -s "RAID Problem-01 on host: "$HOSTNAME root <<< "-e-could not execute SSACLI"
exit
fi
# this next script will analyze "raid_monitor.tmp"
/root/raid_analyze_file.sh
saved_status=$?
echo "-i-saved_status:"$saved_status
if [ $saved_status != 0 ];
then
# note: ats_adm_list is an alias defined here: /etc/aliases
mail -s "RAID Problem-02 on host: "$HOSTNAME ats_adm_list <<< "-e-one or more drives are not 100% healthy"
mail -s "RAID Problem-02 on host: "$HOSTNAME root <<< "-e-one or more drives are not 100% healthy"
exit
fi
#mail -s "RAID Test OKAY host: "$HOSTNAME root <<< "-i-test OKAY"
#-----------------------------------------------------------------------------
#find /var/log -name ${STUB}"*.trc" -a -mtime +2 -exec ls -la {} \;
find /var/log -name ${STUB}"*.trc" -a -mtime +2 -exec rm {} \;
echo "-i-exiting: "${0}" at "$(date +%Y%m%d.%H%M%S)
#
#!/bin/bash #============================================================================= # script : raid_analyze_file.sh # author : Neil Rieck # created: 2019-09-06 # purpose: 1) Reads a text file (searching for some key words) # 2) this script is called by /root/raid_monitor.sh #============================================================================= #set -vex # verbose, stop-on-error, xpand echo "-i-starting script: "${0} MYFILE="./raid_monitor.tmp" # hard coded fname echo "-i-reading: "${MYFILE} declare -i line declare -i good declare -i bad line=0 good=0 bad=0 while IFS='' read -r LINE || [[ -n ${LINE} ]]; do # echo "-i-data: "${LINE} if [[ ${LINE} == *"physical"* ]]; then ((line=line+1)) # test for: "Predicive Failure" or "Failed" if [[ ${LINE} == *"Fail"* ]]; then echo "-i-bad data: "${LINE} ((bad=bad+1)) fi if [[ ${LINE} == *"OK)"* ]]; then echo "-i-good data: "${LINE} ((good=good+1)) fi fi done < ${MYFILE} echo "-i-testing has concluded" echo "-i-report card" echo "-i- lines:"$line echo "-i- bad :"$bad echo "-i- good :"$good # # a little martial arts so we get the best exit value # if [ ${line} -eq 0 ] || [ ${bad} -gt 0 ] || [ ${line} -ne ${good} ]; then echo "-w-problems were detected" rc=99 else echo "-i-all is well" rc=0 fi echo "-i-will exit with code: "$rc echo "-i-exiting script: "${0} exit $rc