Print | Rate this content

HPE ProLiant DL580 Gen8 Server - qla2xxx [0000:c4:00.1]-801c:8: Abort Command Issued Messages Appear in /var/log/messages File

Title: HPE ProLiant DL580 Gen8 Server - qla2xxx [0000:c4:00.1]-801c:8: Abort Command Issued Messages Appear in /var/log/messages File
Object Name: mmr_kc-0123922
Document Type: Support Information
Original owner: KCS - ProLiant Servers
Disclosure level: Public
Version state: final
Environment
FACT:Red Hat Linux
FACT:HPE ProLiant DL580 Gen8 Server
Questions/Symptoms
SYMPTOM:qla2xxx [0000:c4:00.1]-801c:8:Abort command Issued messages appear in /var/log/messages
Cause
CAUSE:These errors indicate an error condition being returned from the SAN.
Answer/Solution
FIX: For the full document please go the the following Red Hat KB document 

Click here to access the article titled ""Abort command Issued" messages appear in /var/log/messages file".
 Non-HPE site
			
FIX:It would also be advised to engage the storage vendor to review the switch logs to verify if there are any error counters, CRC errors in FC switch logs.
FIX:Try to verify if there are any issues present from the FC switch, FC cabling, zoning or Storage array.
Error message qla2xxx [0000:04:00.0]-801c:1: Abort command issued nexus=1:0:0 --  1 2002 is explained below.
qla2xxx is the name of the driver or kernel module.
[0000:04:00.0] is the PCI bus information of the device.
801c is a hexadecimal id which uniquely identifies the part of the code from where the message originated.
1 is the host number of the scsi target.
Abort command issued nexus=1:0:0 The driver aborted the command that was in progress to the scsi target 1:0:0.
the last 1 means the driver spent time wait for the device to respond.
2002 means the reset succeeded
Multiple underlying issues can cause abort messages and a slow SAN.
Initial areas to investigate include SAN related components, such as the switches or storage targets.
Command aborts are almost always caused by command timeouts. The first course of action is to abort it to make sure that any references to it are erased. Command timeout could be caused by many different things: SAN congestion, a flaky target, bad hardware somewhere, or an overloaded target that might be dropping commands.

DIAGNOSTIC STEPS
CAUTION: Turning on extended error logging under moderate to heavy IO loads can cause lockups! The debug code logs information to /var/log/messages about IO being processed. These debug messages cause additional IO, which in turn causes more logging. This can get to the point of essentially locking up the system. It is strongly suggested that the messages file be moved off any QLogic-controlled disks to a local disk or via the network to a remote logging point to avoid this issue.

Enable extended logging for the qla2xxx driver to try to capture any additional error messages when the issue occurs

$ chmod u+w /sys/module/qla2xxx/parameters/ql2xextended_error_logging
$ echo "1" > /sys/module/qla2xxx/parameters/ql2xextended_error_logging
Check for additional error logging in /var/log/messages when the issue occurs:
Mar 14 00:04:51 hostname kernel: qla2xxx_eh_abort(1): aborting sp ffff8102c5614680 from RISC. pid=1048458952.
Mar 14 00:04:51 hostname kernel: scsi(1): ABORT status detected 0x5-0x0.
Mar 14 00:04:51 hostname kernel: qla2xxx 0000:46:00.0: scsi(1:0:109): Abort command issued -- 1 3e7e36c8 2002.
Increase scsi debug logging to get more information from the SCSI layer. It is possible to enable this without a reboot using sysctl in the following fashion:
$ sysctl -w dev.scsi.logging_level=0x1003
Note: Don't use other values, especially larger values such as 0xffff, unless you know exactly what each bit does. Turning on other values can flood the logs with so many messages that the important messages will be overwritten before ever being saved to disk and also cause huge log files to be created.

Please open cases with SAN and Fabric switch vendors involved in the case.

With scsi extended logging_level and ql2xextended_error_logging set, wait for a few events to occur and upload a fresh /var/log/messages file from the systems.

Check how many HBAs and if the errors are balanced over both or only on one of the HBA's:

Check HBA PCIID's:
$ grep QLogic lspci
02:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 02)
46:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 02)
Check the number of errors on each of the cards from /var/log/messages:
$ grep 02:00.0 var/log/messages | grep qla2xxx | wc -l  <-------- These are sample values
4
$ grep 46:00.0 var/log/messages | grep qla2xxx | wc -l  <-------- These are sample values
86
Check if there is something special on the fabric and paths for the device 46:00.0 (Please use the value that correspond to your own environment)

Enable extended logging on the qla2xxx driver

Click here to access the article titled "Abort command Issued messages appear in /var/log/messages file" Non-HPE site
			
Disclaimer
NOTE: One or more of the links above will take you outside the Hewlett-Packard Enterprise  Web site, HPE does not control and is not responsible for information outside of the HPE Web site.

© Copyright 2016 Hewlett-Packard Development Company, L.P.

Legal Disclaimer: Products sold prior to the November 1, 2015 separation of Hewlett-Packard Company into Hewlett Packard Enterprise Company and HP Inc. may have older product names and model numbers that differ from current models.

Provide feedback

Please rate the information on this page to help us improve our content. Thank you!
Document title: HPE ProLiant DL580 Gen8 Server - qla2xxx [0000:c4:00.1]-801c:8: Abort Command Issued Messages Appear in /var/log/messages File
Document ID: mmr_kc-0123922-11
How helpful was this document?
How can we improve this document?
Note: Only English language comments can be accepted at this time.
Please wait while we process your request.