HPlogo HP-UX iSCSI Software Initiator Support Guide: HP-UX 11i v1 & 11i v2 > Appendix D iSCSI Software Interface Driver Statistics

iSCSI Software Interface Driver Statistics

» 

Technical documentation

Complete book in PDF

 » Table of Contents

 » Index

Statistics are maintained in the iSCSI Software Interface Driver (SWD). These statistics are explained in Table D-1 “Software Interface Driver Statistics”.

The Class column (CL) provides message classification. Messages can be informational (I), target errors (T), transient driver errors (D), or connectivity problems (C).

Informational Messages  are counters for driver events. They are not an indication of an error, but should an error occur, they may provide some profiling information.

Target Errors  are detected at the initiator and should be reported to HP and/or the target vendor. Not all target errors are reported on the host side. It is the responsibility of the system administrator to monitor any device specific logs for target issues.

Transient Driver Errors  will typically occur when some resource, for example, memory, is in short supply, or something is not configured correctly. The error is considered transient, because a retry of the operation, or a correct re-configuration, would typically be successful. I/Os that experience transient errors will be retried, so no data will be lost. Control operations such as an application open, or a task management command, may not be retried (the determination to retry is left to the application or to the administrator). If the system resource load is increased, a small value for a transient driver error statistic may be an indication of problems . Larger values for the transient driver error statistic will start to impact performance.

Connectivity Problems  will typically be network or target availability problems. Connectivity problems are transient in the sense that a network infrastructure engineer can resolve the problem and I/O traffic will resume as before.

Table D-1 Software Interface Driver Statistics

iscsiutil Statistic

CL

Description of Field

Software Interface Driver Global Interface Statistics

Number of connection opens

I

The number of TCP connection opens initiated. The statistic is incremented when a call to open a connection is made.

Number of connection closes

I

The number of TCP connection closes initiated. The statistic is incremented when a call to close a connection is made.

Software Interface Driver Connection Statistics

Number of times the login failed

I/T

The number of login failures due to incorrect text / key values and formatting. If this is a transient problem at the target end, the initiator would recover on a successive retry attempt.

Number of exception status class values returned by target

D

The number of iSCSI login phase failure. The reasons for login failure can be determined by looking at the STM/syslog.log logs to find more detailed information. The class of login failures enumerated here consists of interoperability (potential protocol violations) issues between the host and the target.

Number of PDU headers with Protocol errors received by initiator

T

The number of login failures due to protocol violations by the target. The protocol violation occurs when a target sends a PDU login response header and the initiator determines that the response is not protocol compliant.

Number of times iswd daemon failed to open a connection

D/C

The number of times the iswd daemon failed to open a connection to the requested target. The failure can be the result of resource allocation failures, incorrect target configuration, or network infrastructure problems.The exact reason for the failed will be logged in syslog.log.

Number of failures to send a login command due to kernel memory allocation failure

D

The number of attempts to send the Login command that failed due to memory allocation failures. The upper level driver recovery may retry the session open, resulting in a re-attempt to send the Login command. If the memory allocation request succeeds, the Login command will transmit successfully.

Number of asynchronous failures waiting for a login response

D

The number of asynchronous failures experienced while waiting for a login response. The asynchronous failure might be due to a PDU exchange timeout/abort or lack of memory resources. The upper level driver recovery may retry the session open, resulting in a re-attempt to send the Login command. If the memory allocation request succeeds, the Login command will transmit successfully.

Number of asynchronous failures waiting for a logout response

D/C

The number of asynchronous failures experienced while waiting for a logout response. The asynchronous failure might be due to a PDU exchange timeout/abort or lack of memory resources. One additional attempt to complete the Logout is made by requesting the iswd daemon to close the TCP connection and tear down the stream.

Number of unexpected TCP closes in the active state

C

The number of unexpected TCP close events received during the connection ready state. As part of recovery, all the I/Os on this connection are aborted, the connection is closed, and a session reopen is triggered.

Number of timeouts on FIN after sending a logout command

C

The number of timeouts on FIN after the target has sent a logout response. A close is triggered by requesting the iswd daemon to close the TCP connection and tear down the stream.

Number of TCP connection open timeouts

C

The number of TCP connection open timeout occurs. The timeout will trigger the freeing of resources. The upper level driver recovery may retry the session open, resulting in a re-attempt to send the Login command. If the memory allocation request succeeds, the Login command will transmit successfully.

Number of unexpected connection closes after a login command

C

The number of unexpected TCP close events received while waiting for a login response. As part of the recovery mechanism, resources are freed, the connection is closed, and a session reopen is triggered.

Number of unexpected connection closes after a logout command

C

The number of unexpected TCP close events received while waiting for a logout response. As part of the recovery mechanism, the Logout PDU is aborted, resources are freed, and the TCP connection is closed.

Number of target authentication timeouts

I/C

The number target authentication timeouts that occurred during communication with the userspace iradd daemon.

Number of target authentication failures

I/C

The number of target authentication failures for CHAP. Either the target’s CHAP information is not configured in the RADIUS server, or the CHAP information provided by the target is incorrect.

Number of temporary redirection requests

I

Number of temporary login redirections requested by a target device.

Number of permanent redirection requests

I

Number of permanent login redirections requested by a target device.

Number of kernel memory allocation failures

D

The number of memory allocation attempts for a kernel PDU structure that failed. A failure to allocate a PDU structure means that some outbound command, or a NOP-OUT in response to a NOP-IN, could not be completed. As a result:

  • The regular occurrence of this event will have a negative impact on performance.

  • Failed I/Os will be retried according to existing SCSI retry policies (same as Fibre Channel and parallel SCSI), however, there is no retry policy for native iSCSI commands.

Number of streams message allocation failures

D

The number of streams message memory allocation attempts that failed. The regular occurrence of this event will have a negative impact on performance. Failed I/Os will be retried according to existing SCSI retry policies (same as Fibre Channel and parallel SCSI).

Number of PDU transmission failures due to an offline connection

D

The number of transmission attempts of native iSCSI commands that failed, because the connection on which the I/O was attempted, was offline. The regular occurrence of this event will have a negative impact on performance. Failed I/Os will be retried according to existing SCSI retry policies (same as Fibre Channel and parallel SCSI).

Number of streams message duplication failures

D

The number of streams message duplication operations that failed. As a result of the failure, the iSCSI Software Initiator will not transmit the related iSCSI commands. The regular occurrence of this event will have a negative impact on performance. Failed I/Os will be retried according to existing SCSI retry policies (same as Fibre Channel and parallel SCSI).

Number of PDU exchange timeouts

D/C/T

The number of iSCSI PDUs successfully transmitted to the target for which no response was received within a specified period of time. If a timeout occurs, the iSCSI Software Initiator will initiate a logout for the session as part of the recovery, and eventually will attempt to login again with the target. The problem could be a network infrastructure problem or a target congestion issue. The regular occurrence of this event will have a negative impact on performance. Failed I/Os will be retried according to existing SCSI retry policies (same as Fibre Channel and parallel SCSI).

Number of PDU exchanges aborted

C/T

The number of iSCSI PDUs that were aborted by the initiator while waiting for a response from a target. The abort occurs if a response to the iSCSI PDU has not been received at the initiator within a preset timeout period. The failure can also be due to an unexpected close of the TCP connection. The problem could be a network infrastructure problem or a target congestion issue.

Number of PDU exchanges abandoned

D/T

The number of exchanges between an iSCSI initiator and target that were abandoned. This will typically occur when the number of exchanges to complete a negotiation goes beyond a predetermined limit, usually indicating an infinite loop.

Number of I/Os issued on this connection

I

The number of SCSI I/Os that were sent to the networking stack by the iSCSI Software Initiator.

Number of I/O timeouts

D/T

The number of SCSI I/Os that did not complete within a time period preset by an upper level protocol (SCSI). The driver will recover from SCSI I/O timeouts using session level error recovery, for example, tearing down the session and starting over. Failed I/Os will be retried according to existing SCSI retry policies (same as Fibre Channel and parallel SCSI).

Number of kernel memory allocation failures

D

The number of kernel memory allocation failures for SCSI I/O related data structures. Failure to allocate will result in the I/O not being processed and returned to the SCSI layer. The regular occurrence of this event will have a negative impact on performance. Failed I/Os will be retried according to existing SCSI retry policies (same as Fibre Channel and parallel SCSI).

Number of streams message allocation failures

D

The number of memory allocation attempts for a kernel driver structure that failed. A failure to allocate the driver SCSI structure means that some SCSI I/O could not be completed. The regular occurrence of this event will have a negative impact on performance. Failed I/Os will be retried according to existing SCSI retry policies (same as Fibre Channel and parallel SCSI).

Number of I/Os delayed while waiting for resources

D

The number of memory allocation attempts for a kernel driver structure that failed. The iSCSI Software Initiator will retry the allocation request a set number of times. The ultimate failure to allocate the driver structure means that some SCSI I/O could not be completed. The regular occurrence of this event will have a negative impact on performance. Failed I/Os will be retried according to existing SCSI retry policies (same as Fibre Channel and parallel SCSI).

Number of I/Os failed due to an offline connection

D

The number of SCSI I/Os that were aborted as a result of retrying memory resource allocation when the connection went offline. As a result of this event, the corresponding SCSI I/O will be aborted. This could be a network infrastructure problem. Failed I/Os will be retried according to existing SCSI retry policies (same as Fibre Channel and parallel SCSI).

Number of I/Os failed due to memory resource constraints

D

The number of SCSI I/Os that failed to acquire memory resources within the maximum number of retries. As a result of the memory allocation failures, the SCSI I/O was failed backed to SCSI. Failed I/Os will be retried according to existing SCSI retry policies (same as Fibre Channel and parallel SCSI).

Number of invalid Data-In PDUs received

T

The total number of invalid Data-In PDUs sent by the target. A non-zero value for this stat indicates that the target is not adhering to the iSCSI protocol by not sending Data-In PDUs with:

Buffer Offset in increasing offset order with non-overlapping ranges.

DataSN in increasing order.

Number of I/O underruns

I

The number of I/Os on a connection where the number of bytes that were received by the initiator did not match the number of bytes that the target sent.

Number of I/O underflows

I

The number of I/Os on a connection where the target sent less data than what was requested by the initiator. A large value for this statistic is normal.

Number of I/O overflows

I

The total number of I/Os on a connection where the target sent more data than what was requested by the initiator. This indicates that the Expected Data Transfer Length in the I/O request was not sufficient.

Number of I/O failures due to response code errors

T

The total number of I/Os that failed due to a response code error in the SCSI response PDU. This means that the target failed to execute the I/Os. A large value here could indicate that the target is not functioning properly.

Number of Data-In PDUs received without data

I

The total number of empty Data-In PDUs received by the initiator.

Number of invalid R2T PDUs received

T

The total number of Request to Transfer (R2T) PDUs from the target that had an incorrect buffer offset or Data Transfer Length in them. A non-zero value for this stat indicates that the target is not adhering to the error recovery policy negotiated by the initiator.

Number of I/Os that failed to respond to an R2T due to kernel memory allocation failures

D

The total number of Request to Transfer (R2T) PDUs that could not be honored by the initiator due to resource constraints at the initiator. A non-zero value for this stat indicates that the initiator is intermittently running out of resources while handling iSCSI I/O traffic.

Number of unexpected R2T PDUs received during a Read I/O

T

The number of Request to Transfer (R2T) requests sent by the target for a Read operation. This is an unexpected behavior.

Number of unexpected Data-In PDUs received during a Write I/O

T

The total number of data-in PDUs received by the initiator while the initiator was doing a write operation to the target. This is an unexpected behavior.

Number of Data-In PDUs with incorrect residual count

T

The total number of I/O requests where the target had an incorrect residual count value and where the status for the I/O was sent as part of the last Data-In PDU. This might happen when the target indicated an underflow condition but the residual count value did not match the expected residual count.

Number of SCSI Response PDUs with incorrect residual count

T

The total number of I/O requests where the target had an incorrect residual count value set. This might happen when the target indicated an underflow condition but the residual count value did not match the expected residual count. The response PDU in this case was sent as a separate PDU by the target.

Number of I/O underruns

I

The number of I/Os on a connection where the number of bytes that were received by the initiator did not match the number of bytes that the target sent.

Number of I/O underflows

I

The number of I/Os on a connection where the target sent less data than what was requested by the initiator. A large value for this statistic is normal.

Number of I/O overflows

I

The total number of I/Os on a connection where the target sent more data than what was requested by the initiator. This indicates that the Expected Data Transfer Length in the I/O request was not sufficient.

Number of I/O failures due to response code errors

T

The total number of I/Os that failed due to a response code error in the SCSI response PDU. This means that the target failed to execute the I/Os. A large value here could indicate that the target is not functioning properly.

Number of Data-In PDUs received without data

I

The total number of empty Data-In PDUs received by the initiator.

Number of invalid R2T PDUs received

T

The total number of Request to Transfer (R2T) PDUs from the target that had an incorrect buffer offset or Data Transfer Length in them. A non-zero value for this stat indicates that the target is not adhering to the error recovery policy negotiated by the initiator.

Number of I/Os that failed to respond to an R2T due to kernel memory allocation failures

D

The total number of Request to Transfer (R2T) PDUs that could not be honored by the initiator due to resource constraints at the initiator. A non-zero value for this stat indicates that the initiator is intermittently running out of resources while handling iSCSI I/O traffic.

Number of I/O failures due to streams message concatenation memory allocation failures

D

The total number of I/O failures in the inbound path resulting from a failure of msgpullup call.

Number of holes seen in the status sequencing

D/T

The total number of PDUs that were received where the status sequence number of the PDU does not match the expected status sequence number. For error recovery level zero, this will cause the initiator to do a Session level logout.

Number of SCSI Async events received

I

The number of asynchronous events received by the initiator.

Number of "target requests logout" Async events received

I

The total number of times the target sent an asynchronous event with the AsyncEvent set as "target requests logout".

Number of "target will drop connection" Async events received

I

The total number of times the target sent an asynchronous event with the AsyncEvent set as "target will drop connection".

Number of "target requests parameter negotiation" Async events received

I

The total number of times the target sent an asynchronous event with the AsyncEvent set as "target request parameter negotiation".

Number of "vendor specific" Async events received

I

The total number of times the target sent an asynchronous event with the AsyncEvent set as "vendor specific Async event".

Number of Reject PDUs due to Data Digest errors

C

The total number of Reject PDUs sent by the target that had the Reason set as "Data (payload) Digest Error".

Number of Reject PDUs due to SNACK rejects

I

The total number of Reject PDUs sent by the target that had the reason set as "SNACK Reject".

Number of Reject PDUs due to Protocol Errors

D

The total number of Reject PDUs sent by the target that had the Reason set as "Protocol Error ".

Number of Reject PDUs due to excessive Immediate Commands

T

The total number of Reject PDUs sent by the target that had the Reason set as "Immediate Command Reject". This typically happens if the target has too many outstanding immediate commands.

Number of Reject PDUs due to "Task in progress"

T

The total number of Reject PDUs sent by the target that had the Reason set as "Task in Progress".

Number of Reject PDUs due to Invalid Data ACK

D

The total number of Reject PDUs sent by the target that had the Reason set as "Invalid Data ACK".

Number of Reject PDUs due to Invalid PDU field

D

The total number of Reject PDUs sent by the target that had the Reason set as "Invalid PDU Field".

Number of Reject PDUs due to Lack of Target Resources

T

The total number of Reject PDUs sent by the target that had the Reason set as "Long Operations Reject". A large value indicates that the target is frequently running out of resources.

Number of Reject PDUs due to Negotiation Resets

I

The total number of Reject PDUs sent by the target that had the Reason set as "Negotiation Reset".

Number of Reject PDUs due to target Waiting for Logout

I

The total number of Reject PDUs sent by the target that had the Reason set as "Waiting for Logout".

Number of Reject PDUs due to Miscellaneous reasons

T

The total number of Reject PDUs sent by the target for which the Reason code does not match any of the currently defined codes.

Time when statistics were last cleared

I

The time that the statistics were last cleared. This provides an indication of the period of time to which the statistics can be applied, and therefore can be used for averaging the statistics. Because each system is different, separate statistic rates can be determined on a per-system basis and used to determine load changes.