As Windows NT is more commonly deployed into enterprise environments, support tools are needed that provide quick solutions to problems that arise. Physical distance from the malfunctioning machines, reliability of information obtained from end users, and the determination of the appropriate troubleshooting steps are among many factors that make support of desktops and servers a complex proposition. There are a number of tools available today that attempt to ease the burden on IT professionals by addressing some of these factors.
E-Diagnostics for Windows, an Internet-based NT support tool
offered as part of a NT software support contract support
contract with Hewlett Packard's Customer Support Organization,
provides an end-to-end solution that offers the fastest possible resolution of
problems by:
· Guiding customers through a simple question and answer dialog that launches diagnostic probes throughout their environment in an attempt to solve the problem at the customer site[BJD1].
· Electronically escalating problems that cannot be solved locally to a Hewlett Packard Response Center, automatically transferring system configuration and diagnostic data.
·Frequently updating
the diagnostic capabilities of the onsite tools through an automated download
mechanism, maximizing the troubleshooting capability of the customer locally.
The remainder of this paper explores the problem of supporting NT, gives examples of currently available point products that partially address these issues, and details the end-to-end solution that HP provides.
A Typical NT
Deployment
Most NT environments of significant size have a similar architectural layout to
them:
A typical set up is to have a central IT group maintain the company-wide networking services such as user accounts (via one or many Primary Domain Controllers) and TCP/IP name resolution (via some combination of DNS, DHCP, and WINS). Field offices, connected to the central office through some kind of WAN, have a Backup Domain Controller to speed the authentication of the desktop machines and provide productivity services like file and print sharing, email, or database access using Back Office Applications. Finally, the end user desktops running NT Workstation, Windows '95 or Windows '98 interact with the local services through a LAN and offer productivity applications.
There are multitude of variations to this scheme; for example, some companies have only one site, while others have too many people for a single PDC. Nonetheless, the basic layout is similar across a vast majority of businesses.
Support Problem in
the Typical Environment
Given this environment, there are many support problems that arise:
· Physical distance from machines - Understanding a defect with a machine almost always requires obtaining data from it. If a central office is in Phoenix and a field office is in Orlando, it is not convenient to walk over to a machine with a problem.
·
Reliability of
information obtained from users - When an end user describes a problem, if
specifics are not correctly relayed to the IT professional trying to determine
a solution, the time spent crafting an answer can increase dramatically.
· Determination of diagnostic steps - Not all problem solvers are created equal. Often times, the personnel at the field offices do not have as much knowledge as those at the central office. Based on knowledge of the environment and NT in general, a more experienced person may take a dramatically different set of steps than a less knowledgeable person.
· Time taken to perform various diagnostic tests - Manually testing different things in an effort to get a clear picture of a problem takes a significant amount of time.
· Analysis of diagnostic test results - Once tests are run, their results must be analyzed and compared to some expected state. Like knowing what steps to perform, the success factor here is linked to the knowledge of the person performing the tests.
· Interaction with a support provider call coordinator - If the problem cannot be solved with local resources, a call can be logged with a support provider like Hewlett Packard. Most support providers have non-technical people answering the phone whose job it is to get information about the problem and then route the call to the appropriate team. Often crucial information can be lost as reported facts are translated into the call tracking system. Not only does this increase the chance that the call will be incorrectly routed, but information often has to be repeated when the correct support engineer gets involved.
· Data collection by the support engineer - Here, the support engineer capable of solving the problem might have to ask some of the same questions the call coordinator did because of the data loss when the problem description was entered. Even when that does not happen, more data is typically needed to get an accurate description of the environment.
· Multi-level internal support - The support picture is further complicated if a company has their own internal helpdesk that attempts to solve problems before the central IT personnel get involved. Typically, helpdesk personnel are not authorized to escalate the problem to the support provider and any problem data which has already been gathered and stored in the local call tracking system is not preserved as part of the escalation.
A number of tools offer partial solutions to the problems discussed above.
Diagnostic Tools
These tools are typically based on a reasoning system and present the user with a sequential list of questions retrieved from a local database.
The questions can be answered either by direct input from the user or by programatically querying the system. Based on the answers received, an attempt is made to determine the nature of the problem.Examples of such tools include First Aid from McAfee and System Wizard from Systemsoft.
The main advantage of these tools is that they are extremely easy to use and give consistency to the problem solving process. These products are typically most useful for diagnosing problems surrounding the use of a productivity desktop machine and are often deployed to all end user machines in an effort to reduce the number of "simple" calls that come into a corporate helpdesk.
The down side to such tools, however, is that the knowledge it contains is limited to solving problems on a single machine. In the typical NT environment, an end user can be experiencing a problem because of a root cause on a machine that is not their own. Additionally, these tools do not integrate with the other parts of the support chain. If the knowledge contained within it does not solve the problem and a call must be logged to escalate the issue, the data collected during the end user's aided diagnosis is lost. Finally, because these tools must be deployed in their entirety on each machine in an environment, the volume of overall corporate disk space is relatively high.
Still, for simple desktop problems that are isolated to a single machine, these tools are a good buy.
Remote Access
Tools
Typically, a remote access tool allows a system administrator to launch an application on his or her own system that makes a network connection to another machine in the computing environment. On the system administrator's machine, there is a window that has in it a live picture of the remote machine's screen that can be used to interact with the remote system as if the machine were sitting there locally. Examples of such tools are pcAnywhere from Symantec and Remote Desktop from McAfee.
Advantages of these tools are obvious. A system administrator can get access to any machine in the environment, run diagnostic tests, and change configurations without having to walk over to the machine, which is not always possible.
Access, however, is all that is provided. There is no help in determining what steps to take in diagnosing the problem, all data must be collected manually, and no data collected can easily be sent on to the next escalation point. Remote access tools are essential to any IT environment, but they do not provide a big enough solution to make support easier.
· Tools for remote access like pcAnywhere
· Pros: Provides access to machines remotely
· Cons: No help in diagnosing, data collection must be done manually, no escalation path
Electronic Call
Logging
Many support providers offer the ability to log a call through a web interface instead of picking up the phone. Typically such interfaces allow the customer to provide a detailed description of their problem along with information regarding how to contact them with a solution or a request for more specific data. An example of this is HP's Software Call Manager which is part of the Electronic Support Center.
This effectively circumvents the roll of the call coordinator in the call logging process. The description of the problem can be as specific and as technical as the customer wants it to be. The contents of the description are parsed for key phrases and electronically routed to the appropriate team. Because no accuracy or detail is lost the amount of follow-up data needed by the support engineer is reduced and the time needed to resolve the issue is far less than when logged by a phone call.
All data provided, however, must be entered manually by the customer. Additionally, that data is sometimes the customer's perception of the problem and not unbiased in nature. Obviously, electronically logging a call provides no help in trying to solve the customer before a call has to be logged either.
E-Diagnostics for
Windows, a HP
support
product currently
available
to NT
support contract customers and soon purchasable through the web for
non-contract customers, combines elements of the partial solutions to provide a
complete,
end to end solution to the NT support problem. It consists of:
·
Diagnostics that can involve more than one machine
in an environment
·
Remote, automated data collection
·
Integration with Software Call Manager for problems
that the software does not solve
Installation
Once the product has
been downloaded from Hewlett Packard,
installation occurs in two phases. On
an NT Server machine running Internet Information Server 4.0 or later, the
Management Server software is installed.
This is the central E-Diagnostics hub in the customer
environment. The logical components for
all diagnostics reside there as do the modules that communicate information
back to HP should the problem have to be escalated.
All machines that
could be potentially diagnosed with E-Diagnostics are called Nodes. In the second phase of installation, each Node must have at
least the Diagnostic Installer software installed on it.
The Diagnostic Installer enables dynamic installation of data collection
components while a diagnostic session is being executed. This module is provided because it is
unlikely that all problems will occur on all Nodes, so customers are not forced
to install all data collection components on all machines ahead of time. The Diagnostic Installer on the Node being
diagnosed communicates with the logical components on the Management Server to
insure that the needed data collection components are present before a
diagnosis takes place.
Process Flow
The process of using E-Diagnostics
starts when
a System
Administrator launches a browser from any machine in the environment and points
it to the Management Server and the E-Diagnostics URL. From this Management Client, the System
Administrator will select a diagnostic to run from the list provided.
If none of the diagnostics
currently available fit the problem that the System Administrator is
experiencing, the problem can immediately be escalated to HP by logging a call
electronically. In this scenario,
however, assume that a printing problem is being experienced. Next, the System Administrator must enter
the name of the machine that this problem is being experienced from.
Since configurations
can vary greatly between different machines in an environment, it is
important to diagnose the problem from
the perspective of the machine that is experiencing the problem.
After answering this first
question, the logical components on the Management Server first insure that the
needed data collection components have been installed on the Node
selected. If not, they are installed
using the Diagnostic Installer. Then,
the list of currently configured printers is collected off the Node in
question. This particular machine has several printers configured on it, so the
System Administrator must choose the printer in question.
After answering this first
question, the logical components on the Management Server first insure that the
needed data collection components have been installed on the Node
selected. If not, they are installed
using the Diagnostic Installer. Then,
the list of currently configured printers is collected off the Node in
question. This particular machine has
several printers configured on it, so the System Administrator must choose the
printer in question.
At this point for this
diagnostic, no more data will be required from the System
Administrator. The logical components
on the Management Server collect whatever data is needed from the data
collection components that have been installed on the Node. The logical components follow a specific set of steps in attempting to find
the root cause of the problem. When the
diagnosis is complete, the results are displayed:
Notice that a cause is
given as well as details regarding the exact set of steps that the logical
components performed during the course of the diagnostic.
Suppose that this
information did not solve the problem and a call should be logged with HP. The question at the bottom of the results
page "Did this solve your problem?" should be answered with "No,
log a call with Hewlett Packard." Selecting
this link causes three things to occur.
First, the customer is programmatically logged in to the Electronic
Support Center behind the scenes. Next,
all data that has been collected during the current problem solving session is
sent up to the Electronic Support Center so that it can be incorporated into
the new call. Finally, a new instance
of the browser on the Management Client is launched and pointed directly to the
Software Call Manager screen.
There, the System
Administrator can enter more information about the problem and submit their
call. The data that was sent up to the
Electronic Support Center by E-Diagnostics is appended to the call so that the
HP support engineer has access to it while attempting to determine a solution.
Benefits
·
Easy to Use - System Administrators
are guided
through a simple question and answer dialog during the course of
the diagnosis.
·
Consistent Diagnosis and Data Collection - Regardless of the skill
of the person attempting the diagnosis, the logical components run the same. For the less experienced person, this allows
them to solve problems they otherwise would not be able to. For the more experienced person, data is
collected and analyzed for them, reducing the amount of their time they spend
on a particular problem.
·
Integrated Call
Escalation -
Problems that cannot be solved
locally by
E-Diagnostics are easily escalated to a Hewlett Packard Response Center, automatically
transferring system configuration and diagnostic data.
·E-Diagnostics for
Windows
·Determines which
diagnostic steps to perform
·Performs diagnostic
tests on end user systems remotely
·Perspective from the
machine experiencing the problem
·Analyzes the results
·If the problem is not
solved, all data is saved and escalated to Software Call Manager.
·Current problem set
includes top 3 problem areas HP receives NT calls on:
·Network connectivity
·Network security
·Printing
·Planned for September:
·Blue screen
SQL
[BJD1] You may want to define more clearly what you mean by locally
[BJD2]Isn’t all walking physical : )