Y2K REMEDIATION OF DISTRIBUTED INFRASTRUCTURE
Wayne Bock and Terry McLain
A•R•C
100 Tri-State International
Lincolnshire, IL 60069
847-317-1000
847-317-1008
Most organizations have long-standing, extensively funded and structured Y2K remediation efforts underway for mainframes, legacy systems and embedded processors. They have also paid considerable attention to business specific applications, both on the mainframe and in the distributed environment. The interoperability of these business applications with those of their business partners has been another point of emphasis.
Until
early this year however, a number of organizations neglected to pay significant
attention to the Y2K readiness of the general distributed
infrastructure—networks, operating systems, shrink-wrapped applications, PC
hardware. Some companies chose to wait
until 1999 to address this issue, while others belatedly became aware of the
issues and complexities of preparing the infrastructure for Y2K.
In
the third quarter of 1998, A•R•C was requested, by a number of our clients, to
investigate and evaluate how we could leverage our very well established and
accepted technology deployment service to assist IT organizations in preparing
their distributed infrastructure for the millennium. We initially thought it was that it was too late. A few days of investigation convinced us
that we were wrong and that many organizations can use help.
We
discovered Y2K initiatives for many organizations tended to fall into one of
three categories:
·
Had
started Y2K activities in their infrastructure environment but were still
unsure of the effectiveness and accuracy of inventory attempts.
·
Were
bogged down in the mechanics of inventory.
·
Positions
regarding infrastructure hardware and software with respect to Y2K had not been
developed.
Impeding
many of these activities was:
·
Distributed
responsibility: IT is only responsible for a subset of the infrastructure.
·
Lack
of fully implemented standards and standard images. (Often an outcome of
distributed responsibility.)
·
An
under-appreciation by corporate Y2K steering committees of the complexity of
the needed inventory activity in the distributed environment.
In addition, the critical need to involve end users in the process and to determine the level and nature of their participation was often only sketchily understood.
On the other hand, a sizable number of IT organizations appreciated how the need to undertake Y2K infrastructure remediation provided them a great opportunity to make significant strides forward in such areas as asset management and establishing desktop standards, and were keen to take advantage of this situation.
Many IT organizations very much wanted to retain and maintain the value of the Y2K inventory; a difficult thing to do if the procurement, installation, move, add and change processes are not modified to update the inventory (asset) database subsequent to completing the inventory. Establishing a coherent asset database and making such process changes, in parallel with Y2K remediation, is an ambitious undertaking that can influence the overall timeline. Such risks must be carefully weighed.
Given
the uncertainties, multiple agendas and risks inherent in the situation
described above, we became convinced that we should develop a comprehensive,
methodological approach which would be tailored, through assessment, to the
specific needs of an organization. By
doing this we could help IT organizations make the appropriate decisions to
balance risk, cost and quality, thus achieving a controlled state of Y2K
readiness with documented risks and gaps, backed up by appropriate contingency
plans.
The
methodology would provide a template and model for guiding the IT organization
through this process. This allows,
through collaborative effort with the IT organization, a rational selection of
those steps and activities which meet their needs and satisfy their risk
profile. All the while ensuring that
the scope of the project remains manageable and focused on an outcome
achievable within the immutable constraints of the millennium timeframe. It
should be noted that throughout this entire process the ultimate responsibility
for Y2K readiness would remain with the IT organization.
This
paper describes the methodology we developed and our experiences applying it in
numerous environments.
One
of the key distinctions in our methodology is the use of the term Y2K ready as opposed to Y2K compliant. There is a balance that must be struck between the inherent risks
and costs of those two solutions.
In
our lexicon Y2K ready refers to a
device or software that has been modified or patched to store and interpret
dates such that the year 2000 is correctly processed, and has been tested to
verify this. Y2K compliant refers to a device or software that is designed and
manufactured to store and interpret four-digit dates and is warranted to do so
by the manufacturer.
We draw this distinction to emphasize the need to maintain the ongoing Y2K readiness of a device or software, once it has been remediated, through effective change management and monitoring processes. The methodology provides the client with mechanical assistance in this area. However, since we do not assume operational control of all areas of the distributed infrastructure, we are unable through the methodology alone to guarantee continuing readiness. IT organizations therefore must make a risk/cost decision between making existing devices or software Y2K ready, or replacing them with those which are Y2K compliant. Because of limited or inefficient change management processes and control, this is a serious concern to many IT organizations. They must evaluate whether their current operations processes are robust enough to maintain readiness.
Some IT organizations see Y2K remediation as a great opportunity to:
· improve change management processes
· introduce lock down control of the desktop
· install standard software images
Their ability to do this within the timeframe depends on how mature their current operations processes are and how much preparation time and effort has already been expended.
A second distinctive element of our methodology is the extensive and rigorous use of position statements. These position statements are policy decisions and supporting rationale, made by the client, that clearly state the specific action to be taken to make a device or software element Y2K ready. Before, during and after the discovery and inventory phases of the methodology, we work with the IT organization to establish, document and further refine a clear position statement for every device type and software element that is discovered in the infrastructure. These decisions are guided, in part, by information developed by the manufacturer. Much of this information is provided to IT organizations through our methodology and supporting tools. The position statements are stored in a database that is readily accessible to the field technicians and the end users, usually through a web interface.
The position statement usually includes one of four options:
Remediate – apply appropriate patches and fixes, etc.
Upgrade – install a new version or engineering change
Replace – install a new product, model or release
Remove (Retire) – eliminate from inventory
There are two significant reasons for this emphasis on rigorously establishing position statements.
· To assist the client in making appropriate risk/cost evaluations through thorough analysis of manufacturer data.
· To ensure that the remedial implementation is executed with a minimum of errors and confusion, and without significant cost overrun.
To be effective and successful, technicians in the field
who are actually performing the changes need clear, concise scripts and
instructions with a minimum of decision points. This is also true for the end users, who should be prepared for
what is happening to their desktops.
Easy reference to the database of position statements is a great help in
achieving these ends. The importance of
good and effective management of user expectations is a lesson we have learned
from our help desk practice.
End User Awareness
Another distinctive element in our methodology is the emphasis on end user awareness through comprehensive communication, training and support. For a successful outcome it is vital that the end users participate in the activities, are kept informed of progress and, through representatives, have significant input into developing position statements. After all, they understand what is mission-critical in their functional areas, particularly when it comes to databases and spreadsheets. For practical reasons, our methodology does not attempt to remediate either the storage or manipulation of data in spreadsheets or end user databases. It does provide the end users with tools, training in the use of those tools and the technical support that will enable them to remediate those data that they, themselves, consider critical. We strongly advise the IT organization not to attempt this remediation directly. It is interesting to note that some organizations leave end users to their own devices for this class of remediation. One obvious concern about this choice, aside from lost productivity, is the impact it could have on the help desk at the start of the millennium. Planning for extra help desk load at the end of the year, although not a specific element of this methodology, is a very important contingency planning consideration. Consequently, our methodology includes:
· The development of a robust communications plan, most often taking advantage of web technology, to disseminate status reports and the latest position statements.
· A training plan to help end users remediate their own data. We have found the use of video to be very effective.
· Hotline access to an on-site team.
· End user consultation in the establishment of position statements.
Our
complete methodology is broken down into five primary phases, depicted below,
each with a set of tasks that are described in detail later:
·
Pre-inventory
Assessment
·
Inventory
and Audit
·
Assessment
and Remediation Planning
·
Remediation
Implementation
·
Validation
The output and deliverables from each phase serve as inputs to the next phase. However, the phases and the steps within each phase have been structured so that subsets can be selected and combined to meet specific client circumstances. IT organizations respond well to this approach both because of its flexibility and also because it gives the client an opportunity to be aware of what is being omitted and the related risks that represents. Reasons IT organizations select particular subsets of the methodology are many and varied:
·
Availability
of internal resources to undertake some of the steps
·
Cultural
position of IT
·
Cost
constraints
·
Well
established and effective inventory processes are already in place
·
IT
organization’s pre–disposition to certain tools
·
Geographical
distribution of end users
·
Unwillingness
to invest in test lab
·
Recent
refresh of desktop
·
Outside
vendor provided asset management
A very common subset that we have responded to comprises:
·
An
inventory and audit
·
Standard
reports
·
Hardware
(BIOS) remediation of well-defined and predetermined configurations.
In many cases, this
can be effectively done in one sweep of the infrastructure, often referred to
as a “health check.”
Our
experience to date has shown that the first four phases of this methodology can
be successfully executed in 12 to 18 weeks, depending on the size of the IT
organization’s inventory.
Pre-inventory assessment
phase
The principal objectives of this phase are discovery, preparation and information gathering, and analysis that will enable the development of a detailed plan to drive the rest of the phases. Some of the information will have already been collected. One significant output of this phase is a reliable estimate of the cost and resources necessary to complete the assessment and remediation. We have found, somewhat surprisingly, that some IT organizations have considerable difficulty in establishing baseline information. As a result, it is necessary to do primary discovery during the inventory and audit phase. This makes it almost impossible to reliably estimate resources required for the inventory and audit phase. The following activities are carried out during this phase:
· Identify the detailed scope of components and associated requirements
· Identify and review configuration standards and images
· Develop a baseline of existing systems (numbers and locations)
· Review warranty and maintenance status of systems
· Determine the appropriate methodology for inventory and audit (remote or on-site). This can be a function of current operations practices, tools that are already in place and/or operating system levels that are installed, among other variables.
Note: There are some trade-offs to be made between on-site and remote inventories that should be reviewed prior to making a decision on the best practice.
· On-site inventories will do a hard BIOS test whereas remote inventories will only do limited software compliance testing.
· Remote audits will not collect geographical information. This limits reporting options.
· Remote audits may not collect other key information such as serial number.
· Remote audits do not permit affixing an asset tag.
· Review inventory data, if the IT organization has already collected it. This review will determine whether the Y2K readiness project can proceed as planned or whether an additional or new inventory is required.
· Select tools to be used in the inventory and audit phase. If the IT organization is providing tools or has pre-determined which tools should be used, we review their effectiveness and advise on the overall impact. Then the most effective tool is selected and utilized.
· User awareness planning and identification of affected parties
· Identify relationships with other Y2K initiatives, establish single points of contact with MIS and other IT groups, understand their timelines and expectations.
· Identify affected SBU groups, understand their timelines, priorities and expectations.
· Develop an ongoing end user communication plan and training plan, determine methods of contact (e.g., web page), and determine human and technical resources needed to implement plan.
Note: The training plan might include providing tools
and training to end users so that they or the client’s IT department can
remediate data. Training assistance and the
tools can be provided by the vendor but it is not suggested
that any vendor or IT organization get directly involved or take responsibility for the actual remediation of
these data. If this is the situation
with the Client it must be clearly stated in the
SOW.
· Review with the IT organization, develop and document:
· Standards for date use and testing
· Remediation prioritization criteria (work with end users)
· Readiness standards
· Preliminary position statements for each application and system element.
· Establish a test laboratory
· Determine configuration to adequately simulate the enterprise.
· Obtain, through review with the IT organization, budget and resources to create lab.
· Set up and configure lab.
· Set up and configure an audit console. (Will depend on the type of inventory audit being undertaken.)
Note: Some IT organizations have been reluctant, mainly for cost reasons, to set up test laboratories. Although it is possible to work around this, the use of test laboratories minimizes the risk of disruption in the inventory phase and, more particularly, in the remediation phase where interoperability testing can forestall a number of issues.
· Develop, for IT management’s review and approval, more refined budgetary planning estimates for all succeeding phases. (It is assumed that budgetary planning estimates were developed in the original statement of work.)
· Develop and produce a detailed statement of work for the next phase, if necessary. This should include a specific price for the next phase. Review with client and obtain approval to proceed.
· Project initiation, planning and on-going project management
· Identify an IT organization single point of contact. This is critical for there is much in this methodology that must be reviewed with the client.
· Set up schedule and format for status reports and status meetings.
· Establish schedule and procedure for “client review and approval to move to next phase of project” for each project phase.
· Install project management tools. Create and distribute project plan.
· Assign resources.
· Conduct project kick-off.
· Set schedule for quality reviews.
Inventory and Audit Phase
This phase results in the detailed collection of information concerning all systems that can be analyzed and audited to help determine the scope and specifics of remediation activity. It is vital to the development of an effective and comprehensive remediation plan. Typically, this information will be collected to a repository, usually an SQL database. This activity can be executed remotely via server-based tools that are activated at logon or by physically running appropriate scripts on each station. The appropriate method will have been determined in the pre-inventory assessment phase. One important consideration is whether to suspend installs, moves, adds and changes (IMACs) during this phase, since the inventory data can be quickly obsoleted unless the IMAC process can be constructed or modified to update the repository. This must be carefully reviewed with the IT organization. The following activities are to be carried out during this phase:
· Obtain and test selected inventory and audit tools. For hands-on audits this may include virus detection and eradication software.
· Select and establish an enterprise audit repository. Normally this will be an SQL database. At some IT organizations a suitable repository may already exist.
· Establish and install a Y2K knowledge base.
· Develop and test hardware and software scripts and audit procedures.
· For remote audits: develop login scripts, install standard image, develop and test audit software rollout plan, obtain client approval of rollout plan. This should include a pilot rollout.
· For hands on audits: we suggest engaging a technology deployment services provider (TDS) and following their standard procedures. Include a pilot test.
· Select and obtain approval of the pilot group.
· Conduct pilot test, review results and modify rollout or deployment plan accordingly.
· All of the above should result in a set of documented deliverables including:
· For remote audits: configuration of audit server, login script and back up procedures.
· For hands-on audits: inventory and audit scripts, written procedures.
· An implementation plan including timelines and schedules.
· Written approval from the IT organization.
· Enterprise Implementation
· For remote audits: populate collection server, turn on audit software, collect audit data.
· For hands-on audits: execute TDS plan, collect audit data.
· Port audit data into enterprise repository, review supported applications by department, review unsupported applications by department, develop and review, with IT management, position statements for all applications.
· Perform any reconciliation processes that are feasible.
· From the above, the following deliverables should be created
· General audit reports by department
· Application reports by department for supported applications
· Application reports by department for unsupported applications
· Readiness reports by department with severity assessments
· Readiness report for the enterprise with severity assessments
· Repository containing all collected data
· Develop and produce a detailed statement of work for next phase, if necessary. This should include a specific estimate of resources required for the next phase.
· Communicate plans and status to the end users via selected communications media.
Assessment and Planning
This phase results in an assessment of non-compliance
severity, a further assessment and prioritization of the remediation options
and development of budgetary support for the options. It will also yield
development of a specific remediation methodology including tested remediation
packages and a sequence for deployment.
Note: The development and deployment of remediation packages maybe linear to optimize the use of human resources. The following activities are to be carried out during this phase:
· Assess non-readiness severity utilizing knowledge base, manufacturers’ web sites and end user input. Classify and prioritize issues by application, develop expanded or modified position statements for each element in the infrastructure. This will be done in conjunction with the client and end users. This activity should result in a complete list of supported and unsupported applications as well as position statements, assumptions and constraints.
· Assess the efficacy of individual remediation options: fix, upgrade, replace or retire and add to the position statements.
· Determine whether remote or on-site remediation, or a combination, is required.
· Develop and test remediation packages, develop a budgetary cost to implement, engage a TDS to help in this process, if necessary. This may include developing alternate plans. The plans may include:
· BIOS and operating system remediation. (Flash BIOS or install new device drivers: it maybe necessary to identify and list different devices for which you will take different actions, based on manufacturer information.)
· List of devices and device types to be replaced.
· Installation of tools to maintain readiness.
· Supported application remediation packages.
· Unsupported application remediation packages.
· Implementation methodology, remote or on-site.
· Sequence of implementation.
· Pilot test.
· Process for tracking remediation, success and failure reports.
· A contingency plan.
· Review and obtain approval of the IT organization for the implementation of remediation plan.
· Develop and produce a detailed statement of work for next phase, if necessary. This should include a specific estimate of resources needed for the next phase.
· Communicate plans and status to end users via selected communications media.
Remediation Implementation
This phase is relatively straightforward but requires excellent management of resources. For instance, it might make sense to suspend IMACs if this hasn’t already been done. This may be a difficult decision for the IT organization to make but the benefit can be significant. However, the potential gains vs. costs should always be carefully reviewed before any option is selected. A TDS may be very actively engaged during this phase.
· Execute pilot test, make adjustments to the plan as indicated by the pilot test.
· Execute remediation plan. This may be a multi-step process.
· Test remediation.
· Document remediation activities and record status in a remediation tracking database.
· Produce readiness reports by department.
· Develop and produce a detailed statement of work for next phase, if necessary. This should include a specific estimate of resources needed for the next phase.
· Communicate plans and status to end users via selected communications media.
Validation and Project
Closure
The purpose of this phase is to document, report and test the status of remediation to ensure the greatest probability of ongoing readiness, even as the dynamic nature of the distributed infrastructure makes warranty of ongoing readiness impractical. The most that can be guaranteed is that at the time a device was tested, it was in a state of readiness. This phase may include a subsequent audit to test readiness. The steps in this phase include:
· Set up and maintain monitoring tools if they have been selected for deployment. Obtain client permission to deploy.
· Select sample of end users and develop image gathering schedule for the enterprise. Produce and document procedures to gather the images.
· Gather images and deploy for testing in the lab.
· Test images in the lab.
· Update and maintain repository of information detailing the status of compliance.
· Develop reports
· Tracking status
· Listing open issues
· Risk analysis
· Close open issues through appropriate action and update readiness reports.
· When appropriate, communicate results, status and compliance to the enterprise, partners and end users.
·
Provide the client with all relevant documentation to
effect closure.
CONCLUSION
It is premature to draw too many conclusions from the actual implementation of this methodology, since most IT organizations are in the midst of execution. However, many IT organizations have responded very positively to this approach. It is clear to them that this structured approach does allow IT organizations to make appropriate risk/cost trade-offs, to anticipate the hidden complexities in this area of the Y2K problem and to optimize the peripheral value without over-committing resources.
It is clear, however, that:
· Many IT organizations have underestimated the complexity of distributed infrastructure remediation. This has led to delayed planning of infrastructure projects.
· A structured methodology is a critical success factor. Tools, although important, are secondary.
· IT organizations must engage the distributed end users and make them aware of Y2K activities and positions, in particular, to ensure that proper contingencies are planned. In this respect, the help desk may be very important.
· Many IT organizations recognize the need to address the Y2K issue as a great opportunity to enhance enterprise system management processes and achieve longer-term goals.
· On the other hand, IT organizations must resist the temptation to plan too many process changes within the limited timeframe imposed by Y2K. This must be carefully planned.
· Using good methodology, infrastructure remediation can be achieved in 12 to 18 weeks, depending on the size of the IT organization.
As of mid-year, it is still not too late for many IT organizations to implement this methodology.