Network Operations Plan
DRAFT
INTRODUCTION
This plan discusses procedures that Network Operations group uses to provide superior customer service and control cost on (COMPANY NAME) principal networks. This plan has the following objectives;
1. Evaluate the level of customer service the network is providing, and keep Management informed about service trends.
2. Introduce a uniform reporting structure so all regions are reporting and Evaluating their service and cost indicators on the same basis.
3. Establish an information clearing house so techniques that prove effective in one region are communicated to other regions that can profit from their use.
4. Establish a process for measuring the quality of service being provided by other companies, principally local exchange carriers (LECs) and common carriers.(AT&T)
5. Avoid the need for each region to develop its own procedures.
6. Provide uniform procedures to support the following sub-objectives:
a. Customers can deal with (COMPANY NAME) in the same manner regardless of the service area.
b. Personnel who move from one region to another are trained in the same operating methods
c. Operational training costs will be reduced
7. Identify and investigate troubles that are not readily seen by local systems and processes.
8. Ensure a high degree of network security to support (COMPANY NAME) national security and emergency preparedness commitments.
9. Develop a network technical support team, addressing issues such as Switches, networking, routing, radio and standards.
ELEMENTS OF THE OPERATIONS PLAN
The (COMPANY NAME) Network Operations Plan is divided into six sections:
. Fault Management
. Performance Monitoring
. Security Administration
. Information Management
. Configuration Control
In general, the regional Network Operations forces are responsible for all maintenance, surveillance, and control of Network Operations with the responsible for maintenance, surveillance, and control of the principal networks. In addition, Network Operations is responsible for supplying operating practices and methods to the field forces, and for providing an information-clearing house to facilitate the flow of technical data and methods to the field.
Most of the elements and procedures required to implement this plan are not in place today, and it is obviously impractical to implement them all at once. The final section in this plan separates the tasks that must be completed in to priority levels, and discusses the recommended order for implementing controls and procedures. The following sections discuss in more detail what comprises each element.
Fault Management
A primary role of Network Support is administering the process of locating and clearing trouble on the network. Field forces in the regions are responsible for all fault management activities on the All Network. The Network Operations handles all fault management activities for the Network and SS-7 networks except for on-site repair, which is normally assigned to the field forces.
The Network Operations is also responsible for developing and issuing procedures that the field forces use for logging troubles and analyzing patterns. One procedure that is needed by all levels of maintenance forces is a trouble ticketing procedure. We need a uniform trouble reporting structure, reporting such variables as these:
. Circuit ID.
. Time of outage.
. Whom the trouble was reported to for each referral.
. When the trouble was reported.
. When the trouble was cleared.
. What caused the outage?
. What circuit element failed?
. What affect the failure had on service?
There are at least three different types of tickets that are maintained:
· A circuit trouble ticket to track clearance of circuit defects. This ticket is made when trouble is detected on a particular circuit. It is closed when the circuit is restored, even though the restoral is accomplished by taking equipment out of service.
· An equipment trouble ticket when hardware or software faults result in equipment out of service. The equipment ticket is made when a fault is proved into a particular item of equipment. It is closed after the equipment is repaired, which can be later than when the service is restored.
· A repair ticket for the field forces to use in recording tests made, equipment removed from service, trouble found, work done to clear the trouble, hours spent, and other such indicators of repair force activity.
To the maximum degree possible, as procedures are developed, they will use a mechanized process that is, ideally, part of a network management system that provides for trouble logging, clearance, analysis, and escalation.
The following is a brief description of the fault management process, and definitions of the steps in controlling network faults.
· Trouble Detection: Network troubles are detected through alarms and diagnostic routines of Switches and transmission equipment. In some instances customers problem reports also lead to trouble detection, although the objective of the trouble management process is to discover faults before customers become aware of them. At the trouble detection stage, a trouble ticket is opened and carried through the clearance process.
Service Protection: Network maintenance people sometimes must remove equipment from service or take other action to protect the service from further erosion. Any equipment or facilities removed from service are logged on the trouble ticket.
Trouble Notification: This is the process of notifying the appropriate forces of the trouble so the trouble clearance process can begin. The group or individual to whom the trouble was dispatched, and the time of notification are entered on the trouble ticket.
Trouble Verification: This is the process of testing, substituting circuits or equipment, and other steps taken to determine what the trouble is, and the likely cause.
Trouble Isolation: This is the process of determining what jurisdiction has the fault, and in what equipment or circuit the trouble is found.
Repair: When the trouble is located, the equipment or circuit is repaired or the trouble is cleared by temporarily patching to spare facilities.
Repair Verification: This is the process of checking to see that the repair activity cleared the reported trouble.
Return to Service: After the repair is verified, the circuit or equipment is returned to service.
Escalation: Escalation procedures are necessary to trigger certain events when a service-affecting fault has not been cleared after a set amount of time. These procedures also discuss the process for obtaining technical assistance from the manufacturer or other sources.
Performance Monitoring
An effective network management system includes a process for measuring service levels. Service evaluation on the All Network is the responsibility of the regions. If in the future an interregional network with backbone circuits between SWITCHES is established, the national office may become involved with network performance monitoring.
Such factors as these will be detected and controlled:
. General network overloads--caused by a change in traffic patterns.
. Focused overloads--caused by a local situation such as a call-in program, storm, etc.
. Switches system overloads.
. Trunk group overloads.
The two principal internal data networks, require performance monitoring to ensure that they are delivering the necessary level of service. Service levels are particularly important on the SS-7 network, where call-setup can be adversely affected by an overloaded network.
The primary role of the Network Operations is to establish the methods and procedures under which network performance is monitored. The goal of this process is to collect as much information from Switching systems, multiplexers, and other devices as possible to reduce the amount of field effort required to monitor service.
Internal Service Monitoring
These indicators are tracked internally to determine the service level the network is providing to users inside the company.
Host-to-terminal response time: In the Network, this measurement determines how effectively the network supports on-line users. The measurement does not determine the effect of the host computer processing time. The Network Operations Center establishes response time objectives and provides service-level reports for management and using organizations.
Network availability: Availability is defined as the percentage of up-time on the network. Network Operations establishes availability objectives, and determines actual availability from the data retained in the Switches systems.
Customer Service Monitoring
Inter-machine trunk/network congestion detection and control: The regions are responsible for controlling service on the All Network. Blockage reports are available from the Ericsson Switches systems, and generally are the source of information for measuring this variable.
Abnormal condition detection and reporting: A plan is needed for reporting abnormal conditions. This plan defines what constitutes a abnormal condition and includes an escalation procedure for reporting to higher level management and for technical assistance.
Trouble pattern analysis:
The objective of the Network Operations Plan is to detect troubles and correct them before customers become aware that there is a problem. Occasionally, however, customers will report trouble that has not been detected by other means. These reports, which are taken by the Customer Care organizations, are analyzed for patterns that could indicate a technical problem.
Vendor Service Evaluation
This portion of the Network Operations Plan reviews the service being provided by major vendors. Service levels are detected primarily by reviewing information on the trouble tickets. Tickets are analyzed to determine such service factors as these:
. Frequency of troubles
. Repeated troubles
. Cause of trouble
. Length of clearing time
With the information from this process, regional management can take corrective action with vendors.
Security and Disaster Recovery Administration
Security plans will be developed for all networks. For the SS-7 and Network networks, security provisions will be administered by Network Operations. For the All Network, Network Operations will provide procedures for the field forces to use in security administration.
The security plan has three purposes. First, it is to preserve service by preventing unauthorized persons from accessing the network remotely and taking equipment out of service or otherwise affecting service. Second, the plan's purpose is to preserve confidential information such as customer data bases and routing information from unauthorized access. Third, the plan will establish uniform procedures for securing the company's physical plant from unauthorized access.
Initially, the security plan will have three elements:
. Network password security control
. Building security
. Records security
Disaster recovery is the process for restoring service quickly in case of some form of natural or man-made disaster. The plan will discuss methods of enabling the network to survive disasters through such provisions as alternate route protection. It will also discuss ways of restoring service quickly in case each of disaster. The plan will include methods for protecting critical records, such as procedures for storing and backing up generic programs and recent change information.
The regional offices will be responsible for preparing specific site-by-site plans, using policies and procedures developed by Network Operations.
Information Flow
The Network Operations will be the focal point for information flow from the national office to the field. Initially, information will flow informally. As practice numbering and documentation standards are developed, more formal practices will be developed.
Network Operational Practices
The procedures discussed in this plan will be documented in (COMPANY NAME) practices. The practices will be numbered for indexing and ease of retrieval. To ensure that all field locations have an input into practices, they will be sent to the field for comment before they are issued. Field forces can request that practices be developed, or they may propose a local practice for adoption. As the need for practices is identified, the Network Operations Center will send a letter to the field indicating that such a practice is planned, and inviting the field to send in their comments and suggestions. After the practice has been prepared, it will be sent in draft form to the field for comment. If necessary it will make more than one trip to the field for comment. After all comments are in, it will be issued.
Network Advisories
A second, less formal class of practices will be called Network Advisories. These will be used to communicate suggestions for improving operations. Anyone in the field can initiate an advisory. Headquarters will put it in the appropriate format, publish it, distribute it, and keep it current. This series of practices will cover such things as information on new releases and new features of the network that we want to sell.
Network Maintenance Information
Network Operations presently maintains and will continue to maintain information for enabling Network Support Center to cover the regional offices outside working hours. This information includes callout lists and lists of critical subsystem.
Configuration Control
Configuration control is the process whereby all locations develop and maintain records of inventories, options, switch settings, and wiring. A complete record of the configuration in each equipment location is essential for trouble shooting, maintenance, and restoration in case of failure. Records are maintained of all circuits and assignments provided over company-owned and leased facilities.
Service Ordering
Each (COMPANY NAME) region is responsible for ordering its own services from LECs using local procedures. In the future, standardized ordering procedures may be needed to maintain centralized records of vendor services, but for the present, this plan assumes that each region will retain complete control of the service ordering process. It is important, however, that service ordering procedures be established regionally to update service records, which are discussed later.
Records Maintenance
A uniform record-keeping process is essential for all equipment locations. All sites will evolve toward a common records base, with standards set for the following:
. Types of records to be maintained
. Uniform drawing numbering and filing plan
. Uniform drawing and records formats
The purpose of the records standards is to ensure that all locations retain records that are required to maintain service. As records are converted in the future to images stored in a computerized data base, the standards will facilitate the conversion process.
The following discusses specific types of records that are or will be maintained.
System interconnection diagrams: These records show in block diagram form how all the equipment in a location is cabled together and how it is cabled to alarm and maintenance panels. These records are created by Engineering or vendors when equipment is installed or modified.
Software options: All equipment that operates from stored program control, such as switches, has software options that are set when the system is initialized or modified. These options are established by Engineering and retained until the generic program is updated.
Station wiring lists: Each equipment location has or will have a wiring list to retain records of equipment location (bay or cabinet), hardware switch and option settings, vintage, and installation date. Equipment such as multiplexers, modems, CSU’s, etc. has hardware options that are set by straps or dipswitches. Records are created by Engineering at the time of installation or modification. Any option changes must be authorized by Engineering. Changes made in the field to correct trouble conditions must be reported to Engineering and followed with an authorizing order to ensure that the records are kept current.
Service provider records: Records are kept of all circuits provided by other companies such as LECs. These records list the circuit number, what services the circuits carry, service provider name, trouble reporting number, circuit type, installation date, and terminating points.
Network Expansion Planning Support
As demand grows and the networks expand and configuration changes, the Network Operations will provide input to the Engineering planning process. This input will ensure that equipment and services planned by Engineering are compatible with test equipment, methods, and procedures used in the field.
Network Service Assignments
As services are added, there will often be more than one network to which they can be assigned. The Network Operations Center will determine, in consultation with MIS and Engineering, which network will support the service.
PRIORITIES
The following is a list of major tasks and projects that the Network Operations Center must complete to implement this operations plan.
This list is divided into three categories:
Immediate, Medium Range (one to two years), and Long Range (more than two years)
Immediate
· Develop trouble logging, escalation, and analysis procedures.
· Develop a common circuit trouble ticket.
· Develop a common equipment trouble ticket.
· Develop a repairman's trouble ticket.
· Develop procedures for assigning services to the Network and SS-7 networks.
· Develop procedures for disseminating informal information to the field.
· Develop a process for extracting service indicators remotely from Switches systems and compiling them into service reports.
Medium Range
· Develop procedures for integrating customer reports with reports of circuit and equipment outages.
· Develop a mechanized ticket logging process as part of a network management system.
· Develop performance objectives in terms of terminal response time and network availability for the networks.
· Develop performance monitoring and service evaluation procedures for the networks.
· Develop and refine procedures for reporting abnormal network conditions.
· Develop information flow procedures for disseminating formal information to the field.
· Develop record-keeping procedures for vendor services in the networks.
· Develop procedures for evaluating services from local exchange companies and other vendors.
· Develop a reporting process for keeping management informed of network service and cost trends.
· Develop guidelines for the field on what service and equipment records should be kept and the method for retaining them.
Long Range
· Develop a network security plan for all networks.
· Develop a process for including Network Services' requirements in the network planning process.
· Develop a disaster recovery plan for all networks.
· Develop common record-keeping procedures for equipment configurations for All Networks.
· Develop trouble pattern analysis procedures.