IT Operation Management - Overview

IT Operation Management - Overview

Outline

IT service management (ITSM) is set to help companies planning, development, implementation and operation of IT systems for effective management approach is a methodology. ITSM originated in the ITIL (IT Infrastructure Library, IT infrastructure standard library), ITIL is the CCTA (National Computer Board) set of IT service management standard developed by the library in 1980. It is the British method in IT management summed up, become norms for corporate IT departments to provide a set from plan development, implementation to operation and maintenance of the standard method.

Chinese name English name IT service management ITService Management

Expert research and practice shows that a large number of companies in the life cycle of IT projects, about 80% of the time and the operation and maintenance of IT-related projects, and investment at this stage only 20% of the entire IT investment, forming a typical "high technology consumer, "" light, heavy technology "phenomenon.

 

20000 ISO , namely " IT service management system standard " is for the organization's IT service management standards, in order to provide the establishment, implementation, operation, monitoring, reviewing, maintaining and improving IT service management system ( ITSMS model).

ISO20000 standard focuses on managing IT issues via "IT service standardization", is about IT issues classified intrinsically linked identify the problem, and then planned based on service level agreements, implement and monitor, and emphasis on communication with customers. The standard also concerned about the ability of the system, change the system when the required level of management, budgeting, software control and distribution.

ISO20000 principles and methods are as follows

First, the process of integrating the method

Process: conversion of the input or output of interrelated activities of a group of interacting.

The integrated process is identified within each process, define and manage organizational systems to be used, in particular, the interaction between the interface and the process to form a coordinated operation of the process can be set. For example: a service event will trigger the incident management process, which could lead to further problems process management, change management processes.

Second, the quality management PDCA method

 

 

 

A term defined Overview

 

     

1.1 Service catalog: Provides written information to the customer service

1.2 Service Level Agreement Service Level Agreement SLA

Service Level Management: plans to sign a service level agreement SLA carried out, consultation, monitoring, reporting and evaluation of the SLA after the signing of the quality of service and other activities consisting of a service management

   Service Level Agreement SLA: providers and customers with a service agreement to provide support during the critical issue of service objectives and responsibilities of the two sides reached

Agreement to provide IT services and the internal party organization a specific function or IT service delivery and support posts on a specific IT service projects reached: Run Level Agreement OLA

Support contract: IT services agreement to provide external parties and third-party vendors to provide service and support for a particular project reached

 There are several categories of services available period, service response time, service completion time and the like

 

Service Level Agreement
 

Party:

B: B
 

This agreement covers XYZ and services available to support, (briefly services). 

Validity of this agreement is 12 months, from _ in _ months _ day to _ on _ months _ date. This agreement will be reviewed annually, part of the change must be recorded in the attached table rather confirmed by both parties to sign.
 

 

Service levels are defined: XYZ -end application response time

Effective time: day 24- hours from Sunday to Saturday, excluding twist given the country a user-defined holidays. 

Service level indicators: response error less than the total number of connections of 1% .
 

Response time: less . 5 seconds.
 

communication:

(1). Service points Contact: support.xxxx.com or 800-XXX

. (2) Response time: Party promised after receiving the report 5 call back within minutes ill words to the customer

. (3) upgrade behavior: 30 within minutes breach of SLA , notice to Party B project was buried; 4 within hours SLA failed and could not find the recovery method, both the regional manager to notify AB.

. (4) upgrade management: entry to the A and B sides B provides managers SLA failure monthly report. Both parties to provide the area buried by SLA loss Tin quarterly.

The two sides agreed responsibilities: the XYZ related application servers owned by the Party, and the Party is located in the data center, the Party to provide the necessary access to system B shall. B to ensure compliance with safety rules Party.
 

Calculation: Response time <= 5 seconds. Response time refers to the so-called issue from the initial query to display all the characters on the user's screen.
 

Measurement and reporting cycle intervals: polling interval is measured . 1 min. Weekly report period (cumulative data).
 

Data source: measure by XX complete automation tools. Including measuring points and the values of the response time comprises the date and time stamp information.

1.3 Service Availability

At a predetermined time period or predetermined capacity, service perform a required function

Availability% = AST- agreed service time out of service time / AST given service time

 

2 9: (99%) * 365 = 3.65 days

3 9: (99.9%) * 365 * 24 = 8.76 hours, indicating that the system is in continuous operation time up to one year duration of service interruption may be 8.76 hours.

4 9: (1-99.99%) * 365 * 24 = 0.876 hours = 52.6 minutes, indicating that the system is in continuous operation time up to one year duration of service interruption may be 52.6 minutes.

5 9: (1-99.999%) * 365 * 24 * 60 = 5.26 minutes, indicating that the system is in continuous operation time up to one year duration of service interruption may be 5.26 minutes.

 

1.4 Service Continuity Plan

 

In the case of as little interruption of customer service, to provide IT services, IT systems and when there is a problem, in a controlled manner to recover.

IT Service Continuity Management: responsible for disaster prevention, strengthen resilience and fault tolerance of IT infrastructure, organizations need to ensure the availability of adequate technical, financial and management resources after a disaster to ensure continuity of operations of IT services.

BC plans to perform one or more of exercise each year are business continuity management system (business continuity management system, BCMS) is a key component. The exercise program should include updated emergency response team training, policy reviews and audits, business impact analysis (BIA), risk assessment (RA), BCMS awareness programs and other activities.

1.5 Availability - Continuity -SAL

Before entering into SQL, you first need to clear the availability and continuity of customer demand, in determining the availability and continuity goals SAL, service continuity and availability management should ensure and achieve these goals, and provide the actual performance indicators.

The ultimate goal of service continuity management and availability management is consistent, it is to ensure continued service available, uninterrupted. Although ISO20000 system to service continuity and service availability within the same process as a 6.3 terms, but within a group operating practice or differ, each focusing on.

Service Availability IT service management throughout the entire process run, the customer in making the service availability requirements, IT service providers will be required to evaluate the needs resources and infrastructure, and to determine the resources and costs required for customers to choose and OK. Followed by IT services provider will develop recovery plans based on these programs and the availability of availability requirements aimed at the IT service interruption occurs, the shortest possible time to restore business / service to a normal state.

First, the need for service continuity management IT services business impact analysis, a clear need to focus on the scope of continuity management, IT service risk analysis and risk management, identifying organizational weaknesses and potential threats to provide IT services exist in the process, and service continuity strategy formulation, risk reduction at the lowest cost to a minimum acceptance level. Upon completion of the implementation of IT service continuity plans, the need for regular and continuous risk assessment strategies, test. Continuity management and change management are closely inextricably linked, whether it is business needs change, or technical parameters / infrastructure and other changes, the need to timely review and adjustment continuity plan.

 

Availability is available throughout the year overall availability of 99.9% e.g., 8.76 hours. Continuity Continuity Management should establish a close and continuous change management, whether business requirements change, or technical data, technical framework changed, we need to look back in time to adjust continuity plans to ensure that its really work.

 

Two important knowledge management

The so-called troops and horses, forage ahead; a good system or project, there must be a lot of documentation (knowledge) is supported. For example, pre-construction system, we must do a good job requirements documents, design documents system, implementation of the document. To build the system in accordance with the design and implementation of early documents, and generate system-related issues summary document and update the implementation of the document. After the completion of the construction of the system, to write manuals and operation and maintenance manuals and service capabilities using an object-based system. Some businesses in the process of delivery, the request was not in accordance with the relevant documentation, on-line system after problems emerge, leading to operation and maintenance personnel running around in circles, I do not know where to start treatment, operation and maintenance personnel often make a lot of detours around, missed opportunities. Documentation is also a good variety of points, such as configuration files, implementation documentation, design documentation, system normative documents, project management, documentation, and so on. For various, so it requested operation and maintenance personnel must have the appropriate ability to prepare and document finishing capabilities. At the same time we must strictly in accordance with the implementation of the previous document, there are problems in time to learn to communicate, and the problem corrected update to the document.

 

 

Published 37 original articles · won praise 0 · Views 2402

Guess you like

Origin blog.csdn.net/syjhct/article/details/100676727