Risk perception in hospital operation and maintenance scenarios

With the development of medical informatization construction, the system and equipment of the hospital are constantly superimposed. While improving the user experience and enjoying efficient medical services, it also brings huge challenges to the information department that supports the stable operation of the system. Pain points such as complex departments, multiple application scenarios, heavy terminal operation and maintenance workload, and strong requirements for software system compatibility, and hospitals have extremely high requirements for the stability and continuity of systems and technical equipment. How to improve the efficiency of comprehensive security management through operation and maintenance, ensure the healthy and stable operation of the IT environment, how to quickly discover, accurately locate, and avoid failure losses? This is obviously the core of the work of the hospital operation and maintenance department.

The goal of operation and maintenance revolves around the security and stability of systems and services. Stability and availability are the first guarantees. Second, hidden dangers are checked and how to improve performance and resource efficiency.

In the hospital practice scene, we intervene from two means of active monitoring and automatic inspection to perceive risks in advance and prevent problems before they happen.

1. Active monitoring

There is a mantra in the operations world: you can't protect what you can't see.

Monitoring is the main means to discover potential risks and abnormalities. The monitoring objects include hardware, software, application systems, etc. It collects and analyzes indicator data 7*24 hours a day, discovers abnormalities in time, and responds quickly.

We build a comprehensive, accurate, and responsive monitoring system based on hospital business scenarios. By sorting out business processes, associating business and IT data, we collect and manage data, and conduct trend analysis and response to data. Through visual display, you can intuitively grasp the operating status of the system, establish a complete alarm mechanism and fault management process, and ensure that problems are discovered and dealt with in a timely manner.

1. Data collection & processing.

There are three types of data: metrics, traces, and logs. Monitor and manage the business system from the perspective of user experience, and obtain monitoring data on the performance and availability of the business system by monitoring the user's business operation status.

2. Alarm management

Including alarm suppression, aggregation and shielding to avoid false positives, false negatives, and repeated alarms.

Set the corresponding performance alarm thresholds according to the monitoring items of each monitoring object. When the performance status exceeds the threshold, corresponding alarm information is generated and sent to the unified monitoring platform for unified processing and analysis. The unified monitoring platform can conduct alarm correlation analysis on relevant alarm information, and analyze the impact of the business system to determine the root cause of the alarm.

3. Visual display

7×24 business monitoring management, custom business monitoring large screen display.

The performance status of each component of the business application, such as the status of the network, the status of the infrastructure, the performance status of the database, and the performance status of the middleware;

business response. In the event of a business application failure, quickly display and analyze business application problems through the correlation management between business applications and infrastructure components.

Realize the integrated centralized monitoring of the dynamic environment of the computer room, IT infrastructure, Internet of Things equipment, and security. Through the large visual screen of the management center, managers can understand the operating status of the system in real time by means of visual graphics and dynamic visual views, so as to realize unified and efficient management, and perform performance evaluation and operation and maintenance knowledge accumulation. With a business-centric management process, business personnel and operation and maintenance personnel can better combine to improve work efficiency, thereby further optimizing the availability of business applications.

2. Automatic inspection

Patrol inspection is an active assessment and discovery of IT operation risks, to find hidden dangers as much as possible, and to ensure stable operation of equipment. At the same time, early warning and solution suggestions are put forward in a targeted manner to minimize the risk of system operation.

Automatic inspection, as the name suggests, can be arranged for real-time task inspection, or periodic task inspection can be set. The inspection results can be exported to word for archiving. Engineers can add suggestions, risk warnings, etc. in the form. Automatic inspection can effectively reduce the daily work intensity of engineers, find problems in time, and meet compliance requirements.

, duration 00:08

3. Risk perception

Perception, decision-making, and execution constitute the three elements of operation and maintenance. Improving risk awareness scenarios is the key to ensuring O&M security and improving O&M efficiency. Look at the main points of the risk perception scenario:

1. Online perception of risk status, real-time health quality inspection;

2. Data indicators + algorithm strategy;

3. Establish a closed loop of risk perception, decision-making, and execution.

4. Expert collaboration, online management

Establish a normalized working mechanism for operational risk assessment, and coordinate personnel, events, and tools to ensure smooth collaboration in scenario work. Through multi-dimensional means such as full-stack monitoring system, fast abnormal response, advanced and efficient AI machine learning algorithm, and online expert collaboration, a digital risk-aware collaboration mechanism is established.

1

-- Establish a comprehensive and agile monitoring system

Integrate assets into the monitoring system to monitor the status and performance of each resource node in real time. No missed report, less false positives, high response, focus on risk from the perspective of monitoring, real-time discovery of risks, including many risk monitoring. Efficiently cope with the large-scale infrastructure of the hospital, network equipment, servers, storage, applications, etc.

2

-- Quickly discover and locate abnormalities, and respond in a timely manner.

Monitor the health of the system, and display the operating status of each asset, business topology diagram, alarm list trend, etc. through the system view. In terms of alarm management, simulate customer behavior, discover risks before customers, and focus on risk discovery at the business level from a risk perspective. Helping engineers quickly diagnose faults not only plays the role of "fire fighting", but also can perceive risks in advance and prevent problems before they happen.

3

 -- AI machine learning algorithm - accurate and timely

Realize scenarios such as precise alarms, anomaly detection, root cause location, and capacity analysis.

Intelligent abnormal alarms, alarm confirmation based on dynamic thresholds, abnormal detection of massive timing indicators, and rapid response to faults: problems can be found and solutions can also be provided.

4

-- Expert collaborative online management

Provide 7*24 online duty, equipped with moc experts and second-line expert teams, improve incident response and processing efficiency, and greatly reduce labor costs and expert technology costs.

The platform's digital thinking of "connection, data, and empowerment" reshapes the scene, combs the scene with the three elements of "people, events, and tools", and collaborates efficiently to improve risk perception. Behind the high efficiency is strong technical support. What LinkSLA intelligent operation and maintenance housekeeper delivers is not only a platform, but also a set of sustainable improvement operation and maintenance mode, which can enhance value for users, improve operation and maintenance efficiency, and reduce operating costs. 

Supongo que te gusta

Origin blog.csdn.net/LinkSLA/article/details/130709209
Recomendado
Clasificación