Basic server troubleshooting methods

1. Power-on fault

  • Definition:
    A computer fault occurs during the period from power-on (or reset) to completion of self-test.
  • Possible malfunction

1. The main unit cannot be powered on (for example: the power supply fan does not turn or stops after turning it once, etc.), sometimes it cannot be powered on, the switch is switched off, the metal part of the chassis is charged,
etc
.;
4. Repeated restarts; 5. Unable to enter the BIOS, crashes or reports an error after refreshing the BIOS; CMOS power
failure, clock inaccuracy;
6. The machine is noisy and automatically ( Timing) startup, power supply equipment problems and other faults.


  • Mains environment of the components that may be involved; power supply, motherboard, CPU, memory, display card, and other possible boards; settings in the BIOS (can be restored to the factory state by discharging); switches and switch lines, reset buttons and reset lines its own fault

  • Judgment Points/Sequence
    The following text description is a supplement and explanation of the maintenance judgment process and should be read in conjunction with the flow chart. In addition, this chapter only analyzes power-on faults. If other types of faults are involved in the judgment, you can switch to the judgment process of the corresponding faults. The following categories are the same.
    1. Preparation before maintenance

    1. POST card;
    2. multimeter;
    3. test pen;
    4. CPU load.

    2. Environmental inspection
      1) Check the computer equipment:
      A. Whether there is any deformation, discoloration, peculiar smell, etc. inside and outside the peripheral and computer equipment;
      B. The temperature and humidity of the environment;
      C. After power on, pay attention to the components, components and other equipment Whether deformation, discoloration, odor, abnormal temperature, etc. occur.
      2) Check the mains condition:
      A. Check whether the mains voltage is within the range of 220V±10% and whether it is stable (that is, whether there are frequent power failures, instantaneous power failures, etc.); B. Whether the
      wiring definition of the mains is correct (that is, the left Zero right fire, it is not allowed to use the neutral wire as the ground wire (the phenomenon is that the zero ground is short-circuited), the neutral wire should not be suspended or virtual connection); C. Whether there is a leakage protector connected
      to the power supply line (and the live wire must be grounded on), whether there is a ground wire, etc.;
      D. Whether one end of the host power cord is securely plugged into the mains socket, and there should be no looseness or improper insertion, and whether the other end can be connected to the host power supply It should not be too loose or cannot be inserted in place.

2. Display fault

  • Definition
    This type of failure includes not only failures caused by display devices or components, but also abnormal display phenomena caused by other components. In other words, display failures are not necessarily caused by the display equipment, and should be fully observed and judged.

  • Possible fault symptoms
    1. No display when starting up, sometimes or often the display cannot be powered on;
    2. Display color cast, jitter or scroll, blurred display, blurred screen, etc
    .; ), ghosting, crash, etc.;
    4. Screen parameters cannot be set or modified;
    5. Brightness or contrast cannot be adjusted or the adjustable range is small, and the screen size or position cannot be adjusted or the range is small;
    6. The display is abnormal after waking up from sleep;
    7. , The monitor smells strange or makes noise.

  • Components that may be involved include
    monitors, graphics cards and their settings; motherboards, memory, power supplies, and other related components. Pay special attention to interference from other devices around the computer and geomagnetic interference to the computer.

  • Judgment points/sequence
    1. Preparations before maintenance
    The latest version of the driver for the corresponding display card
    2. Environmental inspection
      1) Mains inspection:
      A. Whether the mains voltage is 220V±10%, 50Hz or 60Hz; whether the mains is stable;
      B. . For the rest, please refer to the relevant mains power inspection section in Power-on Faults.
      2) Connection check:
      A. The connection between the display and the host is firm and correct (special attention is paid to whether it is connected to the correct display port when there are two display ports); whether the pins of the cable connector are deformed or broken, Attention should be paid to check whether the quality of the display cable is intact;
      B. Whether the display is connected to the mains supply correctly, and whether its power indicator is correct (whether it is bright and in color);
      C. Whether the abnormality of the display device is related to the ungrounded wire. Special attention: computer maintenance engineers are not allowed to install the ground wire for the user, and the user should be installed by a formal electrician;
      3) Surrounding and host environment inspection:
      A. Check whether the ambient temperature and humidity are in line with the user manual (such as diamond tube, require The operating temperature is 18~40C);
      B. Whether there is any peculiar smell, smoke or abnormal sound (such as popping sound) after the display is powered on;
      C. Whether the components on the display card are deformed, discolored, or the temperature rises too fast D.
      Whether the display card is inserted properly, you can check by reinserting, wiping the gold fingers of the display card (including other boards) with rubber or alcohol; if there is a lot of dust in the host, clean it; E.
      Surroundings Whether there are any interference objects in the environment (these interference objects include: fluorescent lamps, UPS, speakers, hair dryers, other monitors that are too close to each other (within 50 cm), and other high-power electromagnetic equipment, cables, etc.). Note that the placement direction of the monitor may also cause interference to the display device due to the influence of geomagnetism;
      F. For fault phenomena such as color cast and jitter, you can check whether the fault phenomenon disappears by changing the direction and position of the monitor.
      4) Other inspections and precautions:
      A. After the host is powered on, is there any normal self-test and operation action (if there is a beeping sound of self-test completion, hard disk indicator light flashing, etc.), if so, focus on inspection Monitor or graphics card;
      B. It is forbidden to move the monitor and its direction while the power is on. It is also best not to move the monitor for a period of time (2 to 3 minutes) after the power is turned off.
    3. Key points for fault diagnosis
      1) Adjust the monitor and graphics card:
      A. By adjusting the OSD options of the monitor, it is best to return to the RECALL (factory state) state to check whether the fault disappears. For LCD monitors, you need to click the auto config button;
      B. Whether the parameters of the monitor are adjusted too high or too low (such as H/V-MOIRE, which cannot be restored through RECALL); C. Whether
      the buttons of the monitor can be adjusted, Whether the adjustment range deviates from the monitor's specification requirements;
      D. Whether the monitor's abnormal sound or smell exceeds the monitor's technical specifications (for example, a new monitor will have an unusual smell when it is first used; it will cause degaussing when it is first powered on). The sound, screen jitter, etc. caused by the reasons, but these are normal phenomena). For the specifications of the monitor, please see Appendix 2(2);
      E. Whether the technical specifications of the display card can be used in the host (such as whether the AGP 2.0 card can be used in the AGP slot of the host, etc.).
     2) BIOS configuration adjustment:
      A. Whether the settings in the BIOS match the type of display card currently used or the location where the monitor is connected (that is, whether to use an onboard display card or an external display card; whether it is an AGP display card or a PCI display card);
      B. For onboard graphics cards that do not support automatic allocation of display memory, you need to check whether the size of the display memory in the BIOS meets the needs of the application;
      The following checks should be performed under the minimum software system.
      3) Check the monitor/card driver:
      A. Whether the monitor/card driver matches the display device and whether the version is appropriate;
      B. Whether the monitor driver is correct. If there is a driver provided by the manufacturer, it is best to use the manufacturer's driver;
      C. Whether the appropriate Direct X driver (including motherboard driver) is loaded;
      D. If the Direct This program can also be used to check the sound card device.
      4) Check display attributes and resources:
      A. Check in the device manager whether there are resource conflicts between other devices and the display card. If so, remove these conflicting devices first; B.
      Whether the display attributes are set appropriately (such as : Incorrect monitor type, refresh rate, resolution and color depth, etc., will cause ghosting, blurring, screen blur, jitter, and even black screen); 5) xx system configuration and application check: A. In
      the
      system Whether the settings in some configuration files (such as the System.ini file) are appropriate;
      B. Whether the technical specifications of the display card or the functions of the display driver support the needs of the application;
      C. Whether there are other software and hardware conflicts.
      6) Hardware check:
      A. When the display is adjusted normally, other components should be added one by one to check which components caused the abnormal display; B.
      By replacing different types of display cards or monitors, check whether there are matching problems between them ;
      C. Check whether the display is abnormal due to hardware failure by replacing the corresponding hardware (the recommended replacement sequence is: display card, memory, motherboard).

3. Installation failure

  • Definition:
    This type of fault mainly reflects the fault that occurs when installing the xx operating system or application software.

  • Possible failure phenomenon
    1. When installing the xx operating system, it crashes or reports an error during file copying; it crashes or reports an error during system configuration;
    2. When installing application software, it reports an error, restarts, crashes, etc. (including the copying and configuration process) ;
    3. System abnormalities (such as black screen, failure to start, etc.) after the hardware device is installed;
    4. Application software cannot be installed or uninstalled after being uninstalled, etc.

  • Possible components involved
    : Disk drives, motherboards, CPUs, memory, and other possible components and software.

  • Judgment points/sequence
    1. Preparation before maintenance
      1) Pay attention to bringing the disk data cable;
      2) The latest version of the appropriate device driver.
    2. Environmental inspection
      1) Software installation
      A. Check the connection and appearance of the hardware devices:
      a) Check whether other devices connected to the host are working normally;
     b) Whether the connecting cables between the devices are connected incorrectly or missing. Check whether the pins connecting the plug and socket are deformed, missing, short-circuited, etc.;
      c) Check the error message carefully to determine the possible fault location;
      d) Observe whether there is any peculiar smell in the system and the temperature of the components;
      e) The speed of the CPU fan Is it too slow or unstable?
      f) Is there any abnormal sound when the driver is working?
      B. Other aspects of inspection:
      a) Carefully check the software manual to confirm that the machine's software and hardware configurations meet the requirements of the manual;
      b) Carefully observe whether the installation media is intact.
      2) Equipment installation
      A. Check the connection and appearance of the equipment:
      a) Whether the equipment and components to be installed are connected correctly, whether the connecting cables are intact, and whether the connector pins are missing, broken, or short-circuited; b) To be
      installed Whether the manufacturing process of the equipment and components is excellent;
      c) The rest of the software installation is similar to the above.
      B. Driver media check: Whether the driver media used to install the device is intact.
    3. Key points for fault diagnosis
      1) xx operating system installation:
      A. Check the settings in CMOS:
      a) If necessary, please restore to the factory settings first;
      b) Turn off the BootEasy function, turn off the anti-virus function, and turn off the BIOS write protection switch;
      c) Pay special attention to the parameters of the hard disk, the temperature of the CPU, etc. Pay attention to whether the information displayed during the self-test matches the actual hardware configuration.
      B. Check the installation media and target media:
      a) Check whether there are viruses;
      b) Check whether the partition table is correct and whether the partition is activated. Use the Fdisk /mbr command to ensure that the master boot record is correct (note that after using this command, if the machine cannot be started, it can prove that there is a virus or error in the original system. The hard disk should be initialized); c) Check whether the system
      is There are third-party memory-resident programs.
      It is recommended to check the following process under the minimum system of the software (Note: Under the minimum system, other drivers related to installation need to be added).
     C. Check the installation process:
      a) If there is an error in the CAB file when copying the file, you can try to copy the original file to another medium (such as a hard disk) and then install it.
      If it passes normally, there is a problem with the original installation medium, and you can check whether the medium and the corresponding drive are faulty; if it still cannot be copied, you should check the corresponding disk drive, data cable, memory and other components; If the above problems occur, it is recommended that if the fault cannot be eliminated after replacing the installation medium, the hard disk should be initialized first, and then reinstalled (when initializing, it is best to completely clear the hard disk partition). If it still cannot be solved, consider the hardware;
      c) During the installation process, if there is an error message, blue screen or crash when testing the hardware, one is to restart several times (it should be a shutdown and restart) to see if it can pass; the other is to check whether it can be done under the minimum software system pass. If it fails, you should check the memory, disk, CPU (including fan), power supply and other components in the minimum software system in turn; if it can be installed normally, it is a failure or configuration problem of components other than the minimum software system. After the installation is complete, add those components step by step, and judge whether there is a fault or improper configuration;
      D. Hardware and other issues that should be paid attention to:
      a) If the system is restarted or powered off during installation, it is required to test under the minimum software system. If the fault disappears, after installing the system, connect the devices other than the minimum software system one by one to check which component is causing the fault, and solve it by replacement; if the fault cannot disappear, check the power supply and motherboard in the minimum software system And memory, even disk drive;
      b) When installing an operating system such as UNIXXX on an IDE device, or when installing multiple xx operating systems, pay attention: one is the 8.4GB limit (the beginning of UNIX must be within 8.4GB) ——This item does not have this requirement on SCSI devices; the other is the installation sequence and coordination relationship between multiple operating systems;
      E. For the installation of LEOS
      , the following points should be noted: After replacing the motherboard for the user, you must first refresh the BIOS that supports LEOS;
      b) If you replace the hard drive for the user, also pay attention to whether the spare hard drive correctly supports DMA66. Otherwise, problems will occur when installing LEOS;
      c) LEOS is best installed on a brand new hard disk that has not been partitioned. The specific sequence can be referred to the following plan: New hard disk -> Install LEOS -> Partition (Fdisk) -> Install xx operating system (Windows XP) -> Create one-click recovery. If the original hard disk has partitions, you can use the Clear.com program to clear them before installing LEOS.
      2) Application software installation:
      A. Issues that should be paid attention to when checking the installation
      of application software: a) For application software installation problems, please refer to the above-mentioned xx operating system installation inspection method;
      b) Before installation, it is required to back up the registry before proceeding with installation. ;
      B. Conflict check between software and software and hardware:
      a) Two methods of software problem isolation can be used. One is to close the running applications under the minimal software system, and then install the required application software; the other is to directly close the running applications under the original system, and then install the required application software. The method to close existing applications is: use msconfig to disable the programs that are called at startup in the startup group, autoexec.bat, config.sys, win.ini, and system.ini; b) Use the
      task manager to check whether there are any errors in the system. Normal process, and kill it;
      c) For situations where the software technical manual requirements are basically met but cannot be installed, see if it can be solved through setting adjustments. If it cannot be solved, it is considered incompatible;
      d) Use other machines (preferably with different configurations) to check whether there are software and hardware compatibility issues; e
      ) Check whether the software has been installed in the system. If it has been installed You should uninstall it first and then install it. If it cannot be uninstalled normally, you can uninstall it manually or by restoring the registry (for Windows XP, you can use the system restore function to uninstall it); f) If necessary, you can check relevant information from the
     Internet , and then contact the software manufacturer to see if there are any other precautions.
      C. Hardware check:
      If the above steps do not work, you can consider hardware problems. You should check the optical drive, installation media, hard disk cables and other accessories.
      3) Hardware device installation:
      A. Conflict check:
      a) Whether the installed equipment and components are identified during the self-inspection process before the system starts, or can be identified by the xx operating system (except for non-plug-and-play identification equipment). If it cannot be recognized, check the BIOS settings and the device itself, including jumpers and corresponding slots or ports;
      b) Check whether there is any conflict between the newly installed device and the device in the original system; by changing the installation sequence of the driver and removing the original system Corresponding parts or equipment in the machine, replace the slot, and see if the fault is eliminated. If it cannot be eliminated, it is incompatible;
      c) Whether the installed equipment matches the technical or physical specifications of the existing system;
      d) Check whether some settings in the current system (mainly the settings in the . There is a mismatch between installed components or device drivers;
      B. Driver check: Whether the installed device driver is the appropriate version (ie, not necessarily the latest);
      C. Hardware check:
      a) Installed components or Whether the equipment itself is faulty;
      b) Check whether the components in the original system have any defects (such as damaged slots, insufficient power supply capacity, etc.).

4. Operation and application faults

  • Definition:
    This type of fault mainly refers to application and system faults that occur after startup and before shutdown.

  • Possible faults
    1. Unable to wake up normally after hibernation;
    2. Blue screen, crash, illegal operation and other faults appear during system operation;
    3. The system runs slowly;
    4. Running a certain application program causes the hardware function to fail;
    5. Games Unable to run normally;
    6. The application cannot be used normally.

  • Possible components involved
    : motherboard, CPU, memory, power supply, disk, keyboard, connected boards, etc.

  • Judgment points/sequence
    1. Preparations before maintenance
      1) A clean and usable hard disk;
      2) Antivirus software;
      3) Drivers as new as possible, several versions of BIOS;
      4) Data cables connected to the disk, etc.
    2. Environmental inspection
      1) Mains and connection inspection:
      A. Check whether the mains is normal, whether the connection is tight; whether there is grounding;
      B. Whether the connecting wires between devices are wrong or missing.
      2) Peripheral and visual inspection:
      A. Check whether other peripherals connected to the host are working normally;
      B. Whether there is abnormal noise when the drive is working, and whether the speed of the CPU fan is too slow or unstable;
      C. Observe whether there is too much dust in the chassis Too many, resulting in poor contact between the connectors. After removing the dust first, wipe the gold finger with an eraser to remove the oxide layer or dust. Then plug it back in;
      D. Observe whether there is any peculiar smell in the system and whether the temperature of the components rises too high or too fast.
      3) Check the display and settings:
      A. Record the error information in detail, and judge the parts that may cause the failure;
      B. Pay attention to the settings of the hard disk, system time, and CPU temperature in CMOS, and pay attention to whether the hardware information and machine configuration displayed during the self-test are correct. Compliant;
      C. Read the software usage guide carefully and pay attention to the environmental requirements for the software to run.
      4) Fully communicate with users:
      A. Understand the user’s usage;
      B. The phenomenon before the failure;
      C. What operations were performed before the current failure occurred.
      Based on the above understanding, we can initially determine the possible cause of the failure.
    3. Key points for fault judgment
      1) Check whether it is caused by user misoperation.
      A. When the machine crashes, has a blue screen or restarts for no reason, first consider whether the user's operation complies with the operating specifications and requirements, and carefully ask and observe whether the user's operation method is correct. It is consistent with common sense, and engineers use the correct method to operate and apply the user's machine to check whether the fault reported by the user for repair occurs. If it does not appear, it can be considered to be caused by improper user operation. The engineer will explain and demonstrate the correct operation method to the user.
      B. If the fault still exists after the above operations, you can use the system file checker to check whether there are any missing DLL files in the user's machine system and try to restore them.
      C. Pay attention to observe whether there are any patterns when the user's machine crashes, has a blue screen, or restarts for no reason, and find out the possible causes of machine failure (such as the machine crashes when running a certain program or the machine crashes within a certain period of time when it is turned on).
      D. By comparing with another fault-free machine with the same software and hardware, check whether the file size of the faulty machine is the same or not much different, and whether the version of the main program is consistent.
     2) Check whether the fault is caused by a virus or anti-virus program.
      A. Check whether the user's machine is infected by a virus and use anti-virus software to kill it;
      B. Check whether the user has installed two or more anti-virus software. It is recommended that the user use one of them. , and uninstall other anti-virus software;
      C. Check whether there is a Trojan horse program. Use the latest version of the anti-virus program to detect the Trojan horse program. You can install patches to close security holes in programs, or install firewalls.
      3) Check whether the failure is caused by operating system problems.
      A. Check whether the hard disk has enough remaining space and check whether there are too many temporary files. Organize hard disk space and delete unnecessary files;
      B. For damaged or lost system files, you can use the system file checker to check and repair them;
      C. Check whether the operating system has installed the appropriate system patch (For Winnt, you can observe the version of the service pack at startup, it is recommended to use SP6; for Win2k and Winxp, you can check it in the system properties, for Win2k, it is recommended to use SP3, and for Winxp, it is recommended to use SP1.) ;
      D. Check if the DirectX driver is normal, and upgrade the version of DirectX;
      E. Check if the device driver is installed correctly, and whether the driver version is appropriate. Check whether the order of driver installation is correct (for example: install the motherboard driver first).
      4) Check whether the fault is caused by software conflict or compatibility
      A. Check whether the operating environment of the user application software is compatible with the existing operating system (NT/98/2K/XP), you can check the software manual or go to the application software web page Search for relevant information and check the webpage to see if there are any upgrades or patches for this software that can be installed.
      B. You can use the task manager to observe whether there are abnormal programs running in the background of the faulty machine, and try to close the programs and only keep the most basic background programs.
      C. Pay attention to check whether there are shared DLL files in the faulty machine. The problem can be solved by changing the installation sequence or installation directory.
      5) Check whether the hardware settings are correct
      A. First, check whether the CMOS settings are correct, and restore the default values;
      B. Check whether the hardware is normal in the device manager, whether there is a conflict with the interrupt, and if there is a conflict, adjust the system resources (for a certain For some hardware, read the instruction manual and set the hardware correctly according to the instructions);
      C. Delete the hardware driver in the device manager, reinstall the driver (it is best to install the driver with the correct version), and check whether the hardware driver is back to normal;
      D. Run the hardware detection program, such as AMI, etc. to detect whether the hardware is faulty;
      E. In the case of the minimum software system, re-update the hardware driver and observe whether the fault disappears.
      6) Check whether it is a compatibility problem
      A. When encountering compatibility problems, you should check the specifications and standards of the hardware (for example, when using multiple memories at the same time, check whether the memories are from the same manufacturer, the same specifications, the same capacity, and the same batch of memory particles) , are allowed to be used together.
      B. Read the manual or search for relevant information on the web page, check the software requirements required for the normal use of the user's hardware, whether the current software environment meets the requirements, and whether the software and hardware support each other.
      C. Check the user's system resources in the device manager for conflicts. If there are conflicts, manually adjust the system resources.
      D. Check in the device manager whether the hardware driver of the user's machine is installed correctly, and update the appropriate version of the device driver (for example, some graphics cards use the public version driver that comes with WIN2000 or WINXP, which will cause some large 3D games to fail to run) ;
      E. Check and repair the BOM, remove non-Lenovo hardware, and check whether the system can work normally. If it can work normally, it is recommended that the user replace the hardware added by himself or search for hardware-related information to solve the problem.
     7) Check whether it is caused by a network failure.
      A. When the machine is connected to the network and crashes, runs slowly, or has a blue screen, you should first shut down the network, isolate it from the network environment, and observe whether the fault disappears. If the fault disappears, then Network problems caused the malfunction.
      B. The fault is indeed caused by a network problem. Please refer to the network section for the judgment and solution steps.
      8) Check whether it is caused by poor hardware performance or damage.
      A. Use the corresponding hardware detection program to check whether the hardware is faulty. If so, use the replacement method to eliminate the corresponding hardware; B. Use the replacement method to check the hardware that the
      detection program cannot determine. Fault

Guess you like

Origin blog.csdn.net/javascript_good/article/details/132664815