Comprehensive summary of interview questions for Linux operation and maintenance engineers (2023)

 

 

Table of contents

One, linux

1. Linux system startup process

2. linux file type

3. How do centos6 and 7 add the program installed from the source code to the boot self-start?

4. Briefly describe lvm, how to expand the / partition using lvm?

5. Why are the statistical results of du and df inconsistent?

6. How to upgrade the kernel?

Two, mysql

1. Why does the index make the query faster? What are the disadvantages?

2. Difference between left outer join, right outer join, inner join and full join in sql statement

3. Mysql data backup method, how to restore? What is your backup strategy?

4. How to configure the database master-slave synchronization, and whether you encounter data inconsistency in actual work? How to solve?

5. What are the mysql constraints?

6. What is the purpose of the binary log (binlog)?

7. What are the mysql data engines?

8. How to query the storage path of the mysql database?

9. What are the extensions of mysql database files? What is it for?

10. How to change the password of the database user?

11. How to modify user permissions? How to check?

Three, nosql

1. What are the methods of redis data persistence?

2. What are the redis cluster solutions?

3. How does redis perform data backup and recovery?

4. How does MongoDB perform data backup?

5. Why is kafka faster than redis rabbitmq?

4. docker

1. What are the keywords in the dockerfile? What is the use of?

2. How to reduce the image volume generated by dockerfile?

3. What is the difference between CMD and ENTRYPOINT in dockerfile?

4. What is the difference between COPY and ADD in dockerfile?

5. What are the cs architecture components of docker?

6. What are the types of docker networks?

7. How to configure docker remote access?

8. What is the function of the docker core namespace CGroups joint file system?

9. Command-related: Import and export images, enter containers, set container restart policies, view image environment variables, and view container resource usage

10. What are the ways to build images?

11. What is the difference between docker and vmware virtualization?

5. kubernetes

1. What are the cluster components of k8s? What is the function?

2. Related to kubectl commands: how to modify the number of replicas, how to roll update and rollback, how to view pod details, and how to enter pod interaction?

3. How to back up etcd data?

4. What are the k8s controllers?

5. What are cluster-level resources?

6. What are the pod statuses?

7. What is the pod creation process?

8. What are the pod restart strategies?

9. What are resource probes?

10. What are requests and limits used for?

11. What does the kubeconfig file contain and what is its purpose?

12. What is the difference between role and clusterrole, rolebinding and clusterrolebinding in RBAC?

13. Why is ipvs more efficient than iptables?

14. What is the purpose of sc pv pvc, and what is the whole process of container mounting and storage?

15. What is the essence of the principle of nginx ingress?

16. Describe the communication process between Pods on different nodes

17. The k8s cluster nodes need to be shut down for maintenance, how to operate

18. The difference between canal and flannel

Six, prometheus

1. What are the advantages of prometheus over zabbix?

2. What are the prometheus components and what are their functions?

3. What are the types of indicators?

4. How to ensure performance when dealing with monitoring of thousands of nodes

5. Briefly describe the entire process from adding node monitoring to grafana graphing

6. Which exporters are used in the work

7. ELK

1. How to backup and restore Elasticsearch data?

2. What is the logstash filter plugin used in your project? What functions are implemented?

3. Have you used the built-in module of filebeat? What did you use?

4. What is an elasticsearch shard copy? What are the parameters of your configuration?

Eight, operation and maintenance development

2. Write a script, back up a certain library regularly, then compress and send it to another machine

3. Obtain system information of all hosts in batches

4. Django's mtv mode process

5. How to export and import environment dependent packages in python

6. python create, enter, exit, view virtual environment

7. The difference between flask and django, application scenarios

8. List commonly used git commands

9. How to configure the CICD process of git gitlab jenkins

9. Daily work

1. What difficult problems are encountered in daily work, and how to troubleshoot them

2. Daily troubleshooting process

3. Modify the online business configuration file process

4. How much is the business pv? What is the cluster size? How to ensure high service availability?

10. Open Questions

1. What do you think is the difference between a junior operation and maintenance engineer and a senior operation and maintenance engineer?

2. What do you think is the future direction of O&M development?


One, linux

1. Linux system startup process

  • Step 1: POST, load BIOS
  • Step 2: Read the MBR
  • Step 3: Boot Loader grub boot menu
  • Step 4: Load the kernel kernel
  • Step 5: The init process sets the run level according to the inittab folder
  • Step 6: The init process executes rc.sysinit
  • Step 7: Start the kernel module
  • Step 8: Execute script programs at different run levels
  • Step 9: Execute /etc/rc.d/rc.lo

2. linux file type

file attributes file type
- Regular files, ie file
d directory file
b block device is a block device file, such as a hard disk; supports random access in units of blocks
c character device is a character device file, such as a keyboard that supports linear access in units of characters
l symbolic link is a symbolic link file, also known as a soft link file
p pipe Named pipe file
s Socket is a socket file, which is used to communicate between two processes

3. How do centos6 and 7 add the program installed from the source code to the boot self-start?

  • General method: Edit the /etc/rc.d/rc.local file and add the start service command at the end of the file
  • centos6
    ①Enter the /etc/rc.d/init.d directory;
    ②Create a new service startup script, specify the chkconfig parameter in the script;
    ③Add execution permission;
    ④Execute chkconfig --add to add the service to start automatically;
  • centos7
    ①Enter the /usr/lib/systemd/system directory;
    ②Create a new custom service file, which contains [Unit] [Service] [Install] related configuration, and then add the execution permission;
    ③Execute systemctl enable service name;

4. Briefly describe lvm, how to expand the / partition using lvm?

  • Function: It can dynamically manage the disk. Dynamically resize on demand
  • concept:

①PV - Physical volume: The physical volume is at the bottom of the logical volume management. It can be a partition on the actual physical hard disk, or the entire physical hard disk, or a raid device.
②VG - Volume Group: The volume group is established on the physical volume, and a volume group must include at least one physical volume. After the volume group is established, the physical volume can be dynamically added to the volume group. A logical volume management system project can have only one volume group or multiple volume groups.
③LV - Logical Volume: A logical volume is built on top of a volume group, and the unallocated space in the volume group can be used to create a new logical volume. After the logical volume is created, the space can be dynamically expanded and reduced. Multiple logical volumes in the system can belong to the same volume group or to different volume groups.

  • Steps to expand the / partition:

①Add a disk
②Use the fdisk command to partition the newly added disk
③After the partition is completed, change the partition type to lvm
④Use pvcreate to create a physical volume
⑤Use the vgextend command to add the newly added partition to the root directory partition
⑥Use the lvextend command to expand the capacity
⑦ Use xfs_growfs to adjust volume partition size

5. Why are the statistical results of du and df inconsistent?

  • After a large number of files deleted by the user are deleted, they are no longer visible in the file system directory, so du will not count them again.
  • However, if there is still a running process holding the deleted file handle at this time, then the file will not be deleted from the disk, the information in the partition super block will not be changed, and df will still count the deleted file. Deleted files.
  • You can use the lsof command to query files in the deleted state. Deleted files are marked as deleted in the system. If the system has a large number of deleted files, the statistical results of du and df will be inconsistent.

6. How to upgrade the kernel?

  • method one
# 添加第三方yum源进行下载安装。
Centos 6 YUM源:http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm
Centos 7 YUM源:http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
# 先导入elrepo的key,然后安装elrepo的yum源:
rpm -import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
# 查看可用的内核相关包
yum --disablerepo="*" --enablerepo="elrepo-kernel" list available 
yum -y --enablerepo=elrepo-kernel install
  • Method Two
# 通过下载kernel image的rpm包进行安装。
官方 Centos 6: http://elrepo.org/linux/kernel/el6/x86_64/RPMS/
官方 Centos 7: http://elrepo.org/linux/kernel/el7/x86_64/RPMS/
# 获取下载链接进行下载安装即可
wget https://elrepo.org/linux/kernel/el7/x86_64/RPMS/kernel-lt-4.4.185-1.el7.elrepo.x86_64.rpm
rpm -ivh kernel-lt-4.4.185-1.el7.elrepo.x86_64.rp
# 查看默认启动顺序
[root@localhost ~]# awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
CentOS Linux (5.2.2-1.el7.elrepo.x86_64) 7 (Core)
CentOS Linux (4.4.182-1.el7.elrepo.x86_64) 7 (Core)
CentOS Linux (3.10.0-957.21.3.el7.x86_64) 7 (Core)
CentOS Linux (3.10.0-957.10.1.el7.x86_64) 7 (Core)
CentOS Linux (3.10.0-327.el7.x86_64) 7 (Core)
CentOS Linux (0-rescue-e34fb4f1527b4f2d9fc75b77c016b6e7) 7 (Core)
由上面可以看出新内核(4.12.4)目前位置在0,原来的内核(3.10.0)目前位置在1
# 设置默认启动
[root@localhost ~]# grub2-set-default 0  // 0代表当前第一行,也就是4.12.4版本
# 重启验证

7. How to count the top ten IP addresses in nginx log visits?

awk '{array[$1]++}END{for (ip in array)print ip,array[ip]}' access.log |sort -k2 -rn|head

8. How to delete the logs from 30 days ago at the end of .log under /var/log/?

find /var/log/ -type f -name .*.log -mtime 30|xargs rm -f

9. What modules does ansible have? What is the function?

module Function
copy Copy files to the host
cron timed task
fetch Copy the controlled file to the local
file file module
group User Group Module
user user module
hostname hostname module
script script module
service service start module
command Remote execution command module
shell Remote execution command module, command advanced usage
yum Install package group module
setup View host system information

10. Why is nginx faster than apache?

  • nginx adopts epoll model
  • Apache adopts the select model

11. What is the difference between four-layer load and seven-layer load?

  • Layer 4 forwarding based on IP+port
  • Layer seven is load balancing based on application layer information such as URLs

12. What are the working modes of lvs? Which has the highest performance?

  • dr: direct routing mode, the request is accepted by LVS, and the server that actually provides the service returns to the user directly, without going through LVS. ( highest performance )
  • tun: tunnel mode, the client sends access VIP packets to the LVS server. The LVS server repackages the request message and sends it to the backend real server. The back-end real server unpacks the request message, and processes the request after confirming that it has VIP. After processing the data request, the backend real server directly responds to the client.
  • nat: The entry and exit of network newspapers must be processed by LVS. LVS needs to act as a gateway to RS. When the packet arrives at LVS, LVS performs destination address translation (DNAT) to change the destination IP to RS's IP. After RS ​​receives the packet, it seems that the client sends it directly. After RS ​​finishes processing and returns a response, the source IP is the RS IP, and the destination IP is the client's IP. At this time, the RS packet is transferred through the gateway (LVS), and LVS will perform source address translation (SNAT) to change the source address of the packet to VIP, so that the packet looks to the client as if it was directly returned by LVS. The client cannot perceive the existence of the backend RS.
  • Fullnat mode: The fullnat mode is similar to the nat mode, but the difference from nat is that the nat mode only performs two address translations, but the fullnat mode does four times.

13. The meaning of each directory of tomcat, how to modify the port, how to modify the number of memory?

  • Bin stores tomcat commands
  • conf stores tomcat configuration files
  • lib stores the jar packages that need to be loaded when tomcat runs
  • log There are logs generated by Tomcat running
  • Temp files generated during the operation of temp
  • webapps site directory
  • work stores the compiled files when tomcat runs
  • conf/server.xml modify port number
  • bin/catalina.sh modify jvm memory

14. When nginx reverse proxy, how to make the backend obtain the real access source ip?

# 在location配置段添加以下内容:
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;

15. What are the nginx load balancing algorithms?

  • rr rotation training
  • weight Weighted rotation training
  • ip_hash static scheduling algorithm
  • fair dynamic scheduling algorithm
  • url_hash url hash
  • leat_conn the minimum number of connections

16. How to conduct stress test?

For example: Simulate 10 users and initiate a total of 100 requests to the Baidu homepage.

# 测试命令:
ab -n 100 -c 10 https://www.baidu.com/index.htm

17. How does the curl command send https requests? How to view response header information? How to send get and post form information?

  • Send https request:
curl --tlsv1 'https://www.bitstamp.net/api/v2/transactions/btcusd/'
  • response header information: curl -I
  • get: curl request address?key1=value1&key2=value2&key3=value3
  • post:curl -d “key1=value1&key2=value2&key3=value3”

Two, mysql

1. Why does the index make the query faster? What are the disadvantages?

The default method is to scan the entire table according to the search conditions, and add the search result set when the conditions match. If we add an index to a certain field, the query will first go to the index list to locate the number of rows with a specific value at a time, greatly reducing the number of traversal matching rows, so it can significantly increase the query speed. Disadvantages
:

  • Creating and maintaining indexes takes time, which increases with the amount of data
  • The index needs to occupy physical space. In addition to the data space occupied by the data table, each index also occupies a certain amount of physical space. If a clustered index needs to be established, the space required will be larger
  • When adding, deleting, and modifying the data in the table, the index must also be dynamically maintained, which reduces the maintenance speed of integers

2. Difference between left outer join, right outer join, inner join and full join in sql statement

3. Mysql data backup method, how to restore? What is your backup strategy?

  • physical full backup
备份所有数据库文件:/var/lib/mysql/*
备份所有binlog文件:  /var/lib/mysql/mysql-bin.*
备份选项文件: /etc/my.cnf
  • mysqldump logical backup
mysqldump -uroot -p --all-databases > /backup/mysqldump/all.db
  • Physical backup recovery
#先把原来的数据目录改名
mv /var/lib/mysql /var/lib/mysql.old  
cp -a /backups/mysql /var/lib
  • Logical backup data recovery
mysql > use db_name
mysql > source /backup/mysqldump/db_name.db

4. How to configure the database master-slave synchronization, and whether you encounter data inconsistency in actual work? How to solve?

Configure a server-id with a unique value for each server

  • main library
开启binlog日志
创建主从复制用户
查看master的状态
  • From library
change master to设置主库信息
start slave开始复制

5. What are the mysql constraints?

  • not-null constraint
  • unique constraint
  • primary key constraint
  • foreign key constraints

6. What is the purpose of the binary log (binlog)?

BINLOG records the change process of the database. For example, DDL operations such as creating a database, creating a table, modifying a table, and related DML operations on data tables, these operations will cause changes to the database. After the binlog is enabled, the operations that cause changes to the database will be recorded in the form of "events" in chronological order. binlog binary file.

7. What are the mysql data engines?

  • Commonly used myisam, innodb
  • the difference:

(1) InnoDB supports transactions, but MyISAM does not, which is very important. Transaction is an advanced processing method. For example, as long as there is an error in some column additions, deletions and changes, it can be rolled back and restored, but MyISAM cannot; (2
) MyISAM is suitable for query and insertion-based applications, and InnoDB is suitable for frequent modification and It involves applications with high security;
(3) InnoDB supports foreign keys, but MyISAM does not;
(4) MyISAM is the default engine, and InnoDB needs to be specified;
(5) InnoDB does not support FULLTEXT type indexes;
(6) InnoDB does not support Save the number of rows in the table. For example, when selecting count( ) from table, InnoDB needs to scan the entire table to calculate how many rows there are, but MyISAM simply reads the number of saved rows. Note that MyISAM also needs to scan the entire table when the count() statement contains the where condition;
(7) For self-increasing fields, InnoDB must contain only the index of this field, but it can be established together with other fields in the MyISAM table Joint index;
(8) When clearing the entire table, InnoDB deletes one row at a time, which is very slow. MyISAM will rebuild the table;
(9) InnoDB supports row locks (in some cases, it still locks the entire table, such as update table set a=1 where user like '%lee%'

8. How to query the storage path of the mysql database?

  • myisam
.frm文件:保护表的定义
.myd:保存表的数据
.myi:表的索引文件

9. What are the extensions of mysql database files? What is it for?

  • myisam
.frm文件:保护表的定义
.myd:保存表的数据
.myi:表的索引文件
  • innodb
.frm:保存表的定义
.ibd:表空间

10. How to change the password of the database user?

  • before mysql8
set password for 用户名@localhost = password('新密码'); 
mysqladmin -u用户名 -p旧密码 password 新密码  
update user set password=password('123') where user='root' and host='localhost';
  • after mysql8
# mysql8初始对密码要求高,简单的字符串不让改。先改成:MyNewPass@123;
alter user 'root'@'localhost' identified by 'MyNewPass@123';
# 降低密码难度
set global validate_password.policy=0;
set global validate_password.length=4;
# 修改成简易密码
alter user 'root'@'localhost'IDENTIFIED BY '1111';  

11. How to modify user permissions? How to check?

  • Authorization:
grant all on *.* to user@'%' identified by 'passwd'
  • View permissions
show grants for user@'%';

Three, nosql

1. What are the methods of redis data persistence?

  • rdb
  • of

2. What are the redis cluster solutions?

  • Official cluster solution
  • twemproxy proxy scheme
  • sentinel mode
  • Codis
    client fragmentation

3. How does redis perform data backup and recovery?

  • backup
redis 127.0.0.1:6379> SAVE
创建 redis 备份文件也可以使用命令 BGSAVE,该命令在后台执行。
  • reduction
只需将备份文件 (dump.rdb) 移动到 redis 安装目录并启动服务即可
redis 127.0.0.1:6379> CONFIG GET dir

4. How does MongoDB perform data backup?

mongoexport / mongoimport
mongodump  / mongorestore

5. Why is kafka faster than redis rabbitmq?

Are RabbitMQ, ZeroMQ, and Kafka a hierarchical thing? What are the advantages and disadvantages of each other? - Know almost

4. docker

1. What are the keywords in the dockerfile? What is the use of?


2. How to reduce the image volume generated by dockerfile?

  • Try to choose a basic system image that meets the needs but is relatively small. For example, you can choose the debian:wheezy or debian:jessie image most of the time, which is only less than 100 megabytes in size;
  • Clean up temporary files such as compilation generated files and installation package cache;
  • Specify the exact version number when installing each software, and avoid introducing unnecessary dependencies;
  • From a security point of view, applications should use system libraries and dependencies as much as possible;
  • If you need to configure some special environment variables when installing the application, restore the variable values ​​​​that do not need to be kept after installation;

3. What is the difference between CMD and ENTRYPOINT in dockerfile?

  • Both the CMD and ENTRYPOINT instructions are used to specify the command to run when the container starts.
  • When the ENTRYPOINT command is specified as exec mode, the parameters specified by CMD will be added as parameters to the parameter list of the command specified by ENTRYPOINT.

4. What is the difference between COPY and ADD in dockerfile?

  • Both the COPY command and the ADD command can copy or add resources on the host to the container image
  • The difference is that resources that can be added from a remote URL will not be decompressed.
  • If it is a local compressed package ADD it will be decompressed

5. What are the cs architecture components of docker?

6. What are the types of docker networks?

  • host mode
  • container mode
  • none mode
  • bridge mode

7. How to configure docker remote access?

  • vim /lib/systemd/system/docker.service
  • Add configuration after ExecStart=, note that you need to enter a space first, and then enter -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock

8. What is the function of the docker core namespace CGroups joint file system?

  • namespace: resource isolation
  • cgroup: resource control
  • Joint file system: support the modification of the file system as a submission to superimpose layer by layer, and at the same time, different directories can be mounted under the same virtual file system

9. Command-related: Import and export images, enter containers, set container restart policies, view image environment variables, and view container resource usage

  • Import image docker load -i xx.tar
  • Export image docker save -o xx.tar image_name
  • Enter the container docker exec -it container command /bin/bash
  • When setting the container restart strategy to start --restart option
  • View container environment variables docker exec {containerID} env
  • View container resource usage docker stats test2

10. What are the ways to build images?

  • dockerfile
  • The container is submitted as an image

11. What is the difference between docker and vmware virtualization?

5. kubernetes

1. What are the cluster components of k8s? What is the function?

2. Related to kubectl commands: how to modify the number of replicas, how to roll update and rollback, how to view pod details, and how to enter pod interaction?

  • Modify the number of copies
kubectl scale deployment redis --replicas=3
  • activity update
kubectl set image deployments myapp-deploy myapp=myapp:v2
  • rollback
kubectl rollout undo deployments myapp-deploy
  • View pod details
kubectl describe pods/<pod-name>
  • Enter pod interaction
kubectl exec -it <pod-name> -c <container-name> bash

3. How to back up etcd data?

4. What are the k8s controllers?

  • Replica Set (ReplicaSet)
  • Deployment
  • StatefulSet (StatefulSet)
  • Daemon集(DaemonSet)
  • A task (Job)
  • Scheduled tasks (CronJob)
  • Stateful Set (StatefulSet)

5. What are cluster-level resources?

  • Namespace
  • Node
  • Role
  • ClusterRole
  • RoleBinding
  • ClusterRoleBinding

6. What are the pod statuses?

  • Pending
  • Running running
  • Succeeded normal termination
  • Failed abnormal stop
  • Unkonwn unknown status

7. What is the pod creation process?

8. What are the pod restart strategies?

There are three Pod restart strategies, and the default value is Always.

  • Always : When the container fails, kubelet will automatically restart the container;
  • OnFailure: restart when the container terminates and the exit code is not 0;
  • Never : The kubelet will not restart the container regardless of the state

9. What are resource probes?

  • ExecAction: The operation of executing a command in the container and diagnosing it based on the returned status code is called Exec detection. A status code of 0 means success, otherwise it is in an unhealthy state.
  • TCPSocketAction: Diagnose by trying to establish a connection with a certain TCP port of the container. If the port can be successfully opened, it is normal, otherwise it is unhealthy.
  • HTTPGetAction: Diagnose by initiating an HTTP GET request to a specified path on a specified port of the IP address of the container. If the response code is 2xx or 3xx, it is successful; otherwise, it is a failure

10. What are requests and limits used for?

  • The "requests" attribute defines the guaranteed availability value of its request, that is, the container may not use the resources of these quotas, but when it is used, it must ensure that so many resources are available
  • The "limits" attribute is used to limit the maximum value available to the resource, that is, the hard limit

11. What does the kubeconfig file contain and what is its purpose?

Contains cluster parameters (CA certificate, API Server address), client parameters (certificate and private key generated above), cluster context information (cluster name, user name).

12. What is the difference between role and clusterrole, rolebinding and clusterrolebinding in RBAC?

  • Role can be defined in a namespace. If you want to cross namespaces, you can create ClusterRole. ClusterRole has the same permission and role control capabilities as Role. The difference is that ClusterRole is at the cluster level.
  • RoleBinding applies to authorization within a namespace, while ClusterRoleBinding applies to cluster-wide authorization

13. Why is ipvs more efficient than iptables?

The IPVS mode and iptables are also based on Netfilter, but the hash table used by ipvs and the list of rules used by iptables. Iptables is also designed for firewalls. The more clusters there are, the more iptables rules will be. However, iptables rules are matched from top to bottom, so the efficiency is lower. Therefore, when the number of services reaches a certain scale, the speed advantage of the hash lookup table will appear, thereby improving the service performance of the service

14. What is the purpose of sc pv pvc, and what is the whole process of container mounting and storage?

  • PVC: The attributes of the persistent storage that the Pod wants to use, such as storage size, read and write permissions, and so on.
  • PV: Specific properties of Volume, such as Volume type, mount directory, remote storage server address, etc.
  • StorageClass: Acts as a template for PVs. Moreover, only PVs and PVCs that belong to the same StorageClass can be bound together. Of course, another important role of StorageClass is to specify the Provisioner (storage plug-in) of PV. At this time, if your storage plug-in supports Dynamic Provisioning, Kubernetes can automatically create PV for you.

15. What is the essence of the principle of nginx ingress?

  • The ingress controller interacts with the kubernetes api to dynamically sense changes in ingress rules in the cluster.
  • Then read it, according to the custom rules, the rules are to specify which domain name corresponds to which service, and generate a piece of nginx configuration,
  • Then write to the pod of nginx-ingress-controller, which
    runs an Nginx service in the pod of the Ingress controller, and the controller will write the generated nginx configuration into the /etc/nginx.conf file.
  • Then reload it to make the configuration take effect. In this way, the problems of domain name sub-configuration and dynamic update can be achieved.

16. Describe the communication process between Pods on different nodes

17. The k8s cluster nodes need to be shut down for maintenance, how to operate

  • Perform pod eviction: kubelet drain <node_name>
  • Check whether there is no pod running on the node, and the evicted pod is already running normally on other nodes
  • shutdown maintenance
  • Start related services at startup (note the startup sequence)
  • Unschedulable node node: kubectl uncordon node
  • Create a test pod and use the node label to test that the node can be scheduled normally

18. The difference between canal and flannel

  • Flannel (simple, mostly used): based on Vxlan technology (overlay network + Layer 2 tunnel), does not support network policies
  • Calico (more complex, less used than Flannel): can also support tunnel networks, but it is a layer-3 tunnel (IPIP), supports network policies
  • The Calico project can independently provide network solutions and network policies for Kubernetes clusters, and can also be combined with flannel, where flannel provides network solutions, and Calico is only used to provide network policies at this time.

Six, prometheus

1. What are the advantages of prometheus over zabbix?

https://blog.csdn.net/wangyiyungw/article/details/85774969**

2. What are the prometheus components and what are their functions?

3. What are the types of indicators?

  • Counter
  • Guage (dashboard)
  • Histogram
  • Summary

4. How to ensure performance when dealing with monitoring of thousands of nodes

  • Reduce collection frequency
  • Reduce the number of days to save historical data,
  • Using cluster federation and remote storage

5. Briefly describe the entire process from adding node monitoring to grafana graphing

  • The monitored node installs exporter
  • Prometheus server adds monitoring items
  • View prometheus web interface - status - targets
  • Grafana creates graphs

6. Which exporters are used in the work

  • node-exporter monitors linux hosts
  • cAdvisor monitors containers
  • MySQLD Exporter monitors mysql
  • Blackbox Exporter network detection
  • Pushgateway collects custom indicators for monitoring
  • process exporter process monitoring

7. ELK

1. How to backup and restore Elasticsearch data?

https://www.cnblogs.com/tcy1/p/13492361.html
https://blog.csdn.net/moxiaomomo/article/details/78401400?locationNum=8&fps=1

2. What is the logstash filter plugin used in your project? What functions are implemented?

  • date date parsing
  • Grok regular matching analysis
  • overwrite write a field
  • dissect delimiter parsing
  • mutate handles fields
  • JSON parsing
  • geoip geographic location analysis
  • ruby modify logstash event

3. Have you used the built-in module of filebeat? What did you use?

4. What is an elasticsearch shard copy? What are the parameters of your configuration?

https://juejin.cn/post/6844903862088777736

Eight, operation and maintenance development

1. Backup all container images in the system

#备份镜像列表

2. Write a script, back up a certain library regularly, then compress and send it to another machine

  • The public part defines functions, such as obtaining the timestamp and configuring the alarm interface
  • Use if to judge whether there is an exception and handle it. If the database is large, check whether the task is completed. Check if the generated file size is an empty file

3. Obtain system information of all hosts in batches

  • Use python's paramiko library, ssh to log in to the host to perform query operations
  • Use shell scripts to ssh in batches to log in to the host and execute commands
  • Use ansible's setup module to get host information
  • The node_exporter of prometheus collects host resource information

4. Django's mtv mode process

5. How to export and import environment dependent packages in python

  • export environment
pip freeze >> requirements.txt
  • import environment
pip install -r requirement.txt

6. python create, enter, exit, view virtual environment

  • install package
pip3 install virtualenv
  • Check if the installation was successful
virtualenv --version
  • Create a virtual environment
  • cd to the directory where you want to create the virtual environment
cd github/test/venv/
  • Create a virtual environment
virtualenv test
  • Activate the virtual environment
source test/bin/activate(activate路径)
  • Exit the virtual environment
deactivate

7. The difference between flask and django, application scenarios

  • Django has large and comprehensive functions, and Flask only includes the basic one-stop solution for configuring Django, so that developers do not need to spend a lot of time choosing the infrastructure of the application before development. Django has templates, forms, routing, authentication, basic database management, and more built in. In contrast, Flask is just a core, which depends on two external libraries by default: Jinja2 template engine and Werkzeug WSGI toolset, and many other functions are embedded in the form of extensions.
  • Flask is more flexible than Django. Before using Flask to build applications, developers will have more flexibility when selecting components. Some application scenarios may not be suitable for using a standard ORM (Object-Relational Mapping Object Association Mapping), or need to interact with different workflows and templating systems

8. List commonly used git commands

  • $ git init
  • $ git config
  • $ git add
  • $ git commit
  • $ git branch
  • $ git checkout
  • $ git tag
  • $ git push
  • $ git status
  • $ git log

9. How to configure the CICD process of git gitlab jenkins

  • The developer git submits the code to the gitlab warehouse
  • Jenkins pulls code from gitlab and triggers mirror build
  • Upload the image to the harbor private warehouse
  • Download the image to the execution machine
  • mirror run

9. Daily work

1. What difficult problems are encountered in daily work, and how to troubleshoot them

  • Redis weak password leads to mining virus, troubleshooting, optimization
  • The program developed in k8s starts the process when the user uploads the file, but fails to close it in time, causing the node to exceed the maximum number of processes

2. Daily troubleshooting process

  • Check the content of the alarm, and quickly locate the roughly faulty host, service, and scope of influence
  • Notify the operation and maintenance manager of the failure and start troubleshooting
  • If you need to modify the configuration file, restart the server and other operations, inform the relevant developers
  • complete troubleshooting

3. Modify the online business configuration file process

  • Inform the operation and maintenance manager and business-related developers first
  • Test in the test environment and back up the previous configuration files
  • Modify the production environment configuration after the test is correct
  • Observe whether the production environment is normal and whether there is an alarm
  • Complete configuration file changes

4. How much is the business pv? What is the cluster size? How to ensure high service availability?

10. Open Questions

1. What do you think is the difference between a junior operation and maintenance engineer and a senior operation and maintenance engineer?

2. What do you think is the future direction of O&M development?

Note: The article is transferred from the IT operation and maintenance technology circle. If there is any infringement, please contact to delete it.

Guess you like

Origin blog.csdn.net/weixin_53678904/article/details/131826794