Automatic host configuration management based on Ansible

image


image

About the Author

Wang Ao

Former DevOps expert at an exchange in Shanghai


Hello everyone, everyone knows Ansible is a configuration management tool, and now it is acquired and has a commercial service support.

Today is the 10th GOPS Conference. I have changed from a listener to a speaker. I hope everyone here can also stand on the stage.

image

I set the threshold of the theme lower, this is a relatively Low theme:

  • Five nines and four nines mean that you can hear and clearly understand what I can do with Ansible, and you can also see what our company does with Ansible

  • After listening, 80% of people can use Ansible to completely imitate and implement a management method that is more in line with their needs. How do you understand this sentence? It is because Ansible itself is a very simple configuration management tool and a standardized scripting language. After reading this, you know what we have done, and you can also use Ansible to achieve the same needs.

  • The last 100% effort is to balance life. We know that doing operation and maintenance is painful and stressful, and it takes time to balance family and work. Everyone is very young. I just got married and I will encounter many problems. I hope everyone can do better.

image

1. Introduction to the company and technical structure

My last unit was an exchange in Shanghai, currently in Sea, Singapore. The following content mainly focuses on the Ansible practice of the exchange.

image

The first two points are well understood, and different industries have different focuses.

From the third point of view, we have eliminated small computers. It is a trend for financial institutions to eliminate small computers, and it is very beneficial to standardization.

The fourth point is the physical isolation of the internal and external networks. If you can understand the physical level of isolation, you will understand how high the cost of management is. At the same time, you will also face various security compliance requirements. This is a process of balancing efficiency and safety.

Fifth point, we are gradually transforming from a mature foreign commercial integration solution to an integrated domestic open source operation and maintenance R&D.

Sixth, DevOps and AIOps are now popular, and the "R&D and Operation Integrated DevOps Standard System and Capability Maturity Model" has been rated as an international standard. Everyone can learn from and refer to and embrace changes.

image

This is the previous solution of the exchange. For the deployment of bare metal and virtual machines, we know that the cost is high.

We are also gradually transforming OpenStack, from the most native Bash Shell to the Ansible project v1.0.

我希望在下一界运维大会的时候能邀请原来的同事分享自主研发且支持商用的监控平台。

image

我们为什么选 Ansible?跟大家的理解可能类似,灵活,它是没有客户端的,可以非常轻松地切换其他解决方案。

Puppet、SaltStack 学习成本比较高,商业化产品无法满足个性化需求,基于 Ansible 二次开发就成为了一个十分理想的选择。

image

这是我整理的一个CM配置管理工具流行趋势,Ansible 是 2012 年出来的,它是后起之秀,但活跃度远远高于其他的。

image

2. Ansible 标准化学习路径

这是我们之前整理的一些官方的学习标准,包括我们内部的资料。

image

image

3. Ansible 项目实践

到这里我希望大家能理解用 Ansible 做什么,我这边提了两点,第一点就是标准化,这里的标准化很复杂但又非常重要,这里我们体现更多的是软件层面上的标准化,标准化是自动化的充要条件。

所以,做标准化很重要。Ansible 并不知道你的需求。但是你有了标准,你就可以基于 Ansible 快速实现自己的需求。即使你不用 Ansible,你也能够做更多的事情。

无论是自建数据中心还是托管,私有云或者公有云,唯一不变的是自动化运维的能力,这点会让你们永远不会被淘汰。

image

3.1 Ansible 项目实践 - Linux

我们去掉了小机,目标是把 Linux 和 Windows 管起来。

下面我会比较快地做一个梳理,它不是最重要的,最重要的是思想。

第一个服务端,Ansible 很简单,我们是在 RHEL 6 和 7 上搭建,服务端的 Python 版本我们用的是 2.7.14。

2.4 以上 Ansible 变化非常大,因为它对于 Windows 的支持上了一个台阶,在 2.4 之前的版本,连 Ansible 官方都写我们做 Ansible 不是为了管 Windows。

管理的对象,我们这边主要是针对RHEL 5/6/7,包括最新的版本。还有基线的标准,这个是很多公司想要的,因为这是自动化的前提。

image

这是一个思维导图,从这个图里可以看到这个结构非常简单,Ansible 就是 inventory 和 playbook 两部分,主要有四个功能,我们后面会讲。

image

服务端也说了,客户端安装的时候,一般只有 RHEL5.5 需要安装 simplejson,其他的都不用很方便。

image

可以看到这是一些最基本的纯命令行,而且是可以灵活调整的

image

配置文件也很简单,因为我们的关联方式包括传统密码,下面有做优化,我们设了一个缓存。

下面是 Ansible 的一些变量,后面都写了非常详细的解释,这些我认为很基础,大家看了之后,包括看官方文档会很清晰。

image

image

主要的功能是用户管理,包括用户创建,信任关系添加,密码修改等

image

image

第二个配置是动态化的,第三个主要是补丁包,大家都会遇到突然出现高危漏需要尽快评估修复的需求,Redhat 官方也会使用 Ansible 作为打补丁的标准化方案之一。

最后备份恢复非常重要,因为我们不能保证在不同的平台下,你每次做执行结果都是一样的,有可能出现一些意外。之前我们在虚拟机上测试,后来放到新的服务器上执行结果把它搞坏了又没有备份配置文件,导致了服务器要重装,所以说备份很重要。

这是一些简单模块的处理,我认为没有必要多讲,大家事后可以做参考。

image

image

这边 Ansible 做的条件过滤判断非常方便,但这又是比较低效的。如果你们有自己的 CMDB 的话,提前分组执行,而不是让 Ansible 帮你判断,Ansible 虽然很快,但是牺牲的是效率。

image

3.2 Ansible 项目实践 - Windows

下面的是 Windows,我相信用 Ansible 很少用它来管Windows,因为大部分人没有需求,但是我们公司有,所以我们就得硬着头皮上。这边的管理方式跟那边一样,唯一不同的是我建议安装 2.4 的版本,包括最新的 2.6 也支持。

image

这边有一个 Windows 的安全基线,每一条我们全部写得清清楚楚。

image

这边的目录结构也是跟 Ansible 差不多,这上面写的安装可能是稍微有一点区别,如果Windows 版本低于2012,你需要升级客户端的 Powershell。

image

配置结构和Linux类似,大家后面仔细看都能发现,其实都是跟 Windows 相关,跟我们基线相关,可以非常方便的自由定制。

image

image

image

image

用户也是一样的,你改密码的话,目前没有特别好的方式去改,因为改完了会报错,但实际上是改成功了,这是一个坑。

image

image

image

整个配置流程和Linux保持一致性,大部分基于Windows的原生模块编写,支持注册表,本地安全策略,高级审计策略等等。这边是 Windows 的条件系统判断,它支持中文,你不用担心它对中文不友好。

image

4. Ansible v1.x 项目总结

我们原来主导Puppet的同事离职了,在重启自动化平台项目后用起来也不是很爽,加上很多安全合规的要求,最后选择 Ansible 源于多个维度的考虑。

当时用的时候是 2.4 以前的版本,对 Windows 的支持是个迷,你要顶着这个压力往上走。左边是我们写的代码,原来基于 Bash 写的代码非常非常复杂,是我领导一个人写的,你要知道支持 AIX/HPUX 另外两个操作系统有多么的不容易,他都做到了。我们想在这个基础上改,很痛苦,看不懂。

image

右边是我们传统的用 EXCEL 书写的,这些我们现在都可以转换成 Ansible。我们这边用它有几个好处,第一个使用场景标准化,因为我们通用的是 X86 服务器,无论是物理机还是虚拟机它都支持。

第二个我们没有小机很幸福,同时也实现的对Windows的管理,第三个我认为很重要的一点,我们适配了不同的配置场景,而且我们不止步于 Baseline,原来我们要拖很久,我们要去改很复杂,现在我们随时都可以灵活调整。

第四点 Ansible 的版本非常稳定,而且升级非常方便,大家不用担心装了版本之后我要升级是不是很麻烦很复杂,没有的事。你可能只要升级一个服务端。

Now we have integrated the needs of different positions, used Ansible to maintain and break through the walls of different departments. We can implement cross-position requirements quickly and iteratively, and everyone can see the changes.

image

I believe that everyone has no hesitation in using Ansible now, because the community is very active and the knowledge base is also very rich. At that time, we could not see clearly, what we were thinking about was how to break the ice, and we wanted the operation and maintenance personnel to like and adapt.

The third most painful point is how we convert the original Baseline to Ansible, even if I leave the rest of the people can get started quickly, I think this is the most important point, why use open source, this is Standardized support.

The fourth point is that the early version of Ansible was not friendly to Windows. We were very painful to deal with it. Now we have found that there is good support, and everyone can use it with confidence. At least I use Ansble to achieve all the baseline requirements of Windows 7/2008/2012/10/2016, so don't worry.

Fifth, when the internal network is isolated, the network is blocked and there is no way to connect to the external network. At this time, using Ansible can also be solved very elegantly.

The sixth is how to apply patches for Windows and Office. We need to install those patches that are subject to policy and compliance requirements. Our entire company has made desktop virtualization. Even in the domain control management mode, using Ansible can easily implement patch updates, which is unaware of ordinary users.

image

5. Ansible v2.0 project

This was originally just a plan, and now the company in Singapore has almost achieved it, and there is a chance to share it again in the coming year.

image

6. Friendly Promotion

There is a friendly promotion here. I hope more people can stand up and share it, because the exchange is relatively mysterious, but it also does a lot of professional things behind it. For example, LightCam completely independently develops and supports commercial monitoring platforms, and the AOP automated operation and maintenance platform has also successfully applied for national patents. We hope that more scenarios can be automated. People will always make mistakes, but only after they make mistakes will we encourage us to improve methods.

image

Everyone in the project implementation category may also be doing it. One of our projects is to compare the View cross-version upgrade and Huawei's FusionCloud. Learn and progress with everyone.


Guess you like

Origin blog.51cto.com/15127563/2664942