This article uses the "Attribution 4.0 International (CC BY 4.0)" license agreement, welcome to reprint, or re-modify the use, but need to indicate the source. Signed 4.0 International (CC BY 4.0)

Author: Su Yang

Creation time: April 10, 2019 Count of words: 7143 words Reading time: 15 minutes Read this article link: https://soulteary.com/2019/04/10/gitlab-was-built-with-docker-and-traefik -part-2.html

Use Docker and Traefik to build GitLab (the second part) The
previous article mentioned that we need to introduce some GitLab security configuration issues. This article will briefly talk about how to strengthen your GitLab code repository deployed on the public network.

Background of the problem
Generally speaking, we recommend using GitLab in an intranet environment. Relatively large and complex software, such as GitLab, contains a lot of sensitive information. Because of the isolation of the network, it can naturally reduce many "available areas".

If external use is required, it is recommended to use methods such as "dedicated tunnels" to provide directed traffic access.

As mentioned earlier, the demands encountered this time are precisely:

Cannot use dedicated traffic tunnel
It is not possible to build an intranet.
It does not matter if you want to provide public network access , just see the trick.

Analyze the fundamentals
of the application environment without considering the sneaky situation for the time being. You can first simply classify the scenes that can be affected by sex:

System layer (host environment)
Network layer (with traffic interaction scenarios)
Application layer (software operating environment, related configuration)
The user layer (daily software usage scenarios)
considers length and article pertinence for two reasons. This article focuses on security reinforcement of the "application layer".

System Reinforcement
There are a lot of information about system reinforcement on the Internet. I also mentioned it in my previous article. If you are interested, you can read it yourself. In order to facilitate your search, here is a brief mention of conventional means:

Log auditing for sensitive ports.
Modify system sensitive ports, such as SSH ports, and consider opening them on demand.
Restrict login users and user login methods (such as bastion machine or secret key authentication), and create ordinary users to replace root for daily operations.
Purchase cloud security services and connect monitoring and alarms.
Network reinforcement
The content of network reinforcement is rather complicated, we will start from the following five points.

Encrypted traffic transmission
network reinforcement. There is a simple principle. Except for local traffic, all traffic that can be encrypted by SSL is transmitted in SSL encryption mode, including:

System calls across hosts
Although the previous calls of applications and databases are not configured for SSL configuration, the deployment and coding costs will increase. If the machine resources are tight, some performance may be affected, and additional cost issues may also be caused:
SSL certificates used by enterprises are paid annually, which is very expensive.
But if you think the cost of purchasing a certificate is too high, you can also use a self-signed certificate to solve the problem.
In fact, the various costs of deploying SSL are a negligible one-time investment in the long run, but the security risk is the foundation and bottom line, and it is not worth the risk.

Avoid public DNS resolution When
mentioning network services, one thing is often overlooked: DNS resolution.

If your application only provides services for a small number of people, you may wish to consider not performing resolution on the public network DNS, and only provide services by binding Hosts.

With Traefik's service discovery function, if the other party does not know your service domain name, even if your site is scanned by IP, the result after the request is only 404 Not Found.

Adding the
last measure for network request verification , without exposing public domain names, can already solve the sniffing of most scanners. But in the face of targeted ***, this trick is not effective.

At this time, it is recommended to add a basic user authentication in front of our Web system: Baisc Auth.

It is easy to add this layer of verification using Traefik, and only requires the following two simple statements:


- "traefik.gitlab.frontend.auth.basic=${BASIC_AUTH}"
- "traefik.gitlab.frontend.auth.basic.removeHeader=true"

The role of these two lines of configuration is:

The first line tells the program that we want to use Basic authentication, what is the username and password for authentication.
The second line of configuration tells the program that this authentication is only used when Traefik traffic enters, and do not continue to be passed to the application to avoid other troubles (for example, applications such as Confluence will use the authorization in the HTTP request header as the system login credentials ).
Of course, you also need to create a .env environment configuration file, such as:

BASIC_AUTH=soulteary:$apr1$rgGAffTk$vDZ1tL03og0nZ8XlCfdv80


如果你好奇这段代码是如何生成的，可以在使用 Docker 搭建 Confluence 这篇文章中找到答案。

下面给出一个相对完整的配置参考：

labels:

"traefik.enable=true"
GitLab web service
"traefik.gitlab.frontend.auth.basic.removeHeader=true"
"traefik.gitlab.frontend.auth.basic=${BASIC_AUTH}"
"traefik.gitlab.port=80"
"traefik.gitlab.frontend.rule=Host:gitlab.${BASEHOST}"
"traefik.gitlab.frontend.entryPoints=http,https"
"traefik.gitlab.frontend.headers.SSLProxyHeaders=X-Forwarded-For:https"
"traefik.gitlab.frontend.headers.STSSeconds=315360000"
"traefik.gitlab.frontend.headers.browserXSSFilter=true"
"traefik.gitlab.frontend.headers.contentTypeNosniff=true"
"traefik.gitlab.frontend.headers.customrequestheaders=X-Forwarded-Ssl:on"
"traefik.gitlab.frontend.passHostHeader=true"
"traefik.gitlab.frontend.passTLSCert=false"


**使用浮动 IP**
如果对方不光使用侵入的方式进行***，还想让你暂时无法正常使用系统，比如对你进行令人发指的 DDoS ***。

作为被***方，可以使用 浮动IP 的方式，在遭遇***的时刻，降低切换 IP 的成本，快速金蝉脱壳，这里配合支持动态加速的 CDN 服务效果更好。

**应用层**
应用层做的事情也比较杂，我们来慢慢说起。

**用户侧流量加密**
建议系统不提供任何 HTTP 流量，防止用户侧流量被劫持利用。

所有出公网流量一律走 HTTPS，如果你也使用前文提到的 Traefik ，那么这个事情默认就是做好了的（参考刚刚的配置）。

**对接 Prometheus 性能监控**
如果你对可用性有很高的要求，可以参考官方文档，对接 Prometheus 性能监控，如果你对Prometheus没有概念，可以先浏览一下官方的在线示例，这部分展开聊可以写好几篇，先略过。

**处理 CI Runner**
CI 虽然作为呼之即来、挥之即去的“附加部分”，但是实际上也可以因为“频繁调用”而拒绝服务，或者因为不恰当的 CI 配置，而泄露敏感信息，或者作为***跳板，伤害到线上业务代码。

对于 GitLab CI Runner 运行监控，推荐使用 timoschwarzer/gitlab-monitor ，不过如果你在系统中配置好了推送消息，项目数量比较少的时候，一个手机Push过来，或许更方便迅捷。

对于 CI Runner ，要确定尽可能少的提供 SHELL 模式的 Runner，多提供容器模式的 Runner，减少 Runner ***到宿主机的可能。

另外 Runner 可被触发的分支和仓库要做额外的限制，尽可能避免过度频繁的 Runner 执行，让宿主机器“过劳死”。

最后，Runner 中使用的环境变量和配置信息，需要使用加密环境变量的方式进行获取，而非明文写死在配置文件代码中。GitLab 这部分做的很好，有兴趣的小伙伴可以了解一下。

**监控 GitLab SSH 端口**
因为我们对用户提供了 SSH 的方式去 Clone 和 Push 代码，所以作为开放访问的 SSH 端口就面临被***的可能。

下面是一台长期运行在公网的代码仓库的端口日志（cat logs/sshd/current），我节选了比较有代表性的一部分日志，并隐去了具体时间：

Invalid user admin from 179.53.182.234
Invalid user user from 183.89.94.13
Invalid user admin from 36.236.233.142
Invalid user admin from 143.255.154.219
Invalid user admin from 85.57.5.107
Invalid user ubnt from 85.57.5.107
Invalid user admin from 85.57.5.107
Invalid user admin from 156.223.73.14
Invalid user support from 200.145.6.88
Invalid user admin from 152.231.118.191
Invalid user user from 171.228.172.27
Invalid user admin from 27.66.79.45
Invalid user Admin from 117.0.57.69
Invalid user admin from 14.207.231.218
Invalid user gitlab from 121.71.20.66
Invalid user admin from 183.157.173.121
Invalid user guest from 37.214.104.206
Invalid user admin from 197.32.190.120
Invalid user Administrator from 125.34.196.43
Invalid user Administrator from 125.34.196.43
Invalid user \243\254git from 112.87.206.54
Invalid user \243\254git from 112.87.206.54
Invalid user admin from 116.118.104.96
Invalid user admin from 14.186.202.33
Invalid user admin from 113.172.217.15
....


可以看到有大量扫描器在默默的替你关注者你的系统安全，毫不夸张的说，一旦你漏出破绽，你的机器、你的应用就不归你使用了，这类扫描器的拥有者便能光明正大的随意用你的机器、玩你的系统、欺负你的用户…

如何避免这类恶意的扫描器呢？其实写一段简单的日志检测脚本就能解决很大一部分问题。

比如下面这段脚本，在参考 这篇文章 后，我结合实际情况，更新了它，让脚本能够处理 GitLab 的日志格式。

#!/bin/bash

Maximum number of allowed wrong attempts

LIMIT=3

Log file to be analyzed

SCAN_LOG="/data/gitlab/logs/sshd/current"

Block IP records

LOGFILE="/data/gitlab/logs/bad_gay_22_port.log"

Log format to match: 2019-04-10_12

TIME=$(date '+%Y-%m-%d_%H')

Scan the current GitLab log, find out all wrong login behaviors, and count them, and filter out IPs that exceed the allowed number of times

BLOCK_IP=$(grep "$TIME" "$SCAN_LOG"|grep "Invalid user"|awk '{print $(NF-3)}'|sort|uniq -c|awk '$1>"$LIMIT"{print $1":"$2}')
for i in $BLOCK_IP
do
IP=$(echo $i|awk -F: '{print $2}')

Verify whether the IP has been blocked

iptables-save|grep INPUT|grep DROP|grep $IP>/dev/null
# 如果未被封禁，则进行封禁操作
if [ $? -gt 0 ];then
    iptables -A INPUT -s $IP -p tcp --dport 22 -j DROP
    NOW=$(date '+%Y-%m-%d %H:%M')
    echo -e "$NOW : $IP">>${LOGFILE}
fi

done


将上面的内容保存为 gitlab_ssh.sh ，然后赋予脚本可执行权限。

chmod 755 gitlab_ssh.sh && chmod +x gitlab_ssh.sh
接着将脚本放到 GitLab 应用目录中（或者任意你方便管理的地方），举个例子： /data/gitlab/gitlab_ssh.sh。

最后将脚本添加到 crontab 中，以10分钟为粒度执行 (结合自己情况进行调整)。

echo "/10 * root /data/gitlab/gitlab_ssh.sh" >>/etc/crontab


不出意外，往后如果还有这类扫描器，他们最多只能扑腾个10分钟左右。

至于这个脚本的战绩，可以通过查看 /data/gitlab/logs/bad_gay_22_port.log 日志文件来进行了解：

2019-04-10 19:33 : 113.172.217.15
2019-04-10 19:33 : 14.186.202.33


**监控界面登录**
前面已经在网络层添加了访问授权，但是如果授权密码泄露，被针对性***，比如在界面/应用接口层面进行弱口令扫描，那么又该如何处理呢。

配合 fail2ban 可以减少这类事情的影响，下面给出一段参考脚本。

[Init]
maxlines = 6

[Definition]

The relevant log file is in /var/log/gitlab/gitlab-rails/production.log

Note that a single failure can appear in the logs up to 3 times with just one login attempt. Adjust your maxfails accordingly.

Example fail - clone repo via https

#Started GET "/" for 10.0.0.91 at 2016-10-25 00:01:24 +0200
#Processing by RootController#index as HTML
#Completed 401 Unauthorized in 69ms (ActiveRecord: 23.7ms)

Example fail - login via GUI

#Started GET "//chmielu/test.git/info/refs?service=git-upload-pack" for 10.0.0.91 at 2016-10-25 00:01:09 +0200
#Processing by Projects::GitHttpController#info_refs as /

Parameters: {"service"=>"git-upload-pack", "namespace_id"=>"chmielu", "project_id"=>"test.git"}

#Filter chain halted as :authenticate_user rendered or redirected
#Completed 401 Unauthorized in 50ms (Views: 0.8ms | ActiveRecord: 8.1ms)

failregex = ^Started . for <HOST> at .<SKIPLINES>Completed 401 Unauthorized

ignoreregex =



**用户层**
用户层面其实问题不多，如果你能确定你可以坚持使用以下措施的话：

关注官方版本更新和 changelog，及时更新应用版本，减少 XSS 、CVE 漏洞问题。
进行最小权限授予，减少错误授权带来的风险。
在系统设置中设置所有项目都是 private 的，避免某云平台的事故重演。
避免添加过多的全局 Admin 角色，针对项目群组和项目进行管理员设置。

仅允许使用 SSH 方式进行代码 Clone 和 Push，推荐使用秘钥认证的方式进行系统交互。
尽可能减少与外部系统的交互，比如导入外部仓库，仅支持你觉得必要的来源；比如服务调用，仅调用你觉得安全可靠的。
关闭默认注册方式，使用邀请制度，或者使用 SSO/LDAP 方式进行注册。
根据实际情况进行用户频率限制（系统功能）。
要求你的用户使用随机生成的强密码，并定期更换。
**最后**
使用容器在公网环境搭建 GitLab 就先介绍到这里，性能监控部分，等把 WordPress 的坑填完，再细聊吧。

Use Docker and Traefik to build GitLab (part 2)

GitLab web service