GitLab Workhorse
Last time introduced GitLab basic functionality and architecture , but did not explain in detail how the user's request is being processed, only the functional responsibilities of the various components introduced again, this section will briefly gitlab-workhorse functionality
First look: GitLab use Nginx front end of http / https proxy request to gitlab-workhorse, gitlab-workhorse and then forwards the request to the Unicorn Web server. Communication between the front end gitlab-workhorse default unix domain socket is used for, but also supports the TCP transfer request; gitlab using Unicorn Web server and API interface dynamic web
1. Nginx inlet
As can be seen from the schematic diagram, HTTP / HTTPS requests into the first station is nginx GitLab
Download GitLab-ce official source after entering the ${gitlab-ce根目录}/lib/support/nginx
open gitlab-ssl
can see the nginx configuration
GitLab the default http request is redirected to https requests
## Redirects all HTTP traffic to the HTTPS host
server {
listen 0.0.0.0:80;
listen [::]:80 ipv6only=on default_server;
server_name YOUR_SERVER_FQDN;
server_tokens off;
return 301 https://$http_host$request_uri;
access_log /var/log/nginx/gitlab_access.log gitlab_ssl_access;
error_log /var/log/nginx/gitlab_error.log;
}
复制代码
https relatively complex setting, just to mention a bright spot: location: /
The following proxy_pass http://gitlab-workhorse;
described the addition of some static pages, almost all of Nginx http / https request is transmitted to the gitlab-workhorse component (using the unix socket communication)
Unix Socket is one kind of inter-process communications functions socket manner, it does not require complicated data packing unpacking, the checksum calculation and verification, network protocol stack does not need to take, safe and reliable . Unix Socket fact AF_UNIX or AF_LOCAL one type of socket, be unix domain socket, for local communication, i.e. to achieve the IPC, it does not require a constructor IP and port, is replaced by a file path
upstream gitlab-workhorse {
# GitLab socket file,
# for Omnibus this would be: unix:/var/opt/gitlab/gitlab-workhorse/socket
server unix:/home/git/gitlab/tmp/sockets/gitlab-workhorse.socket fail_timeout=0;
}
...
## HTTPS host
server {
listen 0.0.0.0:443 ssl;
listen [::]:443 ipv6only=on ssl default_server;
server_name YOUR_SERVER_FQDN;
server_tokens off;
ssl on;
ssl_certificate /etc/nginx/ssl/gitlab.crt;
ssl_certificate_key /etc/nginx/ssl/gitlab.key;
ssl_ciphers "ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA:ECDHE-RSA-AES128-SHA:ECDHE-RSA-DES-CBC3-SHA:AES256-GCM-SHA384:AES128-GCM-SHA256:AES256-SHA256:AES128-SHA256:AES256-SHA:AES128-SHA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!MD5:!PSK:!RC4";
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 5m;
real_ip_recursive off; ## If you enable 'on'
access_log /var/log/nginx/gitlab_access.log gitlab_ssl_access;
error_log /var/log/nginx/gitlab_error.log;
location / {
client_max_body_size 0;
gzip off;
proxy_read_timeout 300;
proxy_connect_timeout 300;
proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Ssl on;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade_gitlab_ssl;
proxy_pass http://gitlab-workhorse;
}
error_page 404 /404.html;
error_page 422 /422.html;
error_page 500 /500.html;
error_page 502 /502.html;
error_page 503 /503.html;
location ~ ^/(404|422|500|502|503)\.html$ {
root /home/git/gitlab/public;
internal;
}
}
复制代码
2. GitLab-workhorse
So GitLab-workhorse what is it? The official explained that it is a smart GitLab reverse proxy server to handle a heavy load of HTTP requests, such as file upload / download, Git push / pull and Git archive downloads. But in fact it may be more complicated
+-------+ +------------------+ +---------+
| | | | | |
| NGINX +->| gitlab-workhorse +->| Unicorn |
| | | | | |
+-------+ +------------------+ +---------+
复制代码
The following Rails components are running Unicorn Web server:
- workhorse can handle some requests without invoking Rails components, such as static js / css resource file
- Rails workhorse component can be modified in response sent. For example: Suppose you use the Rails components
send_file
, then gitlab-workhorse will open a file on the disk and the file content as a response back to the client body - workhorse to a request to take over the inquiry after the operating authority Rails components, such as processing
git clone
must confirm current client privilege before, workhorse will continue to take over the inquiry confirm Rails componentsgit clone
request - workhorse can be modified before sending the request information to the Rails components. For example: When the upload process Git LFS, Rails Workhorse first ask the user whether the current assembly has permission to execute, then it will request body is stored in a temporary file, then it will comprise this body after the modification request is sent to the temporary file path Rails components
- workhorse prolonged survival and manage communication Rails assembly connected websocket
- workhorse database can not be directly connected, and only the assembly Rails Redis assembly (optional) Communication
- All requests are forwarded by the arrival workhorse upstream proxy (nginx) from
- https connection not accepted workhorse
- workhorse does not clear the idle client connections
- All requests for Rails assembly had elapsed workhorse
For example, Unicorn deal with static resource files relatively low efficiency, then handed workhorse process. Due to the length of the article, just to pick an example to explain: gzip resource file
In ${gitlab-workhorse根目录}/internal/staticpages/servefile.go
function func (s *Static) ServeExisting
within the definition of a workhorse handling of static resource files
Suppose we have a relative URL request is /assets/locale/zh_CN/app-3396bd500e53f89d971d8c31ba7275f1c9ae2899062d4a7aeef14339084f44bd.js
because with assets
the prefix, the workhorse process by way of the processing request static resource file, as shown in the code
// ${gitlab-workhorse根目录}/internal/upstream/routes.go
// Serve assets
route(
"", `^/assets/`,
static.ServeExisting(
u.URLPrefix,
staticpages.CacheExpireMax,
NotFoundUnless(u.DevelopmentMode, proxy),
),
withoutTracing(), // Tracing on assets is very noisy
),
复制代码
The following figure js request for a static resource files /assets/locale/zh_CN/app-3396bd500e53f89d971d8c31ba7275f1c9ae2899062d4a7aeef14339084f44bd.js
is to use gzip way
If the user requests a headband Accept-Encoding: gzip
, then, workhorse will read gzip file (server pre-compressed static resource files) corresponding requests for static resources, and to transmit the browser (not a direct transfer, had to go through nginx server) compressed content, browsing after the server receives the response to determine if content is compressed, if the decompressed compressed; Of course, if the user requests a source file header does not indicate the use of gzip, it will read the workhorse
// ${gitlab-workhorse根目录}/internal/staticpages/servefile.go
// ...省略部分代码
file := filepath.Join(s.DocumentRoot, prefix.Strip(r.URL.Path))
// ...省略部分代码
// Serve pre-gzipped assets
if acceptEncoding := r.Header.Get("Accept-Encoding"); strings.Contains(acceptEncoding, "gzip") {
content, fi, err = helper.OpenFile(file + ".gz")
if err == nil {
w.Header().Set("Content-Encoding", "gzip")
}
}
复制代码
From the figure feel, gz compressed file is smaller than 1/3 of the source files, can be said to greatly reduce the network bandwidth of the server
We should also note that when accessing static resource files, the request was not forwarded to the Unicorn Web server, but the workhorse personally, and this is the greatest significance of the existence of the workhorse components that make up the defects of Unicorn Web server . It was a saying of the association:
Any problem in computer science can be solved by another layer of indirection
Any problems in the field of computer science can be to add an intermediate layer to get, they are actually full of routine ah
Finally, make a summary of it: workhorse component is the first to solve the problem git-over-http / https timeout , or in other words, workhorse server components to solve the unicorn is not good at dealing with the request , and such requests will be dynamic page rendering workhorse unicorn proxy server to help deal with, because unicorn server good at dealing with such requests. As foremost end nginx server configuration is mainly used for such purposes https
appendix
Reference links
GitLab Workhorse official warehouse
Reproduced in: https: //juejin.im/post/5cf680f86fb9a07ed5248cda