GitLab系列2 GitLab Workhorse

GitLab Workhorse

Last time introduced GitLab basic functionality and architecture , but did not explain in detail how the user's request is being processed, only the functional responsibilities of the various components introduced again, this section will briefly gitlab-workhorse functionality


First look: GitLab use Nginx front end of http / https proxy request to gitlab-workhorse, gitlab-workhorse and then forwards the request to the Unicorn Web server. Communication between the front end gitlab-workhorse default unix domain socket is used for, but also supports the TCP transfer request; gitlab using Unicorn Web server and API interface dynamic web

1. Nginx inlet

As can be seen from the schematic diagram, HTTP / HTTPS requests into the first station is nginx GitLab

Download GitLab-ce official source after entering the ${gitlab-ce根目录}/lib/support/nginxopen gitlab-sslcan see the nginx configuration


GitLab the default http request is redirected to https requests

## Redirects all HTTP traffic to the HTTPS host
server {
  listen 0.0.0.0:80;
  listen [::]:80 ipv6only=on default_server;
  server_name YOUR_SERVER_FQDN;
  server_tokens off;
  return 301 https://$http_host$request_uri;
  access_log  /var/log/nginx/gitlab_access.log gitlab_ssl_access;
  error_log   /var/log/nginx/gitlab_error.log;
}
复制代码

https relatively complex setting, just to mention a bright spot: location: /The following proxy_pass http://gitlab-workhorse;described the addition of some static pages, almost all of Nginx http / https request is transmitted to the gitlab-workhorse component (using the unix socket communication)

Unix Socket is one kind of inter-process communications functions socket manner, it does not require complicated data packing unpacking, the checksum calculation and verification, network protocol stack does not need to take, safe and reliable . Unix Socket fact AF_UNIX or AF_LOCAL one type of socket, be unix domain socket, for local communication, i.e. to achieve the IPC, it does not require a constructor IP and port, is replaced by a file path

upstream gitlab-workhorse {
  # GitLab socket file,
  # for Omnibus this would be: unix:/var/opt/gitlab/gitlab-workhorse/socket
  server unix:/home/git/gitlab/tmp/sockets/gitlab-workhorse.socket fail_timeout=0;
}
...
## HTTPS host
server {
  listen 0.0.0.0:443 ssl;
  listen [::]:443 ipv6only=on ssl default_server;
  server_name YOUR_SERVER_FQDN;
  server_tokens off;

  ssl on;
  ssl_certificate /etc/nginx/ssl/gitlab.crt;
  ssl_certificate_key /etc/nginx/ssl/gitlab.key;

  ssl_ciphers "ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA:ECDHE-RSA-AES128-SHA:ECDHE-RSA-DES-CBC3-SHA:AES256-GCM-SHA384:AES128-GCM-SHA256:AES256-SHA256:AES128-SHA256:AES256-SHA:AES128-SHA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!MD5:!PSK:!RC4";
  ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
  ssl_prefer_server_ciphers on;
  ssl_session_cache shared:SSL:10m;
  ssl_session_timeout 5m;
  real_ip_recursive off;    ## If you enable 'on'
  access_log  /var/log/nginx/gitlab_access.log gitlab_ssl_access;
  error_log   /var/log/nginx/gitlab_error.log;

  location / {
    client_max_body_size 0;
    gzip off;
    proxy_read_timeout      300;
    proxy_connect_timeout   300;
    proxy_redirect          off;

    proxy_http_version 1.1;

    proxy_set_header    Host                $http_host;
    proxy_set_header    X-Real-IP           $remote_addr;
    proxy_set_header    X-Forwarded-Ssl     on;
    proxy_set_header    X-Forwarded-For     $proxy_add_x_forwarded_for;
    proxy_set_header    X-Forwarded-Proto   $scheme;
    proxy_set_header    Upgrade             $http_upgrade;
    proxy_set_header    Connection          $connection_upgrade_gitlab_ssl;

    proxy_pass http://gitlab-workhorse;
  }

  error_page 404 /404.html;
  error_page 422 /422.html;
  error_page 500 /500.html;
  error_page 502 /502.html;
  error_page 503 /503.html;
  location ~ ^/(404|422|500|502|503)\.html$ {
    root /home/git/gitlab/public;
    internal;
  }
}
复制代码

2. GitLab-workhorse

So GitLab-workhorse what is it? The official explained that it is a smart GitLab reverse proxy server to handle a heavy load of HTTP requests, such as file upload / download, Git push / pull and Git archive downloads. But in fact it may be more complicated

+-------+  +------------------+  +---------+
|       |  |                  |  |         |
| NGINX +->| gitlab-workhorse +->| Unicorn |
|       |  |                  |  |         |
+-------+  +------------------+  +---------+
复制代码

The following Rails components are running Unicorn Web server:

  1. workhorse can handle some requests without invoking Rails components, such as static js / css resource file


  2. Rails workhorse component can be modified in response sent. For example: Suppose you use the Rails components send_file, then gitlab-workhorse will open a file on the disk and the file content as a response back to the client body
  3. workhorse to a request to take over the inquiry after the operating authority Rails components, such as processing git clonemust confirm current client privilege before, workhorse will continue to take over the inquiry confirm Rails components git clonerequest


  4. workhorse can be modified before sending the request information to the Rails components. For example: When the upload process Git LFS, Rails Workhorse first ask the user whether the current assembly has permission to execute, then it will request body is stored in a temporary file, then it will comprise this body after the modification request is sent to the temporary file path Rails components
  5. workhorse prolonged survival and manage communication Rails assembly connected websocket


  6. workhorse database can not be directly connected, and only the assembly Rails Redis assembly (optional) Communication
  7. All requests are forwarded by the arrival workhorse upstream proxy (nginx) from
  8. https connection not accepted workhorse
  9. workhorse does not clear the idle client connections
  10. All requests for Rails assembly had elapsed workhorse

For example, Unicorn deal with static resource files relatively low efficiency, then handed workhorse process. Due to the length of the article, just to pick an example to explain: gzip resource file

In ${gitlab-workhorse根目录}/internal/staticpages/servefile.gofunction func (s *Static) ServeExistingwithin the definition of a workhorse handling of static resource files

Suppose we have a relative URL request is /assets/locale/zh_CN/app-3396bd500e53f89d971d8c31ba7275f1c9ae2899062d4a7aeef14339084f44bd.jsbecause with assetsthe prefix, the workhorse process by way of the processing request static resource file, as shown in the code

// ${gitlab-workhorse根目录}/internal/upstream/routes.go
// Serve assets
route(
    "", `^/assets/`,
    static.ServeExisting(
        u.URLPrefix,
        staticpages.CacheExpireMax,
        NotFoundUnless(u.DevelopmentMode, proxy),
    ),
    withoutTracing(), // Tracing on assets is very noisy
),
复制代码

The following figure js request for a static resource files /assets/locale/zh_CN/app-3396bd500e53f89d971d8c31ba7275f1c9ae2899062d4a7aeef14339084f44bd.jsis to use gzip way


If the user requests a headband Accept-Encoding: gzip, then, workhorse will read gzip file (server pre-compressed static resource files) corresponding requests for static resources, and to transmit the browser (not a direct transfer, had to go through nginx server) compressed content, browsing after the server receives the response to determine if content is compressed, if the decompressed compressed; Of course, if the user requests a source file header does not indicate the use of gzip, it will read the workhorse


// ${gitlab-workhorse根目录}/internal/staticpages/servefile.go
// ...省略部分代码
file := filepath.Join(s.DocumentRoot, prefix.Strip(r.URL.Path))
// ...省略部分代码
// Serve pre-gzipped assets
if acceptEncoding := r.Header.Get("Accept-Encoding"); strings.Contains(acceptEncoding, "gzip") {
    content, fi, err = helper.OpenFile(file + ".gz")
    if err == nil {
        w.Header().Set("Content-Encoding", "gzip")
    }
}

复制代码

From the figure feel, gz compressed file is smaller than 1/3 of the source files, can be said to greatly reduce the network bandwidth of the server


We should also note that when accessing static resource files, the request was not forwarded to the Unicorn Web server, but the workhorse personally, and this is the greatest significance of the existence of the workhorse components that make up the defects of Unicorn Web server . It was a saying of the association:

Any problem in computer science can be solved by another layer of indirection

Any problems in the field of computer science can be to add an intermediate layer to get, they are actually full of routine ah

Finally, make a summary of it: workhorse component is the first to solve the problem git-over-http / https timeout , or in other words, workhorse server components to solve the unicorn is not good at dealing with the request , and such requests will be dynamic page rendering workhorse unicorn proxy server to help deal with, because unicorn server good at dealing with such requests. As foremost end nginx server configuration is mainly used for such purposes https

appendix

Reference links

GitLab Workhorse official warehouse


Reproduced in: https: //juejin.im/post/5cf680f86fb9a07ed5248cda

Guess you like

Origin blog.csdn.net/weixin_34326558/article/details/91479012