Concurrent requests to the gitlab interface report an error 500 Internal Server Error

Configuration Environment

Gitalb 14.9
is built by docker, and the database used is the built-in postgresql
insert image description here

problem background

Recently, I found that when the self-developed system requests to call the api interface of gitlab, especially when the same resource is requested concurrently , I am creating the subgroup resource concurrently here , and I will get an error of 500. At first I thought it was the system code developed by myself. The 500 error caused by inaccurate data is not too concerned. But this phenomenon has reappeared recently, so try to check the cause of the error.

Simulate the error-generating environment

Then take a stress test to verify whether it is really a problem caused by concurrent calls. First try to start 5 threads at the same time within 0.01 seconds to request the creation of subgroup resources

insert image description here

This error was directly reproduced under a small amount of concurrency. It can be seen that only the first request returned the result normally, and the remaining 4 requests reported 500 Internal Error
insert image description here
insert image description here

Find the cause of the error

own system error
insert image description here

The nginx log of gitlab, found in /var/log/gitlab/nginx/gitlab_access.log that this request is indeed returned 500
insert image description here

There is a gitlab-rails module in gitlab, and
the cause of the error was found in the api_json.log log file under this module.
insert image description here

{
	"time": "2023-07-19T05:19:06.173Z",
	"severity": "INFO",
	"duration_s": 0.08902,
	"db_duration_s": 0.0557,
	"view_duration_s": 0.03332,
	"status": 500,
	"method": "POST",
	"path": "/api/v4/groups",
	"params": [{
		"key": "path",
		"value": "gcptest071902"
	}, {
		"key": "name",
		"value": "gcptest071902"
	}, {
		"key": "parent_id",
		"value": 234
	}],
	"host": "wx8vm00001.apac.bosch.com",
	"remote_ip": "10.4.103.206, 127.0.0.1",
	"ua": "python-requests/2.28.1",
	"route": "/api/:version/groups",
	"user_id": 1,
	"username": "root",
	"exception.class": "ActiveRecord::RecordNotUnique",
	"exception.message": "PG::UniqueViolation: ERROR:  duplicate key value violates unique constraint \"index_namespaces_name_parent_id_type\"\nDETAIL:  Key (name, parent_id, type)=(gcptest071902, 234, Group) already exists.\n",
	"exception.backtrace": ["lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `block in write_using_load_balancer'", "lib/gitlab/database/load_balancing/load_balancer.rb:112:in `block in read_write'", "lib/gitlab/database/load_balancing/load_balancer.rb:172:in `retry_with_backoff'", "lib/gitlab/database/load_balancing/load_balancer.rb:110:in `read_write'", "lib/gitlab/database/load_balancing/connection_proxy.rb:125:in `write_using_load_balancer'", "lib/gitlab/database/load_balancing/connection_proxy.rb:67:in `block (2 levels) in <class:ConnectionProxy>'", "lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `block in write_using_load_balancer'", "lib/gitlab/database/load_balancing/load_balancer.rb:112:in `block in read_write'", "lib/gitlab/database/load_balancing/load_balancer.rb:172:in `retry_with_backoff'", "lib/gitlab/database/load_balancing/load_balancer.rb:110:in `read_write'", "lib/gitlab/database/load_balancing/connection_proxy.rb:125:in `write_using_load_balancer'", "lib/gitlab/database/load_balancing/connection_proxy.rb:77:in `transaction'", "app/services/groups/create_service.rb:39:in `block in execute'", "lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `block in write_using_load_balancer'", "lib/gitlab/database/load_balancing/load_balancer.rb:112:in `block in read_write'", "lib/gitlab/database/load_balancing/load_balancer.rb:172:in `retry_with_backoff'", "lib/gitlab/database/load_balancing/load_balancer.rb:110:in `read_write'", "lib/gitlab/database/load_balancing/connection_proxy.rb:125:in `write_using_load_balancer'", "lib/gitlab/database/load_balancing/connection_proxy.rb:77:in `transaction'", "lib/gitlab/database.rb:309:in `block in transaction'", "lib/gitlab/database.rb:308:in `transaction'", "app/models/concerns/cross_database_modification.rb:99:in `transaction'", "app/services/groups/create_service.rb:38:in `execute'", "lib/api/groups.rb:63:in `create_group'", "lib/api/groups.rb:207:in `block (2 levels) in <class:Groups>'", "lib/api/api_guard.rb:213:in `call'", "lib/gitlab/metrics/elasticsearch_rack_middleware.rb:16:in `call'", "lib/gitlab/middleware/rails_queue_duration.rb:33:in `call'", "lib/gitlab/middleware/memory_report.rb:13:in `call'", "lib/gitlab/middleware/speedscope.rb:13:in `call'", "lib/gitlab/request_profiler/middleware.rb:17:in `call'", "lib/gitlab/database/load_balancing/rack_middleware.rb:23:in `call'", "lib/gitlab/metrics/rack_middleware.rb:16:in `block in call'", "lib/gitlab/metrics/web_transaction.rb:46:in `run'", "lib/gitlab/metrics/rack_middleware.rb:16:in `call'", "lib/gitlab/jira/middleware.rb:19:in `call'", "lib/gitlab/middleware/go.rb:20:in `call'", "lib/gitlab/etag_caching/middleware.rb:21:in `call'", "lib/gitlab/middleware/query_analyzer.rb:11:in `block in call'", "lib/gitlab/database/query_analyzer.rb:46:in `within'", "lib/gitlab/middleware/query_analyzer.rb:11:in `call'", "lib/gitlab/middleware/multipart.rb:173:in `call'", "lib/gitlab/middleware/read_only/controller.rb:50:in `call'", "lib/gitlab/middleware/read_only.rb:18:in `call'", "lib/gitlab/middleware/same_site_cookies.rb:27:in `call'", "lib/gitlab/middleware/handle_malformed_strings.rb:21:in `call'", "lib/gitlab/middleware/basic_health_check.rb:25:in `call'", "lib/gitlab/middleware/handle_ip_spoof_attack_error.rb:25:in `call'", "lib/gitlab/middleware/request_context.rb:21:in `call'", "lib/gitlab/middleware/webhook_recursion_detection.rb:15:in `call'", "config/initializers/fix_local_cache_middleware.rb:11:in `call'", "lib/gitlab/middleware/compressed_json.rb:26:in `call'", "lib/gitlab/middleware/rack_multipart_tempfile_factory.rb:19:in `call'", "lib/gitlab/middleware/sidekiq_web_static.rb:20:in `call'", "lib/gitlab/metrics/requests_rack_middleware.rb:77:in `call'", "lib/gitlab/middleware/release_env.rb:13:in `call'"],
	"exception.sql": "/*application:web,correlation_id:01H5P9CWAR3B4123CHYQSZKYH5,endpoint_id:POST /api/:version/groups,db_config_name:main*/ INSERT INTO \"namespaces\" (\"name\", \"path\", \"created_at\", \"updated_at\", \"type\", \"visibility_level\", \"description_html\", \"parent_id\", \"cached_markdown_version\") VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9) RETURNING \"id\"",
	"queue_duration_s": 0.009978,
	"redis_calls": 5,
	"redis_duration_s": 0.001121,
	"redis_read_bytes": 609,
	"redis_write_bytes": 304,
	"redis_cache_calls": 3,
	"redis_cache_duration_s": 0.000656,
	"redis_cache_read_bytes": 609,
	"redis_cache_write_bytes": 198,
	"redis_shared_state_calls": 2,
	"redis_shared_state_duration_s": 0.000465,
	"redis_shared_state_write_bytes": 106,
	"db_count": 15,
	"db_write_count": 3,
	"db_cached_count": 0,
	"db_replica_count": 0,
	"db_primary_count": 15,
	"db_main_count": 15,
	"db_main_replica_count": 0,
	"db_replica_cached_count": 0,
	"db_primary_cached_count": 0,
	"db_main_cached_count": 0,
	"db_main_replica_cached_count": 0,
	"db_replica_wal_count": 0,
	"db_primary_wal_count": 0,
	"db_main_wal_count": 0,
	"db_main_replica_wal_count": 0,
	"db_replica_wal_cached_count": 0,
	"db_primary_wal_cached_count": 0,
	"db_main_wal_cached_count": 0,
	"db_main_replica_wal_cached_count": 0,
	"db_replica_duration_s": 0.0,
	"db_primary_duration_s": 0.056,
	"db_main_duration_s": 0.056,
	"db_main_replica_duration_s": 0.0,
	"cpu_s": 0.043004,
	"mem_objects": 15895,
	"mem_bytes": 1253816,
	"mem_mallocs": 4559,
	"mem_total_bytes": 1889616,
	"pid": 413885,
	"correlation_id": "01H5P9CWAR3B4123CHYQSZKYH5",
	"meta.user": "root",
	"meta.client_id": "user/1",
	"meta.caller_id": "POST /api/:version/groups",
	"meta.remote_ip": "10.4.103.206",
	"meta.feature_category": "subgroups",
	"content_length": "68",
	"request_urgency": "default",
	"target_duration_s": 1
}

At the same time, this error can also be found in the exceptions_json.log log file under this module

{
	"severity": "ERROR",
	"time": "2023-07-19T05:19:06.171Z",
	"correlation_id": "01H5P9CWAR3B4123CHYQSZKYH5",
	"exception.class": "ActiveRecord::RecordNotUnique",
	"exception.message": "PG::UniqueViolation: ERROR:  duplicate key value violates unique constraint \"index_namespaces_name_parent_id_type\"\nDETAIL:  Key (name, parent_id, type)=(gcptest071902, 234, Group) already exists.\n",
	"exception.backtrace": ["lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `block in write_using_load_balancer'", "lib/gitlab/database/load_balancing/load_balancer.rb:112:in `block in read_write'", "lib/gitlab/database/load_balancing/load_balancer.rb:172:in `retry_with_backoff'", "lib/gitlab/database/load_balancing/load_balancer.rb:110:in `read_write'", "lib/gitlab/database/load_balancing/connection_proxy.rb:125:in `write_using_load_balancer'", "lib/gitlab/database/load_balancing/connection_proxy.rb:67:in `block (2 levels) in \u003cclass:ConnectionProxy\u003e'", "lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `block in write_using_load_balancer'", "lib/gitlab/database/load_balancing/load_balancer.rb:112:in `block in read_write'", "lib/gitlab/database/load_balancing/load_balancer.rb:172:in `retry_with_backoff'", "lib/gitlab/database/load_balancing/load_balancer.rb:110:in `read_write'", "lib/gitlab/database/load_balancing/connection_proxy.rb:125:in `write_using_load_balancer'", "lib/gitlab/database/load_balancing/connection_proxy.rb:77:in `transaction'", "app/services/groups/create_service.rb:39:in `block in execute'", "lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `block in write_using_load_balancer'", "lib/gitlab/database/load_balancing/load_balancer.rb:112:in `block in read_write'", "lib/gitlab/database/load_balancing/load_balancer.rb:172:in `retry_with_backoff'", "lib/gitlab/database/load_balancing/load_balancer.rb:110:in `read_write'", "lib/gitlab/database/load_balancing/connection_proxy.rb:125:in `write_using_load_balancer'", "lib/gitlab/database/load_balancing/connection_proxy.rb:77:in `transaction'", "lib/gitlab/database.rb:309:in `block in transaction'", "lib/gitlab/database.rb:308:in `transaction'", "app/models/concerns/cross_database_modification.rb:99:in `transaction'", "app/services/groups/create_service.rb:38:in `execute'", "lib/api/groups.rb:63:in `create_group'", "lib/api/groups.rb:207:in `block (2 levels) in \u003cclass:Groups\u003e'", "lib/api/api_guard.rb:213:in `call'", "lib/gitlab/metrics/elasticsearch_rack_middleware.rb:16:in `call'", "lib/gitlab/middleware/rails_queue_duration.rb:33:in `call'", "lib/gitlab/middleware/memory_report.rb:13:in `call'", "lib/gitlab/middleware/speedscope.rb:13:in `call'", "lib/gitlab/request_profiler/middleware.rb:17:in `call'", "lib/gitlab/database/load_balancing/rack_middleware.rb:23:in `call'", "lib/gitlab/metrics/rack_middleware.rb:16:in `block in call'", "lib/gitlab/metrics/web_transaction.rb:46:in `run'", "lib/gitlab/metrics/rack_middleware.rb:16:in `call'", "lib/gitlab/jira/middleware.rb:19:in `call'", "lib/gitlab/middleware/go.rb:20:in `call'", "lib/gitlab/etag_caching/middleware.rb:21:in `call'", "lib/gitlab/middleware/query_analyzer.rb:11:in `block in call'", "lib/gitlab/database/query_analyzer.rb:46:in `within'", "lib/gitlab/middleware/query_analyzer.rb:11:in `call'", "lib/gitlab/middleware/multipart.rb:173:in `call'", "lib/gitlab/middleware/read_only/controller.rb:50:in `call'", "lib/gitlab/middleware/read_only.rb:18:in `call'", "lib/gitlab/middleware/same_site_cookies.rb:27:in `call'", "lib/gitlab/middleware/handle_malformed_strings.rb:21:in `call'", "lib/gitlab/middleware/basic_health_check.rb:25:in `call'", "lib/gitlab/middleware/handle_ip_spoof_attack_error.rb:25:in `call'", "lib/gitlab/middleware/request_context.rb:21:in `call'", "lib/gitlab/middleware/webhook_recursion_detection.rb:15:in `call'", "config/initializers/fix_local_cache_middleware.rb:11:in `call'", "lib/gitlab/middleware/compressed_json.rb:26:in `call'", "lib/gitlab/middleware/rack_multipart_tempfile_factory.rb:19:in `call'", "lib/gitlab/middleware/sidekiq_web_static.rb:20:in `call'", "lib/gitlab/metrics/requests_rack_middleware.rb:77:in `call'", "lib/gitlab/middleware/release_env.rb:13:in `call'"],
	"exception.sql": "/*application:web,correlation_id:01H5P9CWAR3B4123CHYQSZKYH5,endpoint_id:POST /api/:version/groups,db_config_name:main*/ INSERT INTO \"namespaces\" (\"name\", \"path\", \"created_at\", \"updated_at\", \"type\", \"visibility_level\", \"description_html\", \"parent_id\", \"cached_markdown_version\") VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9) RETURNING \"id\"",
	"user.username": "root",
	"tags.program": "web",
	"tags.locale": "en",
	"tags.feature_category": "subgroups",
	"tags.correlation_id": "01H5P9CWAR3B4123CHYQSZKYH5"
}

initial analysis

The error message is related to GitLab's database load balancing problem. Database load balancing in GitLab is designed to handle high availability and performance of database connections. However, when using load balancing, you may sometimes run into issues causing the database connection to fail with this error
Seems to be seeing issues with GitLab's database transactions in the logs as well. Transaction is a mechanism used to guarantee the atomicity, consistency, isolation and durability of database operations. In GitLab, database transactions are used to ensure data consistency and integrity.

This error can usually be caused by:

  • Database connection problem: There may be a problem with the database connection, causing GitLab to fail to connect to the database normally. This could be due to database server failure, network problems, or misconfiguration.

  • Database locking: If multiple transactions operate on the same database resource at the same time, and lock competition occurs, it may cause one of the transactions to fail.

  • Database deadlock: If multiple transactions wait for each other to release the lock, and form a circular wait, a database deadlock occurs, causing one of the transactions to fail.

  • Transaction timeout: If the transaction execution time is too long and exceeds the transaction timeout period set by the database, the transaction may fail.

Possible solutions:

  • Use an external database, adjust the relevant settings of the database transaction

  • Check the status and performance of the database server to ensure that the database server is up and performing properly.

  • Check the configuration and settings of the database connection to ensure that the GitLab server can properly connect to the database.

  • Check and optimize database indexes and query statements to reduce the possibility of database locks and deadlocks.

  • Adjust database transaction settings, such as increasing transaction timeouts, to accommodate longer transaction processing times.

  • Check the logs on the database server and GitLab server for other error messages that might cause the transaction to fail.


The above analysis is more likely that we cannot involve changes, and we are not sure whether this problem has been fixed in the new version of gitlab. You can try to update the gitlab version to see if it can be solved, or add a global lock to the business layer or distributed architecture to prohibit simultaneous concurrent requests to operate the same resource at the millisecond level, which should solve this problem.

Guess you like

Origin blog.csdn.net/weixin_44388689/article/details/131806512