In-depth analysis of KubeSphere back-end source code

In this article, we will learn to debug and learn the KubeSphere backend module architecture based on the ssh remote plugin on vscode.

premise

Install vscode and ssh remote container plugins;
Install the kubenertes container "operating system" and KubeSphere >= v3.1.0 cloud "control panel" on the remote host;
installgo >=1.16;
ks components that need to be debugged, such as devops, kubeedge or whatever, are installed on KubeSphere. If it is a component activated by default, such as monitoring, it does not need to be deactivated.

Configure launch file

$ cat .vscode/launch.json
{
    // 使用 IntelliSense 了解相关属性。 
    // 悬停以查看现有属性的描述。
    // 欲了解更多信息，请访问: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "ks-apiserver",
            "type": "go",
            "request": "launch",
            "mode": "auto",
            "program": "${workspaceFolder}/cmd/ks-apiserver/apiserver.go"
        }
        
    ]
}

ks-apiserver debug dependency files

Configure kubesphere.yaml under the relative path cmd/ks-apiserver/.

First, look at the cm configuration file in the cluster:

$ kubectl -n kubesphere-system get cm kubesphere-config -oyaml

Because there are few kubeconfig related configurations in the above configmap, it is necessary to copy the above yaml file and integrate it as follows.

Why add kubeconfig file?

Mainly because k8s needs such a file when creating a client, and inclusterconfig is used in the container and does not need to be added.

If you are interested, you can take a look at the example of client-go:

https://github.com/kubernetes/client-go/blob/master/examples/in-cluster-client-configuration/main.go#L41

https://github.com/kubernetes/client-go/blob/master/examples/out-of-cluster-client-configuration/main.go#L53

So the complete configuration startup file is as follows:

$ cat ./cmd/ks-apiserver/kubesphere.yaml
kubernetes:
  kubeconfig: "/root/.kube/config"
  master: https://192.168.88.6:6443
  $qps: 1e+06
  burst: 1000000
authentication:
  authenticateRateLimiterMaxTries: 10
  authenticateRateLimiterDuration: 10m0s
  loginHistoryRetentionPeriod: 168h
  maximumClockSkew: 10s
  multipleLogin: True
  kubectlImage: kubesphere/kubectl:v1.20.0
  jwtSecret: "Xtc8ZWUf9f3cJN89bglrTJhfUPMZR87d"
  oauthOptions:
    clients:
    - name: kubesphere
      secret: kubesphere
      redirectURIs:
      - '*'
network:
  ippoolType: none
monitoring:
  endpoint: http://prometheus-operated.kubesphere-monitoring-system.svc:9090
  enableGPUMonitoring: false
gpu:
  kinds:
  - resourceName: nvidia.com/gpu
    resourceType: GPU
    default: True
notification:
  endpoint: http://notification-manager-svc.kubesphere-monitoring-system.svc:19093
  
kubeedge:
  endpoint: http://edge-watcher.kubeedge.svc/api/

gateway:
  watchesPath: /var/helm-charts/watches.yaml
  namespace: kubesphere-controls-system

In addition to kubernetes, the key of the first layer represents the ks component that has been activated by default or by default in our cluster, and now we can start debug through F5.

Before debugging, you may ask, why is this configuration file placed in /cmd/ks-apiserver/kubesphere.yaml?

Let's first explore the operation logic of a wave of ks-apiserver.

start ks-apiserver

View the logic of cmd/ks-apiserver/app/server.go:

// Load configuration from file
conf, err := apiserverconfig.TryLoadFromDisk()

The logic of TryLoadFromDisk is as follows:

viper.SetConfigName(defaultConfigurationName) // kubesphere
viper.AddConfigPath(defaultConfigurationPath) // /etc/kubesphere

// Load from current working directory, only used for debugging
viper.AddConfigPath(".")

// Load from Environment variables
viper.SetEnvPrefix("kubesphere")
viper.AutomaticEnv()
viper.SetEnvKeyReplacer(strings.NewReplacer(".", "_"))

// 上面一顿配置之后，单步调试，ReadInConfig这一步读取的文件路径是	
// v.configPaths：["/etc/kubesphere","/root/go/src/kubesphere.io/kubesphere/cmd/ks-apiserver"]
if err := viper.ReadInConfig(); err != nil {
	if _, ok := err.(viper.ConfigFileNotFoundError); ok {
		return nil, err
	} else {
		return nil, fmt.Errorf("error parsing configuration file %s", err)
	}
}

conf := New() // 初始化各组件配置

// 从读取的实际路径配置文件来反序列化到conf这个struct
if err := viper.Unmarshal(conf); err != nil {
	return nil, err
}

return conf, n

The above comment explains that you need to add kubesphere.yaml to the specified path to start the ks-apiserver command line.

Let's go down and use the cobra.Command package to integrate the command line:

func Run(s *options.ServerRunOptions, ctx context.Context) error {
	// NewAPIServer 通过给定的配置启动apiserver实例，绑定实例化的各组件的client
	// 这一步还通过AddToScheme来注册一些自定义的GVK到k8s，最终暴露为apis API
	// 借助rest.Config和scheme 初始化runtimecache和runtimeClient 
	apiserver, err := s.NewAPIServer(ctx.Done())
	if err != nil {
		return err
	}
	
	// PrepareRun 主要是使用resful-go集成kapis API
	// 上一步绑定了各组件的client，这一步就可以调用各组件的client来访问对应组件的server端了
	// 猜猜4.0后端可插拔架构会是什么样子的？
	err = apiserver.PrepareRun(ctx.Done())
	if err != nil {
		return nil
	}
	
	// 运行各种informers同步资源，并开始ks-apiserver监听请求
	return apiserver.Run(ctx)
}

s.NewAPIServer(ctx.Done()) is mainly to create an apiserver instance. In this step of creating an apiserver instance, it also registers the ks-customized GVK to k8s through the scheme, and exposes it as the API of the apis request path.

PrepareRun mainly uses the resful-go framework to integrate proxy requests or integrated services of various submodules, and exposes the API functions of the kapis request path.

apiserver.Run(ctx) synchronizes resources and starts server monitoring.

The descriptions are separately described below.

NewAPIServer

The first is to bind various clients and informers:

// 调用各组件的NewForConfig方法整合clientset
kubernetesClient, err := k8s.NewKubernetesClient(s.KubernetesOptions)
if err != nil {
	return nil, err
}
apiServer.KubernetesClient = kubernetesClient
informerFactory := informers.NewInformerFactories(kubernetesClient.Kubernetes(), kubernetesClient.KubeSphere(),kubernetesClient.Istio(), kubernetesClient.Snapshot(), kubernetesClient.ApiExtensions(), kubernetesClient.Prometheus())
apiServer.InformerFactory = informerFactory
...
// 根据kubesphere.yaml或者kubesphere-config configmap的配置来绑定ks组件的client
...

After the initial binding is completed, a server will be started to respond to the request, so an addr binding will be made here:

...
server := &http.Server{
	Addr: fmt.Sprintf(":%d", s.GenericServerRunOptions.InsecurePort),
}

if s.GenericServerRunOptions.SecurePort != 0 {
	certificate, err := tls.LoadX509KeyPair(s.GenericServerRunOptions.TlsCertFile, s.GenericServerRunOptions.TlsPrivateKey)
	if err != nil {
		return nil, err
	}

	server.TLSConfig = &tls.Config{
		Certificates: []tls.Certificate{certificate},
	}
	server.Addr = fmt.Sprintf(":%d", s.GenericServerRunOptions.SecurePort)
}

sch := scheme.Scheme
if err := apis.AddToScheme(sch); err != nil {
	klog.Fatalf("unable add APIs to scheme: %v", err)
}
...

Pay attention to this step, apis.AddToScheme(sch), to register the GVK we defined into k8s.

By the way, GVK refers to Group, Version, Kind, for example:

{Group: "", Version: "v1", Resource: "namespaces"}
{Group: "", Version: "v1", Resource: "nodes"}
{Group: "", Version: "v1", Resource: "resourcequotas"}
...
{Group: "tenant.kubesphere.io", Version: "v1alpha1", Resource: "workspaces"}
{Group: "cluster.kubesphere.io", Version: "v1alpha1", Resource: "clusters"}
...

Scheme manages the relationship between GVK and Type, one GVK can only correspond to one reflect.Type, and one reflect.Type may correspond to multiple GVKs; in addition, Scheme also aggregates converters and cloners to convert different versions of structures and obtain structures A copy of the value; due to limited space, interested children's shoes can explore in depth.

Returning to the text, let's see how to inject the scheme:

// AddToSchemes may be used to add all resources defined in the project to a Schemevar AddToSchemes runtime.SchemeBuilder
// AddToScheme adds all Resources to the Schemefunc 
AddToScheme(s *runtime.Scheme) error {	return AddToSchemes.AddToScheme(s)}

The AddToSchemes type is []func(*Scheme) erroran alias of , and it can be injected into Scheme only by implementing the corresponding init() method in the interface file under package apis to import the implemented version API.

for example:

$ cat pkg/apis/addtoscheme_dashboard_v1alpha2.go
package apis
import monitoringdashboardv1alpha2 "kubesphere.io/monitoring-dashboard/api/v1alpha2"
func init() {	
  AddToSchemes = append(AddToSchemes, monitoringdashboardv1alpha2.SchemeBuilder.AddToScheme)
}

That is, the versioned resources integrated by the plug-ins we develop must implement the xxx.SchemeBuilder.AddToScheme function before they can be registered in the scheme and finally exposed as apis to access API services.

So far, the clients corresponding to all submodules have been bound to this apiserver.

PrepareRun

Next, we discuss how PrepareRun registers kapis and binds the handler.

Mainly through the restful-go framework to achieve.

The restful-go framework uses containers to hold webservices with specific GVRs. A webserver can bind multiple routers, allowing containers or webservers to add custom interceptors, that is, calling the filter method.

func (s *APIServer) PrepareRun(stopCh <-chan struct{}) error {
  // container来hold住拥有特定GVR的webservice
	s.container = restful.NewContainer()
	// 添加请求Request日志拦截器
	s.container.Filter(logRequestAndResponse)
	s.container.Router(restful.CurlyRouter{})
	
	// 发生Recover时，绑定一个日志handler
	s.container.RecoverHandler(func(panicReason interface{}, httpWriter http.ResponseWriter) {
		logStackOnRecover(panicReason, httpWriter)
	})
	
	// 每个API组都构建一个webservice，然后根据路由规则来并绑定回调函数
  // 通过AddToContainer来完成绑定
	s.installKubeSphereAPIs()
	
	// 注册metrics指标: ks_server_request_total、ks_server_request_duration_seconds
	// 绑定metrics handler
	s.installMetricsAPI()
	
	// 为有效请求增加监控计数
	s.container.Filter(monitorRequest)

	for _, ws := range s.container.RegisteredWebServices() {
		klog.V(2).Infof("%s", ws.RootPath())
	}
	
	s.Server.Handler = s.container
	
	// 添加各个调用链的拦截器, 用于验证和路由分发
	s.buildHandlerChain(stopCh)

	return nil
}

The above mainly uses the restful-go framework to bind a container to s.Server.handler and add various interceptors.

In the step of s.installKubeSphereAPIS(), installing GVR binds the kapis agent, which is implemented as follows:

// 调用各api组的AddToContainer方法来向container注册kapi:
urlruntime.Must(monitoringv1alpha3.AddToContainer(s.container, s.KubernetesClient.Kubernetes(), s.MonitoringClient, s.MetricsClient, s.InformerFactory, s.KubernetesClient.KubeSphere(), s.Config.OpenPitrixOptions))

// 详细来说，各个组件实现的AddToContainer方法
// 为带有GroupVersion信息的webserver添加route，不同路由路径绑定不同的handler
ws := runtime.NewWebService(GroupVersion)
// 给子路由绑定回调函数
ws.Route(ws.GET("/kubesphere").
    To(h.handleKubeSphereMetricsQuery).
    Doc("Get platform-level metric data.").
    Metadata(restfulspec.KeyOpenAPITags, []string{constants.KubeSphereMetricsTag}).
    Writes(model.Metrics{}).
    Returns(http.StatusOK, respOK, model.Metrics{})).
    Produces(restful.MIME_JSON)

We know that apis corresponds to the request of k8s, and in ks, kapis corresponds to the proxy request of the subcomponent, and the response is provided by ks-apiserver itself or the forwarding target component server, so how does ks-apiserver distinguish these requests?

The answer is distributed through buildHandlerChain.

buildHandlerChain

As mentioned above, buildHandlerChain builds interceptors for various services, which are arranged as follows.

handler = filters.WithKubeAPIServer(handler, s.KubernetesClient.Config(), &errorResponder{})

if s.Config.AuditingOptions.Enable {
	handler = filters.WithAuditing(handler,
		audit.NewAuditing(s.InformerFactory, s.Config.AuditingOptions, stopCh))
}

handler = filters.WithAuthorization(handler, authorizers)
if s.Config.MultiClusterOptions.Enable {
	clusterDispatcher := dispatch.NewClusterDispatch(s.InformerFactory.KubeSphereSharedInformerFactory().Cluster().V1alpha1().Clusters())
	handler = filters.WithMultipleClusterDispatcher(handler, clusterDispatcher)
}

handler = filters.WithAuthentication(handler, authn)
handler = filters.WithRequestInfo(handler, requestInfoResolver)

The WithRequestInfo filter defines the following logic:

info, err := resolver.NewRequestInfo(req)
---
func (r *RequestInfoFactory) NewRequestInfo(req *http.Request) (*RequestInfo, error) {
   ...
   defer func() {
		prefix := requestInfo.APIPrefix
		if prefix == "" {
			currentParts := splitPath(requestInfo.Path)
			//Proxy discovery API
			if len(currentParts) > 0 && len(currentParts) < 3 {
				prefix = currentParts[0]
			}
		}
    // 通过api路由路径中的携带apis还是kapis就可以区分
		if kubernetesAPIPrefixes.Has(prefix) {
			requestInfo.IsKubernetesRequest = true
		}
	}()
	
	...
	// URL forms: /clusters/{cluster}/*
	if currentParts[0] == "clusters" {
		if len(currentParts) > 1 {
			requestInfo.Cluster = currentParts[1]
		}
		if len(currentParts) > 2 {
			currentParts = currentParts[2:]
		}
	}
	...
}

There is a lot of code, so I won't take screenshots one by one. The general meaning can be seen from the comments:

// NewRequestInfo returns the information from the http request.  If error is not nil, RequestInfo holds the information as best it is known before the failure
// It handles both resource and non-resource requests and fills in all the pertinent information for each.
// Valid Inputs:
//
// /apis/{api-group}/{version}/namespaces
// /api/{version}/namespaces
// /api/{version}/namespaces/{namespace}
// /api/{version}/namespaces/{namespace}/{resource}
// /api/{version}/namespaces/{namespace}/{resource}/{resourceName}
// /api/{version}/{resource}
// /api/{version}/{resource}/{resourceName}
//
// Special verbs without subresources:
// /api/{version}/proxy/{resource}/{resourceName}
// /api/{version}/proxy/namespaces/{namespace}/{resource}/{resourceName}
//
// Special verbs with subresources:
// /api/{version}/watch/{resource}
// /api/{version}/watch/namespaces/{namespace}/{resource}
//
// /kapis/{api-group}/{version}/workspaces/{workspace}/{resource}/{resourceName}
// /
// /kapis/{api-group}/{version}/namespaces/{namespace}/{resource}
// /kapis/{api-group}/{version}/namespaces/{namespace}/{resource}/{resourceName}
// With workspaces:
// /kapis/clusters/{cluster}/{api-group}/{version}/namespaces/{namespace}/{resource}
// /kapis/clusters/{cluster}/{api-group}/{version}/namespaces/{namespace}/{resource}/{resourceName}

Through the information defined by the route, you can distinguish what level the request is, and which server the request is to be distributed to.

We add breakpoints to the callback functions of each filter, and then do a small experiment to see what the interception sequence of the interceptor is.

Suppose the service of the remote cloud host has been started, the service port is 9090, and you have set the access permission of the resource type ClusterDashboard under the monitoring.kubesphere.io group for the anonymous globalrole. Of course, you can also use an account with access rights to test directly.

Next, let's send a kapis request to see how the link jumps:

curl -d '{"grafanaDashboardUrl":"https://grafana.com/api/dashboards/7362/revisions/5/download", "description":"this is a test dashboard."}' -H "Content-Type: application/json" localhost:9090/kapis/monitoring.kubesphere.io/v1alpha3/clusterdashboards/test1/template

The test results are as follows:

WithRequestInfo -> WithAuthentication -> WithAuthorization -> WithKubeAPIServer

Run

This method mainly does two things, one is to start the informers to synchronize resources, and the other is to start the ks apiserver.

func (s *APIServer) Run(ctx context.Context) (err error) {
  // 启动informer工厂，包括k8s和ks的informers
	// 同步资源，包括k8s和ks的GVR
	// 检查GVR是否存在，不存在报错警告，存在就同步
	err = s.waitForResourceSync(ctx)
	if err != nil {
		return err
	}

	shutdownCtx, cancel := context.WithCancel(context.Background())
	defer cancel()

	go func() {
		<-ctx.Done()
		_ = s.Server.Shutdown(shutdownCtx)
	}()
	
	// 启动server
	klog.V(0).Infof("Start listening on %s", s.Server.Addr)
	if s.Server.TLSConfig != nil {
		err = s.Server.ListenAndServeTLS("", "")
	} else {
		err = s.Server.ListenAndServe()
	}

	return err
}

So far, after calling the Run method, ks-apiserver is started.

Now let's make a brief summary:

Create a ks-apiserver instance according to the configuration file. The instance calls three key methods, namely NewAPIServer, PrepareRun and Run methods;
NewAPIServer binds the client of each module through the given configuration, registers the custom GVK with Scheme, and exposes the apis routing service;
PrepareRun registers and binds kapi routing and callback functions through the restful-go framework, which is used to respond to itself or send component server queries to merge data and return it to the client;
Finally, call the Run method to synchronize resources and start the ks-apiserver service;

GVK exploration practice

Obviously, we only need to pay attention to the AddToContainer method of each module.

iam.kubesphere.io

pkg/kapis/iam/v1alpha2/register.go

From the code comments, this module manages the CRUD of account roles such as users, clustermembers, globalroles, clusterroles, workspaceroles, roles, workspaces groups, workspace members, and devops members.

Now we can hit a breakpoint in the handler to request these apis.

$ curl "localhost:9090/kapis/iam.kubesphere.io/v1alpha2/users"
$ curl "localhost:9090/kapis/iam.kubesphere.io/v1alpha2/clustermembers"
$ curl "localhost:9090/kapis/iam.kubesphere.io/v1alpha2/users/admin/globalroles"
...

kubeedge.kubesphere.io

pkg/kapis/kubeedge/v1alpha1/register.go

The proxy used in the code forwards the request:

func AddToContainer(container *restful.Container, endpoint string) error {
	proxy, err := generic.NewGenericProxy(endpoint, GroupVersion.Group, GroupVersion.Version)
	if err != nil {
		return nil
	}

	return proxy.AddToContainer(container)
}

That is, the request of kapis/kubeedge.kubesphere.io will be forwarded to http://edge-watcher.kubeedge.svc/api/, that is , the service under the namespace kubeedge, where the related interfaces are integrated.

Regarding the integration of the integrated edge computing platform, in addition to the quick installation and integration of a mainstream edge framework, an adapter similar to edge-shim can also be integrated, which probably needs to be considered from the following aspects:

Proxy endpoint: The current kubeedge uses proxy mode forwarding;
Health check interface: at least make sure that the components in the cloud have been successfully deployed;
Support for observable components such as events, long-term logs, and auditing;
Other edge auxiliary functions, such as file or configuration delivery;

notification.kubesphere.io

pkg/kapis/notification/v2beta1/register.go

The APIs under this group mainly implement the CRUD of the global or tenant-level config and receiver resources of notification.

config resource

Some configurations for configuring parameters related to the docking notification channel are divided into global and tenant-level config resources;

reciever resources

Some configuration information for configuring receivers, distinguishing global and tenant-level receivers;

We pick a callback function to analyze:

ws.Route(ws.GET("/{resources}").
		To(h.ListResource).
		Doc("list the notification configs or receivers").
		Metadata(KeyOpenAPITags, []string{constants.NotificationTag}).
		Param(ws.PathParameter("resources", "known values include configs, receivers, secrets")).
		Param(ws.QueryParameter(query.ParameterName, "name used for filtering").Required(false)).
		Param(ws.QueryParameter(query.ParameterLabelSelector, "label selector used for filtering").Required(false)).
		Param(ws.QueryParameter("type", "config or receiver type, known values include dingtalk, email, slack, webhook, wechat").Required(false)).
		Param(ws.QueryParameter(query.ParameterPage, "page").Required(false).DataFormat("page=%d").DefaultValue("page=1")).
		Param(ws.QueryParameter(query.ParameterLimit, "limit").Required(false)).
		Param(ws.QueryParameter(query.ParameterAscending, "sort parameters, e.g. ascending=false").Required(false).DefaultValue("ascending=false")).
		Param(ws.QueryParameter(query.ParameterOrderBy, "sort parameters, e.g. orderBy=createTime")).
		Returns(http.StatusOK, api.StatusOK, api.ListResult{Items: []interface{}{}}))
		
func (h *handler) ListResource(req *restful.Request, resp *restful.Response) {
	// 租户或用户的名称
	user := req.PathParameter("user")
	// 资源类型，configs/recievers/secrets
	resource := req.PathParameter("resources")
	// 通知渠道 dingtalk/slack/email/webhook/wechat
	subresource := req.QueryParameter("type")
	q := query.ParseQueryParameter(req)

	if !h.operator.IsKnownResource(resource, subresource) {
		api.HandleBadRequest(resp, req, servererr.New("unknown resource type %s/%s", resource, subresource))
		return
	}

	objs, err := h.operator.List(user, resource, subresource, q)
	handleResponse(req, resp, objs, err)
}

Let's look at the logic of the list object:

// List objects.
func (o *operator) List(user, resource, subresource string, q *query.Query) (*api.ListResult, error) {
	if len(q.LabelSelector) > 0 {
		q.LabelSelector = q.LabelSelector + ","
	}

	filter := ""
	// 如果没有给定租户的名称，则获取全局的对象
	if user == "" {
		if isConfig(o.GetObject(resource)) {
		    // type=default对config资源来说是全局的
			filter = "type=default"
		} else {
		    // type=global对receiever资源来说是全局的
			filter = "type=global"
		}
	} else {
	// 否则就给过滤器绑定租户名称
		filter = "type=tenant,user=" + user
	}
	// 组装过滤标签
	q.LabelSelector = q.LabelSelector + filter
	...
	// 通过过滤标签获取cluster或者namespace下的指定资源
	res, err := o.resourceGetter.List(resource, ns, q)
	if err != nil {
		return nil, err
	}

	if subresource == "" || resource == Secret {
		return res, nil
	}

	results := &api.ListResult{}
    ...
}

In this way, the CRUD of the tenant-level notification alarm CR configuration is implemented. These CRs are classified as follows:

config is divided into two levels: global type = default, tenant type = tenant;
reciever is divided into two levels: global type = global, tenant type = tenant;

So how are config and receiver bound to each other, and how are alerts sent to tenants through channels?

https://github.com/kubesphere/notification-manager/blob/master/pkg/webhook/v1/handler.go#L45

https://github.com/kubesphere/notification-manager/blob/master/pkg/notify/notify.go#L66

notification-manager is referred to as nm, and I will briefly answer it out of context.

In terms of function:

The globally configured receiver sends all alerts to its defined recipient list through the configured channel, and the receiver configured with tenant information can only send alerts under the current ns through the channel;
In reciever, you can further filter alarm messages by configuring the alertSelector parameter;
Customize the sending message template by modifying the confimap named notification-manager-template;

The process from alert to notification:

nm uses port 19093 and API path /api/v2/alerts to receive alerts sent from Alertmanager;
The callback function accepts alerts and converts them to notification template data, and distinguishes alert data according to namespace;
Traverse all Recievers, start a coroutine under each ns to send messages, and each ns here corresponds to multiple notification channels, so waitgroup is also used to concurrently arrange tasks to complete;

monitoring.kubesphere.io

pkg/kapis/monitoring/v1alpha3/register.go

The monitoring indicators are divided into platform-level, node-level, workspaces, namespaces, pods and other levels, not only can obtain the total statistics, but also can obtain monitoring indicators such as all pods/containers under nodes/namespaces/workspaces.

Let's look at the callback function and take handleNamedMetricsQuery as an example:

Traverse the legal metric indicators under the given indicator level, and filter the indicator name according to the metricFilter in the request parameter;
Judging whether it is a range query or a real-time query, call the relevant methods in the monitoring package, and request the backend through the corresponding client to get the result and return it;

code show as below:

func (h handler) handleNamedMetricsQuery(resp *restful.Response, q queryOptions) {
	var res model.Metrics

	var metrics []string
	// q.namedMetrics 是一组按照监控指标级别分类好的拥有promsql expr定义的完整指标名数组
	// 监控指标级别分类是根据 monitoring.Levelxxx在上一个栈里细分的，i.e: monitoring.LevelPod
	for _, metric := range q.namedMetrics {
		if strings.HasPrefix(metric, model.MetricMeterPrefix) {
			// skip meter metric
			continue
		}
		// 根据请求参数中的指标名来过滤
		ok, _ := regexp.MatchString(q.metricFilter, metric)
		if ok {
			metrics = append(metrics, metric)
		}
	}
	if len(metrics) == 0 {
		resp.WriteAsJson(res)
		return
	}
	
	// 判断是否是范围查询还是实时查询，继续调用相关函数
	// 主要还是用prometheus client去查询promsql, 边缘节点的指标目前通过metrics server来查询
	if q.isRangeQuery() {
		res = h.mo.GetNamedMetricsOverTime(metrics, q.start, q.end, q.step, q.option)
	} else {
		res = h.mo.GetNamedMetrics(metrics, q.time, q.option)
		if q.shouldSort() {
			res = *res.Sort(q.target, q.order, q.identifier).Page(q.page, q.limit)
		}
	}
	resp.WriteAsJson(res)
}

Now, we port the perspective to :

pkg/models/monitoring/monitoring.go:156

Taking GetNamedMetricsOverTime as an example, here is a description of combining the query results of prometheus and metrics-server for return:

func (mo monitoringOperator) GetNamedMetricsOverTime(metrics []string, start, end time.Time, step time.Duration, opt monitoring.QueryOption) Metrics {
    // 获取prometheus client查询结果，主要使用sync.WaitGroup并发查询，每个指标启动一个goroutine，最后将结果和并返回
	ress := mo.prometheus.GetNamedMetricsOverTime(metrics, start, end, step, opt)
	// 如果metrics-server激活了
	if mo.metricsserver != nil {

		//合并边缘节点数据
		edgeMetrics := make(map[string]monitoring.MetricData)

		for i, ressMetric := range ress {
			metricName := ressMetric.MetricName
			ressMetricValues := ressMetric.MetricData.MetricValues
			if len(ressMetricValues) == 0 {
				// this metric has no prometheus metrics data
				if len(edgeMetrics) == 0 {
					// start to request monintoring metricsApi data
					mr := mo.metricsserver.GetNamedMetricsOverTime(metrics, start, end, step, opt)
					for _, mrMetric := range mr {
						edgeMetrics[mrMetric.MetricName] = mrMetric.MetricData
					}
				}
				if val, ok := edgeMetrics[metricName]; ok {
					ress[i].MetricData.MetricValues = append(ress[i].MetricData.MetricValues, val.MetricValues...)
				}
			}
		}
	}

	return Metrics{Results: ress}
}

In addition, the monitoring package also defines the interface methods of each monitoring query client, which can be explored as needed:

GetMetric(expr string, time time.Time) Metric
GetMetricOverTime(expr string, start, end time.Time, step time.Duration) Metric
GetNamedMetrics(metrics []string, time time.Time, opt QueryOption) []Metric
GetNamedMetricsOverTime(metrics []string, start, end time.Time, step time.Duration, opt QueryOption) []Metric
GetMetadata(namespace string) []Metadata
GetMetricLabelSet(expr string, start, end time.Time) []map[string]string

tenant.kubesphere.io

Before talking about api, by the way, the security degree of multi-tenancy in isolation can be divided into soft isolation (Soft Multi-tenancy) and hard isolation (Hard Multi-tenancy).

Soft isolation is more for multi-rental needs within the enterprise;
Hard isolation is more for service providers that provide services to the outside world, and requires stricter isolation as a security guarantee.

The more important part of this group is to implement tenant query logs/audits/events:

Take the query log as an example:

func (h *tenantHandler) QueryLogs(req *restful.Request, resp *restful.Response) {
    // 查询上下文中携带的租户信息
	user, ok := request.UserFrom(req.Request.Context())
	if !ok {
		err := fmt.Errorf("cannot obtain user info")
		klog.Errorln(err)
		api.HandleForbidden(resp, req, err)
		return
	}
	// 解析查询的参数，比如确定属于哪个ns/workload/pod/container的查询、时间段，是否为柱状查询等
	queryParam, err := loggingv1alpha2.ParseQueryParameter(req)
	if err != nil {
		klog.Errorln(err)
		api.HandleInternalError(resp, req, err)
		return
	}
	// 导出数据
	if queryParam.Operation == loggingv1alpha2.OperationExport {
		resp.Header().Set(restful.HEADER_ContentType, "text/plain")
		resp.Header().Set("Content-Disposition", "attachment")
		// 验证账号是否有权限
		// admin账号可以导出所有ns的日志，租户只能导出本ns的日志
		// 组装loggingclient进行日志导出
		err := h.tenant.ExportLogs(user, queryParam, resp)
		if err != nil {
			klog.Errorln(err)
			api.HandleInternalError(resp, req, err)
			return
		}
	} else {
		// 验证账号是否有权限
		// admin账号可以查看所有ns的日志，租户只能查看本ns的日志
		// 组装loggingclient进行日志返回
		result, err := h.tenant.QueryLogs(user, queryParam)
		if err != nil {
			klog.Errorln(err)
			api.HandleInternalError(resp, req, err)
			return
		}
		resp.WriteAsJson(result)
	}
}

Due to the limited space, only the above GVR has been debugged. If you are interested, you can learn more~

This article is published by OpenWrite , a multi-post blog platform !