How to use Prometheus monitor your application

Prometheus as a complete open source monitoring close to the program because of its many powerful features and ecological openness, seems to have become the de facto standard for surveillance and has been widely deployed in applications worldwide. So how to take advantage of Prometheus create an effective monitoring of our application of it? In fact, as we only need to apply in a manner consistent Prometheus standard exposure monitoring data can be, for follow-up monitoring data acquisition, processing, preservation are done automatically by Prometheus.

Generally, there are two Prometheus monitored subject: If the user has the ability to customize the application code, Prometheus SDK provides a variety of languages, the user can easily integrate it to the application, so that the internal state of the application is valid monitor the data in a format consistent with Prometheus standard external exposure; for MySQL, Nginx and other applications, on the one hand custom code is quite difficult, on the other hand they have been exposed to some form of external monitoring data for such applications, we requires an intermediate components, the use of such an application interface to obtain raw monitoring data and converted into a format consistent with Prometheus standard external exposure, such intermediate components, which we call Exporter, the community has a large number of ready-made Exporter can be used directly.

This article will be written by a Golang HTTP Server, for example, need to show how to use Golang SDK Prometheus, and gradually add a variety of monitoring indicators and exposed in a standard manner the internal state of the application. Subsequent content will be divided into basic, advanced two parts, the first part of it is usually sufficient for most needs, but if you want to get more customization capabilities, then the second part can provide a good reference .

1. Basic

Promethues provided in various languages SDK package is already very well, and if you want a Golang Prometheus program can be monitored, we need to do is to introduce client_golangthis package and add a few lines of code, the code examples are as follows:

package main

import (
        "net/http"

        "github.com/prometheus/client_golang/prometheus/promhttp" ) func main() { http.Handle("/metrics", promhttp.Handler()) http.ListenAndServe(":8080", nil) }

We can see, the above code, we just started the HTTP Server and client_golangprovides a default HTTP Handler registered to the path /metricson. We can try to run the program and interface to access the following results:

$ curl http://127.0.0.1:8080/metrics
...
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 7
# HELP go_info Information about the Go environment. # TYPE go_info gauge go_info{version="go1.12.1"} 1 # HELP go_memstats_alloc_bytes Number of bytes allocated and still in use. # TYPE go_memstats_alloc_bytes gauge go_memstats_alloc_bytes 418912 ...

Default Handler Prometheus Golang SDK will be automatically registered to provide a range of surveillance and monitoring indicators Golang runtime application process for related information. So, in case we have not registered any custom indicators, it is still exposed to a series of indicators. Indicators of exposure is also very unified format, the first line to # HELPthe beginning of the explanation for the use of the index, sub-line to # TYPEthe beginning of the indicators used to illustrate the type of follow-up a few lines is the specific content indicators. For Prometheus, as long as the address and the access paths to provide gripping objects (usually /metrics), it can monitor the data shown above to crawl. For users, and custom metrics only need to modify the corresponding index value according to the operation of the program. For the polymerization intermediate exposure monitoring data, Prometheus the SDK will help you deal automatically. Below, we will try to combine specific needs for various types of custom add monitoring indicators.

For a HTTP Server, understanding the current rate request is received is very important. Prometheus supports a data type called Counter on this type is essentially only a monotonically increasing counter. If we define a Counter represents the cumulative number of HTTP requests received, the amount of growth in the most recent period of the Counter is actually receiving rate. Further, Prometheus define a custom query PromQL, it can be conveniently monitored statistical analysis of data samples, including the data rate required for the type of Counter. Accordingly, after the modified program as follows:

package main

import (
	"net/http"

	"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promauto" "github.com/prometheus/client_golang/prometheus/promhttp" ) var ( http_request_total = promauto.NewCounter( prometheus.CounterOpts{ Name: "http_request_total", Help: "The total number of processed http requests", }, ) ) func main() { http.HandleFunc("/", func(http.ResponseWriter, *http.Request){ http_request_total.Inc() }) http.Handle("/metrics", promhttp.Handler()) http.ListenAndServe(":8080", nil) }

We use promautopackage provides NewCountera method defines a Counter types of monitoring indicators, only need to fill name and help information, the index now created. It should be noted that the name of the Counter type of data to try to _totalas a suffix. Otherwise, when Prometheus integration with other systems, the problem may not be recognized indicator will appear. Whenever there is a request to access the root directory, the index will call the Inc()method plus one, of course, we can also call the Add()method of accumulating any non-negative.

Run the modified program again, the first of many visits to the root path, and then the /metricspath access, you can see the new definition of success indicators have been exposed:

$ curl http://127.0.0.1:8080/metrics | grep http_request_total
# HELP http_request_total The total number of processed http requests
# TYPE http_request_total counter
http_request_total 5

Monitor cumulative request processing is clearly not enough, we usually want to know the number of requests currently being processed. Gauge type data in Prometheus, Counter and different, it can not only increase becomes smaller. The number of requests being processed Gauge defined type is suitable. Thus, we add the following block of code:

...
var (
	...
	http_request_in_flight = promauto.NewGauge(
		prometheus.GaugeOpts{
			Name:	"http_request_in_flight", Help: "Current number of http requests in flight", }, ) ) ... http.HandleFunc("/", func(http.ResponseWriter, *http.Request){ http_request_in_flight.Inc() defer http_request_in_flight.Dec() http_request_total.Inc() }) ...

Gauge Counter types of data and the operation of the difference is not large, the only difference is Gauge support Dec()or a Sub()method for reducing the index value.

For a network service, the ability to know the average delay is important, but many times we want to know the response time of the distribution. Prometheus in the Histogram type provides good support for such needs. Specific to the needs of the new code is as follows:

...
var (
	...
	http_request_duration_seconds = promauto.NewHistogram(
		prometheus.HistogramOpts{
			Name:		"http_request_duration_seconds", Help: "Histogram of lantencies for HTTP requests", // Buckets: []float64{.1, .2, .4, 1, 3, 8, 20, 60, 120}, }, ) ) ... http.HandleFunc("/", func(http.ResponseWriter, *http.Request){ now := time.Now() http_request_in_flight.Inc() defer http_request_in_flight.Dec() http_request_total.Inc() time.Sleep(time.Duration(rand.Intn(1000)) * time.Millisecond) http_request_duration_seconds.Observe(time.Since(now).Seconds()) }) ...

After visiting several of the above-described HTTP Server root path, from the /metricsresponse obtained following path:

# HELP http_request_duration_seconds Histogram of lantencies for HTTP requests
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.005"} 0 http_request_duration_seconds_bucket{le="0.01"} 0 http_request_duration_seconds_bucket{le="0.025"} 0 http_request_duration_seconds_bucket{le="0.05"} 0 http_request_duration_seconds_bucket{le="0.1"} 3 http_request_duration_seconds_bucket{le="0.25"} 3 http_request_duration_seconds_bucket{le="0.5"} 5 http_request_duration_seconds_bucket{le="1"} 8 http_request_duration_seconds_bucket{le="2.5"} 8 http_request_duration_seconds_bucket{le="5"} 8 http_request_duration_seconds_bucket{le="10"} 8 http_request_duration_seconds_bucket{le="+Inf"} 8 http_request_duration_seconds_sum 3.238809838 http_request_duration_seconds_count 8

Histogram exposure monitoring data type Gauge Counter and much more complex, and finally to _sumand _countat the beginning of each indicator represents the total count for the response time, and response time. And a plurality of row above them represents: the number of the response delay in 0.005 seconds, 0.01 seconds in the response times, number of responses in the last .025 seconds ... +Infindicates the response time of infinite response times, and the value _countof values are equal. Obviously, Histogram types of monitoring data presented nicely distribution of data. Of course, the boundaries of the Histogram default settings, for example, 0.005, 0.01 such values are typically used to measure the delay of a network service. For a specific application scenario, we can also customize them, similar to the above code is commented that the (final +Infautomatically added).

And Histogram Similarly, Prometheus a type defined in Summary, from another angle depicts the distribution of data. For the response delay, we may want to know how much their median? Nine digits and how much? Summary data for the type of use and is defined as follows:

...
var (
	...
	http_request_summary_seconds = promauto.NewSummary(
		prometheus.SummaryOpts{
			Name:	"http_request_summary_seconds", Help: "Summary of lantencies for HTTP requests", // Objectives: map[float64]float64{0.5: 0.05, 0.9: 0.01, 0.99: 0.001, 0.999, 0.0001}, }, ) ) ... http.HandleFunc("/", func(http.ResponseWriter, *http.Request){ now := time.Now() http_request_in_flight.Inc() defer http_request_in_flight.Dec() http_request_total.Inc() time.Sleep(time.Duration(rand.Intn(1000)) * time.Millisecond) http_request_duration_seconds.Observe(time.Since(now).Seconds()) http_request_summary_seconds.Observe(time.Since(now).Seconds()) }) ...

Summary of use and definition Histogram is similar, we get the final results are as follows:

$ curl http://127.0.0.1:8080/metrics | grep http_request_summary
# HELP http_request_summary_seconds Summary of lantencies for HTTP requests
# TYPE http_request_summary_seconds summary
http_request_summary_seconds{quantile="0.5"} 0.31810446 http_request_summary_seconds{quantile="0.9"} 0.887116164 http_request_summary_seconds{quantile="0.99"} 0.887116164 http_request_summary_seconds_sum 3.2388269649999994 http_request_summary_seconds_count 8

Similarly, _sumand _countdenote overall delay request and the number of requests, and Histogram is different, represent Summary rest, the median response time is 0.31810446 seconds, nine bits like bit 0.887116164. We can also customize quantile Summary presented according to specific needs, such as the above program annotated Objectives field. Puzzling, it is a map type, which represents the key quantiles, and the value is represented by an error. For example, the above-described 0.31810446 seconds is distributed between the response data 0.45 to 0.55, 0.5 and not perfectly fall.

In fact, the above-mentioned Counter, Gauge, Histogram, Summary Prometheus is able to support all types of monitoring data (in fact, there is a type Untyped, indicating an unknown type). The most commonly used is Counter and Gauge these two basic types, combined with a strong foundation for PromQL monitoring data analysis and processing capabilities, we will be able to get extremely rich monitoring information.

However, sometimes, we might want more from a feature dimensions to measure indicators. For example, the number of HTTP requests received, we may wish to know the number of requests specific to each received path. Suppose the current can access /and /foodirectory, obviously define two different Counter, such as http_request_root_total and http_request_foo_total, is not a good way. On the one hand the expansion is rather poor: If you define more access paths need to create more new monitoring indicators, at the same time, we define the characteristic dimension is often more than one possible path, and we want to know a number of requests for the return code of XXX how much is that this method can not do anything; on the other hand, PromQL has a very good analysis of these indicators polymerization.

Prometheus is a label defined metrics for each feature dimensions such problems of the method, the label is essentially a set of pairs. And a plurality of indicators can be associated with label, and a particular set of indicators and a label uniquely identify a time series. For statistical problems are the number of requests for each of the paths, Prometheus standard solution is as follows:

...
var (
	http_request_total = promauto.NewCounterVec(
		prometheus.CounterOpts{
			Name:	"http_request_total", Help: "The total number of processed http requests", }, []string{"path"}, ) ... http.HandleFunc("/", func(http.ResponseWriter, *http.Request){ ... http_request_total.WithLabelValues("root").Inc() ... }) http.HandleFunc("/foo", func(http.ResponseWriter, *http.Request){ ... http_request_total.WithLabelValues("foo").Inc() ... }) )

Counter herein by way of example the type of data, the data for the other three additional types of operations are the same. Here we call NewCounterVecdefining metrics method, we define a named pathlabel in /and /foothe Handler, the WithLabelValuesmethod specify the label value rootand foo, if the value corresponding to the time sequence is not present, the new method will one after the operation and the general index of Counter no different. By the final /metricsexposure were as follows:

$ curl http://127.0.0.1:8080/metrics | grep http_request_total
# HELP http_request_total The total number of processed http requests
# TYPE http_request_total counter
http_request_total{path="foo"} 9 http_request_total{path="root"} 5

It can be seen that, at this time index http_request_totalcorresponding to the two time series, respectively, for the path fooand a rootnumber of requests. So if we turn to statistics, the sum of the individual requesting the path of it? Do we need to define a path value totalused to represent the overall count circumstances? Obviously it is not necessary to, PromQL can easily each dimension of the data is an indicator of the polymerization, the query statement by the sum of the requests can be obtained Prometheus:

sum(http_request_total)

Prometheus label is a simple and powerful tool, in theory, no limit to the number of Prometheus can be an indicator of an associated label. However, the number label is also not possible, because each additional label, the user when using PromQL would require additional consider a label configuration. After the general, we require the addition of a label, for summing and averaging indicators are meaningful.

2. Advanced

We will be able to monitor various indicators of well defined in their own applications which based on the content described above and ensure that it is receiving and processing the Prometheus. But sometimes we may need more customization capabilities, despite the use of highly packaged API really convenient, but it added something that may not be what we want, and related processes associated Golang such as the default Handler provides runtime Some of the monitoring indicators. In addition, when writing Exporter of ourselves, how to use existing components, the monitoring indicators native applications into standard indicators in line with Prometheus. In order to solve these problems, we need to realize the internal mechanism of Prometheus SDK to understand some of the more profound.

In Prometheus SDK in, Register and Collector are two core object. Collector which may comprise one or more Metric, which is in fact a Golang the interface, provides the following two methods:

type Collector interface {
	Describe(chan<- *Desc) Collect(chan<- Metric) }

Briefly, Describe method can be provided for each of the information describing the Metric Collector through channel, Collect it provides a method wherein each Metric specific data channel. Collector's the definition alone is not enough, we also need to register it in a Registry, Registry will call its way to ensure Describe the newly added Metric and Metric is not already present before the conflict. The Registry is required and the specific Handler is associated, so that when users access /metricspath, Handler will call the Registry Collector Collect various methods have been registered, access to index data and return.

In the above, we define such a convenient indicator, is the root cause promautofor us a lot of packages, e.g., using our promauto.NewCountermethod, specific implementation is as follows:

http_request_total = promauto.NewCounterVec(
	prometheus.CounterOpts{
		Name:	"http_request_total", Help: "The total number of processed http requests", }, []string{"path"}, ) --- // client_golang/prometheus/promauto/auto.go func NewCounterVec(opts prometheus.CounterOpts, labelNames []string) *prometheus.CounterVec { c := prometheus.NewCounterVec(opts, labelNames) prometheus.MustRegister(c) return c } --- // client_golang/prometheus/counter.go func NewCounterVec(opts CounterOpts, labelNames []string) *CounterVec { desc := NewDesc( BuildFQName(opts.Namespace, opts.Subsystem, opts.Name), opts.Help, labelNames, opts.ConstLabels, ) return &CounterVec{ metricVec: newMetricVec(desc, func(lvs ...string) Metric { if len(lvs) != len(desc.variableLabels) { panic(makeInconsistentCardinalityError(desc.fqName, desc.variableLabels, lvs)) } result := &counter{desc: desc, labelPairs: makeLabelPairs(desc, lvs)} result.init(result) // Init self-collection. return result }), } }

A Counter (or CounterVec, that is contained in the Counter label) Collector's actually a specific implementation, the description of its Describe the methods provided, nothing more than the name of the index, as well as information to help Label name defined. promautoAfter the completion of its definition, also calls prometheus.MustRegister(c)were registered. In fact, prometheus default provides a Default Registry, prometheus.MustRegisterwill register directly Collector in the Default Registry. If we directly use promhttp.Handler()to process the /metricsrequest path, it will directly Default Registry and Handler associated and registered Golang Collector and Process Collector to Default Registry. So, assuming that we do not need to monitor these indicators automatically injected, as long as you can construct your own Handler.

Of course, Registry and Collector are also able to custom, especially at the time of writing Exporter, we will tend to all the metrics defined in a Collector, depending on the result of the application to access the native interface for monitoring indicators needed to fill and return result. For achieving the above mechanisms Prometheus SDK is understood that we can achieve a simple Exporter frame as follows:

package main

import (
	"net/http"
	"math/rand" "time" "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promhttp" ) type Exporter struct { up *prometheus.Desc } func NewExporter() *Exporter { namespace := "exporter" up := prometheus.NewDesc(prometheus.BuildFQName(namespace, "", "up"), "If scrape target is healthy", nil, nil) return &Exporter{ up: up, } } func (e *Exporter) Describe(ch chan<- *prometheus.Desc) { ch <- e.up } func (e *Exporter) Scrape() (up float64) { // Scrape raw monitoring data from target, may need to do some data format conversion here rand.Seed(time.Now().UnixNano()) return float64(rand.Intn(2)) } func (e *Exporter) Collect(ch chan<- prometheus.Metric) { up := e.Scrape() ch <- prometheus.MustNewConstMetric(e.up, prometheus.GaugeValue, up) } func main() { registry := prometheus.NewRegistry() exporter := NewExporter() registry.Register(exporter) http.Handle("/metrics", promhttp.HandlerFor(registry, promhttp.HandlerOpts{})) http.ListenAndServe(":8080", nil) }

In the most simple to achieve this Exporter, we created a new Registry, the Collector exporter manually completed enrollment and is based on the Registry had built a Handler and the /metricsassociate. In the initial exporter, we just need to call the NewDesc()method of filling the information required to describe indicators monitored. When a user accesses /metricswhen a path through the full chain of calls, during the last Collect, we will monitor the interface of the native application to be accessed to obtain monitoring data. In a real Exporter implementation, this step should be Scrape()to complete the method. Finally, native returned monitoring data, the use of MustNewConstMetric()constructed we need Metric, you can return to the channel. The Exporter to access the /metricsresults obtained are as follows:

$ curl http://127.0.0.1:8080/metrics
# HELP exporter_up If scrape target is healthy
# TYPE exporter_up gauge
exporter_up 1

3. Summary

After analysis of this article can be found by Prometheus SDK application simple secondary development, it can be Prometheus effectively monitor, so you can enjoy the convenience of ecological monitoring throughout Prometheus brought. At the same time, Prometheus SDK also provides multiple levels of abstraction, in general, highly packaged API will be able to quickly meet our needs. For more customization requirements, Prometheus SDK also has many of the underlying, more flexible API available.

The sample code and herein Prometheus how applications and secondary development of the Exporter to prepare detailed specification, see references to related content.

references

Guess you like

Origin www.cnblogs.com/YaoDD/p/11391316.html