Mainstream enterprise full-link monitoring system-OpenTelemetry (2)

Insert image description here

4. Deployment (python)

Now let’s connect the demo program to OTel step by step. This article takes an example of Python demo deployment.

Preparation(1/5)

The first step is to set up the environment and install the package. After the author installed Python 3.6.8 with centos7, I used pip to install the packages required by Otel.

pip3 install flask
pip3 install opentelemetry-distro
cd /opt/
mkdir otel-demo
cd otel-demo

opentelemetry-distro brings together the API, SDK, opentelemetry-bootstrap, and opentelemetry-instrument packages, which will be used later.

Create HTTP Server (2/5)

[root@node-138 otel-demo]# cat app.py
#!/usr/bin/python env

from flask import Flask,request

app=Flask(__name__)

@app.route("/greeting")
def greeting():
    user = request.args.get("user", "DaoCloud")
    return "hello, Cloud Native! This is %s."%user

Here we will first use Automatic instrumentation to connect OTel to this small flask application.

Automatic instrumentation(3/5)

This method replaces cumbersome operations and automatically generates some basic observation data. The simplest method is used here.

opentelemetry-bootstrap -a install

This command will detect installed libraries and automatically install the relevant packages needed for developers to "plug in".

[root@node-138 otel-demo]# opentelemetry-bootstrap
opentelemetry-instrumentation-aws-lambda==0.33b0
opentelemetry-instrumentation-dbapi==0.33b0
opentelemetry-instrumentation-logging==0.33b0
opentelemetry-instrumentation-sqlite3==0.33b0
opentelemetry-instrumentation-urllib==0.33b0
opentelemetry-instrumentation-wsgi==0.33b0
opentelemetry-instrumentation-flask==0.33b0
opentelemetry-instrumentation-jinja2==0.33b0

You can see many options in the help of opentelemetry-instrument. These options correspond to the environment variables that can be configured in OTel, and the priority of variables entered on the command line will be higher than the variables configured by default.

[root@node-138 ~]# opentelemetry-instrument -h
usage: opentelemetry-instrument [-h]
                                [--attribute_count_limit ATTRIBUTE_COUNT_LIMIT]
                                [--attribute_value_length_limit ATTRIBUTE_VALUE_LENGTH_LIMIT]
...
                                command ...

...
  --exporter_otlp_traces_certificate EXPORTER_OTLP_TRACES_CERTIFICATE
  --exporter_otlp_traces_compression EXPORTER_OTLP_TRACES_COMPRESSION
  --exporter_otlp_traces_endpoint EXPORTER_OTLP_TRACES_ENDPOINT
...
  --traces_exporter TRACES_EXPORTER
  --version             print version information

...

At this point, the application has actually completed the access to OTel. Let's start the service and see the effect.

opentelemetry-instrument --traces_exporter console --metrics_exporter console flask run

After the startup command is executed, open another window and send a request to the HTTP Server to see the final effect.

[root@node-138 ~]# curl 127.0.0.1:5000/greeting
hello, Cloud Native! This is DaoCloud.

Main window

[root@node-138 otel-demo]# opentelemetry-instrument --traces_exporter console --metrics_exporter console flask run
You are using Python 3.6. This version does not support timestamps with nanosecond precision and the OpenTelemetry SDK will use millisecond precision instead. Please refer to PEP 564 for more information. Please upgrade to Python 3.7 or newer to use nanosecond precision.
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [30/Aug/2023 08:44:57] "GET /greeting HTTP/1.1" 200 -
{
    
    
    "name": "/greeting",
    "context": {
    
    
        "trace_id": "0x72c633526437624bfb78df6f5e1210c3",
        "span_id": "0x2b65163734329c43",
        "trace_state": "[]"
    },
    "kind": "SpanKind.SERVER",
    "parent_id": null,
    "start_time": "2023-08-30T00:44:57.362599Z",
    "end_time": "2023-08-30T00:44:57.364329Z",
    "status": {
    
    
        "status_code": "UNSET"
    },
    "attributes": {
    
    
        "http.method": "GET",
        "http.server_name": "127.0.0.1",
        "http.scheme": "http",
        "net.host.port": 5000,
        "http.host": "127.0.0.1:5000",
        "http.target": "/greeting",
        "net.peer.ip": "127.0.0.1",
        "http.user_agent": "curl/7.29.0",
        "net.peer.port": 47178,
        "http.flavor": "1.1",
        "http.route": "/greeting",
        "http.status_code": 200
    },
    "events": [],
    "links": [],
    "resource": {
    
    
        "attributes": {
    
    
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.12.0",
            "telemetry.auto.version": "0.33b0",
            "service.name": "unknown_service"
        },
        "schema_url": ""
    }
}
{
    
    "resource_metrics": [{
    
    "resource": {
    
    "attributes": {
    
    "telemetry.sdk.language": "python", "telemetry.sdk.name": "opentelemetry", "telemetry.sdk.version": "1.12.0", "telemetry.auto.version": "0.33b0", "service.name": "unknown_service"}, "schema_url": ""}, "scope_metrics": [{
    
    "scope": {
    
    "name": "opentelemetry.instrumentation.flask", "version": "0.33b0", "schema_url": ""}, "metrics": [{
    
    "name": "http.server.active_requests", "description": "measures the number of concurrent HTTP requests that are currently in-flight", "unit": "requests", "data": {
    
    "data_points": [{
    
    "attributes": {
    
    "http.method": "GET", "http.host": "127.0.0.1:5000", "http.scheme": "http", "http.flavor": "1.1", "http.server_name": "127.0.0.1"}, "start_time_unix_nano": 1693356297362655744, "time_unix_nano": 1693356352010139136, "value": 0}], "aggregation_temporality": 2, "is_monotonic": false}}, {
    
    "name": "http.server.duration", "description": "measures the duration of the inbound HTTP request", "unit": "ms", "data": {
    
    "data_points": [{
    
    "attributes": {
    
    "http.method": "GET", "http.host": "127.0.0.1:5000", "http.scheme": "http", "http.flavor": "1.1", "http.server_name": "127.0.0.1", "net.host.port": 5000, "http.status_code": 200}, "start_time_unix_nano": 1693356297364467968, "time_unix_nano": 1693356352010139136, "count": 1, "sum": 2, "bucket_counts": [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0], "explicit_bounds": [0.0, 5.0, 10.0, 25.0, 50.0, 75.0, 100.0, 250.0, 500.0, 1000.0], "min": 2, "max": 2}], "aggregation_temporality": 2}}], "schema_url": ""}], "schema_url": ""}]}

The above are the traces and metrics information of a request respectively. These information are the "built-in" observation items of automatic instrumentation for users. For example, tracking the life cycle data of /greeting routing requests, as well as indicator data such as http.server.active_requests and http.server.duration. But these data are far from enough to actually build observability for applications in production .

Add observation items (Manual) (4/5)

Automatic integration provides users with some basic observation data, but these data mostly come from the "edge" of the system, such as in-band or out-of-band HTTP requests, and cannot show users the internal situation of the application. The following is based on Automatic instrumentation Then manually add some observation capabilities.

  • Traces
[root@node-138 otel-demo]# cat app.py
#!/usr/bin/python env

from flask import Flask,request
from opentelemetry import trace

app=Flask(__name__)
tracer=trace.get_tracer(__name__)

@app.route("/greeting")
def greeting():
  with tracer.start_as_current_span("greeting") as greeting_span:
    user = request.args.get("user", "DaoCloud")
    greeting_words = "hello, Cloud Native! This is %s."%user
    greeting_span.set_attribute("greeting.content",greeting_words)
    greeting_span.set_attribute("greeting.person",user)
    return greeting_words

Initialize a tracer and create a trace data (that is, span, which is a sub-span of the span data created by automatic integration).

Run the service again and request.

[root@node-138 otel-demo]# opentelemetry-instrument --traces_exporter console -- metrics_exporter console flask run                                                       
You are using Python 3.6. This version does not support timestamps with nanoseco                                                          nd precision and the OpenTelemetry SDK will use millisecond precision instead. P                                                          lease refer to PEP 564 for more information. Please upgrade to Python 3.7 or new                                                          er to use nanosecond precision.
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployme                                                          nt.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [30/Aug/2023 08:57:40] "GET /greeting HTTP/1.1" 200 -
{
    
    
    "name": "greeting",
    "context": {
    
    
        "trace_id": "0x5e489ef3e6d5ba767676bb23ff31842f",
        "span_id": "0xb2fbc81eb8ae804d",								
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": "0x7553887517e8ff6f",									#parent_id
    "start_time": "2023-08-30T00:57:40.131915Z",
    "end_time": "2023-08-30T00:57:40.132000Z",
    "status": {
    
    
        "status_code": "UNSET"
    },
    "attributes": {
    
    
        "greeting.content": "hello, Cloud Native! This is DaoCloud.",#新增属性
        "greeting.person": "DaoCloud"															#新增属性
    },
    "events": [],
    "links": [],
    "resource": {
    
    
        "attributes": {
    
    
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.12.0",
            "telemetry.auto.version": "0.33b0",
            "service.name": "unknown_service"
        },
        "schema_url": ""
    }
}
{
    
    
    "name": "/greeting",
    "context": {
    
    
        "trace_id": "0x5e489ef3e6d5ba767676bb23ff31842f",
        "span_id": "0x7553887517e8ff6f",								#span_id
        "trace_state": "[]"
    },
    "kind": "SpanKind.SERVER",
    "parent_id": null,
    "start_time": "2023-08-30T00:57:40.130697Z",
    "end_time": "2023-08-30T00:57:40.132480Z",
    "status": {
    
    
        "status_code": "UNSET"
    },
    "attributes": {
    
    
        "http.method": "GET",
        "http.server_name": "127.0.0.1",
        "http.scheme": "http",
        "net.host.port": 5000,
        "http.host": "127.0.0.1:5000",
        "http.target": "/greeting",
        "net.peer.ip": "127.0.0.1",
        "http.user_agent": "curl/7.29.0",
        "net.peer.port": 47186,
        "http.flavor": "1.1",
        "http.route": "/greeting",
        "http.status_code": 200
    },
    "events": [],
    "links": [],
    "resource": {
    
    
        "attributes": {
    
    
            "telemetry.sdk.language": "python",
            "telemetry.sdk.name": "opentelemetry",
            "telemetry.sdk.version": "1.12.0",
            "telemetry.auto.version": "0.33b0",
            "service.name": "unknown_service"
        },
        "schema_url": ""
    }
}

You can see that the parent_id of the manually created greeting span is the same as the span_id of /greeting, indicating a parent-child relationship.

  • Metrics
[root@node-138 otel-demo]# cat app.py
#!/usr/bin/python env

from flask import Flask,request
from opentelemetry import trace,metrics

app=Flask(__name__)
tracer=trace.get_tracer(__name__)
meter=metrics.get_meter(__name__)
greeting_counter=meter.create_counter(
    "greeting_counter",
    description="The number of greeting times of each person",
)

@app.route("/greeting")
def greeting():
  with tracer.start_as_current_span("greeting") as greeting_span:
    user = request.args.get("user", "DaoCloud")
    greeting_words = "hello, Cloud Native! This is %s."%user
    greeting_span.set_attribute("greeting.content",greeting_words)
    greeting_span.set_attribute("greeting.person",user)
    greeting_counter.add(1, {
    
    "greeting.persion": user})
    return greeting_words

Initialize a meter and create a Counter to count each user parameter (count the number of calls).

Run the service again, this time sending multiple different requests.

curl http://localhost:5000/greeting  # 1
curl http://localhost:5000/greeting	 # 2
curl http://localhost:5000/greeting	 # 3
curl http://localhost:5000/greeting?user=sulun  # 4

Observation results

[root@node-138 otel-demo]# opentelemetry-instrument --traces_exporter console --                                                          metrics_exporter console flask run
...
{
    
    "resource_metrics": [{
    
    "resource": {
    
    "attributes": {
    
    "telemetry.sdk.language": "python", "telemetry.sdk.name": "opentelemetry", "telemetry.sdk.version": "1.12.0", "telemetry.auto.version": "0.33b0", "service.name": "unknown_service"}, "schema_url": ""}, "scope_metrics": [{
    
    "scope": {
    
    "name": "opentelemetry.instrumentation.flask", "version": "0.33b0", "schema_url": ""}, "metrics": [{
    
    "name": "http.server.active_requests", "description": "measures the number of concurrent HTTP requests that are currently in-flight", "unit": "requests", "data": {
    
    "data_points": [{
    
    "attributes": {
    
    "http.method": "GET", "http.host": "127.0.0.1:5000", "http.scheme": "http", "http.flavor": "1.1", "http.server_name": "127.0.0.1"}, "start_time_unix_nano": 1693357979607876096, "time_unix_nano": 1693357993432316672, "value": 0}, {
    
    "attributes": {
    
    "http.method": "GET", "http.host": "localhost:5000", "http.scheme": "http", "http.flavor": "1.1", "http.server_name": "127.0.0.1"}, "start_time_unix_nano": 1693357979607876096, "time_unix_nano": 1693357993432316672, "value": 0}], "aggregation_temporality": 2, "is_monotonic": false}}, {
    
    "name": "http.server.duration", "description": "measures the duration of the inbound HTTP request", "unit": "ms", "data": {
    
    "data_points": [{
    
    "attributes": {
    
    "http.method": "GET", "http.host": "127.0.0.1:5000", "http.scheme": "http", "http.flavor": "1.1", "http.server_name": "127.0.0.1", "net.host.port": 5000, "http.status_code": 200}, "start_time_unix_nano": 1693357979609903616, "time_unix_nano": 1693357993432316672, "count": 3, "sum": 4, "bucket_counts": [0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0], "explicit_bounds": [0.0, 5.0, 10.0, 25.0, 50.0, 75.0, 100.0, 250.0, 500.0, 1000.0], "min": 1, "max": 2}, {
    
    "attributes": {
    
    "http.method": "GET", "http.host": "localhost:5000", "http.scheme": "http", "http.flavor": "1.1", "http.server_name": "127.0.0.1", "net.host.port": 5000, "http.status_code": 200}, "start_time_unix_nano": 1693357979609903616, "time_unix_nano": 1693357993432316672, "count": 1, "sum": 1, "bucket_counts": [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0], "explicit_bounds": [0.0, 5.0, 10.0, 25.0, 50.0, 75.0, 100.0, 250.0, 500.0, 1000.0], "min": 1, "max": 1}], "aggregation_temporality": 2}}], "schema_url": ""}, {
    
    "scope": {
    
    "name": "app", "version": "", "schema_url": ""}, "metrics": [{
    
    "name": "greeting_counter", "description": "The number of greeting times of each person", "unit": "", "data": {
    
    "data_points": [{
    
    "attributes": {
    
    "greeting.persion": "DaoCloud"}, "start_time_unix_nano": 1693357979609319424, "time_unix_nano": 1693357993432316672, "value": 3}, {
    
    "attributes": {
    
    "greeting.persion": "sulun"}, "start_time_unix_nano": 1693357979609319424, "time_unix_nano": 1693357993432316672, "value": 1}], "aggregation_temporality": 2, "is_monotonic": true}}], "schema_url": ""}], "schema_url": ""}]}

This time, three default calls and one call with user parameters were sent. This behavior was counted by the previously set Counter and generated metrics data.

The above steps are the preliminary process of introducing OTel and manually adding Traces and Metrics data. As for the Logs module, because the export to console startup method has just been selected, outputting logs to the console has the same effect as printing a log normally, so this section will not demonstrate it first. Next, let’s take a look at Logs in OTel based on the content of OTLP. module performance.

Send OTLP protocol data to Collector (5/5)

The key functions of Collector have been introduced before. Here are some benefits that Collector can bring:

  • Collector allows multiple services to share an observation data pool, thereby reducing the cost of switching Exporters.
  • Collector can aggregate the same trace data across hosts and services.
  • Collector provides a transfer station that spits data before the backend, providing the ability to analyze and filter data in advance.

Let's demonstrate docking a simple Collector.

First, create a tmp/ folder under the path of the app.py file, and create an otel-collector-config.yaml file.

[root@node-138 tmp]# cat otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
exporters:
  logging:
    loglevel: debug
processors:
  batch:
service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [logging]
      processors: [batch]
    metrics:
      receivers: [otlp]
      exporters: [logging]
      processors: [batch]

Next, pull the basic Collector image provided by OTel and start a Collector.

docker run -p 4317:4317 -v /opt/otel-demo/tmp/otel-collector-config.yaml:/etc/otel-collector-config.yaml  otel/opentelemetry-collector:latest  --config=/etc/otel-collector-config.yaml

The command is not difficult to understand. After mounting the configuration file, listen to the Collector's default port 4317.

docker run -p 4317:4317 -v /opt/otel-demo/tmp/otel-collector-config.yaml:/etc/otel-collector-config.yaml  otel/opentelemetry-collector:latest  --config=/etc/otel-collector-config.yaml
2023-08-30T01:48:04.874Z        info    service/telemetry.go:84 Setting up own telemetry...
...
2023-08-30T01:48:04.878Z        info    [email protected]/otlp.go:83 Starting GRPC server    {
    
    "kind": "receiver", "name": "otlp", "data_type": "metrics", "endpoint": "0.0.0.0:4317"}

Then the previously automatically and manually captured observation data needs to be converted into a standard format through the OTLP protocol and sent to the Collector.

This step requires the Exporter of the OTLP protocol, and the OTel community has also prepared an installation package for users.

pip3 install opentelemetry-exporter-otlp

Now launch the application again.

opentelemetry-instrument flask run

There is no longer a need to specify --traces_exporter consoleparameters such as this, because the ability of Automatic instrumentation is also used here. The instrument agent will automatically detect the newly installed package, and switch to the Exporter of the OTLP/gRPC protocol for the user at the next startup, the default target The port is also 4317.

When the service is requested again, the Collector process will receive OTLP standard data, but the original Flask process will no longer display telemetry data.

  • Traces
Resource SchemaURL:
Resource attributes:
     -> telemetry.sdk.language: Str(python)
     -> telemetry.sdk.name: Str(opentelemetry)
     -> telemetry.sdk.version: Str(1.12.0)
     -> telemetry.auto.version: Str(0.33b0)
     -> service.name: Str(unknown_service)
ScopeSpans #0
ScopeSpans SchemaURL:
InstrumentationScope app
Span #0
    Trace ID       : 9d67e3b1f2e6f40a0925fd7951bc0942
    Parent ID      : f73a56bacf0c1a05
    ID             : 5e764b6a40cab9c3
    Name           : greeting
    Kind           : Internal
    Start time     : 2023-08-30 01:49:22.416530944 +0000 UTC
    End time       : 2023-08-30 01:49:22.416732416 +0000 UTC
    Status code    : Unset
    Status message :
Attributes:
     -> greeting.content: Str(hello, Cloud Native! This is DaoCloud.)
     -> greeting.person: Str(DaoCloud)
ScopeSpans #1
ScopeSpans SchemaURL:
InstrumentationScope opentelemetry.instrumentation.flask 0.33b0
Span #0
    Trace ID       : 9d67e3b1f2e6f40a0925fd7951bc0942
    Parent ID      :
    ID             : f73a56bacf0c1a05
    Name           : /greeting
    Kind           : Server
    Start time     : 2023-08-30 01:49:22.414059776 +0000 UTC
    End time       : 2023-08-30 01:49:22.417396224 +0000 UTC
    Status code    : Unset
    Status message :
Attributes:
     -> http.method: Str(GET)
     -> http.server_name: Str(127.0.0.1)
     -> http.scheme: Str(http)
     -> net.host.port: Int(5000)
     -> http.host: Str(127.0.0.1:5000)
     -> http.target: Str(/greeting)
     -> net.peer.ip: Str(127.0.0.1)
     -> http.user_agent: Str(curl/7.29.0)
     -> net.peer.port: Int(47980)
     -> http.flavor: Str(1.1)
     -> http.route: Str(/greeting)
     -> http.status_code: Int(200)
        {
    
    "kind": "exporter", "data_type": "traces", "name": "logging"}
2023-08-30T01:49:34.750Z        info    MetricsExporter {
    
    "kind": "exporter", "data_type": "metrics", "name": "logging", "resource metrics": 1, "metrics": 3, "data points": 3}
2023-08-30T01:49:34.751Z        info    ResourceMetrics #0
  • metrics
Resource SchemaURL:
Resource attributes:
     -> telemetry.sdk.language: Str(python)
     -> telemetry.sdk.name: Str(opentelemetry)
     -> telemetry.sdk.version: Str(1.12.0)
     -> telemetry.auto.version: Str(0.33b0)
     -> service.name: Str(unknown_service)
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope opentelemetry.instrumentation.flask 0.33b0
Metric #0
Descriptor:
     -> Name: http.server.active_requests
     -> Description: measures the number of concurrent HTTP requests that are currently in-flight
     -> Unit: requests
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
     -> http.method: Str(GET)
     -> http.host: Str(127.0.0.1:5000)
     -> http.scheme: Str(http)
     -> http.flavor: Str(1.1)
     -> http.server_name: Str(127.0.0.1)
StartTimestamp: 2023-08-30 01:49:22.414117888 +0000 UTC
Timestamp: 2023-08-30 01:49:34.58335872 +0000 UTC
Value: 0
Metric #1
Descriptor:
     -> Name: http.server.duration
     -> Description: measures the duration of the inbound HTTP request
     -> Unit: ms
     -> DataType: Histogram
     -> AggregationTemporality: Cumulative
HistogramDataPoints #0
Data point attributes:
     -> http.method: Str(GET)
     -> http.host: Str(127.0.0.1:5000)
     -> http.scheme: Str(http)
     -> http.flavor: Str(1.1)
     -> http.server_name: Str(127.0.0.1)
     -> net.host.port: Int(5000)
     -> http.status_code: Int(200)
StartTimestamp: 2023-08-30 01:49:22.417513728 +0000 UTC
Timestamp: 2023-08-30 01:49:34.58335872 +0000 UTC
Count: 1
Sum: 3.000000
Min: 3.000000
Max: 3.000000
ExplicitBounds #0: 0.000000
ExplicitBounds #1: 5.000000
ExplicitBounds #2: 10.000000
ExplicitBounds #3: 25.000000
ExplicitBounds #4: 50.000000
ExplicitBounds #5: 75.000000
ExplicitBounds #6: 100.000000
ExplicitBounds #7: 250.000000
ExplicitBounds #8: 500.000000
ExplicitBounds #9: 1000.000000
Buckets #0, Count: 0
Buckets #1, Count: 1
Buckets #2, Count: 0
Buckets #3, Count: 0
Buckets #4, Count: 0
Buckets #5, Count: 0
Buckets #6, Count: 0
Buckets #7, Count: 0
Buckets #8, Count: 0
Buckets #9, Count: 0
Buckets #10, Count: 0
ScopeMetrics #1
ScopeMetrics SchemaURL:
InstrumentationScope app
Metric #0
Descriptor:
     -> Name: greeting_counter
     -> Description: The number of greeting times of each person
     -> Unit:
     -> DataType: Sum
     -> IsMonotonic: true
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
     -> greeting.persion: Str(DaoCloud)
StartTimestamp: 2023-08-30 01:49:22.416660736 +0000 UTC
Timestamp: 2023-08-30 01:49:34.58335872 +0000 UTC
Value: 1
        {
    
    "kind": "exporter", "data_type": "metrics", "name": "logging"}
  • Logs
    The SDK and related components of Logs are still under development, so the current process of connecting to the app and connecting to the Collector is not as convenient as the other two types of observation data.

Edit the otel-collector-config.yaml file. In the above steps, the OTLP protocol
Insert image description here
is created in sequence (this step is very important for the Collector to receive the logs output by the application), and finally the instance created with log_emitter is added to the logging handlers (note that in the Flask application , you need to add the Stream type handler to the logging handler, otherwise it will affect the information output of werkzeug).log_emitter_providerexporterlog_emitterLoggingHandler

Rerun the Collector container and application:

docker run -p 4317:4317 -v /opt/otel-demo/tmp/otel-collector-config.yaml:/etc/otel-collector-config.yaml  otel/opentelemetry-collector:latest  --config=/etc/otel-collector-config.yaml

opentelemetry-instrument flask run

Request the /greeting interface to see the results in the Collector.
Insert image description here
It can be noticed that the Collector only receives log messages of INFO and above levels. This is because loglevel=INFO was previously set in the otel-collector-config.yaml file.
Insert image description here
Taking Flask and werkzeug as examples, other libraries, frameworks or system logs will also be sent to Collector.
Insert image description here
Logs recorded within the Span context will be processed with the Trace ID and Span ID after being processed by the OTLP protocol, allowing all observation data to be connected to each other.

5. Deploy OpenTelemetry + Prometheus + Grafana

This time, docker-compose is used for rapid deployment.

Theory: Collect trace and metrics data through OpenTelemetry-Collector, and automatically convert the data into an observable format and provide it to prometheus, and finally display it by grafana

Modify configuration file (1/5)

otel-collector-config.yaml

[root@node-138 tmp]# cat otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
      http:
        cors:
          allowed_origins:
            - http://*
            - https://*
exporters:
  logging:
    loglevel: debug
  prometheus:
    endpoint: "0.0.0.0:8889"
    const_labels:
      label1: value1
processors:
  batch:
service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [logging]
      processors: [batch]
    metrics:
      receivers: [otlp]
      exporters: [prometheus]
      processors: [batch]
    logs:
      receivers: [otlp]
      exporters: [logging]

docker-compose.yaml

[root@node-138 tmp]# cat docker-compose.yaml
version: '3.3'

services:
    otel-collector:
        image: otel/opentelemetry-collector:0.50.0
        command: ["--config=/etc/otel-collector-config.yaml"]
        volumes:
            - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
        ports:
            - "1888:1888"   # pprof extension
            - "8888:8888"   # Prometheus metrics exposed by the collector
            - "8889:8889"   # Prometheus exporter metrics
            - "13133:13133" # health_check extension
            - "4317:4317"        # OTLP gRPC receiver
            - "4318:4318"        # OTLP http receiver
            - "55670:55679" # zpages extension
    prometheus:
        container_name: prometheus
        image: prom/prometheus:latest
        volumes:
            - ./prometheus.yaml:/etc/prometheus/prometheus.yml
        ports:
            - "9090:9090"
    grafana:
        container_name: grafana
        image: grafana/grafana
        ports:
            - "3000:3000"

prometheus.yaml

[root@node-138 tmp]# cat prometheus.yaml
# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

  - job_name: 'otel-collector'
    scrape_interval: 10s
    static_configs:
      - targets: ['192.168.17.138:8889']
      - targets: ['192.168.17.138:8888']

Starting a container (2/5)

[root@node-138 tmp]# docker-compose up -d
[root@node-138 tmp]# docker ps
CONTAINER ID   IMAGE                                 COMMAND                  CREATED       STATUS       PORTS                                                                                                                                                       NAMES
417ec6e93078   otel/opentelemetry-collector:0.50.0   "/otelcol --config=/…"   2 hours ago   Up 2 hours   0.0.0.0:1888->1888/tcp, 0.0.0.0:4317-4318->4317-4318/tcp, 0.0.0.0:8888-8889->8888-8889/tcp, 0.0.0.0:13133->13133/tcp, 55678/tcp, 0.0.0.0:55670->55679/tcp   tmp_otel-collector_1
a2a8014cb136   grafana/grafana                       "/run.sh"                2 hours ago   Up 2 hours   0.0.0.0:3000->3000/tcp                                                                                                                                      grafana
1064669ca884   prom/prometheus:latest                "/bin/prometheus --c…"   2 hours ago   Up 2 hours   0.0.0.0:9090->9090/tcp                                                                                                                                      prometheus

Start business services (3/5)

Still use the python flask program in the previous section. Note that we also do not need to specify parameters. The collector will collect indicator data for us.

[root@node-138 tmp]# cat ../app.py
#!/usr/bin/python env

from flask import Flask,request
from opentelemetry import trace,metrics

app=Flask(__name__)
tracer=trace.get_tracer(__name__)
meter=metrics.get_meter(__name__)
greeting_counter=meter.create_counter(
    "greeting_counter",
    description="The number of greeting times of each person",
)

@app.route("/greeting")
def greeting():
  with tracer.start_as_current_span("greeting") as greeting_span:
    user = request.args.get("user", "DaoCloud")
    greeting_words = "hello, Cloud Native! This is %s."%user
    greeting_span.set_attribute("greeting.content",greeting_words)
    greeting_span.set_attribute("greeting.person",user)
    greeting_counter.add(1, {
    
    "greeting.persion": user})
    return greeting_words
[root@node-138 tmp]# opentelemetry-instrument flask run

access services

curl http://localhost:5000/greeting  # 1
curl http://localhost:5000/greeting	 # 2
curl http://localhost:5000/greeting	 # 3
curl http://localhost:5000/greeting?user=sulun  # 4

View data Prometheus (4/5)

Visit the prometheus web page http://192.168.17.138:9090/
Insert image description hereto view the indicator data
Insert image description hereand you can see the counter information.

Check out grafana (5/5)

Insert image description here

Guess you like

Origin blog.csdn.net/u010230019/article/details/132580704