InfluxDB-Use InfluxDB templates, scheduled tasks, dashboards, service process parameters, and migration data

Using InfluxDB Templates

What are InfluxDB Templates

The InfluxDB template is a yaml-style configuration file. It includes a complete set of dashboards, Telegraf configuration and alarm configuration. The InfluxDB template strives to ensure that it can be used out of the box. Download the yaml file and import it into InfluxDB. Everything from data collection to data monitoring and alarming will be created for you.

InfluxDB officially includes a batch of templates on github. Before development, you can take a stroll here to see if there is anything you can use directly.

https://github.com/influxdata/community-templates

Example: Rapid deployment using templates

In this example, we will use community templates to quickly create a set of docker monitoring templates. To complete this example, you need to master docker-related knowledge in advance.

(1) Find the Docker template document

Visit https://github.com/influxdata/community-templates in the previous section to find the directory of Docker templates and click on it.

As you can see, there is a section titled Quick install which contains detailed configuration instructions.

MMSIZE

(2) Install template

Use influx-cli to install the template.

influx apply -f https://raw.githubusercontent.com/influxdata/community-templates/master/docker/docker.yml

After the command is executed, the following message will pop up, asking you whether to use the above resources.

MMSIZE

The resources referred to here involve what bucket name to create in your InfluxDB, what scheduled tasks and alarm tasks to create, what dashboard to create, etc. If you are on an InfluxDB that is in production and has similar businesses, you still need to take a good look at this list to avoid duplicate bucket names.

After confirming that there is no problem, hit y and press Enter.

MMSIZE

If the content displayed next ends with Stack ID: xxxx, it means the installation is successful! The concept of Stack here is actually an instance of a template.

(3) View the installation results

Now, let's open the Web UI of InfluxDB and take a look at the import effect of our template. The picture below is the bucket created by the template for us, named docker.

1) Bucket

There is a bucket called docker:

MMSIZE

2) telegraf configuration

A Telegraf configuration file named Docker Monitor. This configuration file may need to be modified based on your Docker configuration.

MMSIZE

3) Dashboard

The template also helped us create a Docker dashboard, but there is no data in our current Bucket, so the charts in this column have not yet been displayed.

MMSIZE

4) Alarm rules

The template also helps us set up 4 alarm rules. According to the description of the topic, they are

  • Container CPU usage exceeds 80% for 15 minutes

  • Container disk usage exceeds 80%

  • The memory usage of the container exceeds 80% for 15 minutes

  • The container did not exit with status 0 (normal end).

MMSIZE

(4) Run Telegraf to collect data

Now we are going to use Telegraf to run the configuration file in the docker template. But before, we wrote a host_tel.sh start and stop script in the ~/bin directory. That file will ensure that there is only one telegraf globally. So, in order to avoid confusion, we need to stop the previous telegraf now.

host_tel.sh stop

Next, we write a new script.

Still in the ~/bin directory, create the docker_tel.sh file. Type the following:

#!/bin/bash
export INFLUX_TOKEN=h106QMEj47juNUco-6T-op1Tzz0IeMh5MhBIDT8vUdv1R3BVeAzMvWGq2DtmJIcyuPwvPmHTLbZLTbnKxz3UK
A==
export INFLUX_HOST=http://localhost:8086/
export INFLUX_ORG=atguigu
/opt/module/telegraf-1.23.4/usr/bin/telegraf --config http://localhost:8086/api/v2/telegrafs/09edf888eeeb6000

According to the requirements of the docker template, before running telegraf, we need to declare three variables, INFLUX_TOKEN, INFLUX_HOST and INFLUX_ORG.

Then, we modify the execution permissions of docker_tel.sh.

 chmod 755 ./docker_tel.sh

Finally, start docker_tel.sh:

./docker_tel.sh

(5) Check the template effect

First, you can check whether there is data in the docker bucket in DataExplorer.

HELP HIM

As shown in the figure, the data has been successfully entered into InfluxDB.

Next, we can take a look at the status of the dashboard, as shown below. The dashboard also displays data successfully.

HELP HIM

(6) Run a docker container

Run a docker starter container using the command below.

docker run -dp 80:80 docker/getting-started

If there is no docker/getting-started image on your host, then docker goes back to dockerhub to pull the image. Because this image is abroad, the speed may be very slow. If the pull fails, please use Baidu to replace the source.

In addition, after the container is running, it will take some time for telegraf to collect data.

(7) View the dashboard again

As shown in the figure below, the number of our images and containers has changed from 0 to 1. And the system's memory usage has become higher.

HELP HIM

(8) Delete stack (template instance)

On the Web UI, click the button on the left toolbar, and then click TEMPLATES above. You can see the list of installed templates, and there is a delete button on the right side of each last shift list, which can be quickly deleted in this way.

HELP HIM

After deletion, all resources involved in the stack will disappear.

You can also delete it via influx-cli:

influx stacks remove -o atguigu --stack-id=09ee20c80d692000

influx-cli has a full range of functions. You can also use it to rename a stack, or view all stacks under an organization, etc.

Disadvantages of InfluxDB templates

(1) FLUX compatibility

We have seen before that many InfluxDB templates have FLUX language scripts embedded in them. However, the FLUX language versions compiled into different InfluxDBs are different. The most important thing is that the FLUX language is still in a period of rapid changes, and the standard library has not yet been determined. Especially later versions of FLUX may discard functions and APIs in previous versions. This results in poor forward compatibility of the FLUX language.

For example, the FLUX version of InfluxDB 2.4 is 0.179, and the FLUX version of Influx2.0 is 0.131.0. Four iterations can make the FLUX version different by more than 40 versions. The most typical example is the smaple-data template. It is no longer available after InfluxDB 2.3. If you load this template, you will be prompted that there is a problem with line 11 of the configuration file.

This is actually because csv.from(url:xxx) in this template has been abandoned.

MMSIZE

In addition, the officially maintained template warehouse is in a state of "disrepair" and lacks maintenance. Therefore, you may need to manually modify the template yourself.

(2) Ecology is not as good as Grafana

Grafana is a framework that specializes in monitoring dashboards, supports setting monitoring tasks, and supports multiple databases as data sources. The community activity is higher than that of InfluxDB, so there are richer and easier-to-use templates under the Grafana framework. Grafana also supports InfluxDB as a data source, so the InfluxDB+Grafana solution can be used in framework selection. In this way, InfluxDB is only responsible for reading and writing, and Grafana is responsible for data display and alarming.

The picture below is a template provided by the Grafana community. It can be seen that after filtering, there are 1117 dashboards that support InfluxDB as a data source.

MMSIZE

On the other hand, InfluxDB cannot keep up with granfana in terms of template collection, activity, and template updates.

scheduled tasks

What is a scheduled task

An InfluxDB task is a scheduled FLUX script that queries data, modifies or aggregates it in some way, and then writes the data back to InfluxDB or performs other operations.

Example: Convert data to json and send to other applications

(1) Ways to create tasks

There are many ways to help you create tasks, such as DataExplorer, Notebook, HTTP API and influx-cli. However, here we will use influx-cli to create a scheduled task in order to restore the original appearance of the scheduled task.

(2) Requirements for this task

Our scheduled tasks must meet the following requirements.

  1. Scheduled every 30s

  2. Query the first piece of data in the last 30 seconds

  3. Convert data to json

  4. Send json over HTTP to SimpleHttpPostServer.

SimpleHttpPostServer is the simplest HTTP POST service written by myself in go language. Its function is to receive a POST request, and then convert the content of the request body into a string and print it out. You can visit the github address https://github.com/realdengziqi/simpleHttpPostServer ,

Download the source code and compile it yourself, or download the linux-x64 executable program I have compiled in the release record.

(3) Start simpleHttpPostServer

After downloading simpleHttpPostServer, cd to the directory where it is located and execute the simpleHttpPostServer program. After running, the terminal will be blocked.

(4) Test whether simpleHttpPostServer is working properly

We can use curl to verify that simpleHttpPostServer is working. Send a POST request to simpleHttpPostServer using the following command.

 curl -POST http://localhost:8080/ -d '{"hello":"world"}'

After that, check the terminal occupied by simpleHttpPostServer. If a new piece of data appears in the terminal, it means that simpleHttpPostServer is working normally.

(5) Write FLUX script in DataExplorer

We first write the logic of converting the query into json in DataExplorer.

What we want to query is the go_goroutines measurement under the test_init bucket. This measurement reflects the number of goroutines (lightweight threads) currently in our InfluxDB application. Open DataExplrer and write the following code:

import "json"
import "http"
from(bucket: "test_init")
     |> range(start:-30s)
     |> filter(fn: (r) => r["_measurement"] == "go_goroutines")
     |> first(column: "_value")
     |> map(
         fn: (r) => {
             status_code = http.post(url:"http://localhost:8080/", data:json.encode(v:r))
             return {r with status_code:status_code}
     }
 )

Code explanation:

  • from -> range -> filter, specify the data source and take out the sequence we want, in which the start parameter of range is programmed to -30s.

  • The first function, the data returned by Flux query InfluxDB is sorted from first to last by time by default. The first function combined with the previous query is equivalent to fetching only the first piece of data in the last 30 seconds.

  • map function, we complete the sending of data in the map function. Note here that the map function requires that a record must be returned, and the output record cannot be exactly the same as the input record. Therefore, when returning at the end, we use the with syntax to add a field to the record output by the map. That is the status code of the http request response we send. In the anonymous function in map, we use http.post to send a piece of json format data to http://localhost:8080/.

(6) Run the code and observe the effect

Now, we click the SUBMIT button to execute this FLUX query script and observe the data returned by DataExplorer and the output of simpleHttpPostServer.

1) Observe DataExplorer

After clicking SUBMIT, click the view Raw Data button. We focus on the data in table format. As you can see, there is now an additional status_code column in the data, and its value is 200. Moreover, because of the first() function, we only generated one row of data for this query.

MMSIZE

2) Observe SimpleHttpPostServer

As you can see, our httpServer successfully received the JSON.

HELP HIM

So far, it shows that our FLUX script can meet the requirements. The current question is how to set this script as a scheduled task.

(7) Configure timing scheduling

Now, insert a row before our query logic.

option task = {
    
     name: "example_task", every: 30s, offset:0m }

Indicates that we have made a setting and specified a task named example_task. This task is executed every 30 seconds. The offset here is temporarily set to 0m. We will talk about the significance of the offset here later.

import "json"
import "http"

option task = { name: "example_task", every: 30s, offset:0m }

from(bucket: "test_init")
     |> range(start:-30s)
     |> filter(fn: (r) => r["_measurement"] == "go_goroutines")
     |> first(column: "_value")
     |> map(
         fn: (r) => {
             status_code = http.post(url:"http://localhost:8080/", data:json.encode(v:r))
             return {r with status_code:status_code}
     }
 )

(8) Use influx-cli to create tasks

Although we have written an option in DataExplorer, whether the option will take effect depends on the FLUX script and then interacting with the HTTP API of InfluxDB. So clicking the SUBMIT button here will only execute the query again, and the option task will not take effect. Here, we will first use influx-cli to create a task.

First copy the Flux script and create a file in /opt/modules/examples, call it example_task.flux. Paste the script you wrote earlier.

import "json"
import "http"
option task = { name: "example_task", every: 30s, offset:0m }
from(bucket: "test_init")
     |> range(start:-30s)
     |> filter(fn: (r) => r["_measurement"] == "go_goroutines")
     |> first(column: "_value")
     |> map(
         fn: (r) => {
             status_code = http.post(url:"http://localhost:8080/", data:json.encode(v:r))
             return {r with status_code:status_code}
     }
 )

Use the following command to create a FLUX task.

./influx task create --org atguigu -f /opt/module/examples/example_task.flux

As you can see, our task has been successfully created.

HELP HIM

(9) View scheduled tasks on the Web UI

Click the button on the left toolbar to view the task list. You can see that the task has been successfully created.

HELP HIM

Click on the name of the task to view the task execution details.

MMSIZE

On the details page, you can see the task's scheduling time, start time, task time and other information. Click the EDIT TASK button at the top to see the current task definition. Here, you can also modify the task definition directly.

MMSIZE

(10) Check the effect of scheduled tasks on the receiving end

The receiving end of the data is simpleHttpPostServer. As you can see, our receiving end currently receives a piece of json data every 30 seconds.

HELP HIM

(11) Use DataExplorer to create tasks

This time, we use DataExplorer to create the task. Now, let's delete the existing tasks first.

HELP HIM

Open DataExplorer, edit the FLUX script, and paste the query script we wrote before. Note that the option line needs to be deleted. As shown below:

HELP HIM

After completing the above operation, click the SAVE AS button on the upper right side of the DataExplorer page and select the TASK tab in the pop-up dialog box.

MMSIZE

  • Name, fill in as example_task

  • Every, fill in 30s

  • Offset can be left empty, so the default is 0. The 20m shown here is the front-end rendering effect and has nothing to do with specific task execution.

  • Please note that when filling in the OutputBucket here, the script we wrote did not specify writing data back to InfluxDB. However, if a scheduled task is created from the Web UI, the Output Bucket must be set, which will cause a write-back operation. This is why we didn't use the Web UI to create tasks before.

After configuration, click SAVE AS TASK.

(12) Check the task details again (pay attention to the small movements of the Web UI)

Now, let's go back to the task list again. It can be seen that the task has been successfully created and is running normally.

HELP HIM

Click the EDIT TASK button to view our FLUX script. You will be surprised to find that our FLUX code has been modified, and a to function is forced to be added to the operation that was not originally written back to the database. In addition, you can see that an option task code has been added in front of our code, which shows that the page click operation of the Web UI only helps us complete a few steps of typing the code by hand.

HELP HIM

In general, it depends on whether the developer can accept that the code is implicitly modified. If this behavior is unacceptable, it is strongly recommended to use influx-cli to create tasks.

late data problem

The offset in option is specifically used to help us deal with the lateness problem.

First, let's focus on a late arrival scenario, as shown below. Our scheduled task queries the last 30 seconds of data each time. At the same time, the scheduling interval is set to execute every 30 seconds.

HELP HIM

At this time, due to the network delay, the data that should have been entered into the database in 1 minute and 20 seconds did not arrive until 1 minute and 32 seconds. But at 1 minute and 30 seconds, our query has been executed. At this time we missed 1 minute and 20 seconds of data. At this time, if we set the offset to 5 seconds. As shown below:

HELP HIM

The execution time of the scheduled task is delayed by 5 seconds, but the data in the original range is still queried. At this time, the execution time of the scheduled task is 1 minute and 35 seconds. The original lateness data can be queried by us.

cron expression

In fact, InfluxDB's scheduled tasks also support cron expressions. option is written as follows:

option task = {
     // ...
     cron: "0 * * * *",
}

crontab is a tool on Linux that can set up scheduled execution commands. The cron expression is an expression that represents a time interval that was first used in crontab. For relevant information, please refer to the open source cron tutorial of the rookie tutorial.

https://www.runoob.com/linux/linux-comm-crontab.html

If you are very proficient in cron and can write it right in one go, that would be great. If you are not sure what to write at the first time during development, you can use Baidu's online cron generation tools as assistance.

However, here I recommend the open source cron generator project on gitee. For example: https://gitee.com/toktok/easy-cron is a corn generation tool based on node.js. You can pull down the code and deploy it yourself, or you can use its online demo. http://www.easysb.cn/open/easy-cron/index.html

This tool can support cron expressions with 5, 6, and 7 fields at the same time.

MMSIZE

Supplement: The nature of InfluxDB crawling tasks

Behind the crawling tasks we set up in the Web UI before is actually the FLUX script that is executed regularly. It's just that InfluxDB separates them in the API.

import "experimental/prometheus"
prometheus.scrape(url: "http://localhost:8086/metrics")

In the current FLUX version, the experimental/prometheus library provides us with the ability to collect data in prometheus format. For details, please refer to https://docs.influxdata.com/flux/v0.x/prometheus/scrape-prometheus/

InfluxDB dashboard

What is the InfluxDB Dashboard

Click the button on the left to enter the InfluxDB dashboard management page. You can see the management page of the dashboard, as shown below:

HELP HIM

I open a System dashboard here. Note that the content in this dashboard depends on Example 2 we did before.

HELP HIM

This is a dashboard that monitors host hardware and network resources. Each Cell in the dashboard is actually a FLUX query statement. The data results are obtained by executing FLUX, and then the UI is used to display it as various charts. InfluxDB executes these queries the moment you open the dashboard.

Dashboard controls

(1) Manual refresh

Click the refresh button on the upper right to re-execute a round of queries in the dashboard. Because the usual FLUX scripts query data from the current period of time, the refresh function is still necessary.

(2) Turn on automatic refresh

The ENABLE AUTO REFRESH button on the upper right can turn on the automatic refresh of the dashboard:

MMSIZE

(3) Switch display time zone

Local button, you can choose to display the current date and time as the current time zone or UTC.

(4) Set the query range

Specify the length of time in the past to query data.

MMSIZE

(5) Add a Cell

Cell is a graph of multiple graphs in the dashboard. Adding graphics corresponds to the ADD CELL button in the upper left corner.

(6) Add a Note

A Note is also a module in the dashboard and supports Markdown syntax. Corresponds to the ADD NOTE button in the upper left corner.

(7) Display variables

If the dashboard contains queries involving variables, a drop-down menu will appear at the top of the dashboard. You can use the drop-down menu to specify the value of the variable to operate the dashboard to display the response data. Corresponds to Show Variables in the upper left corner.

MMSIZE

(8) Turn on annotations

You can hold down shift and the left mouse button to add guide lines to the dashboard icon. Turning annotations on and off affects the visibility of guides.

MMSIZE

(9) Full screen and dark mode

This function is on the ... button in the upper left corner, as shown in the figure:

MMSIZE

Example: Make an interactive and dynamic dashboard

This example builds a dashboard for metrics related to CPU usage, which relies on Example 2. Please complete the modified example on the basis of example 2.

(1) Demand

Users would like to have a drop-down menu on our dashboard to choose which CPU usage to view. The indicator to be monitored is usage_user, and the dashboard should display the maximum, minimum, and median values ​​of CPU usage every 1 minute.

(2) Create variables

I won’t explain why variables are created here.

Hover the mouse over the button on the left and select Variables on the popup bar. as the picture shows:

Click the CREATE VARIABLES button in the upper right corner, select New Variable, and a dialog box for creating variables will pop up. Under the premise that the Type in the upper right corner is Query, type the following content in the script editing area:

import "influxdata/influxdb/schema"
schema.tagValues(bucket: "example02",tag:"cpu")

This script can query the tag values ​​of the cpu tag in the example02 bucket. In the upper left corner, you need to assign a name to the variable. The input here is CPU.

(3) Create a new dashboard

Back to the dashboard management page, click the CREATE DASHBOARD button to create a new dashboard. Click the ADD CELL button in the upper left corner.

HELP HIM

(4) Create a new cell

HELP HIM

As you can see, the familiar DataExplorer appeared again. After entering, switch directly to SCRIPT EDITOR. Type the following.

basedata = from(bucket: "example02")
 |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
 |> filter(fn: (r) => r["_measurement"] == "cpu")
 |> filter(fn: (r) => r["_field"] == "usage_user")
 |> filter(fn: (r) => r["cpu"] == v.CPU)
basedata
 |> aggregateWindow(every: 1m, fn: median, createEmpty: false)
 |> yield(name: "median")
basedata
 |> aggregateWindow(every: 1m, fn: max, createEmpty: false)
 |> yield(name: "max")
basedata
 |> aggregateWindow(every: 1m, fn: min, createEmpty: false)
 |> yield(name: "min")

Click SUBMIT to see the effect.

HELP HIM

(5) Optimize display effect

The default visualization type is Graph, we will now switch it to Band, which means a line chart with boundaries.

MMSIZE

After switching graphics, click CUSTOMIZE to customize settings.

There is a column for Aggregate Functions, where the Upper Column Name is specified as max. The Main Column Name is median, and the Lower Column Name is min.

HELP HIM

This is what a bordered line chart does. Finally, click the checkmark in the upper right corner to save.

(6) Check the effect

As you can see, a drop-down menu named CPU appears at the top of the dashboard. Through this drop-down menu, we can control the entire dashboard, but only if the FLUX query statement corresponding to the cell refers to the variable v.CPU we set.

HELP HIM

Use the drop-down menu to select a different CPU to display the corresponding data.

HELP HIM

Example: More flexible variables and dashboards

(1) Demand

In the previous example, we can dynamically adjust the sequence displayed in the dashboard through a variable named CPU.

But the dashboard in the previous example has a flaw. As shown in the figure, we can only display one sequence at a time, but what if we want to compare the performance difference between two CPUs? At this time, the dashboard made in the previous example is not enough.

HELP HIM

Now, we hope that the dashboard can display the working status of the two CPUs at the same time to facilitate visual comparison.

(2) Create variables

Click the Settings-\Variables button on the left toolbar. Enter the variable configuration page:

HELP HIM

Click the CREATE VARIABLE button in the upper right corner of the page. A dialog window for creating variables will pop up on the Web UI.

HELP HIM

Select CSV from the Type drop-down menu in the upper right corner. (In the previous example, we created the Query type. Variables of the Query type can change dynamically according to the status of the data. But other Map types and CSV types cannot. They are static. If you want the values ​​to change, Unless its value is manually adjusted again through API or Web UI)

Give the variable a good name in the upper left corner. In the demonstration, we set the variable name to cpuxxx.

The main area in the middle is used to set the value of the variable. You can use CSV format here, but there is no need to organize the values ​​in rows and columns. The CSV format here actually just requires you to use, (English comma) to separate values. In fact, this place can also use line breaks to separate values. As shown in the figure, we use line breaks to separate values.

Here, we set the values ​​as cpu0, cpu1, cpu2, cpu3 and cpu1|cpu2. Notice! cpu1|cpu2 here is a regular expression, indicating cpu1 or cpu2.

HELP HIM

There is a Select A Default drop-down menu in the lower right corner, which can set a default value for our variables. Here you can set the default value to cpu0. At this point, our cpuxxx has been created.

MMSIZE

(3) Modify the FLUX script (add regular filtering)

First, as shown in the figure, click the gear button first, and then click the Configure button in the pop-up menu to modify the current Cell.

MMSIZE

Now we need to make some changes to the query script from the previous example. Add a steamer cpuxxx value to the filter to perform regular filtering.

Below is our final script, where we will not show the maximum and minimum values ​​of the data.

basedata = from(bucket: "example02")
 |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
 |> filter(fn: (r) => r["_measurement"] == "cpu")
 |> filter(fn: (r) => r["_field"] == "usage_user")
 |> filter(fn: (r) => regexp.matchRegexping(r:regexp.compile(v:v.cpuxxx),v:r["cpu"])

Code explanation:

  • regexp.compile(v:v.cpuxxx): It should be noted that the type of variables we set in InfluxDB is always a string type, so if we want to perform regular matching, we must first convert the string into a regular expression. The compile function under the regexp package is specifically used to convert strings into regular expressions.

  • regexp.matchRegexpString: Used to determine whether a string matches a regular expression. If there is a match, the function will return true; if there is no match, it will return false.

  • In this case, when we set the value of the variable cpuxxx to cpu1|cpu2, we can display the two sequences we want at the same time.

Finally, click the √ button in the upper right corner to save the modified cell.

(4) View the final effect

After returning to the dashboard, you can see that the variable drop-down menu at the top has changed from cpu to cpuxxx, which means that the dashboard will automatically determine which variables are used by the internal cells and make corresponding adjustments. As shown in the picture below, this is the effect after modification.

MMSIZE

At this time, select cpu1|cpu2, and you can see that two sequences will appear in the previous cell.

MMSIZE

InfluxDB service process parameters (usage of influxd command)

influxd command list

After our InfluxDB is downloaded, the influxd in the decompression directory is the startup command of our InfluxDB service process.

For details, please refer to: https://docs.influxdata.com/influxdb/v2.4/reference/cli/influx/

Order literal translation explain
downgrade Downgrade Downgrade metadata format to match older releases
help help Print help information for the influxd command
inspect examine Check the data of the database on disk
print-config Print configuration (This command has been deprecated in 2.4) Print the complete configuration information of influxd in the current environment
recovery recover Restore operating authority to InfluxDB, manage tokens, organizations and users
run run Run influxd service (default)
upgrade upgrade Upgrade InfluxDB from 1.x to InfluxDB2.4
version Version Print the current version of InfluxDB

It is not necessary to use the influxd command to view the current configuration of InfluxDB. You can also use the influx-cli command:

influx server-config

inspect instruction

You can use the following command to view help information for the inspect subcommand.

./influxd inspect -h

MMSIZE

You will find that there are many subcommands under the inspect subcommand.

The tsi, tsm, and wal appearing here are all related to the underlying storage engine of InfluxDB. With a little click here, you can use the following command to view the general situation of data storage in InfluxDB.

./influd inspect report-tsm

The execution result is shown in the figure below.

MMSIZE

The information displayed includes the data storage situation of InfluxDB, such as how many sequences there are in the entire InfluxDB, how many sequences there are in each bucket, etc.

In addition, there is also a more important export-tsm command, which can export all the data in a certain bucket to the InfluxDB row protocol. We will demonstrate its use in detail in an example later.

recovery command

You can first use the following command to view the help information of the recovery subcommand.

./influd recovery -h

As shown in the figure, the influxd recovery command is mainly used to repair or regenerate the operator permissions required to operate InfluxDB.

MMSIZE

There are three subcommands under recovery, namely auth, org and user. They are related to token, organization and user respectively.

The following mainly explains the usage of the auth subcommand. Use the following command to further view the help information of the auth subcommand.

./influxd recovery auth -h

The returned results are as shown below:

MMSIZE

You can see that it has two subcommands.

  • create-operator: Create a new operator token for a user.

  • list: List all tokens in the current database.

Use the following command to create an operator-token again for user tony.

./influxd recovery auth create-operator --username tony --org atguigu

After the command is executed, the terminal will display the content as shown in the figure below. You can see that an operator token named tony's Recovery Token is created here.

influxd common configuration items

There are many configuration items available for influxd, for details, please refer to: https://docs.influxdata.com/influxdb/v2.4/reference/config-options/#assets-path

The following are some commonly used parameters

  • bolt-path: The path to the BoltDB file.

  • engine-path: path to the InfluxDB file

  • sqlit-path: The path of sqlite. InfluxDB also uses sqllite, which stores some metadata about task execution.

  • flux-log-enabled: Whether to enable the log, the default is false.

  • log-level: log level, supports debug, info, error, etc. The default is info.

How to configure influxd

There are 3 ways to configure influxd. Here we use http-bind-address to perform the operation and demonstrate it for everyone.

(1) Command line parameters

Before performing the following operations, remember to close the currently running influxd. You can use the following command to kill the influxd process of course. Otherwise, the original influxd process will lock the BoltDB database and other processes cannot access it. Of course you can also modify the BlotDB path, but that would be too troublesome.

ps -ef | grep influxd | grep -v grep | awk '{print $2}' | xargs kill

When the user uses the influxd command to start InfluxDB, a configuration item is passed through the command line parameters. for example:

./influxd --http-bind-address=:8088

You can try to access port 8088 to see if the service is connected to the port.

(2) Environment variables

Similarly, kill the previous influxd process first. Run the following command:

ps -ef | grep influxd | grep -v grep | awk '{print $2}' | xargs kill

Users can declare an environment variable to configure influxd, for example:

export INFLUXD_HTTP_BIND_ADDRESS=:8089

Now, let's start influxd and see the effect.

Finally, because we are using the export command, we temporarily set up an environment variable. If you feel that the current shell session is not important, you can close the current shell session. Otherwise, you can use the unset command to destroy this environment variable.

unset INFLUXD_HTTP_BIND_ADDRESS

(3) Configuration file

You also place a config file in the directory where influxd is located, which can be config.json, config.toml, or config.yaml. Influxd can recognize these three formats, but the content in the file must be legal. influxd will automatically detect this file when it starts.

Create a config.json file in the InfluxDB installation directory.

vim /opt/module/influxdb2_linux_amd64/config.json

Edit the following:

{
    
    
    "http-bind-address": ":9090"
}

Remember to stop the previous InfluxDB process before starting it.

ps -ef | grep influxd | grep -v grep | awk '{print $2}' | xargs kill

Now start it again and see the effect.

./influxd

You can see that the port has changed to 9090. The configuration is also effective.

How does the time series database store usernames and passwords?

InfluxDB comes with a BlotDB written in Go language . BlotDB is a key-value database. Its functions are relatively limited. It basically focuses on storing and reading values. At the same time, because of its limited functions, it can also be made small and lightweight.

InfluxDB stores usernames, passwords, tokens and other information in such a key-value database. By default, BlotDB data will be stored in a separate file, which will be in the ~/.influxdbv2/ path and named influxd.bolt.

The path of this file can be modified in influxd through the bolt-path configuration item.

Migrate data from InfluxDB OSS

Export data from InfluxDB

To export InfluxDB data, you must use the influxd command (note, not the influx command). In InfluxDB2.x, data export is based on buckets.

Here are sample commands:

influxd inspect export-lp \
 --bucket-id 12ab34cd56ef \
 --engine-path ~/.influxdbv2/engine \
 --output-path path/to/export.lp
 --start 2022-01-01T00:00:00Z \
 --end 2022-01-31T23:59:59Z \
 --compress

Parameter explanation:

  • influxd inspect, influxd is a command line tool that can operate the InfluxDB service process, inspect is a subcommand of the influxd command, use inspect to

  • export-lp is the abbreviation of export xxx to line protocol, which means exporting data to line protocol. It is a subcommand of inspect.

  • bucket-id, a required parameter of inspect. bucket id

  • engine-path, a required parameter of inspect, but has a default value ~/.influxdbv2/engine. So if your data directory is ~/.influxdbv2/engine, then it is okay not to specify this parameter.

  • output-path, a required parameter of inspect, specifies the location of the output file.

  • start, optional, the start time of exporting data

  • end, optional, the end time of exported data.

  • compress, it is recommended to enable it. If enabled, influxd will use gzip to compress the output data.

Example: Export data from InfluxDB

This time, we try to export the data of test_init. As of now, the data in this bucket should be the most at present.

(1) First, you can use influx-cli or the Web UI to view the ID corresponding to the bucket we want to export. Here, you can choose to use the Web UI, and you can see that the ID of the test_init bucket is 0a2e821ccd12854a.

MMSIZE

(2) So, we run the following command and try to export the data.

./influxd inspect export-lp \ 
--bucket-id 0a2e821ccd12854a \ 
--output-path ./oh.lp

This command will export the data in the test_init bucket to the oh.lp file in the current directory in the InfluxDB row protocol format.

Under normal circumstances, the program will output a series of read and write information.

(3) Use the following command to view the files and their sizes in the current path.

ls -lh

The h parameter of ls can print the number of bytes of the file into easier-to-read MB and GB units.

MMSIZE

As you can see, the data file oh.lp we exported is 1.5G in size.

(4) Now, we use the tail command to view the contents of the file.

tail -15 ./oh.lp

The output of the command is the last 15 lines of the file, and you can see that it is full of data of the InfluxDB line protocol.

HELP HIM

However, we should pay attention to a characteristic of the InfluxDB row protocol. In fact, for the entire file, the measurements of multiple pieces of data are actually repeated. The repetition rate of the tagset is not low, and the changes in the filed are not great. This highly repetitive data is actually very suitable for compression algorithms.

Example: Compress when exporting data

(1) Now, we re-run the data export command, this time adding the –compress parameter at the end of the command. Don't worry about the oh.lp file already existing in the directory, the program will overwrite it directly.

./influxd inspect export-lp \ 
--bucket-id 0a2e821ccd12854a \ 
--output-path ./oh.lp
--compress

(2) Use the ls command to check the file size again.

MMSIZE

You can see that the file has changed from 1.5G before to 91M now, and the compression rate is very high.

Guess you like

Origin blog.csdn.net/qq_44766883/article/details/131580208