Detailed statistics and sort command using Nginx

sort command is to help us sort based on different data types, its grammar and common parameter format:
  sort [-bcfMnrtk] [source file] [- o Output File] 
Supplement: sort can for the contents of text files, in units to Sort.

Parameters:
  -b ignore the space character in front of each line start out.
  -c Check whether the file has been sorted in the order.
  When -f sort ignore uppercase and lowercase letters.
  -M The first three-letter abbreviations be sorted by month.
  -n sorted in accordance with size values.
  -o <output file> The results sorted into the specified file.
  -r sorted in reverse order.
  -t field when <separator character> Specifies the sort used separator characters.
  -k choose which section to sort.

example:

  1. According to statistics UV IP access

awk '{print $1}'  access.log|sort | uniq -c |wc -l

{Print $ 1} Note that part must be enclosed in single quotes. You can not use double quotes.

2. Statistical URL access statistics PV

awk '{print $7}' access.log|wc -l

3. Query the most frequently visited URL

awk '{print $7}' access.log|sort | uniq -c |sort -n -k 1 -r|more

4. query access to the most frequently IP

awk '{print $1}' access.log|sort | uniq -c |sort -n -k 1 -r|more

5. Review the logs based on the time period statistics

 cat  access.log| sed -n '/14\/Mar\/2015:21/,/14\/Mar\/2015:22/p'|more

Note: The time to use the slash backslash "\" escape

 If so Tomcat

-n sed ' / 2018-02-23 23:31 * /, / 2018-02-23 23:32 * / the p-.. ' catalina. OUT | plus ## more behind the more benefits can be displayed in batches, do not I will always refresh

 Extended:

Search Keyword # 
sed -n ' / 2019-06-27 09:55 * /, / 2019-06-27 13:59 * / the p-.. ' Laravel.log- 20,190,628 | egrep ' request_body ":" {\\ "Mobile ' 

# keyword query and down 3 row 

sed -n ' .. / 2019-06-27 09:55 * /, / 2019-06-27 13:59 * / the p- ' laravel.log- 20,190,628 | egrep -C . 3  ' request_body ":" {\\ "Mobile '

1 sort of works

sort of each line of the file as a unit, with each other, the comparison principle is followed by ASCII code value is compared to the first character from the back, and finally their output in ascending order.

[rocrocket@rocrocket programming]$ cat seq.txt
banana
apple
pear
orange
[rocrocket@rocrocket programming]$ sort seq.txt
apple
banana
orange
pear

2 sort -u option

Its role is simply to remove duplicate rows in the output line.

[rocrocket@rocrocket programming]$ cat seq.txt
banana
apple
pear
orange
pear
[rocrocket@rocrocket programming]$ sort seq.txt
apple
banana
orange
pear
pear
[rocrocket@rocrocket programming]$ sort -u seq.txt
apple
banana
orange
pear

pear due to repeated -u option is ruthless deleted.

3 sort -r option

sort default sort is ascending, descending order if you want to change, you add -r to get.

[rocrocket@rocrocket programming]$ cat number.txt
1
3
5
2
4
[rocrocket@rocrocket programming]$ sort number.txt
1
2
3
4
5
[rocrocket@rocrocket programming]$ sort -r number.txt
5
4
3
2
1

4 sort -o option

Since the default sort output to standard output, it is necessary to write the result file redirection, shaped like a sort filename> newfile.

However, if you want to sort the output results to the original file, may die with redirection.

[rocrocket @ rocrocket Programming] $ number.txt the Sort -r> number.txt
[rocrocket @ rocrocket Programming] $ CAT number.txt
[rocrocket @ rocrocket Programming] $
Look, even the number emptied.

At this time, -o option appeared, it successfully solved the problem, rest assured you will write the results to the original file. This is perhaps the only advantage the proportion -o orientation lies.

[rocrocket@rocrocket programming]$ cat number.txt
1
3
5
2
4
[rocrocket@rocrocket programming]$ sort -r number.txt -o number.txt
[rocrocket@rocrocket programming]$ cat number.txt
5
4
3
2
1

5 sort -n option

Have you encountered cases of 10 to 2 small. I met anyway. This occurs because the sorting program will sort the numbers by character, the sort program will first compare 1 and 2, apparently a little, so it will be 10 in 2 in front of myself. This is the sort of consistent style.

If we want to change this situation, you must use the -n option to tell the sort, "to be sorted numerically"!

[rocrocket@rocrocket programming]$ cat number.txt
1
10
19
11
2
5
[rocrocket@rocrocket programming]$ sort number.txt
1
10
11
19
2
5
[rocrocket@rocrocket programming]$ sort -n number.txt
1
2
5
10
11
19

6 sort -t option and -k options

If the content of a file like this:

[rocrocket@rocrocket programming]$ cat facebook.txt
banana:30:5.5
apple:10:2.5
pear:90:2.3
orange:20:3.4

This file has three columns, from column to column separated by a colon, the first column indicates the type of fruit, the second column indicates the number of fruit, the third column indicates the fruit prices.

Well, I think the number of fruit to be sorted, it is the second column to sort, sort how to achieve?

Fortunately, sort -t option is provided, behind the operator can set the interval. (-D option is not thought of cut and paste, the resonance ~ ~)

After specifying the break character, it can be used to specify the number of columns -k.

[rocrocket@rocrocket programming]$ sort -n -k 2 -t : facebook.txt
apple:10:2.5
orange:20:3.4
banana:30:5.5
pear:90:2.3

We use as a colon symbol interval, and performs for the second column value in ascending order, the result is very satisfactory.

7 other sort commonly used options

-f will lowercase letters are converted to uppercase for comparison, that is to ignore case

-C checks whether the file has been sorted, if out of order, the output of the first-order-related information line, and finally returns 1

-C checks whether the file has been sorted, if out of order, does not output the contents, only returns 1

-M will be sorted by month, for example, less than JAN FEB etc.

-b ignores all of the blanks in front of each line, from start comparing the first visible character.

Sometimes learning script, you will find the sort command followed by a bunch of similar -k1,2, or -k1.2 -k3.4 stuff, some incredible. Today, we have to get it --k options!

 

1 Prepare material

$ cat facebook.txt
google 110 5000
baidu 100 5000
guge 50 3000
sohu 100 4500

The first field is the company name, the second field is the number of companies, the third domain is the average wage of employees.

2 I want this file alphabetically by company, that is, according to the first field to sort :( this facebook.txt file has three fields)

$ sort -t ‘ ‘ -k 1 facebook.txt
baidu 100 5000
google 110 5000
guge 50 3000
sohu 100 4500

See, the direct use of -k 1 is set on it. (In fact, here is not strict, you'll know later)

3 I want facebook.txt sorted according to the number of companies

$ sort -n -t ‘ ‘ -k 2 facebook.txt
guge 50 3000
baidu 100 5000
sohu 100 4500
google 110 5000

Do not explain, I'm sure you can understand.

However, there is a problem here, and that is the same as the number of companies and sohu baidu, are 100 people, this time how to do it? By default rule, in ascending order beginning from the first domain, in the front row baidu sohu.

4 I want facebook.txt sorted according to the number of companies, the number of employees in accordance with the same sort of average wage in ascending order:

$ sort -n -t ‘ ‘ -k 2 -k 3 facebook.txt
guge 50 3000
sohu 100 4500
baidu 100 5000
google 110 5000

Look, we added a -k2 -k3 to solve the problem. To drop, sort support this setting, that setting the priority field sort, first sort to a second domain, if the same, then the third sort fields. (If you prefer, you can always write down so, set a number of ordering priority)

5 I want facebook.txt in accordance with wages in descending order, if the same number of employees, the company in ascending order according to the number :( this is a bit difficult myself)

$ sort -n -t ‘ ‘ -k 3r -k 2 facebook.txt
baidu 100 5000
google 110 5000
sohu 100 4500
guge 50 3000

There are a number of tips used here, you look at the back 3 -k secretly added a lowercase letter r. Think about it, in combination with our last article, it can get an answer? Announced: The role of r and -r option is the same, that is expressed in reverse order. Because the default sort is ascending order, so the need to add here r represents the third domain (the average wage of employees) is sorted in descending order. Here you can add n, it means the time to sort this domain, to be sorted according to numerical values, give you an example:

$ sort -t ‘ ‘ -k 3nr -k 2n facebook.txt
baidu 100 5000
google 110 5000
sohu 100 4500
guge 50 3000

Look, we removed the top of the -n option, but will add it to each of the -k option.

Specific syntax 6 -k option

We should continue down, then we have to point the theoretical knowledge. You need to understand the -k option syntax, as follows:

[FStart [.CStart]] [Edit] [[Fend [.CEnd]] [Edit]]

The syntax can be one of the comma ( ",") is divided into two parts, Start and End section section.

Instill give you an idea, that is, "If you do not set the End section, then that End is set to end of the line." This concept is important, but often you do not pay attention to it.

Start section also consists of three parts, which Modifier part of what we said before similar option part n and r. We focus here FStart and C.Start Start section.

C.Start also be omitted omitted if it starts from the beginning of this domain. The previous example -k 2 -k 3 is omitted, and an example of C.Start myself.

FStart.CStart, which represents the domain FStart is used, and CStart said in FStart field began to count from the first few characters of "the first character of the sort."

Similarly, in the End section, you can set FEnd.CEnd, if you .CEnd omitted indicates the end of the last character "domain tail", i.e., the local domain. Or, if you will CEnd set to 0 (zero), also signify the end of the "domain tail."

7 whim, from the second letter of the English name of the company to begin sorting:

$ sort -t ‘ ‘ -k 1.2 facebook.txt
baidu 100 5000
sohu 100 4500
google 110 5000
guge 50 3000

Look, we used the -k 1.2, which represents the second character of the first field began to string up to the last character in this field to be sorted. You will find baidu because the second letter is a rather top of the list. google sohu and the second character is O, but sohu google O h of the front, so that both the second and third rows, respectively. guge the only placing him fourth.

8 whim ,, only be sorted for the second letter of the company name in English, if the same descending order according to wages and salaries:

$ sort -t ‘ ‘ -k 1.2,1.2 -k 3,3nr facebook.txt
baidu 100 5000
google 110 5000
sohu 100 4500
guge 50 3000

Because only the second letter of sorts, so we use the representation -k 1.2,1.2, showing us "only" the second letter of the sort. (If you ask, "how I use -k 1.2 does not work?" Of course not, because you omit the End section, which means that you will play the string until the last character in this field from the second letter sorting ). For the wages were sorted, we also use the -k 3,3, this is the most accurate representation, we represent "only" sort of the field, because if you omit the back 3, we became "the first 3 domain content to start the last field position of the sort, "the.

9 modifier section which options can be used?

It can be used b, d, f, i, n or r.

Where n and r sure you are already familiar with.

b represents a blank check symbol ignored this field.

d represents a domain of the lexicographically ordering (i.e., considering only the blank and letters).

f represents the domain of this sort to ignore case.

i omit "unprintable characters" just sort for printable characters. (Some non-printable ASCII character is, for example \ a is the alarm, \ B is a backspace, \ n-newline is, \ R & lt carriage return, etc.)

10 Reflections Reflections example of -u -k and joint use:

$ cat facebook.txt
google 110 5000
baidu 100 5000
guge 50 3000
sohu 100 4500

This is the most primitive facebook.txt file.

$ sort -n -k 2 facebook.txt
guge 50 3000
baidu 100 5000
sohu 100 4500
google 110 5000

$ sort -n -k 2 -u facebook.txt
guge 50 3000
baidu 100 5000
google 110 5000

When setting numerical ordering staff to the domain and then add -u, sohu line has been deleted! -U recognition domain with only the original set -k, found to be identical, it will follow the same row deleted.

$ sort  -k 1 -u facebook.txt
baidu 100 5000
google 110 5000
guge 50 3000
sohu 100 4500

$ sort  -k 1.1,1.1 -u facebook.txt
baidu 100 5000
google 110 5000
sohu 100 4500

This example also empathy, beginning the character of g guge would not survive.

$ sort -n -k 2 -k 3 -u facebook.txt
guge 50 3000
sohu 100 4500
baidu 100 5000
google 110 5000

what! Here set up a two-tier ordering priority cases, the use -u would not delete any rows. The original -u -k will weigh all the options, all the same will be deleted, as long as there is a different will not easily removed:) (do not believe, you can add your own try line sina 100 4500)

11 most bizarre sort:

$ sort -n -k 2.2,3.1 facebook.txt
guge 50 3000
baidu 100 5000
sohu 100 4500
google 110 5000

A second character to the second field begins the third field first character of the end portion sorted.

The first line, 03 will extract, extracting the second line 005, third line 004 extraction, 105 the fourth line extraction.

And because the sort considered less than 0 and less than 00 000 less than 0000 ....

So 03 certainly in the first. 105 is definitely the last. But why 005 was in front of 00 4 it? (You can do your own experiments to think about.)

The answer is revealed: the original "cross-domain setting an illusion", the second character sort compares only the second field to the second part of the last character of the domain, and without the beginning of the third domain of character included in the comparison range. When they find 00 and 00 are the same, sort will automatically compare the first field to go. Of course, the sohu baidu in front. It can be confirmed by an example:

$ sort -n -k 2.2,3.1 -k 1,1r facebook.txt
guge 50 3000
sohu 100 4500
baidu 100 5000
google 110 5000

12 Sometimes after sort +1 -2 command will see these symbols, what is this stuff?

About this syntax, so it is the latest sort of explain:

On older systems, `sort’ supports an obsolete origin-zero syntax `+POS1 [-POS2]‘ for specifying sort keys.  POSIX 1003.1-2001 (*note Standards conformance::) does not allow this; use `-k’ instead.

Originally, this ancient representation has been eliminated, the future can justifiably despise the use of this method of representation script myself!

(In order to prevent the presence of ancient script in which to tell you this representation, plus sign indicates Start section, minus sign indicates End section. The most important thing is that in this way is to start counting from zero, said before the first field, where it is represented as a 0-th domain. before the first two characters, represented here as the first character. understand?)

Guess you like

Origin www.linuxidc.com/Linux/2019-09/160713.htm
Recommended