[Linux shell] grep, sed, awk introduced

Today to talk about something that is formatted documents, to be honest I feel that Linux is a powerful place to be than Windows.

Linux in many documents, many export orders, are highly formatted. After understanding the appropriate content, we will be able to pass some commands, say what we want to extract, so the following will introduce several commands:

command effect
grep The query string corresponding to the row where the
and Format standard output stream
awk Processing field

A. Grep command

grep command is used to find the corresponding string content is no need to add any command parameters can use regular expressions pipe .

grep pattern

One of the most commonly used parameter -v, we can use this parameter anti-election .

grep -v pattern

(Using grep's a best practice is to delete all but blank lines , you can try it yourself)

If you want to see a context can specify -A, -B, -C parameter.

Wherein, -A + is used to view digital few lines before the current line, -B opposite. -C is used to view the contents of a few lines before and after a few lines.

grep -A/B/C num pattern

Finally, with regard to grep important point is that he has two styles, one is BRE style, using basic regular expressions, and the other is ERE style, using extended regular expressions. Both regions ERE style that adds | {} meaning other special symbols. If you want to use extended regular expressions, you can use:

# 虽然有了ERE,我更加推荐直接使用egrep
egrep
grep -E

 

Two. Sed command

sed command is a command pipeline, his role is to make treatment a standard output formatting. He can do for a new standard output, insert, delete, replace, and other operations.

(1). New Line

New Line has two parameters: -i, -a. Wherein -i is inserted before the corresponding row, -a content after inserting the corresponding row.

# 注意到sed的用法和grep完全不同
# 下面以0作为占位符,展示了sed插入行的做法
sed "0a content"
sed "0i content"

(2). Delete Row

# 下面展示两种模式
# 删除指定的一行(以数字0作为行号的占位符)
sed "0d content"
# 删除指定区间的所有行(以数字1和2作为行号的占位符)
sed "1,2d content"

(3) Alternatively OK

Replace Air and deleting rows exactly the same way, but the function is changed from d c:

sed "0c content"
sed "1,2c content"

(4) The replacement part

Note the difference between the line and the replacement phase, replaced here to do is replace the work of vim inside, function also vim inside s:

sed "s/要被替换的内容/替换的内容/g"

For example, I want to delete all comment lines with # at the beginning, my approach is:

# 这里用到了一个小技巧,讲被替换的内容用什么都没有的东西代替,起到了删除内容的作用
sed "s/^#.*$//g"

 

Three. Awk command

Awk action is to output the contents of each field to make a deal, following a brief introduction of its concept:

(1) The form

awk '条件类型{操作1} 条件类型{操作2}...'

Note awk and sed in common usage in that - to be used "" wrap the subsequent operations .

The difference is in the "" in, awk and can contain a variety of different types and conditions of sub-operations, sub-operations {a} and wrap .

(2) Built-in variable

awk is a built-in variables, which is why awk can be used to deal with the most important reason field.

Each field used to represent variables:

  1. $ 0: this is a special variable that refers to the data on behalf of the entire line .
  2. $ 1, $ 2, ...: These variables are used to refer to the first field, the contents of the second field.

There are some special variables, such as:

  1. NF: total number of fields in each row have.
  2. NR: awk currently we are dealing with the first few lines of data.
  3. FS: The current delimiter character, the default is the space bar.

These built-in variables may be used in the condition type , but also may be used in the operation of.

(3) Condition Type

Speaking of the conditions necessary to mention the Boolean type, you do not say, awk is really a set of C ++ and consistent Boolean operators. Here do lists.

awk There are some keywords can also be used for the type of conditions, such as BEGIN and END, these two represent the first line of treatment and the last line of treatment .

(4) Operation

a built-awk operation. You can view the document by man. Basically in accordance with the wording shell script to write.

You can even not applicable condition type script written with logic directly inside the {}:

# 这里直接用了命令,由此可以看出和写shell还是差不太多的
# printf还是和C++差不多,需要自己补上换行号
awk '{if (NR==1) printf("Hello World\n")}'

(5) built-in function

Personally think that is the essence of built-in functions awk, awk built many functions, the most famous is the print function, he can print the contents to the standard output stream.

# 这里print的使用方式又像python一样
# 自动换行,可以任意拼接你想输出的东西
# 注意引号之间的匹配问题
awk '{print %1 "==" %2}'

There awk string handling functions, such as split substr or the like, that can access the document itself.

Consider this question:

Here are two ideas, one is seeking the use of the index and index Qiuzi substr string:

awk '{print substr($1, index($1, "/") + 1) "==" $2}'

 Another is the use of split array obtained after dividing, and then fetches the corresponding elements:

awk '{split($1, str_arr, "/")}{print str_arr[2] "==" $2}'

 

Published 137 original articles · won praise 19 · views 10000 +

Guess you like

Origin blog.csdn.net/qq_43338695/article/details/103845369