Introduction to grep awk sed Three Musketeers + basic usage

Introduction to grep

  1. grep is one of the most commonly used text processing tools in Linux
  2. Together with sed awk, they are called the Three Musketeers in Linux!
  3. grep is like you open a txt file in windows and use the shortcut key "Ctrl+F" to find a string in the text, you can understand grep as a character search tool
  4. The full name of grep is Global search Regular Expression and Print out the line, which means global search

Basic usage of grep

# grep [选项] 正则表达式 [目标文件]
grep [OPTIONS] PATTERN [FILE...]    
# grep [选项] -e 正则表表达式 或 -f 包含正则表表?

Common parameters

  • -v: Reverse search, display lines without matching patterns

  • -n: display the line number of the matching result

  • -i: Ignore case when searching

  • -o: Only display qualified strings

  • -e: Realize the matching of multiple options, logical or relationship

  • -E: Use extended regular expressions, when using the "-E" option, it is equivalent to using egrep

  • -w: Exact match keyword

  • -c: count the number of matched rows, note that it is the total number of matched rows

  • -q: Quiet mode, no information is output

  • -l: Do not display the output matching result, only display the matching file name

  • -A<number of display lines> or --after-context=<number of display lines>: In addition to displaying the column that conforms to the template style, and displaying the content after the line.

  • b or --byte-offset: Before displaying the line that conforms to the pattern, mark the number of the first character of the line.

Introduction to sed

Stream editor, mainly used to filter and replace text content

working principle:

  1. Sed is a stream editor and does not allow users to interact with it. Sed processes text content in units of lines. Each time a line is read into the memory, it is called pattern space.
  2. The original file is not modified by default, if you need to modify it, you need to add the -i parameter
  3. Sed has mode space and hold space (hold sapce), the content in the mode space is printed to standard output by default
  4. When sed reads each line, it saves the content to the memory
  5. Support regular and extended regular expressions, except -y option

Basic usage of sed

sed [选项] '定址和命令' 处理的文件 

Commonly used options

  • -i: modify directly
  • -n: cancel the default output

Commonly used parameters

  • p: print

  • d: delete

  • s: Replacement (string is replaced) g: Global pattern, all matched strings are replaced (without g, only the first match in each line is replaced by default)

  • y: change character (one-to-one correspondence)

  • i: Insert (before matching line)

  • a: Append (after matching line)

  • c: Modification (the entire line matched is modified)

  • r: read from file

  • w: write file

  • q: Exit after finding the first match

Introduction to awk

  1. Awk is an excellent text processing tool. (The name is derived from the first letters of the surnames of its founders Alfred Aihou, Peter Weinberg and Brian Colinham)

  2. The pattern represents what AWK is looking for in the data, and the action is a series of commands that are executed when matching content is found

Basic usage of awk

-F specifies the separator as a colon

Syntax format:

awk  -F  '分隔符'  '/模式/{操作}'   文件名

awk built-in variables

属性  		说明
$0  	当前记录(作为单个变量)
$1~$n   当前记录的第n个字段,字段间由FS分隔
FS 		输入字段分隔符 默认是空格
NF 		当前记录中的字段个数,就是有多少列
NR  	已经读出的记录数,就是行号,从1开始
RS  	输入的记录他隔符默 认为换行符
OFS     输出字段分隔符 默认也是空格
ORS     输出的记录分隔符,默认为换行符
ARGC    命令行参数个数
ARGV    命令行参数数组
FILENAME    当前输入文件的名字
IGNORECASE  如果为真,则进行忽略大小写的匹配
ARGIND  	当前被处理文件的ARGV标志符
CONVFMT     数字转换格式 %.6g
ENVIRON     UNIX环境变量
ERRNO   	UNIX系统错误消息
FIELDWIDTHS     输入字段宽度的空白分隔字符串
FNR    		 当前记录数
OFMT    	数字的输出格式 %.6g
RSTART  	被匹配函数匹配的字符串首
RLENGTH     被匹配函数匹配的字符串长度

awk commonly used variables

  • NF: represents the last field
[root@localhost ~]# awk -F ':' '{print $NF}' /etc/passwd
/bin/bash
/sbin/nologin
/sbin/nologin
/sbin/nologin
/sbin/nologin
  • $(NF-1): indicates the penultimate field
[root@localhost ~]# awk -F ':' '{print $(NF-1)}' /etc/passwd
/root
/bin
/sbin
/var/adm
/var/spool/lpd
  • NR: Indicates which row is currently being processed
Output second line
[root@localhost ~]# awk -F ":" 'NR==2 {print}' /etc/passwd
bin:x:1:1:bin:/bin:/sbin/nologin
After three lines
[root@localhost ~]# awk -F ':' 'NR>3 {print}' /etc/passwd
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown

awk built-in functions

  • toupper() is used to convert characters to uppercase [t^ber]
[root@localhost ~]# awk -F ':' '{print toupper($1)}' /etc/passwd
ROOT
BIN
DAEMON
ADM
LP
  • tolower() is used to convert strings to lowercase
[root@localhost ~]# awk '{print tolower($1)}' /tmp/aa.txt
root
bin
daemon
adm

awk if else statement

  • If $1==root print the first field, otherwise print the second field
root@localhost ~]# awk -F : '{if ($1=="root") print $1;else print $2}' /etc/passwd
root
x
x
x
x

awk advanced usage BEGIN END

awk '
BEGIN { actions }
/pattern/ { actions }
/pattern/ { actions }
……….
END { actions }
' filenames 
  • BEGIN mode: means that awk will execute the action specified in BEGIN immediately before reading any input line.
  • END mode: means that awk will execute the action specified in END before it officially exits

Guess you like

Origin blog.csdn.net/Q274948451/article/details/108926674