1.cut command
cut default separator is a tab command, which is "tab" key
1.1 Command Format
cut [选项] 文件名
-f 列号: 提取第几列
-d 分隔符: 按照指定分隔符分割列
-c 字符范围: 不依赖分隔符来区分列,而是通过字符范围(行首为 0)来进行字段 提取。“n-”表示从第 n 个字符到行尾;“n-m”从第 n 个字符到第 m 个字符;“-m”表示从第 1 个字符到第 m 个字符。
1.2 Example command
数据
vi student.txt
ID Name gender Mark
1 Liming M 86
2 Sc M 90
3 Tg M 83
cut -f 2 student.txt 截取第二列
grep -v 'Name' student.txt|cut -f 2 student.txt 去掉列头展示第二列
cut -f 2,3 student.txt 截取第2,3列
cut -d ':' -f 1,2,3 /etc/passwd 使用:分割/etc/passwd每行,并截取1,2,3列
2.awk command
2.1printf formatted output
printf ‘输出类型输出格式’ 输出内容 输出类型:
%ns: 输出字符串。n 是数字指代输出几个字符
%ni: 输出整数。n 是数字指代输出几个数字
%m.nf: 输出浮点数。m 和 n 是数字,指代输出的整数位数和小数位数。如%8.2f
代表共输出 8 位数,其中 2 位是小数,6 位是整数。
输出格式:
\a: 输出警告声音
\b: 输出退格键,也就是 Backspace 键
\f: 清除屏幕
\n: 换行
\r: 回车,也就是 Enter 键
\t: 水平输出退格键,也就是 Tab 键
\v: 垂直输出退格键,也就是 Tab 键
2.2 demo data
vi student.txt
ID Name PHP Linux MySQL Average
1 Liming 82 95 86 87.66
2 Sc 74 96 87 85.66
3 Tg 99 83 93 91.66
printf '%s' $(cat student.txt) 格式混乱
printf '%s\t %s\t %s\t %s\t %s\t %s\t \n' $(cat student.txt) 按照指定格式输出
printf '%i\t %s\t %i\t %i\t %i\t %8.2f\t \n' \
$(cat student.txt | grep -v Name) 将字段转换为指定类型
2.3awk command format
awk '{action 1 Condition 1 Condition 2} {2} ... operation' file name
awk '{printf $2 "\t" $6 "\n"}' student.txt 输出第二列和第6列
2.4awk conditions
Type of condition |
condition |
|
Description |
awk reserved words |
BEGIN |
When awk program start, executed before any data has not been read. After BEGIN The action is executed only once at the beginning of the program |
|
|
END |
In awk has processed all the data, execution is about to end. After the action END Only once at the end of the program |
|
|
> < >= |
more than the Less than greater or equal to |
|
|
<= |
Less than or equal |
|
Relational Operators |
== |
equal. For determining whether the two values are equal, if it is assigned to a variable, use "=" No. |
|
|
!= |
not equal to |
|
|
A~B |
A string is determined whether to include the substring matches B expression |
|
|
A!~B |
A determination whether the string does not contain substring matches B expression |
|
Regular Expressions |
/ Regular / |
If the "//" characters can be written, it can also support regular expressions |
awk implementation process
1) If there are conditions BEGIN, the first operation performed BEGIN defined
2) If there is no BEGIN conditions is read into the first row, the first row of data sequentially assigned to $ 0, $ 1, $ 2 variables. Where $ 0
Data representative of the entire trip, $ 1 represents the first field, $ 2 represents the second field.
2) determining whether to perform an operation based on the type of condition. If the condition is met, perform an action, or read the next line of data. If there are no conditions, each row to perform an action.
3) reading the next row of data, repeating the above steps.
awk built-in variable
awk built-in variable |
Role |
$0 |
Currently on behalf of the entire row of data awk read. We know awk is read into the data line by line And $ 0 represents the entire row of data is read into the current line. |
$n |
Read on behalf of the current line of the n-th field. |
NF |
The current row fields owned (column) total. |
NO |
Awk current line being processed, the first few lines of the total data. |
FS |
User-defined delimiters. awk's default delimiter is any space. If you want to use other Separator (e.g., ":"), needs to define the variable FS. |
ARGC |
The number of command-line parameters. |
ARGV |
An array of command line arguments. |
FNR |
The current number of records in the current file (input file starting at 1). |
OFMT |
Numerical output format (default% .6g). |
OFS |
Separator (space by default) of the output field. |
ORS |
Output record separator (default newline). |
RS |
The input record separator (default newline). |
2.5 Examples
cat student.txt | grep -v Name | \
awk '$6 >= 87 {printf $2 "\n" }' #判断第6列的值大于87,如果成立打印第二行
awk '$2 ~ /Sc/ {printf $6 "\n"}' student.txt 获取Sc的成绩
cat /etc/passwd | grep "/bin/bash" | \
awk '{FS=":"} {printf $1 "\t" $3 "\n"}' 查询可以登录用户的 name和UID
3.sed command
sed is mainly used to select the data, replace, delete, add the command
3.1 Syntax
sed [选项] ‘[动作]’ 文件名
选项:
-n: 一般 sed 命令会把所有数据都输出到屏幕,如果加入此选择,则只会 把经过 sed 命令处理的行输出到屏幕。
-e: 允许对输入数据应用多条 sed 命令编辑。
-f 脚本文件名: 从 sed 脚本中读入 sed 操作。和 awk 命令的-f 非常类似。
-r: 在 sed 中支持扩展正则表达式。
-i: 用 sed 的修改结果直接修改读取数据的文件,而不是由屏幕输出
动作:
a \: 追加,在当前行后添加一行或多行。添加多行时,除最后 一行外, 每行末尾需要用“\”代表数据未完结。
c \: 行替换,用 c 后面的字符串替换原数据行,替换多行时,除最后一行 外,每行末尾需用“\”代表数据未完结。
i \: 插入,在当期行前插入一行或多行。插入多行时,除最后 一行外, 每行末尾需要用“\”代表数据未完结。
d: 删除,删除指定的行。
p: 打印,输出指定的行。
s: 字串替换,用一个字符串替换另外一个字符串。格式为“行范围 s/
旧字串/新字串/g”(和 vim 中的替换格式类似)。
3.2 Exercises
sed -n '2p' student.txt 打印第二行
sed '2,4d' student.txt 删除第2-4行 ,并没有修改文件的内容,
sed -i '2,4d' student.txt 删除第2-4行,并修改文件的内容
sed '2a hello' student.txt 在第二行后面追加hello
sed '2i hello \
> world' student.txt 在第二行前插入 两行 \为换行符
cat student.txt | sed '2c No such person' 替换第二行为指定的字符
sed ‘s/旧字串/新字串/g’ 文件名 字符串替换
sed '3s/74/99/g' student.txt 替换第三行中的字符串
sed '4s/^/#/g' student.txt 将第4行注释掉
sed -e 's/Liming//g ; s/Tg//g' student.txt 执行多个命令 使用 -e参数
4.sort command
4.1 Command Format
sort [选项] 文件名 选项:
-f: 忽略大小写
-b: 忽略每行前面的空白部分
-n: 以数值型进行排序,默认使用字符串型排序
-r: 反向排序
-u: 删除重复行。就是 uniq 命令
-t: 指定分隔符,默认是分隔符是制表符
-k n[,m]: 按照指定的字段范围排序。从第 n 字段开始,m 字段结束(默认到行尾)
4.2 Test Sample
sort /etc/passwd #排序用户信息文件
sort -r /etc/passwd #反向排序
sort -t ":" -k 3,3 /etc/passwd 使用:分割每行,并用第三个字段排序
sort -n -t ":" -k 3,3 /etc/passwd 将第三个字段转为数值再排序
uniq
uniq 命令是用来取消重复行的命令,其实和“sort -u”选项是一样的。命令格式如下:
[root@localhost ~]# uniq [选项] 文件名 选项:
-i: 忽略大小写
统计命令 wc
[root@localhost ~]# wc [选项] 文件名 选项:
-l: 只统计行数
-w: 只统计单词数
-m: 只统计字符数