shell 编程的老臣 - sed

vi/vim 的世界里多了两个兄弟:sed 和 gawk.

sed: stream editor. 在编辑器处理数据之前,根据事先提供的规则来编辑数据流。

sed 有点类似于 Kafka, 对数据进行一行一行的编辑,行云流水,没有半点拖沓。当然 kafka 更加强大,实时捕获数据,还能指定更复杂的处理逻辑,并且能够发送到任何地方保存起来。sed 能接受的源头是文本文件,最终的结果还只能是普通的文本文件,实现的是文本到文本的转换。

sed 的命令格式:
sed options script file

详细的文档见这里:

https://www.gnu.org/software/sed/manual/sed.html

常用的 3 个可选参数列在这里:

sed -e 's/cat/dog/' logfile.txt 
sed -f Wordchange.sed logfile.txt
sed -e 's/cat/dog' -i logfile.txt

-e : 指定命令表达式,s/cat/dog 用 dog 取代文本中的 cat ;

-f : 如果有多行表达式,且频繁修改,用文件存储这些命令表达式则显得尤为高效,而 -f 代表的就是命令文件;

-i: in-place 就地修改并保存。如果不指定 sed 将修改后的结果输出到标准输出也就是屏幕上

主要是围绕着 script 做文章,指定的命令可以完成目标文本的转换, 而 options 则更多是一些可选的动作,比如直接修改文本而不是保存,指定多命令的文件等等。

革命要胜利,路线不能歪。所以首先要攻克的便是 script 命令。这里有份指南,从简到深,细细铺开来讲。

• sed script overview:        sed script overview
• sed commands list:        sed commands summary
• The "s" Command:        sed’s Swiss Army Knife
• Common Commands:        Often used commands
• Other Commands:        Less frequently used commands
• Programming Commands:        Commands for sed gurus
• Extended Commands:        Commands specific of GNU sed
• Multiple commands syntax:        Extension for easier scripting
sed 命令脚本综述
[line address]X[options]

多命令也好,单行命令也好,多行命令也好,命令文件也罢,命令的格式逃不过上面这公式。

line address 是文本的行数范围,比如指定文本的 30 到 50 行,’30,50’;

X 是单字命令,够简单,但是不好记。随时备份一张 X 的列表在你的桌面上,或许能帮你随用随查;

options 就是单字命令的可选参数

sed -e '/^foo/d' -e 's/hello/world/' input.txt > output.txt

echo 's/hello/world/' > script2.sed
sed -e '/^foo/d' -f script2.sed input.txt > output.txt

/^foo 比指定行数(每一行在文本文件中总有一个行号)要来的灵活,^foo代表的就是开头以foo的那些行;

/d 标识命令是 delete, 即删除行的操作;

-e, -f, 都可以多次引用,其作用就是为了指定多个命令

sed 常用命令

可选的命令太多了,所以还是挑几个常用的命令来讲讲

Swiss Army Knife 瑞士军刀 - s
[root@centos00 _data]# echo 'this is a cat dog' | sed -e 's/cat/fat/'
this is a fat dog
[root@centos00 _data]#

s 这单字命令,一定要严格按照格式:

s/original word/replaced word/

来编写,否则出现会这种错误:

sed: -e expression #1, char 9: unterminated `s' command
其他常用命令介绍
{#;d;q;p;n}

q - quit 在当前行退出(当前处理的文件),不再处理更多往下的行

[root@centos00 _data]# seq 5
1
2
3
4
5
[root@centos00 _data]# seq 5 | sed 3q
1
2
3
[root@centos00 _data]#

seq 是 sequence 命令,产生一组序列值;

3q 是 sed 单字命令应用,3 代表第三行,而 q 就是退出

d - delete 是删除满足条件的行,可以指定行号也可以使用条件表达式

[root@centos00 _data]# seq 5 | sed 3d
1
2
4
5
[root@centos00 _data]#

p - print 打印当前行,必须与 sed -n 可选参数同时使用,才奏效

[root@centos00 _data]# seq 5 | sed 3p
1
2
3
3
4
5
[root@centos00 _data]# seq 5 | sed 3p -n
3
[root@centos00 _data]#

-n 作为 sed 的可选参数,没有在文档中找到其原意,我暂 YY 它是 no print 的意思。

n - Next line , 隔行处理。指定多少个 n, 就隔多少行处理一次编辑

[root@centos00 _data]# seq 5 | sed 'n;s/./new line/'
1
new line
3
new line
5
[root@centos00 _data]#

[root@centos00 _data]# seq 5 | sed 'n;n;s/./new line/'
1
2
new line
4
5
[root@centos00 _data]

{#;d;q;p;n} - 命令组合符号{;}

刚才那一案例已经说明白了 ‘n;n;s/./new line/’ , 使用“;”即可将多个命令同时作用于一行上,而如果要作用于满足条件的行,则必须加上“{}”:

[root@centos00 _data]# seq 5 | sed -n '2{s/./new line/;p}'
new line
[root@centos00 _data]#

看完这些例子,不禁令我想到一个问题,在单字命令表达式

[line address]X[options]

中,line address 可以是数字型的行号,也可以是满足条件的行号。而什么样的条件可以被放在[line address]表达式中呢?

好比,我需要打印偶数行,表达式该怎么写?

固然,用 n 命令可以解决这个问题,但我们考察的是[line address]的用法

# 使用单字命令:

[root@centos00 _data]# seq 8 | sed -n '{n;p}'
2
4
6
8
[root@centos00 _data]#

使用~可以实现打印隔行的功能:

[root@centos00 _data]# seq 8 | sed -n '0~2p'
2
4
6
8

而[line address]还可以使用正则表达式:

[root@centos00 _data]# seq 20 | sed -n '/[2]/p'
2
12
20
[root@centos00 _data]

/regular express/ 是正确引用正则表达式的方法,这里仅仅是打印包含2字符的那些行。

在 IT 领域,仅看理论而不动手,“学而不练”则惘。就像笔者一样,在玩 Oracle 那段时间天天用着,还蛮熟练的,中途转 SQL Server 做了几年,回头再用 sed 却手生得紧。拳不离手,曲不离口,文不离码,一点没错。

有 20 道题,是从文档上看到的,做下笔记,方便日后查阅
  • 1 Joining lines
  • 2 Centering Lines
  • 3 Increment a Number
  • 4 Rename Files to Lower Case
  • 5 Print bash Environment
  • 6 Reverse Characters of Lines
  • 7 Text search across multiple lines
  • 8 Line length adjustment
  • 9 Reverse Lines of Files
  • 10 Numbering Lines
  • 11 Numbering Non-blank Lines
  • 12 Counting Characters
  • 13 Counting Words
  • 14 Counting Lines
  • 15 Printing the First Lines
  • 16 Printing the Last Lines
  • 17 Make Duplicate Lines Unique
  • 18 Print Duplicated Lines of Input
  • 19 Remove All Duplicated Lines
  • 20 Squeezing Blank Lines

sed 详解

我觉得 sed 玩到最后,应该触及的最高难度的问题,有这些:

  1. 替换百万行文本,sed 的处理速度如何
  2. sed 作为 ETL 工具,与 MySQL, Oracle 等连接起来,做交互式操作
  3. sed 会有异常吗,那么如何处理:比如处理百万数据失效了

而这一切才刚刚开始!

Substitute - s 命令详解

sed 's/pattern/replacement/' inputfile

经典的用法就是这样。

但实际运作起来,并非像我们想象的那样:

[root@centos00 _data]# cat hw.txt
this is the profession tool on the professional platform
this is the man on the earth

[root@centos00 _data]# sed 's/the/a/' hw.txt
this is a profession tool on the professional platform
this is a man on the earth

[root@centos00 _data]#

虽然我们制定了 pattern, 但 replacement 只替换了每行第一次出现的指定文本。

所以有了这些 s 命令的衍生:

s/pattern/replacement/flag

数字:指定第几处符合指定模式的文本被替换;
g: 替换所有符合的模式文本;
p: 原先的内容文本先打印出来;
w filename: 将替换的结果写入到文件里面去

替换掉所有的符合模式条件的文本:

[root@centos00 _data]# sed 's/the/a/g' hw.txt
this is a profession tool on a professional platform
this is a man on a earth

将结果写入到另一个文本文件:

[root@centos00 _data]# sed 's/the/a/w dts.txt' hw.txt
this is a profession tool on the professional platform
this is a man on the earth

[root@centos00 _data]# cat dts.txt
this is a profession tool on the professional platform
this is a man on the earth
[root@centos00 _data]#

分隔符的替换:

[root@centos00 _data]# sed 's!/bin/bash!/bin/csh!' /etc/passwd
root:x:0:0:root:/root:/bin/csh
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync

使用 ! 亦可以作为分隔符。因为 / 和路径分隔符重合,而转义的时候,会加很多 \ 符,因此不是很好读。

还可以用@ 作为分隔符

[root@centos00 _data]# sed 's@/bin/bash@/bin/csh@' /etc/passwd
 root:x:0:0:root:/root:/bin/csh
 bin:x:1:1:bin:/bin:/sbin/nologin
 daemon:x:2:2:daemon:/sbin:/sbin/nologin
 adm:x:3:4:adm:/var/adm:/sbin/nologin
 lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
 sync:x:5:0:sync:/sbin:/bin/sync

不禁要问自己的问题是,到底还有多少符号可以用来作为分隔符?

参考官方文档,貌似任何的字符都可以作为分隔符,是根据s后面第一个遇到的符号作为分隔符:

https://www.gnu.org/software/sed/manual/html_node/The-_0022s_0022-Command.html

[root@centos00 _data]# sed 's6a6the6g' dts.txt
this is the profession tool on the professionthel plthetform
this is the mthen on the etherth
[root@centos00 _data]#

瞧,说的没错把。s 命令后面第一个字符,就是当做分隔符。

貌似这篇文章还有点深入的:

There are two levels of interpretation here: the shell, and sed.

In the shell, everything between single quotes is interpreted literally, except for single quotes themselves. You can effectively have a single quote between single quotes by writing '\'' (close single quote, one literal single quote, open single quote).

Sed uses basic regular expressions. In a BRE, in order to have them treated literally, the characters $.*[\]^&nbsp;need&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">to</span>&nbsp;be&nbsp;quoted&nbsp;by&nbsp;preceding&nbsp;them&nbsp;by&nbsp;a&nbsp;backslash,&nbsp;except&nbsp;inside&nbsp;character&nbsp;sets&nbsp;([…]).&nbsp;Letters,&nbsp;digits&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">and</span>&nbsp;(){}+?|&nbsp;must&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">not</span>&nbsp;be&nbsp;quoted&nbsp;(you&nbsp;can&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">get</span>&nbsp;away&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">with</span>&nbsp;quoting&nbsp;some&nbsp;of&nbsp;these&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">in</span>&nbsp;some&nbsp;implementations).&nbsp;The&nbsp;sequences&nbsp;\(,&nbsp;\),&nbsp;\n,&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">and</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">in</span>&nbsp;some&nbsp;implementations&nbsp;\{,&nbsp;\},&nbsp;\+,&nbsp;&nbsp;\?,&nbsp;\|&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">and</span>&nbsp;other&nbsp;backslash+alphanumerics&nbsp;have&nbsp;special&nbsp;meanings.&nbsp;You&nbsp;can&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">get</span>&nbsp;away&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">with</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">not</span>&nbsp;quoting&nbsp;$^] in some positions in some implementations.

Furthermore, you need a backslash before / if it is to appear in the regex outside of bracket expressions. You can choose an alternative character as the delimiter by writing, e.g., s~/dir~/replacement~ or \~/dir~p; you'll need a backslash before the delimiter if you want to include it in the BRE. If you choose a character that has a special meaning in a BRE and you want to include it literally, you'll need three backslashes; I do not recommend this, as it may behave differently in some implementations.

In a nutshell, for sed 's/…/…/':

Write the regex between single quotes.
Use '\'' to end up with a single quote in the regex.
Put a backslash before $.*/[\]^&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">and</span>&nbsp;only&nbsp;those&nbsp;characters&nbsp;(but&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">not</span>&nbsp;inside&nbsp;bracket&nbsp;expressions).<br>Inside&nbsp;a&nbsp;bracket&nbsp;expression,&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">for</span>&nbsp;-&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">to</span>&nbsp;be&nbsp;treated&nbsp;literally,&nbsp;make&nbsp;sure&nbsp;it&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;first&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">or</span>&nbsp;last&nbsp;([abc-]&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">or</span>&nbsp;[-abc],&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">not</span>&nbsp;[a-bc]).<br>Inside&nbsp;a&nbsp;bracket&nbsp;expression,&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">for</span>&nbsp;^&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">to</span>&nbsp;be&nbsp;treated&nbsp;literally,&nbsp;make&nbsp;sure&nbsp;it&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">not</span>&nbsp;first&nbsp;(use&nbsp;[abc^],&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">not</span>&nbsp;[^abc]).<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">To</span>&nbsp;include&nbsp;]&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">in</span>&nbsp;the&nbsp;list&nbsp;of&nbsp;characters&nbsp;matched&nbsp;by&nbsp;a&nbsp;bracket&nbsp;expression,&nbsp;make&nbsp;it&nbsp;the&nbsp;first&nbsp;character&nbsp;(<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">or</span>&nbsp;first&nbsp;after&nbsp;^&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">for</span>&nbsp;a&nbsp;negated&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">set</span>):&nbsp;[]abc]&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">or</span>&nbsp;[^]abc]&nbsp;(<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">not</span>&nbsp;[abc]]&nbsp;nor&nbsp;[abc\]]).<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">In</span>&nbsp;the&nbsp;replacement&nbsp;text:<br><br>&amp;&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">and</span>&nbsp;\&nbsp;need&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">to</span>&nbsp;be&nbsp;quoted&nbsp;by&nbsp;preceding&nbsp;them&nbsp;by&nbsp;a&nbsp;backslash,&nbsp;as&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">do</span>&nbsp;the&nbsp;delimiter&nbsp;(usually&nbsp;/)&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">and</span>&nbsp;newlines.<br>\&nbsp;followed&nbsp;by&nbsp;a&nbsp;digit&nbsp;has&nbsp;a&nbsp;special&nbsp;meaning.&nbsp;\&nbsp;followed&nbsp;by&nbsp;a&nbsp;letter&nbsp;has&nbsp;a&nbsp;special&nbsp;meaning&nbsp;(special&nbsp;characters)&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">in</span>&nbsp;some&nbsp;implementations,&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">and</span>&nbsp;\&nbsp;followed&nbsp;by&nbsp;some&nbsp;other&nbsp;character&nbsp;means&nbsp;\c&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">or</span>&nbsp;c&nbsp;depending&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">on</span>&nbsp;the&nbsp;implementation.<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">With</span>&nbsp;single&nbsp;quotes&nbsp;around&nbsp;the&nbsp;argument&nbsp;(sed&nbsp;<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">'s/…/…/'),&nbsp;use&nbsp;'\''&nbsp;to&nbsp;put&nbsp;a&nbsp;single&nbsp;quote&nbsp;in&nbsp;the&nbsp;replacement&nbsp;text.</span><br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">If</span>&nbsp;the&nbsp;regex&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">or</span>&nbsp;replacement&nbsp;text&nbsp;comes&nbsp;from&nbsp;a&nbsp;shell&nbsp;variable,&nbsp;remember&nbsp;that<br><br>The&nbsp;regex&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;a&nbsp;BRE,&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">not</span>&nbsp;a&nbsp;literal&nbsp;<span class="hljs-built_in" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">string</span>.<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">In</span>&nbsp;the&nbsp;regex,&nbsp;a&nbsp;newline&nbsp;needs&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">to</span>&nbsp;be&nbsp;expressed&nbsp;as&nbsp;\n&nbsp;(which&nbsp;will&nbsp;never&nbsp;match&nbsp;unless&nbsp;you&nbsp;have&nbsp;other&nbsp;sed&nbsp;code&nbsp;adding&nbsp;newline&nbsp;characters&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">to</span>&nbsp;the&nbsp;pattern&nbsp;<span class="hljs-built_in" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">space</span>).&nbsp;But&nbsp;note&nbsp;that&nbsp;it&nbsp;won<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">'t&nbsp;work&nbsp;inside&nbsp;bracket&nbsp;expressions&nbsp;with&nbsp;some&nbsp;sed&nbsp;implementations.</span><br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">In</span>&nbsp;the&nbsp;replacement&nbsp;text,&nbsp;&amp;,&nbsp;\&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">and</span>&nbsp;newlines&nbsp;need&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">to</span>&nbsp;be&nbsp;quoted.<br>The&nbsp;delimiter&nbsp;needs&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">to</span>&nbsp;be&nbsp;quoted&nbsp;(but&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">not</span>&nbsp;inside&nbsp;bracket&nbsp;expressions).<br>Use&nbsp;double&nbsp;quotes&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">for</span>&nbsp;interpolation:&nbsp;sed&nbsp;-e&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">"s/$BRE/$REPL/".

使用寻址地址

行寻址:

第一种数字寻址:使用明确的行号,1,2,4 来标识需要匹配的行:

[root@centos00 _data]# sed '1s6a6the6g' dts.txt
this is the profession tool on the professionthel plthetform
this is a man on the earth
[root@centos00 _data]# sed '2s6a6the6g' dts.txt
this is a profession tool on the professional platform
this is the mthen on the etherth
[root@centos00 _data]#

第二种使用正则,当然这种方法更为灵活:

[root@centos00 _data]# sed '/platform/s6a6the6g' dts.txt
this is the profession tool on the professionthel plthetform
this is a man on the earth

命令执行:

[root@centos00 _data]# sed '/platform/{
s6a6the6g
s6on6above6g
}' dts.txt
this is the professiabove tool above the professiabovethel plthetform
this is a man on the earth
[root@centos00 _data]# sed '
/platform/ 
{s6a6the6g
s6on6above6g
}' dts.txt
sed: -e expression #1, char 11: unknown command: `
'

[root@centos00 _data]#

单行命令我已经描述过了,但多行命令应用到同一行还是有些不一样。比如{}的闭合就有说法,就像卡波蒂所说,一个标点符号的错位都有可能引起文章句意的不同。这里还是要注意。

官方文档有篇文章,介绍 sed 是如何工作的,我觉得蛮有意思:

6.1 How sed Works
sed maintains two data buffers: the active pattern space, and the auxiliary hold space. Both are initially empty.

sed operates by performing the following cycle on each line of input: first, sed reads one line from the input stream, removes any trailing newline, and places it in the pattern space. Then commands are executed; each command can have an address associated to it: addresses are a kind of condition code, and a command is only executed if the condition is verified before the command is to be executed.

When the end of the script is reached, unless the -n option is in use, the contents of pattern space are printed out to the output stream, adding back the trailing newline if it was removed.8 Then the next cycle starts for the next input line.

Unless special commands (like ‘D’) are used, the pattern space is deleted between two cycles. The hold space, on the other hand, keeps its data between cycles (see commands ‘h’, ‘H’, ‘x’, ‘g’, ‘G’ to move data between both buffers).

sed 按行处理文本时,会开辟两块缓冲区,pattern 空间和 hold 空间。

pattern 空间是保留去行首尾换行符之后的所有文本。一旦对这行文本处理完毕,就“倒掉” pattern 空间中的文本,换一下行。作为临时性的贮存区,每一次的换行都将清除 pattern 空间中的文本数据。

而 hold 空间则是保留了每次换行之后,前一行的数据。

接下来的进阶版文章中,会逐渐引入 pattern space, hold space 的概念。

sed 进阶

#### 多行命令

在整个文本文件中寻找模式,就需要考虑多行(跨行)的问题。因为模式可能不会存在单行上,或被分割成相邻的两行,或模式寻找的范围更广,需要将整篇文章作为搜索对象。所以多行就变成了必须。

硬编码的多行,用 n;n;… 来表示的例子:

[root@centos00 _data]# sed  '{/professional/{n;d}}' dts.txt
this is a profession tool on the professional platform
this is a man on the earth

i like better man
[root@centos00 _data]#

定位到含有 professional 那行,并且删除下面一行。

这里 n; 仅仅是为了可以定位更加机动化。试想如果不用 n;想要删除其中的空行, 那么使用 ^

不能识别此Latex公式:
 就将移除所有的空行:


[root@centos00 _data]# sed  '{/^$/d}'</span>&nbsp;dts.txt<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;a&nbsp;profession&nbsp;tool&nbsp;on&nbsp;the&nbsp;professional&nbsp;platform<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;a&nbsp;man&nbsp;on&nbsp;the&nbsp;earth<br>i&nbsp;like&nbsp;better&nbsp;man<br>[<span class="hljs-symbol" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">root@</span>centos00&nbsp;_data]#<br></code></pre><br><p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 1.5em 0px; word-wrap: inherit !important; word-break: inherit !important;">这里用到了正则,说明下:</p><br><p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 1.5em 0px; word-wrap: inherit !important; word-break: inherit !important;">正则表达式是用模式匹配来过滤文本的工具。</p><br><blockquote style="line-height: inherit; display: block; padding: 15px 15px 15px 1rem; font-size: 0.9em; margin: 1em 0px; color: rgb(129, 145, 152); border-left: 6px solid rgb(220, 230, 240); background: rgb(242, 247, 251); overflow: auto; word-wrap: inherit !important; word-break: inherit !important;"><br>&nbsp;&nbsp;<p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 0px; word-wrap: inherit !important; word-break: inherit !important;">在&nbsp;Linux&nbsp;中,正则表达式引擎有两种:</p><br>&nbsp;&nbsp;<p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 0px; word-wrap: inherit !important; word-break: inherit !important;">BRE&nbsp;-&nbsp;基本正则表达式引擎(Basic&nbsp;Regular&nbsp;Expressions)</p><br>&nbsp;&nbsp;<p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 0px; word-wrap: inherit !important; word-break: inherit !important;">ERE&nbsp;-&nbsp;扩展正则表达式引擎(Extentional&nbsp;Regular&nbsp;Expressions)</p><br></blockquote><br><p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 1.5em 0px; word-wrap: inherit !important; word-break: inherit !important;">sed&nbsp;使用的是&nbsp;BRE&nbsp;引擎,而且用的还是&nbsp;BRE&nbsp;引擎中更小的一部分表达式,因此速度超快,但功能受限;</p><br><p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 1.5em 0px; word-wrap: inherit !important; word-break: inherit !important;">gawk&nbsp;使用的是&nbsp;ERE&nbsp;引擎,重武器库型编辑工具(实际上具有可编程性),因此表达式丰富,但是速度可能较慢。</p><br><h5&nbsp;id="h-3" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><span style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">锚定字符:</span><br><blockquote style="line-height: inherit; display: block; padding: 15px 15px 15px 1rem; font-size: 0.9em; margin: 1em 0px; color: rgb(129, 145, 152); border-left: 6px solid rgb(220, 230, 240); background: rgb(242, 247, 251); overflow: auto; word-wrap: inherit !important; word-break: inherit !important;"><br>&nbsp;&nbsp;<p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 0px; word-wrap: inherit !important; word-break: inherit !important;">行首定位&nbsp;^</p><br>&nbsp;&nbsp;<p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 0px; word-wrap: inherit !important; word-break: inherit !important;">行尾定位&nbsp;</p><pre style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><code style="padding: 2px 4px; margin: 0px 2px; color: rgb(233, 105, 0); background: rgb(248, 248, 248); line-height: 18px; font-size: 14px; font-weight: normal; word-spacing: 0px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; display: block !important; white-space: pre !important; overflow: auto !important; word-wrap: inherit !important; word-break: inherit !important;">不能识别此Latex公式:<p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 0px; word-wrap: inherit !important; word-break: inherit !important;"></p><br>&nbsp;&nbsp;<p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 0px; word-wrap: inherit !important; word-break: inherit !important;">空行:^</p></code></pre><p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 0px; word-wrap: inherit !important; word-break: inherit !important;"></p><br></blockquote><br><p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 1.5em 0px; word-wrap: inherit !important; word-break: inherit !important;"><strong style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; font-weight: bold; word-wrap: inherit !important; word-break: inherit !important;">多行匹配</strong></p><br><pre style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><code style="padding: 2px 4px; margin: 0px 2px; color: rgb(233, 105, 0); background: rgb(248, 248, 248); line-height: 18px; font-size: 14px; font-weight: normal; word-spacing: 0px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; display: block !important; white-space: pre !important; overflow: auto !important; word-wrap: inherit !important; word-break: inherit !important;">[<span class="hljs-symbol" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">root@</span>centos00&nbsp;Documents]#&nbsp;sed&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">'/first/{N;s/\n/&nbsp;/;s/line/user/g}'</span>&nbsp;MultiLine.txt<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;the&nbsp;header&nbsp;line<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;the&nbsp;first&nbsp;user&nbsp;&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;the&nbsp;second&nbsp;user<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;the&nbsp;third&nbsp;line&nbsp;<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;the&nbsp;end<br><br><br>[<span class="hljs-symbol" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">root@</span>centos00&nbsp;Documents]#&nbsp;sed&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">'/first/{N;s/\n/&nbsp;/;s/first.*second/user/g}'</span>&nbsp;MultiLine.txt<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;the&nbsp;header&nbsp;line<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;the&nbsp;user&nbsp;line<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;the&nbsp;third&nbsp;line&nbsp;<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;the&nbsp;end<br><br><br>[<span class="hljs-symbol" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">root@</span>centos00&nbsp;Documents]#<br></code></pre><br><p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 1.5em 0px; word-wrap: inherit !important; word-break: inherit !important;">第一个例子,我们先找有&nbsp;first&nbsp;存在的那行,接着将下一行的文本也附加到找到的这行来(其实是存在于&nbsp;pattern&nbsp;space),然后对于这行中的换行符(\n)做了替换处理,要不两行还是显示两行,替换了换行符,将所有&nbsp;line&nbsp;文本替换为&nbsp;user;</p><br><p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 1.5em 0px; word-wrap: inherit !important; word-break: inherit !important;">第二个例子更有意思,除了连接符合条件行的两行之外,还用“.”通配符,替换了整个包含符合条件的文本,从而实现了两行搜索。</p><br><blockquote style="line-height: inherit; display: block; padding: 15px 15px 15px 1rem; font-size: 0.9em; margin: 1em 0px; color: rgb(129, 145, 152); border-left: 6px solid rgb(220, 230, 240); background: rgb(242, 247, 251); overflow: auto; word-wrap: inherit !important; word-break: inherit !important;"><br>&nbsp;&nbsp;<p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 0px; word-wrap: inherit !important; word-break: inherit !important;">当然还可以连着搜索三行:</p><br></blockquote><br><pre style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><code style="padding: 2px 4px; margin: 0px 2px; color: rgb(233, 105, 0); background: rgb(248, 248, 248); line-height: 18px; font-size: 14px; font-weight: normal; word-spacing: 0px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; display: block !important; white-space: pre !important; overflow: auto !important; word-wrap: inherit !important; word-break: inherit !important;">[<span class="hljs-symbol" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">root@</span>centos00&nbsp;Documents]#&nbsp;sed&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">'/first/{N;N;s/\n/&nbsp;/g;s/first.*third/user/g}'</span>&nbsp;MultiLine.txt<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;the&nbsp;header&nbsp;line<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;the&nbsp;user&nbsp;line&nbsp;<br><span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">this</span>&nbsp;<span class="hljs-keyword" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(248, 35, 117); word-wrap: inherit !important; word-break: inherit !important;">is</span>&nbsp;the&nbsp;end<br><br><br>[<span class="hljs-symbol" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">root@</span>centos00&nbsp;Documents]#<br></code></pre><br><blockquote style="line-height: inherit; display: block; padding: 15px 15px 15px 1rem; font-size: 0.9em; margin: 1em 0px; color: rgb(129, 145, 152); border-left: 6px solid rgb(220, 230, 240); background: rgb(242, 247, 251); overflow: auto; word-wrap: inherit !important; word-break: inherit !important;"><br>&nbsp;&nbsp;<p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 0px; word-wrap: inherit !important; word-break: inherit !important;"><strong style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; font-weight: bold; word-wrap: inherit !important; word-break: inherit !important;"><em style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; font-style: italic; font-weight: bold; word-wrap: inherit !important; word-break: inherit !important;">这里可以想象如果是整个文本文件呢?</em></strong></p><br></blockquote><br><h4&nbsp;id="h-4" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><span style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">反转文本顺序</span><br><p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 1.5em 0px; word-wrap: inherit !important; word-break: inherit !important;">要实现文本文件的行顺序反转,需要用到两个概念:</p><br><blockquote style="line-height: inherit; display: block; padding: 15px 15px 15px 1rem; font-size: 0.9em; margin: 1em 0px; color: rgb(129, 145, 152); border-left: 6px solid rgb(220, 230, 240); background: rgb(242, 247, 251); overflow: auto; word-wrap: inherit !important; word-break: inherit !important;"><br>&nbsp;&nbsp;<ol style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; padding-left: 32px; list-style-type: decimal; word-wrap: inherit !important; word-break: inherit !important;"><br>&nbsp;&nbsp;<li style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; margin-bottom: 0.5em; word-wrap: inherit !important; word-break: inherit !important;"><span style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">Hold&nbsp;space&nbsp;保持空间</span></li><br>&nbsp;&nbsp;<li style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; margin-bottom: 0.5em; word-wrap: inherit !important; word-break: inherit !important;"><span style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">排除命令!</span></li><br>&nbsp;&nbsp;</ol><br></blockquote><br><p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 1.5em 0px; word-wrap: inherit !important; word-break: inherit !important;">Hold&nbsp;space&nbsp;的概念很有意思,和&nbsp;pattern&nbsp;space&nbsp;一样的是他们都被&nbsp;sed&nbsp;用来存储临时数据,不一样的是&nbsp;hold&nbsp;space&nbsp;保留的数据,时效性更长一些,而&nbsp;pattern&nbsp;space&nbsp;的数据在存储下一行数据之前,会被清空。且两种空间之间的数据可以互相交换。</p><br><p style="font-size: inherit; color: inherit; line-height: inherit; padding: 0px; margin: 1.5em 0px; word-wrap: inherit !important; word-break: inherit !important;">sed&nbsp;编辑器的&nbsp;hold&nbsp;space&nbsp;命令:</p><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><table style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; display: table; width: 100%; text-align: left; word-wrap: inherit !important; word-break: inherit !important;"><thead style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><tr style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; border-width: 1px 0px 0px; border-right-style: initial; border-bottom-style: initial; border-left-style: initial; border-right-color: initial; border-bottom-color: initial; border-left-color: initial; border-image: initial; border-top-style: solid; border-top-color: rgb(204, 204, 204); background-color: white; word-wrap: inherit !important; word-break: inherit !important;"><th style="color: inherit; line-height: inherit; margin: 0px; font-size: 1em; border: 1px solid rgb(204, 204, 204); padding: 0.5em 1em; text-align: left; font-weight: bold; background-color: rgb(240, 240, 240); word-wrap: inherit !important; word-break: inherit !important;">命令</th><th style="color: inherit; line-height: inherit; margin: 0px; font-size: 1em; border: 1px solid rgb(204, 204, 204); padding: 0.5em 1em; text-align: left; font-weight: bold; background-color: rgb(240, 240, 240); word-wrap: inherit !important; word-break: inherit !important;">解释</th></tr></thead><tbody style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; border: 0px; word-wrap: inherit !important; word-break: inherit !important;"><tr style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; border-width: 1px 0px 0px; border-right-style: initial; border-bottom-style: initial; border-left-style: initial; border-right-color: initial; border-bottom-color: initial; border-left-color: initial; border-image: initial; border-top-style: solid; border-top-color: rgb(204, 204, 204); background-color: white; word-wrap: inherit !important; word-break: inherit !important;"><td style="color: inherit; line-height: inherit; margin: 0px; font-size: 1em; border: 1px solid rgb(204, 204, 204); padding: 0.5em 1em; text-align: left; word-wrap: inherit !important; word-break: inherit !important;">h</td><td style="color: inherit; line-height: inherit; margin: 0px; font-size: 1em; border: 1px solid rgb(204, 204, 204); padding: 0.5em 1em; text-align: left; word-wrap: inherit !important; word-break: inherit !important;">将模式空间复制到保持空间</td></tr><tr style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; border-width: 1px 0px 0px; border-right-style: initial; border-bottom-style: initial; border-left-style: initial; border-right-color: initial; border-bottom-color: initial; border-left-color: initial; border-image: initial; border-top-style: solid; border-top-color: rgb(204, 204, 204); background-color: white; word-wrap: inherit !important; word-break: inherit !important;"><td style="color: inherit; line-height: inherit; margin: 0px; font-size: 1em; border: 1px solid rgb(204, 204, 204); padding: 0.5em 1em; text-align: left; word-wrap: inherit !important; word-break: inherit !important;">H</td><td style="color: inherit; line-height: inherit; margin: 0px; font-size: 1em; border: 1px solid rgb(204, 204, 204); padding: 0.5em 1em; text-align: left; word-wrap: inherit !important; word-break: inherit !important;">将模式空间附加到保持空间</td></tr><tr style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; border-width: 1px 0px 0px; border-right-style: initial; border-bottom-style: initial; border-left-style: initial; border-right-color: initial; border-bottom-color: initial; border-left-color: initial; border-image: initial; border-top-style: solid; border-top-color: rgb(204, 204, 204); background-color: white; word-wrap: inherit !important; word-break: inherit !important;"><td style="color: inherit; line-height: inherit; margin: 0px; font-size: 1em; border: 1px solid rgb(204, 204, 204); padding: 0.5em 1em; text-align: left; word-wrap: inherit !important; word-break: inherit !important;">g</td><td style="color: inherit; line-height: inherit; margin: 0px; font-size: 1em; border: 1px solid rgb(204, 204, 204); padding: 0.5em 1em; text-align: left; word-wrap: inherit !important; word-break: inherit !important;">将保持空间复制到模式空间</td></tr><tr style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; border-width: 1px 0px 0px; border-right-style: initial; border-bottom-style: initial; border-left-style: initial; border-right-color: initial; border-bottom-color: initial; border-left-color: initial; border-image: initial; border-top-style: solid; border-top-color: rgb(204, 204, 204); background-color: white; word-wrap: inherit !important; word-break: inherit !important;"><td style="color: inherit; line-height: inherit; margin: 0px; font-size: 1em; border: 1px solid rgb(204, 204, 204); padding: 0.5em 1em; text-align: left; word-wrap: inherit !important; word-break: inherit !important;">G</td><td style="color: inherit; line-height: inherit; margin: 0px; font-size: 1em; border: 1px solid rgb(204, 204, 204); padding: 0.5em 1em; text-align: left; word-wrap: inherit !important; word-break: inherit !important;">将保持空间附加到模式空间</td></tr><tr style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; border-width: 1px 0px 0px; border-right-style: initial; border-bottom-style: initial; border-left-style: initial; border-right-color: initial; border-bottom-color: initial; border-left-color: initial; border-image: initial; border-top-style: solid; border-top-color: rgb(204, 204, 204); background-color: white; word-wrap: inherit !important; word-break: inherit !important;"><td style="color: inherit; line-height: inherit; margin: 0px; font-size: 1em; border: 1px solid rgb(204, 204, 204); padding: 0.5em 1em; text-align: left; word-wrap: inherit !important; word-break: inherit !important;">x</td><td style="color: inherit; line-height: inherit; margin: 0px; font-size: 1em; border: 1px solid rgb(204, 204, 204); padding: 0.5em 1em; text-align: left; word-wrap: inherit !important; word-break: inherit !important;">交换模式空间和保持空间的内容</td></tr></tbody></table><br><h3&nbsp;id="h-5" style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><span style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;">将文件中内容按行倒序:</span><br><pre style="font-size: inherit; color: inherit; line-height: inherit; margin: 0px; padding: 0px; word-wrap: inherit !important; word-break: inherit !important;"><code style="padding: 2px 4px; margin: 0px 2px; color: rgb(233, 105, 0); background: rgb(248, 248, 248); line-height: 18px; font-size: 14px; font-weight: normal; word-spacing: 0px; letter-spacing: 0px; font-family: Consolas, Inconsolata, Courier, monospace; border-radius: 0px; display: block !important; white-space: pre !important; overflow: auto !important; word-wrap: inherit !important; word-break: inherit !important;">[<span class="hljs-symbol" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">root@</span>centos00&nbsp;Documents]#&nbsp;cat&nbsp;seqnumber.txt<br><span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">1</span><br><span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">2</span><br><span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">3</span><br><span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">4</span><br><span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">5</span><br><span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">6</span><br>[<span class="hljs-symbol" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">root@</span>centos00&nbsp;Documents]#&nbsp;sed&nbsp;-n&nbsp;<span class="hljs-string" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(238, 220, 112); word-wrap: inherit !important; word-break: inherit !important;">'{G;h;s/\n//g;$p}' seqnumber.txt
654321
[root@centos00 Documents]#

在本例中,G;h;就是利用了 pattern, hold space 的命令,做出两空间中数据的移动。


这里特别要注意的是 

p 中 就是寻到最后一行数据。

排除命令:

有两个作用,一是对符合条件的行不执行命令,二是对不符合条件的那些行则坚决执行这些命令

[root@centos00 Documents]# sed -n '{G;h;$p}'&nbsp;seqnumber.txt</span><br><span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">6</span><br><span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">5</span><br><span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">4</span><br><span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">3</span><br><span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">2</span><br><span class="hljs-number" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(174, 135, 250); word-wrap: inherit !important; word-break: inherit !important;">1</span><br><br>[root@centos00&nbsp;Documents]<span class="hljs-comment" style="font-size: inherit; line-height: inherit; margin: 0px; padding: 0px; color: rgb(128, 128, 128); word-wrap: inherit !important; word-break: inherit !important;">#&nbsp;sed&nbsp;-n&nbsp;'{1!G;h;$p}' seqnumber.txt
6
5
4
3
2
1
[root@centos00 Documents]#

1!G就表示仅在第一行排除使用 G 命令,因为第一行读取时,hold space 并没有内容,是空值(看第一个结果,末尾有个空行),只执行 h; 而其他行都会一次执行 G;h;, 最后一行还会执行 p 的操作。

改变流:

跳转命令:
[address]b[label]

[address] 是定位表达式,label 是用来表示特定的一组命令的标记。

[root@centos00 Documents]# cat MultiLine.txt 
this is the header line
this is the first line 
this is the second line
this is the third line 
this is the end


[root@centos00 Documents]# sed '{ /second/bchg;s/[ ]is[ ]/ was /g;:chg s/line/user/ }' MultiLine.txt
this was the header user
this was the first user 
this is the second user
this was the third user 
this was the end


[root@centos00 Documents]#

值得注意的是,所有的命令都会被依次执行,但符合条件的行只被执行标记出来的命令。以上代码中, is 被替换成 was 只有在行内容中没有 second 的那些行,才执行。而所有的行,都会执行替换 line 成 user 的操作。

当然,为了阅读美观性,[address]b [label]之间可以加一个空格:

[root@centos00 Documents]# sed '{ /second/b chg;s/[ ]is[ ]/ was /g;:chg s/line/user/ }' MultiLine.txt
this was the header user
this was the first user 
this is the second user
this was the third user 
this was the end


[root@centos00 Documents]#

如果在跳转命令后面什么标识(label)都不注明,那么符合条件的这行将跳过所有的命令,知道末尾退出,什么都不做!

[root@centos00 Documents]# sed '{ /second/b;s/[ ]is[ ]/ was /g;:chg s/line/user/ }' MultiLine.txt
this was the header user
this was the first user 
this is the second line
this was the third user 
this was the end


[root@centos00 Documents]#

除了放在末尾外,label 也可以放在首部命令的位置,这样就造成了调用 label 命令时的循环:

[root@centos00 Documents]# echo 'this,is,a,header,line,' | sed ':rmc s/,/ / ; b rmc ;' 
^C
[root@centos00 Documents]# echo 'this,is,a,header,line,' | sed ':rmc s/,/ / ; /,/b rmc ;' 
this is a header line 
[root@centos00 Documents]#

为了防止死循环,加上判断,比如是否还有满足条件的情况(还有逗号)可以有效停止循环。

测试命令:
[root@centos00 Documents]# cat sed_t.sed
{

        s/second/sec/
        t
        s/[ ]is[ ]/ was /
        ;

}


[root@centos00 Documents]# sed -f sed_t.sed MultiLine.txt
this was the header line
this was the first line 
this is the sec line
this was the third line 
this was the end


[root@centos00 Documents]#

测试命令,完成了 if-then-else-then 的结构:

if 
    s/second/sec/ 

else 
    s/[ ]is[ ]/ was /

如果没有完成 s/second/sec/ 的替换,那么执行 s/[ ]is[ ]/ was / 的替换。

t 和 b 的引用风格也一样 :

[address]t [label]

但这里[address]是替换成了s/// 的替换命令:

[s/second/sec/]t [label]

完整的写起来是这么回事,前面例子省却了 label, 则自动跳转到命令脚本末尾,即什么也不发生。

[root@centos00 Documents]# cat sed_t_header.sed
{
    s/header/beginning/
    t chg
    s/line/user/
    :chg 
    s/beginning/beginning header/
}

[root@centos00 Documents]# sed -f sed_t_header.sed MultiLine.txt
this is the beginning header line
this is the first user 
this is the second user
this is the third user 
this is the end


[root@centos00 Documents]#

值得注意的是,t 的脚本中,命令也是依次执行的, chg 的命令同样也会作用于每一行上,只是不起作用而已。

模式替代
and(&) 操作符
[root@centos00 Documents]# echo 'the cat is sleeping in his hat' | sed 's/.at/"&"/g'
the "cat" is sleeping in his "hat"
[root@centos00 Documents]#

“.”指代任意一个字符,所以 cat, hat 都匹配的上。用 & 标识整个模式匹配的上的字符串,将其前后加上双引号。

()指定子模式替代字符串
[root@centos00 Documents]# sed 's/this\(.*line\)/that\1/;p;' -n MultiLine.txt
that is the header line
that is the first line 
that is the second line
that is the third line 
this is the end


[root@centos00 Documents]#

有意思的事情是, \1, \2, \3, \n 标识了每个用 () 标记起来的模式子字符串,在替换命令中,使用了 \1,\2… 指代符的维持原来内容不变,而没有 \1, \2… 标记起来的内容,则全部替换。

案例:

给每行加个行号:
[root@centos00 Documents]# cat MultiLine.txt
this is the header line
this is the first line 
this is the second line
this is the third line 
this is the end


[root@centos00 Documents]# sed ' = ' MultiLine.txt | sed 'N;s/\n//g' 
1this is the header line
2this is the first line 
3this is the second line
4this is the third line 
5this is the end
6
7
[root@centos00 Documents]#

猜你喜欢

转载自blog.csdn.net/wujiandao/article/details/82557947