sed Linux application to delete the blank line

A, sed editor Presentation

sed editor is referred to as stream editor (stream editor), and normal interactive text editor contrary. In an interactive text editor (such as vim), you can use keyboard commands interactively insert, delete or replace text data. Stream editor to edit the data stream will be based on a set of rules provided in advance prior to processing the data editor.
sed editor may process the data stream in accordance with commands that are input either from the command line or command text stored in a file (i.e., a script file). sed editor performs the following actions:

  • Sequentially read data from the input line
  • The editor commands matching data provided
  • The data stream in order to modify
  • The output of the new input data to STDOUT

After all command stream editor to match with a line of data is completed, it reads the next line data and repeat the process until all data has been processed terminated.
sed command format is as follows

sed options script file

Two, sed delete blank lines

Description : Before using the assumption that the reader has a basic knowledge of regular, if not, on their own learning

1. Delete consecutive blank lines

If you have a variable number of blank line between each line of our text, read up quite some effort, not beautiful, we want a fixed interval between each line, such as line spacing is just a blank line, we can use the following script:

/./ , /^$/!d

Interval /./ to / ^ $ / will match any boycott the start of the end of the address contains at least one character line, the interval will match a blank line, the line in this range will not be deleted.
Examples

~$ cat example1
Such stories set us thinking, 

wondering what we should do under similar circumstances. 


What events, what experiences, 

what associations should we crowd into those last hours as mortal beings, 


what regrets?
~$ sed '/./,/^$/!d' example1
Such stories set us thinking, 

wondering what we should do under similar circumstances. 

What events, what experiences, 

what associations should we crowd into those last hours as mortal beings, 

what regrets?
~$

The original irregular blank line becomes uniform. The example No content after the last row, then after the last line of text if there is a blank line, a line of treatment for a blank line.

2. Remove blank lines at the beginning of

If a piece of text at the beginning of multiple rows blank apparently will read the inconvenience. Delete blank lines at the beginning of thinking is quite similar to the above, we can use the following script:

/./ , $!d

This script will have to start from the line of characters, until the end, such a range of content will be removed, that is, before the first row blank will officially contents are deleted.
Examples

~$ cat example2



Such stories set us thinking, 

wondering what we should do under similar circumstances. 
What events, what experiences, 
what associations should we crowd into those last hours as mortal beings, 
what regrets?
~$ sed '/./, $!d' example2
Such stories set us thinking, 

wondering what we should do under similar circumstances. 
What events, what experiences, 
what associations should we crowd into those last hours as mortal beings, 
what regrets?
~$

3. Remove trailing blank lines

In delete consecutive blank lines we mentioned, the processing of the last line will contain one line blank line, maybe this is not the result we want, we want the last line does not contain a blank line, we can use the following script:

sed '{
: Start
/ ^ \ the n-* $ / $ {d; N; b Start}
}'

this script may seem a little bit complicated, in normal script braces also brace, which allows you to command the entire script Some in the command packet, the command packet is applied to the specified address. Here used branch label b, the jump can be achieved, as the cycle C language. The implication is that if you find a line that has only one line break, but also the last line, then delete it, otherwise it will continue to the next match.
Examples

~$ cat example3
Such stories set us thinking, 

wondering what we should do under similar circumstances. 

What events, what experiences, 
what associations should we crowd into those last hours as mortal beings, 
what regrets?

~$ sed '{
:start
/^\n*$/{$d; N; b start}
}' example3
Such stories set us thinking, 

wondering what we should do under similar circumstances. 

What events, what experiences, 
what associations should we crowd into those last hours as mortal beings, 
what regrets?
~$

4. Remove HTML tags

We will use the command line to get a lot of information on some pages, which contains a lot of html tag information is not conducive to the main message we get them, we need to be formatted as plain text, we can use the following script:

s/ < [^>]*>//g ; /^$/d
** Examples **
~$ cat example4
<html>
<head>
<title>This is the page</title>
</head>
<body>
<p>
This is the first line in the web page
This should provide some useful
information to use in our sed script
</p>
</body>
</html>
~$ sed -e 's/<[^>]*>//g ; /^$/d' data.txt
This is the page
This is the first line in the web page
This should provide some useful
information to use in our sed script
~$

Description : example4 content of each line immediately after the line breaks, does not contain any spaces or tab, readers can try, behind some tags to add some space or tab, the resulting content will be different.

We pay attention will find that all of the content here are five indent, but our standard html structure usually contains various indents relations, such as the following example

~$ cat example5
<html>
	<head>
		<title>This is the page</title>
	</head>
		<body>
		<p>
		This is the first line in the web page
		This should provide some useful
		information to use in our sed script
		</p>
	</body>
</html>
~$ sed -e 's/<[^>]*>//g ; /^$/d' example5
	
		This is the page
	
		
		
		This is the first line in the web page
		This should provide some useful
		information to use in our sed script
		
	
~$

Tags are deleted, but a blank line, but any course exist, and why? We will save the output, and then use the cat -t look at the specific content inside:

~$ sed -e 's/<[^>]*>//g ; /^$/d' data.txt > test
~$ cat -t test 
^I
^I^IThis is the page
^I
^I^I
^I^I
^I^IThis is the first line in the web page
^I^IThis should provide some useful
^I^Iinformation to use in our sed script
^I^I
^I
~$

You can see, the text of each blank line is there is actually a non-printable character ^ the I , which represents a TAB, he was executed in the match / ^ $ / d , the two do not match, so delete blank lines operations not performed. So it is necessary to modify the rules:

s/ < [^>]*>//g ; /^ [[:space:]]*$/d
The difference is that the intermediate blank line matching increased [[: Space]] * , [[: Space]] represents any whitespace characters, including spaces, tabs, NL, FF, VT and CR. Implementation of the results is as follows:
~$ sed -e 's/<[^>]*>//g ; /^[[:space:]]*$/d' example5
		This is the page
		This is the first line in the web page
		This should provide some useful
		information to use in our sed script
~$

You can see, delete empty rows over, but then there is any part of the gap between each row, so that we can deal with that part blank again be deleted on the basis of the above, the modified rules are as follows

s/ < [^>]*>//g ; /^ [[:space:]]*$/d ; s/^[[:space:]]*$//g
We start each line of space or TAB matching replacement, so as to achieve deleted:
~$ sed -e 's/<[^>]*>//g ; /^[[:space:]]*$/d; s/^[[:space:]]*//g' example5
This is the page
This is the first line in the web page
This should provide some useful
information to use in our sed script
~$

Now the effect is much better, you can redirect the output to the specified file.

Published 25 original articles · won praise 23 · views 10000 +

Guess you like

Origin blog.csdn.net/Secur17y/article/details/100895144