Three Musketeers of Text Processing for Linux
> grep: text filtering (pattern: pattern) tool
> grep, egrep, fgrep (regular expression search not supported)
> sed: stream editor, text editing tool
> awk: implementation of gawk on Linux, text Report Generator
- ### grep
is used to search the target file according to "pattern" and display the matching lines
This is a simple script The purpose of the script is to display hello
grep in three types
> 1: fgerp does not support regular expressions but the search speed is extremely fast
> 2: gerp supports regular expressions
> 3: egerp supports extended regular expressions Useful
usage: grep [OPTIONS] PATTERN [FILE...]
gerp common options
-color=auto: pair matching Rendered text colorization
-v: show lines not matched by pattern
-i: ignore character case
-n: show matching line numbers
-c: count the number of matching lines
-o: show only matching
strings- q: silent mode, do not output any information
-A #: after, after# lines-
B #: before, before# lines
-C #: context, each # line before and after
-e: realize the logical or relationship between multiple options
-w: match the whole word
-E: use ERE
-F: equivalent to fgrep, does not support regular expression
regular expression meta The characters are
```
. Match any single character
[] Match any single character within the specified range
[^] Match any single character outside the specified range
[:alnum:] Letters and numbers
[:alpha:] represent any English uppercase and lowercase characters , i.e. AZ, az
[:lower:] lowercase letters [:upper:] uppercase letters
[:blank:] whitespace characters (spaces and tabs)
[:space:] horizontal and vertical whitespace characters (more than [:blank :] contains a wide range)
[:cntrl:] Non-printable control characters (backspace, delete, bell...)
[:digit:] Decimal digits[:xdigit:]Hexadecimal digits
[:graph: ] printable non-whitespace characters
[:print:] printable characters
[:punct:] punctuation
```
matches
```
* matches the preceding character any number of times, including 0
greedy mode: match as long as possible
.* any character of any length
\? matches the preceding character 0 or 1 times
\+ matches the preceding character at least 1 time
\{n\} matches the preceding character n times
\{m,n\} matches the preceding character at least m times and at most n times
\{,n\} Matches the preceding character at most n times
\{n,\} Matches the preceding character at least n times
```
Position anchored characters have
```
Position anchors: position where they occur
^ Line anchor fixed, for the leftmost
$ of the pattern End of line anchoring, for the rightmost of the
pattern ^PATTERN$ for the pattern to match the entire line
^$ empty line
^[[:space:]]*$ blank line
\< or \b prefix anchor, used on the left side of a word pattern
\> or \b suffix anchor; used on the right side of a word pattern
\<PATTERN\> matches the whole word
``` grouping```
grouping : \(\ ) Bundle one or more characters together and process them as a whole, such as: \(root\)\+ The content matched by the pattern in the grouping brackets will be recorded in the internal variable by the regular expression engine, These variables are named as: \1, \2, \3, ... ``` - #### Extended regular expressions are basically the same as basic regular expressions, but are more convenient than basic regular expressions , the usage is to remove the escape character based on the regular expression shell script basis
The programming style of the program is divided into two categories
> 1: Procedural style: instruction-centered, data serves data
> 2: Object style: data-centered, instructions serve data
High-level languages are roughly divided into two types
> Compile: high-level languages -->Compiler-->Object code
java,C#
> Interpretation: High-level language -->Interpreter-->Machine code
shell, perl, python
- ### Shell script:
contains some commands or declarations, and conforms to a certain format The text file
> format requirements: in shell script
> the first line needs to write shebang mechanism
> shebang is written as #! , #! The role is to tell the cpu what mechanism this program is.
Example: #!
> #!/bin/bash >
#!/usr/bin/python
> #!/usr/bin/perl
> Shell scripts are used to:
> automate common commands
> perform system administration and troubleshooting
> create simple applications Programs
> Process Text or Files
Simple scripts can be composed of simple commands.
Basic structure of the script
> #!SHEBANG
>
> FUNCTION_DEFINITIONS
> MAIN_CODE
示例
```
#!/bin/bash
#---------------------------
# my first script
# owner :zhangxiao
#---------------------------
echo "hello"
```
> grep: text filtering (pattern: pattern) tool
> grep, egrep, fgrep (regular expression search not supported)
> sed: stream editor, text editing tool
> awk: implementation of gawk on Linux, text Report Generator
- ### grep
is used to search the target file according to "pattern" and display the matching lines
This is a simple script The purpose of the script is to display hello
grep in three types
> 1: fgerp does not support regular expressions but the search speed is extremely fast
> 2: gerp supports regular expressions
> 3: egerp supports extended regular expressions Useful
usage: grep [OPTIONS] PATTERN [FILE...]
gerp common options
-color=auto: pair matching Rendered text colorization
-v: show lines not matched by pattern
-i: ignore character case
-n: show matching line numbers
-c: count the number of matching lines
-o: show only matching
strings- q: silent mode, do not output any information
-A #: after, after# lines-
B #: before, before# lines
-C #: context, each # line before and after
-e: realize the logical or relationship between multiple options
-w: match the whole word
-E: use ERE
-F: equivalent to fgrep, does not support regular expression
regular expression meta The characters are
```
. Match any single character
[] Match any single character within the specified range
[^] Match any single character outside the specified range
[:alnum:] Letters and numbers
[:alpha:] represent any English uppercase and lowercase characters , i.e. AZ, az
[:lower:] lowercase letters [:upper:] uppercase letters
[:blank:] whitespace characters (spaces and tabs)
[:space:] horizontal and vertical whitespace characters (more than [:blank :] contains a wide range)
[:cntrl:] Non-printable control characters (backspace, delete, bell...)
[:digit:] Decimal digits[:xdigit:]Hexadecimal digits
[:graph: ] printable non-whitespace characters
[:print:] printable characters
[:punct:] punctuation
```
matches
```
* matches the preceding character any number of times, including 0
greedy mode: match as long as possible
.* any character of any length
\? matches the preceding character 0 or 1 times
\+ matches the preceding character at least 1 time
\{n\} matches the preceding character n times
\{m,n\} matches the preceding character at least m times and at most n times
\{,n\} Matches the preceding character at most n times
\{n,\} Matches the preceding character at least n times
```
Position anchored characters have
```
Position anchors: position where they occur
^ Line anchor fixed, for the leftmost
$ of the pattern End of line anchoring, for the rightmost of the
pattern ^PATTERN$ for the pattern to match the entire line
^$ empty line
^[[:space:]]*$ blank line
\< or \b prefix anchor, used on the left side of a word pattern
\> or \b suffix anchor; used on the right side of a word pattern
\<PATTERN\> matches the whole word
``` grouping```
grouping : \(\ ) Bundle one or more characters together and process them as a whole, such as: \(root\)\+ The content matched by the pattern in the grouping brackets will be recorded in the internal variable by the regular expression engine, These variables are named as: \1, \2, \3, ... ``` - #### Extended regular expressions are basically the same as basic regular expressions, but are more convenient than basic regular expressions , the usage is to remove the escape character based on the regular expression shell script basis
The programming style of the program is divided into two categories
> 1: Procedural style: instruction-centered, data serves data
> 2: Object style: data-centered, instructions serve data
High-level languages are roughly divided into two types
> Compile: high-level languages -->Compiler-->Object code
java,C#
> Interpretation: High-level language -->Interpreter-->Machine code
shell, perl, python
- ### Shell script:
contains some commands or declarations, and conforms to a certain format The text file
> format requirements: in shell script
> the first line needs to write shebang mechanism
> shebang is written as #! , #! The role is to tell the cpu what mechanism this program is.
Example: #!
> #!/bin/bash >
#!/usr/bin/python
> #!/usr/bin/perl
> Shell scripts are used to:
> automate common commands
> perform system administration and troubleshooting
> create simple applications Programs
> Process Text or Files
Simple scripts can be composed of simple commands.
Basic structure of the script
> #!SHEBANG
>
> FUNCTION_DEFINITIONS
> MAIN_CODE
示例
```
#!/bin/bash
#---------------------------
# my first script
# owner :zhangxiao
#---------------------------
echo "hello"
```