sort files sort content

Whether it is used for work or deal with one of a variety of interview, linux sort are necessary to master the linux basic commands. Especially linux sort -k command, is often get confused, simply look carefully at the sort command

sort files sort content

grammar:

sort [-bcdfimMnr][-o<输出文件>][-t<分隔字符>][+<起始栏位>-<结束栏位>][--help][--verison][文件] 

Options:

-b:忽略每行前面开始的空格字符,空格数量不固定时,该选项几乎是必须要使用的("-n"选项隐含该选项,测试发现都隐含)
-c:检查文件是否已经按照顺序排序,如未排序,会提示从哪一行开始乱序
-C:类似于"-c",只不过不输出任何诊断信息。可以通过退出状态码1判断出文件未排序
-d:只处理英文字母、数字及空格,忽略其他的字符
-f:将小写字母视为大写字母
-h:使用易读性数字(例如:2K、1G)
-i:除了040至176之间的ASCII字符外(八进制0-177),忽略其他的字符(忽略无法打印的字符如退格/换页键/回车)
-k:以哪个区间 (field) 来进行排序
-m:将几个排序好的文件进行合并,只是单纯合并,不做排序
-M:将前面3个字母依照月份的缩写进行排序
-n:依照数值的大小排序
-o<输出文件>:将排序后的结果存入指定的文件
-r:降序
-u:忽略相同行
-t<分隔字符>:指定分隔符,默认的分隔符为空白字符和非空白字符之间的空字符

Parameter is not presented directly to the example, first look at the raw data ordering cat sort.log

a   mac     2000    500 2K
d   winxp   4000    300 3G
e   bsd     1000    600 4M
b   linux   1000    200 5K
f   SUSE    4000    300 6M
g   winxp   500     300 3G
c   win7    2000    100 7G
c   Debian  600     200 8K

1, where to print the column began to be out of order sort -c sort.log; echo $?

sort: sort.log:4: disorder: b linux 1000 200 5K 1 

sort -C sort.log; echo $?

1

Wherein a return result indicating that the file is not already sorted file 2, default sort (for entire row of ASCII characters in ascending order) sort sort.log

a   mac     2000    500 2K
b   linux   1000    200 5K
c   Debian  600     200 8K
c   win7    2000    100 7G
d   winxp   4000    300 3G
e   bsd     1000    600 4M
f   SUSE    4000    300 6M
g   winxp   500     300 3G

3, high-energy came, people confused k grammar, syntax k's first look

[ FStart [ .CStart ] ] [ Modifier ] [ , [ FEnd [ .CEnd ] ][ Modifier ] ]

This syntax can be one of the comma ( ",") is divided into two parts, part of Start and End Start and End portions which are made of three parts, of which part is Modifier options section similar to n and r can be omitted FStart , Fend, represents the use of the domain, and CStart said in FStart field from the first few characters began to count, "the first character of the sort," Similarly, CEnd represents the end of the first few characters of the sort is the last character, .CStart, .CEnd may be omitted, respectively, from the beginning of this domain to the domain of the tail end of this domain, CEnd set to 0, also showing the tail end of the domain. I said rumor, a few examples of it

3.1 pairs of the third column is sorted, if not n, in accordance with the ASCII character sort sort -t $'\t' -k 3 sort.log

b   linux   1000    200 5K
e   bsd     1000    600 4M
c   win7    2000    100 7G
a   mac     2000    500 2K
d   winxp   4000    300 3G
f   SUSE    4000    300 6M
g   winxp   500     300 3G
c   Debian  600     200 8K

After adding 3.2 n, sorted according to the value sort -t $'\t' -k 3n sort.log

g   winxp   500     300 3G
c   Debian  600     200 8K
b   linux   1000    200 5K
e   bsd     1000    600 4M
a   mac     2000    500 2K
c   win7    2000    100 7G
d   winxp   4000    300 3G
f   SUSE    4000    300 6M

3.3 Fend is not specified, a plurality of front to back ordering may -k, not forward from the back to front, a plurality of -k, data are consistent with the expected sort -t $'\t' -k 3n -k 1 sort.log

g   winxp   500     300 3G
c   Debian  600     200 8K
b   linux   1000    200 5K
e   bsd     1000    600 4M
a   mac     2000    500 2K
c   win7    2000    100 7G
d   winxp   4000    300 3G
f   SUSE    4000    300 6M

Back to front, a plurality of -k, the third column are the same, according to a first column in descending order, the data in line with expectations sort -t $'\t' -k 3n -k 1r sort.log

g   winxp   500     300 3G
c   Debian  600     200 8K
e   bsd     1000    600 4M
b   linux   1000    200 5K
c   win7    2000    100 7G
a   mac     2000    500 2K
f   SUSE    4000    300 6M
d   winxp   4000    300 3G

Replaced from front to back sort -t $'\t' -k 1 -k 3n sort.log

a   mac     2000    500 2K
b   linux   1000    200 5K
c   Debian  600     200 8K
c   win7    2000    100 7G
d   winxp   4000    300 3G
e   bsd     1000    600 4M
f   SUSE    4000    300 6M
g   winxp   500     300 3G

sort -t $'\t' -k 1 -k 3nr sort.log

a   mac     2000    500 2K
b   linux   1000    200 5K
c   Debian  600     200 8K
c   win7    2000    100 7G
d   winxp   4000    300 3G
e   bsd     1000    600 4M
f   SUSE    4000    300 6M
g   winxp   500     300 3G

By sort -t $'\t' -k 1 -k 3n sort.logand sort -t $'\t' -k 1 -k 3nr sort.logreturned results found in the first column are equal, regardless of which three are arranged in the positive sequence, or in reverse order, the results are the same, does not take effect described -k behind when the specified FEendsort -t $'\t' -k 1,1 -k 3nr sort.log

a   mac     2000    500 2K
b   linux   1000    200 5K
c   win7    2000    100 7G
c   Debian  600     200 8K
d   winxp   4000    300 3G
e   bsd     1000    600 4M
f   SUSE    4000    300 6M
g   winxp   500     300 3G

3.4 Scope immediately following the options (such as "-k3n" of "n" and "-k2nr" of "n", "r") after field called private option, use a dash to write outside the field of options ( such as "-n", "- r" ) as a global option. When the option is not assigned a private field, the field will inherit the global ordering options, including but not limited to all options "bfnrhM" addition "b" option, the remaining option specifies whether or FEnd in FStart are equivalent, for "b" option to specify the role fstart fstart, specified in FEnd acting on FEnd sort -t $'\t' -k1r,2 sort.log, one can see, two are arranged flashback

g   winxp   500     300 3G
f   SUSE    4000    300 6M
e   bsd     1000    600 4M
d   winxp   4000    300 3G
c   win7    2000    100 7G
c   Debian  600     200 8K
b   linux   1000    200 5K
a   mac     2000    500 2K

3.5 Note n option is specified when sorted by value, due to the "n" option is only an identification number and a minus "-" when ordering does not recognize the character when encountered, will lead to an immediate end to the sort of the key, n option will not under cross-domain to compare default, sort it will conduct a "final ranking" , to conduct a sort of default in accordance with the rules of the entire line, this sort as "the last of the sort."

sort -t $'\t' -k3n sort.logIn the third column are equal, the entire line will be arranged in ascending order according to the last ASCII

g   winxp   500     300 3G
c   Debian  600     200 8K
b   linux   1000    200 5K
e   bsd     1000    600 4M
a   mac     2000    500 2K
c   win7    2000    100 7G
d   winxp   4000    300 3G
f   SUSE    4000    300 6M

sort -t $'\t' -k3,4n -s sort.log, Added after -s, not the final sort (1000 phase Consequently, e b in the front edge), but retain the original ordering

g   winxp   500     300 3G
c   Debian  600     200 8K
e   bsd     1000    600 4M
b   linux   1000    200 5K
a   mac     2000    500 2K
c   win7    2000    100 7G
d   winxp   4000    300 3G
f   SUSE    4000    300 6M

3.6 sorted by the first n characters of a domain sort -t $'\t' -k2.3,2.3 sort.log, sorted according to the third character in the second column

c   Debian  600     200 8K
a   mac     2000    500 2K
e   bsd     1000    600 4M
b   linux   1000    200 5K
c   win7    2000    100 7G
d   winxp   4000    300 3G
g   winxp   500     300 3G
f   SUSE    4000    300 6M

4, -h legibility using numbers (e.g.: 2K, 1G) sort -t $'\t' -k5h sort.log

a   mac     2000    500 2K
b   linux   1000    200 5K
c   Debian  600     200 8K
e   bsd     1000    600 4M
f   SUSE    4000    300 6M
d   winxp   4000    300 3G
g   winxp   500     300 3G
c   win7    2000    100 7G
  1. sort -u and sort | uniq difference if -k option to specify sort, are not equivalent, uniq default is the entire line to be heavy sort -t $'\t' -k2,2 -u sort.log
e   bsd     1000    600 4M
c   Debian  600     200 8K
b   linux   1000    200 5K
a   mac     2000    500 2K
f   SUSE    4000    300 6M
c   win7    2000    100 7G
d   winxp   4000    300 3G

sort -t $'\t' -k2,2 sort.log|uniq

e   bsd     1000    600 4M
c   Debian  600     200 8K
b   linux   1000    200 5K
a   mac     2000    500 2K
f   SUSE    4000    300 6M
c   win7    2000    100 7G
d   winxp   4000    300 3G
g   winxp   500     300 3G

sort -t $'\t' -k2,2 -u sort.logThe second column to be heavy, and sort -t $'\t' -k2,2 sort.log|uniqwill be de-emphasis whole row (of course, also possible to re uniq according to the second column)

sort finishing finished, Daniel welcome advice

Author: smoke_zl link: https: //www.jianshu.com/p/c4d159a98dd8 Source: Letters of Jane book copyright reserved by the authors, are reproduced in any form, please contact the author to obtain authorization and indicate the source.

Guess you like

Origin www.cnblogs.com/wilson403/p/11162809.html