linux shell merge multiple files and delete duplicate lines

table of Contents

Source File

Merge files

Delete duplicate rows and display

File Union & Intersection & Complement


cat a.txt b.txt | sort | uniq > h.txt

Source File

1. First enter "cd tmp" to the file directory (tmp in this example).

LinuxCombinFile1.png

2. Input "cat a.txt" to output the contents of a.txt file.

LinuxCombinFile2.png

3. Enter "cat b.txt" to output the contents of the b.txt file.

LinuxCombinFile3.png

Merge files

4. Enter "cat a.txt b.txt> c.txt" to merge the contents of the b.txt file to the bottom of a.txt and output to c.txt. And enter "cat c.txt" to output the contents of the c.txt file.
   Note: The content in the original a.txt file is in the blue box, and the content in the original b.txt file is in the yellow box.

LinuxCombinFile4.png

5. Enter "paste a.txt b.txt> d.txt" to merge the contents of the b.txt file to the right of a.txt and output to d.txt. And input "cat d.txt" to output the contents of the d.txt file.
   Note: The content in the original a.txt file is in the blue box, and the content in the original b.txt file is in the yellow box.

LinuxCombinFile5.png

6. Enter "cat a.txt b.txt | sort | uniq |> e.txt" to merge the contents of a.txt and b.txt files and delete duplicate lines, and output the result to e.txt. And enter "cat e.txt" to output the contents of the e.txt file.
   Note the difference between e.txt and the c.txt file in the figure above.

LinuxCombinFile6.png

Delete duplicate rows and display

7. Enter "cp b.txt f.txt" command to copy b.txt, the new file name is f.txt, enter "cat f.txt" command to display the contents of the file.

LinuxCombinFile7.png

8. Enter "sort f.txt | uniq" and press Enter to confirm the display of the results of deleting duplicate lines (only one line is displayed when multiple lines are repeated).

LinuxCombinFile8.png

9. The sort command only deletes duplicate lines in the displayed result, and does not modify the file. You can view the contents of the file by entering the "cat f.txt" command, and the result is the same as the original file.

LinuxCombinFile16.png

10. Enter "cp b.txt g.txt" command to copy b.txt, the new file name is g.txt, and enter "cat g.txt" command to display the contents of the file.

LinuxCombinFile9.png

11. Enter "sort g.txt | uniq -u" and press Enter to confirm the display of the results of deleting duplicate lines (do not show duplicate lines).

LinuxCombinFile10.png

12. The sort command only deletes duplicate lines in the displayed result, and does not modify the file. You can enter the "cat g.txt" command to view the contents of the file, and the result is the same as the original file.

LinuxCombinFile17.png

 

File Union & Intersection & Complement

Note: For operations such as file complement and intersection, it is necessary to ensure that there are no duplicate lines in the file content.
13. Enter the "cat a.txt b.txt | sort | uniq> h.txt" command to merge the a.txt and b.txt files into h.txt (if the two source files have duplicate lines, only one line is kept) , And enter "cat h.txt" to view the contents of the h.txt file.

LinuxCombinFile11.png

14. Enter the "cat a.txt b.txt | sort | uniq -c> h1.txt" command to merge the a.txt and b.txt files into h1.txt (the -c parameter shows the number of occurrences of each line), And enter "cat h1.txt" to view the contents of the h.txt file.

LinuxCombinFile19.png

15. Enter the "cat a.txt b.txt | sort | uniq -d> i.txt" command (the -d parameter indicates that only duplicate lines are displayed) to output the intersection of a.txt and b.txt to i.txt, and Enter "cat i.txt" to view the contents of the file.
     Note: Because there are duplicate lines in b.txt, the content of the output file is incorrect.

LinuxCombinFile12.png

16. Enter the "sort b.txt | uniq> b1.txt" command to delete the duplicate lines of b.txt (only one line is left) and output the result as a b1.txt file, and enter "cat b1.txt" to view the contents of the file .

LinuxCombinFile13.png

17. Enter the "cat a.txt b1.txt | sort | uniq -d> j.txt" command (the -d parameter indicates that only duplicate lines are displayed) to output the intersection of a.txt and b1.txt to j.txt, and Enter "cat j.txt" to view the contents of the file.
     Note: Because there are no duplicate lines in b1.txt, the content of the output file is correct.

LinuxCombinFile14.png

18. Enter the "cat a.txt b.txt | sort | uniq -u> k.txt" command (the -u parameter means that only non-duplicate lines in the file are displayed) will delete the intersection of a.txt and b.txt files (111 And 777), and output other content to k.txt, enter "cat k.txt" to view the content of the file.
    Note: Because the content of the b.txt file has duplicate lines (two lines aaa and bbb), the content of the output file is incorrect

LinuxCombinFile15.png

19. Enter the "cat a.txt b1.txt | sort | uniq -u> k1.txt" command (-u parameter means that only non-duplicated lines in the file are displayed) will delete the intersection of a.txt and b1.txt (111 And 777), and output other content to k1.txt, enter "cat k.txt" to view the content of the file.
    Note: Because there are no duplicate lines in the b1.txt file, the output file content is correct

LinuxCombinFile18.png

 

 

Guess you like

Origin blog.csdn.net/whatday/article/details/113773283