[Shell command collection file management] Linux split file split command usage tutorial


Shell command column: Full analysis of Linux Shell commands


describe


The split command is a command used to split files in Linux systems. It can split a large file into multiple small files for easy transmission, storage or processing. The following is a detailed description of the split command:

Split command syntax

split [选项] [输入文件] [输出文件前缀]

split command options

  • -b <大小>:Specify the size of each output file. The size can be expressed using suffixes (such as K, M, G), and the default unit is bytes.
  • -l <行数>:Specify the number of lines in each output file.
  • -a <后缀长度>: Specify the suffix length of the output file name, the default is 2.
  • -d: Use numbers as suffix for output file names instead of the default letters.
  • --verbose: Show details of the splitting process.
  • --help: Display help information.

Example of split command

  1. Split a file into chunks of a specified size:
split -b 1M largefile.txt output

This command splits the largefile.txt file into chunks of 1MB each and generates multiple output files prefixed with output.

  1. Split a file into chunks of a specified number of lines:
split -l 1000 largefile.txt output

This command splits the largefile.txt file into chunks of 1000 lines each and generates multiple output files prefixed with output.

  1. Split the file into chunks of a specified size, using a number as suffix:
split -b 10M -d largefile.txt output

This command splits the largefile.txt file into chunks of 10MB each and generates multiple output files with numbers as suffixes.

Precautions

  • The split command will generate multiple output files with file names consisting of the specified prefix and suffix.
  • If no input file is specified, data is read from standard input.
  • If no output file prefix is ​​specified, it defaults to "x".
  • The split file blocks are named in alphabetical order by default. Use the -d option to change to numerical order.

This is the detailed description of split command. Use the split command to easily split large files into small pieces to meet different needs.


Syntax format

split [OPTION]... [INPUT [PREFIX]]

Parameter Description

  • -b, --bytes=SIZE: Specify the size of each block after splitting. The supported units are K, M, and G. The default unit is bytes.
  • -l, --lines=NUMBER: Specifies the number of rows in each block after splitting.
  • -a, --suffix-length=N: Specify the suffix length of the output file name, the default is 2.
  • -d, --numeric-suffixes: Use numbers as suffix for output file names.
  • --verbose: Show details of the splitting process.
  • --help: Display help information and exit.
  • --version: Display version information and exit.

error condition

  • If no input file is specified, an error message is displayed and exits.
  • If the specified input file does not exist, an error message is displayed and exits.
  • If the specified split size or number of rows is invalid, an error message is displayed and exits.

Precautions

There are some things to note when using the Linux Shell’s split command:

  1. The input file must exist: The split command requires an input file to be specified for splitting, so before using the command, ensure that the input file exists and has the appropriate permissions.

  2. Selection of split size or number of lines: The split command can split the file based on its size or number of lines. When choosing the split size, choose the appropriate size based on actual needs. If you choose to split the number of lines, make sure there are enough lines in the file.

  3. Output file name suffix length: Use -aoptions to specify the suffix length of the output file name. The default is 2. The suffix length can be adjusted according to actual needs. Note that a suffix length that is too short may result in duplicate file names.

  4. Output file name suffix format: Use -dthe option to use a number as a suffix for the output file name. If this option is not used, letters are used as suffixes by default. Select the appropriate suffix format according to actual needs.

  5. Output file name prefix: You can PREFIXset the prefix of the output file name by specifying parameters. If no prefix is ​​specified, it defaults to "x". According to actual needs, you can specify an appropriate prefix.

  6. The storage location of the output file: The split command will generate the split file in the current working directory. If you need to store the file in another directory, you can specify the full output file path in the command.

  7. Split file order: The split command will split the input file in the default order. The generated file name suffixes range from aa, ab, ac to az, and then from ba, bb, bc to zz. If you need to customize the file name order, you can use other commands or scripts for subsequent processing.

  8. Display of splitting process: Use --verbosethe option to display details of the splitting process, including size of each split block, number of rows, etc. This is helpful in situations where you need to understand the splitting process.

  9. Error handling: When using the split command, pay attention to handling possible error conditions, such as the input file does not exist, the split size or the number of lines is invalid, etc. Handling errors promptly can avoid unnecessary problems.

The above are some precautions when using the split command of Linux Shell. Adjust and use it according to the actual situation.


underlying implementation

The underlying implementation of the split command works by reading the contents of an input file, splitting it into fixed-size chunks or fixed-number-of-line chunks, and writing each chunk to a separate output file.

The specific implementation process is as follows:

  1. The split command first opens the input file and reads its contents.

  2. Calculates the size or number of rows of each split block based on the specified split size or number of rows.

  3. Create output files and assign a unique file name to each output file.

  4. Reads data from the input file and writes the data to the current output file until the limit of split size or number of lines is reached.

  5. If there is remaining data to be written, create a new output file and write the remaining data to that file.

  6. Repeat the above steps until the entire input file is split.

Under the hood, the split command uses file operation functions to read and write data. It also uses some algorithms to calculate the split chunk size or number of lines and generate a unique output file name.

It should be noted that the underlying implementation of the split command may vary depending on different operating systems. In different Linux distributions or other Unix systems, there may be some differences in details. But generally speaking, the underlying implementation of the split command realizes file splitting by reading and writing files.


Example

Example 1

Split the file into chunks of the specified size and name the output file using the default alphabetical suffix.

split -b 100M largefile.txt output

This command splits the largefile.txt file into chunks of 100MB each and generates multiple output files prefixed with output.

Example 2

Splits the file into chunks of the specified number of lines and names the output file using the default alphabetical suffix.

split -l 5000 largefile.txt output

This command splits the largefile.txt file into chunks of 5000 lines each and generates multiple output files prefixed with output.

Example three

Split the file into chunks of the specified size, naming the output file using a number as a suffix.

split -b 50M -d largefile.txt output

This command splits the largefile.txt file into chunks of 50MB each and generates multiple output files with numbers as suffixes.

Example 4

Split the file into chunks of the specified size, and specify a suffix length of 3 for the output file name.

split -b 1G -a 3 largefile.txt output

This command splits the largefile.txt file into chunks of 1GB each and generates multiple output files with output as the prefix, and the output file names have a suffix length of 3.

Example five

Splits the file into chunks of the specified number of lines and displays details of the splitting process.

split -l 2000 --verbose largefile.txt output

This command splits the largefile.txt file into chunks of 2000 lines each and generates multiple output files prefixed with output while displaying details of the splitting process.

Example 6

Splits the file into chunks of the specified size and reads the input file from standard input.

cat largefile.txt | split -b 500M - output

This command will pipe the contents of the largefile.txt file to the split command, split it into chunks of 500MB size each, and name the output file with the default alphabetical suffix.

Example 7

Split the file into chunks of the specified size and save the output file in the specified directory.

split -b 100M largefile.txt /path/to/output/output

This command splits the largefile.txt file into chunks of 100MB each and saves the output file in the specified directory. The output file name is prefixed with output.



Conclusion

During our exploration, we have gained an in-depth understanding of the power and wide application of Shell commands. However, learning these techniques is just the beginning. The real power comes from how you integrate them into your daily routine to increase efficiency and productivity.

Psychology tells us that learning is a continuous and active process. So, I encourage you to not only read and understand these commands, but also practice them. Try creating your own commands and gradually master shell programming so that it becomes part of your daily routine.

Also, remember that sharing is a very important part of the learning process. If you found this blog helpful, please feel free to like and leave a comment. Sharing the problems or interesting experiences you encountered when using Shell commands can help more people learn from them.
In addition, I also welcome you to bookmark this blog and come back to check it anytime. Because review and repeated practice are also the keys to consolidating knowledge and improving skills.

Finally, remember: anyone can become a shell programming expert through continued study and practice. I look forward to seeing you make further progress on this journey!


Read my CSDN homepage and unlock more exciting content: Bubble’s CSDN homepage

Insert image description here

Guess you like

Origin blog.csdn.net/qq_21438461/article/details/131362571