Linux shell operation .csv file

When using the shell to process CSV files, you can use the following commands and tricks to perform common operations:

  1. Read CSV files: Use catcommands with redirection symbols ( >, >>) or pipes ( |) to read the contents of CSV files.
   cat file.csv    # 查看CSV文件的内容
  1. Extract specific columns: Use cutthe command to extract specific columns in the CSV file.
   cut -d ',' -f 1,3 file.csv    # 提取第1列和第3列的数据(以逗号作为字段分隔符)
  1. Filter Rows: Use grepcommands to filter rows in CSV files based on certain criteria.
   grep "keyword" file.csv    # 提取包含指定关键字的行
  1. Sort data: Use sortcommands to sort the data in the CSV file.
   sort -k 2n file.csv    # 按第2列进行数值排序
  1. Statistical calculation: awkStatistical calculation of data can be performed by using commands.
   awk -F ',' '{sum += $3} END {print sum}' file.csv    # 计算第3列数据的总和
  1. Modify the CSV file: Use redirection symbols ( >, >>) to output the processing results to a new file or overwrite the original file.
   grep "keyword" file.csv > filtered_file.csv    # 将包含指定关键字的行写入新文件

The following lists several commonly used ways to view the character encoding format of the current CSV file :

  1. Use a text editor: Open the CSV file and use a text editor (such as Notepad++, Sublime Text, Visual Studio Code, etc.) to view the character encoding format of the file. A display of the current encoding can usually be found in the bottom status bar of the editor or in the settings.

  2. Using command-line tools: On the command line, you can use filecommands to detect file types and encodings. Use the following command to view the encoding format of the file:

   file -i file.csv

This command will output the MIME type and encoding information of the file.

  1. Use third-party tools: There are also some third-party tools available, such as enca, chardetetc. These tools can automatically detect the character encoding format of the file.
   enca -L none file.csv    # 使用enca工具检测文件的字符编码
   chardet file.csv    # 使用chardet工具检测文件的字符编码

Please note that the above method is not absolutely accurate, especially when the file does not have a clear encoding identification. Therefore, if you find that CSV files display different encoding formats in different tools or methods, you may need to conduct further analysis and judgment, or try to use different encoding conversion methods.

Here's an example of converting a CSV file from UTF-8 encoding to another encoding, and vice versa:

	iconv -f UTF-8 -t GBK file.csv > converted_file.csv
	iconv -f UTF-8 -t UTF-16 file.csv > converted_file.csv
	iconv -f UTF-8 -t ASCII file.csv > converted_file.csv

Guess you like

Origin blog.csdn.net/qq_38202733/article/details/131570807