Introduction to line breaks CRLF/LF (\r\n,\n) in Windows and Linux

question:

Sometimes, after we take a file that has been modified on Windows and open it with vim on Linux, extra characters "^M" will appear at the end of each line. What is going on?

1.CR/LF introduction

CR is the abbreviation of Carriage-Return, that is, carriage return;

LF is the abbreviation of Line-Feed, which means line feed.

CR and LF are holdovers from the days when computer terminals were teleprinters. Teletypewriters work just like regular typewriters.

At the end of each line, the CR command moves the print head back to the left. The LF command advances the paper one line.

Although the days of rolling paper terminals are over, the CR and LF commands still exist and are still used as delimiters by many applications and network protocols.

Linux (unix) and mac use "\n" as the newline character by default;

Windows uses "\r\n" as the newline character by default;

2.Unix (Linux) newline character

The newline character under Linux is "\n".

"\n" Corresponds to LF in the ACSII table, and the ACSII value is 10, which is 0x0a (hexadecimal) 

 

3.Newline character under windows

The newline character under Windows is "\r\n".

"\r" corresponds to "CR" in the ACSII table, and the ACSII value is 13, which is 0x0d (hexadecimal).

"\r" is interpreted as "^M" in vim.

4. Unix/windows format newline conversion

4.1 You can use the following tools for conversion on Linux

  1. dos2unix: Convert windows-style newlines to unix-style newlines
  2. unix2dos: Convert unix style newlines to windows style newlines

4.2 Conversion of CRLF and LF on Windows

4.2.1 Using dos2unix/unix2dos conversion

Download the windows version of dos2unix/unix2dos,

dos2unix - Browse /dos2unix/7.5.1 at SourceForge.net

For usage, please refer to the dos2unix tool.

dos2unix-7.5.1-win64-nls/share/doc/dos2unix-7.5.1/dos2unix.htm

example and RECURSIVE CONVERSION chapters

(See Appendix 2)

4.2.2 Commonly used code editors on Windows generally support the conversion of CRLF and LF.

For example, VsCode, you can choose LF or CRLF in the lower right corner;

The operation of other editors is similar.

If you need the default settings, modify them in the settings.

5. Some configurations about line breaks in git

5.1 core.autocrlf

The core.autocrlf option has three optional values:

  • true: Change to LF when submitting, and change to CRLF when checking out
  • false (default value): It will be what it is when submitting, no newline characters will be changed, and it will not be changed when checking out.
  • input: Change to LF when submitting, do not change when checking out

5.2 core.eol

The core.eol option is used to specify the line ending style of the file.

  • lf : Use LF as the line ending style.
  • crlf: Use CRLF as line ending style.
  • native (default): Use the operating system's default line ending style.

5.3 core.safecrlf

The core.safecrlf option is used to prevent mixed newline errors. It has three optional values:

  • false: Turns off checking, allowing errors with mixed newlines.
  • warn (default): Turn on checking and print a warning message when an error with mixed newlines is found.
  • true: Enable checking, print an error message and reject submission when an error with mixed newlines is found.

5.4 git configuration suggestions

Some commands to view git configuration

# 查看 git config 配置
git config -l

# 查看 git config 配置具体位置
git config --list --show-origin

# 全局配置
git config --global core.autocrlf true

5.4.1

Development environment: windows

Code compilation/running environment: windows

Recommended configuration: core.autocrlf = true

5.4.2

Development environment: windows

Code compilation/running environment: Linux/Mac

Recommended configuration: core.autocrlf = input

5.4.3

Development environment: Linux/Mac

Code compilation/running environment: Linux/Mac

Recommended configuration: core.autocrlf = false (keep the default configuration)

5.4.4

Development environment: Linux/Mac

Code compilation/running environment: Windows

Recommended configuration: core.autocrlf = true

The personal configuration is to keep the default configuration.

Personal work situation is:

99% probability of submitting code on Linux and running on Linux;

There is a very small probability that it is possible to submit a bat script on Linux;

 So keep the default configuration.

For bat scripts submitted in the Linux environment, manually convert them into CRLF format.

Appendix 1. ASCII code table

Appendix 2. Introduction to how to use dos2unix

EXAMPLES
    Read input from 'stdin' and write output to 'stdout':

        dos2unix < a.txt
        cat a.txt | dos2unix

    Convert and replace a.txt. Convert and replace b.txt:

        dos2unix a.txt b.txt
        dos2unix -o a.txt b.txt

    Convert and replace a.txt in ascii conversion mode:

        dos2unix a.txt

    Convert and replace a.txt in ascii conversion mode, convert and replace
    b.txt in 7bit conversion mode:

        dos2unix a.txt -c 7bit b.txt
        dos2unix -c ascii a.txt -c 7bit b.txt
        dos2unix -ascii a.txt -7 b.txt

    Convert a.txt from Mac to Unix format:

        dos2unix -c mac a.txt
        mac2unix a.txt

    Convert a.txt from Unix to Mac format:

        unix2dos -c mac a.txt
        unix2mac a.txt

    Convert and replace a.txt while keeping original date stamp:

        dos2unix -k a.txt
        dos2unix -k -o a.txt

    Convert a.txt and write to e.txt:

        dos2unix -n a.txt e.txt

    Convert a.txt and write to e.txt, keep date stamp of e.txt same as
    a.txt:

        dos2unix -k -n a.txt e.txt

    Convert and replace a.txt, convert b.txt and write to e.txt:

        dos2unix a.txt -n b.txt e.txt
        dos2unix -o a.txt -n b.txt e.txt

    Convert c.txt and write to e.txt, convert and replace a.txt, convert and
    replace b.txt, convert d.txt and write to f.txt:

        dos2unix -n c.txt e.txt -o a.txt b.txt -n d.txt f.txt

RECURSIVE CONVERSION
    In a Unix shell the find(1) and xargs(1) commands can be used to run
    dos2unix recursively over all text files in a directory tree. For
    instance to convert all .txt files in the directory tree under the
    current directory type:

        find . -name '*.txt' -print0 |xargs -0 dos2unix

    The find(1) option "-print0" and corresponding xargs(1) option -0 are
    needed when there are files with spaces or quotes in the name. Otherwise
    these options can be omitted. Another option is to use find(1) with the
    "-exec" option:

        find . -name '*.txt' -exec dos2unix {} \;

    In a Windows Command Prompt the following command can be used:

        for /R %G in (*.txt) do dos2unix "%G"

    PowerShell users can use the following command in Windows PowerShell:

        get-childitem -path . -filter '*.txt' -recurse | foreach-object {dos2unix $_.Fullname}

References:

CRLF_Baidu Encyclopedia

Baidu Encyclopedia-CRLF

[git series 4/4] How to set core.autocrlf | core.safecrlf (the meaning of configuration values ​​and best practices)

[git series 4/4] How to set core.autocrlf | core.safecrlf (the meaning and best practices of configuration values) - CSDN Blog

Git automatic newline character (autocrlf) input converts newline characters from LF to CRLF

Does Git automatic line break (autocrlf) input convert line breaks from LF to CRLF | Geek Notes

Problems and solutions for ^M in Shell scripts

Problems and solutions of ^M in Shell script-CSDN Blog

Sourceforge-dos2unix

https://sourceforge.net/projects/dos2unix

おすすめ

転載: blog.csdn.net/ever_who/article/details/133419705