FatFs of the two path rules, character encoding, code pages, volume management Detailed

EDITORIAL

The content of this article comes from official documents FatFs, but added some extra sections and content. The official translation of the document is not the original FatFs. If the concerned official documents and FatFs, see this article from the reference to go see chapter can!

Format path name

  FatFs in pathname format similar to DOS / Windows file name specifications: [drive#:][/]directory/file. FatFs supports long file names (LFN) and 8.3 format file name (SFN). When FF_USE_LFN> = 1, may be used LFN. Windows same DOS / API way, FatFs subdirectory also use \ or / separated. Automatically skipped and ignores the duplicate delimiters. The only difference is the specified logical drive drive prefix digital colon + form. Drive prefix is omitted, the drive letter is assumed to be the default drive (drive current or drive 0). For 0:/folder/myfile.exeexample: .
  Control character (\ 0 \ x1F) is identified as the end of the path name. A path name in the LFN leading or embedded spaces arranged as part of a valid name, but in a non-LFN configuration, the space is identified as the end of the path name. Both configurations will ignore trailing spaces and points.
  In the default configuration (FF_FS_RPATH == 0), and it does not like the concept of the operating system as the current directory for the file system. Each object on the volume always specify the full path name starting from the root directory. Not allowed point directory name ( "." Or "..."). The title separator is ignored, it may be present or omitted. Default drive is fixed to the drive 0.
  When enabled relative path (FF_FS_RPATH> = 1), if the title is present separator, from the start at the root following the specified path. If not, the drive from f_chdir function sets the current directory to follow. Pathname points also allows the use of the name. The default drive is f_chdrive feature set of the current drive.

Path name FF_FS_RPATH == 0 FF_FS_RPATH >= 1
file.txt 0 root of the drive in the file The current files in the current directory of the drive
/file.txt 0 root of the drive in the file The current file in the root directory of the drive
The root of the drive 0 Current drive of the current directory
/ The root of the drive 0 The current root directory of the drive
2: The root of the drive 2 2 drives the current directory
2:/ The root of the drive 2 The root of the drive 2
2:file.txt Root of the drive in the file 2 2 drives in the current directory in the file
…/file.txt Invalid name Parent files in the directory
. Invalid name Catalog
Invalid name Parent directory of the current directory (*)
dir1 / ... Invalid name Current directory
/… Invalid name Root directory (always top)

  Further, the drive can be any string prefix predefined. When the option is 1:00 FF_STR_VOLUME_ID ==, any string may be used as the drive volume ID prefix. For example, "flash: file1.txt", "ram: temp.dat" or "sd:". When FF_STR_VOLUME_ID == 2, you can use the Unix style drive prefix. E.g. "/flash/file1.txt","/ ram / temp.dat" or "/ usb". However, it can not traverse the drive, such as "/flash/.../ram/temp.dat". Unix-style drive prefix may lead to the identification between the volume ID and file name confusion. For example, "/ flash" What do you mean, the file on the root of the "flash" prefix does not drive or drives prefixed with "flash"? If the header after the slash character string match any volume ID, it is considered to drive the prefix.

Note: In this release, the two-point name "..." can not follow the parent directory on the exFAT volumes. It will serve. "" Work. And stay there.

8.3 format file name

  8.3 format file names, file names known as Short (Short File Name, SFN) is a DOS + FAT12 / FAT16 age nomenclature employed, as shown below:
Here Insert Picture Description

  • 8: refers to the main part of the file name or directory name is 8 bytes or less, there is no directory name extension, a file name can have an extension indicating the type of file
  • : Separator to separate the file name and extension
  • 3: refer to the extension portion of the file name (extension) smaller than or equal to 3 bytes.

The naming of the 8.3 format, the extra characters are ignored, the DOS default approach is to replace redundant characters ~ x. Wherein x is a number, e.g., 123456 ~ 8.TXT. And if a file without an extension, then there is no meaning behind it points, namely:. File and the file is the same file.
  In addition, FAT file itself is not case sensitive, but in the vast majority of systems are using a unified name will be converted to uppercase! FatFs too!

Long File Names

  Long file names (Long File Name, LFN), who base name of the file (not including the extension) or more than 8 bytes extension over 3 bytes of the file name, are called "long file names."

Legal character and case sensitivity

  In the FAT file system, the legal character object name (file / directory name) is 0-9 AZ! # $% & '() - {} ~ @ ^ _` and any extended characters. Extended characters valid character code depends on the code page configuration. In the support system LFN +; = [] and spaces are valid object name, a space and the points can be placed anywhere in the path name, except the end of the name.
  FAT file system on the volume of the object names are not case sensitive. Object name on the FAT volume is compared in the case insensitive. For example, the three names file.txt, File.Txt and FILE.TXT are the same. Extended character is such a rule. *** When an object is created on a FAT volume, automatically converts the name of the capital recorded entry SFN and LFN feature is enabled when the original name of the record to the LFN entry. ***
  For CJK (DOS / DBCS) for MS-DOS and PC DOS, extended characters to be recorded and no entries SFN case conversion, and is case sensitive. When DOS / DBCS system to create any object that has extended characters on the volume, which can lead to compatible with Windows system problems; therefore, you should not use DBCS object name has extended characters on these systems share a FAT volume. FatFs use DBCS configuration (DOS / DBCS specification) only non-extended characters LFN case sensitive. But in the LFN configuration, FatFs for extended characters (Windows NT specification) are not case sensitive.

Character Encoding

The first stage

  Initially the computer U.S. invention, the first problem encountered character encoding, they like a long time, eventually invention employing a byte (8 bits) may be combined with a total of 256 (8 th power of 2) different states manner that their language characters, for them enough. And early ASCII only defines 128 characters (96 printable), and the remaining most significant bit is bit 1 of 128 yards be vacated.

second stage

  The second phase, the computer became popular, the original 128 was not enough, so he enabled the original empty high 128. From the character set 128-255 This page is called "extended character set." IBM PC, when engaged in the definition of the codepage 437, the bit high 128 yards gained access. However, the computer continue to be popular, now 256 characters is not enough, so on to the third stage!

The third stage

  ASCII characters can not be represented in other countries, such as Chinese, Japanese, Korean and so on. One way people think: the two byte to represent a character. ASCII is not empty out a bit high 128 yards it? The first byte in this high 128 in value, and then combined with the second byte can represent up to 128x256 = 32768 characters up. Procedures when dealing with a string of encounters have byte <128 is considered to ASCII, encounter> = 128 is put under a byte read in, converted into a character. Our first version of the character encoding GB2312 coding is to realize this program, other countries also have their own realization! Later found in China GB2312 still not enough, so the policy change represents expanded into GBK; then later, GBK not represent certain minority languages, our extended again to GB18030.
  This encoding method is called DBCS (Double-Byte Character Set, a double-byte character set).

The fourth stage

  Each country has its own DBCS implementation, which led to the character encoding of each country is not the same! For example, our country's different regions (Hong Kong, Taiwan) implementation of DBCS is different! This time ISO (International Standards Organization who) stood out to solve this problem. Their method is simple and brutal: the abolition of all regional coding scheme, including a re-engage all cultures on Earth, encoding all letters and symbols! They plan to call it "Universal Multiple-Octet Coded Character Set ", referred to as the UCS, commonly known as the "unicode". For example, the original coding regions of like dialect, while Unicode is the official language of unity around the world.
  It should be noted, Unicode is only specifies the form of regular character, but did not realize! Unicode variety of implementations: the main UTF-8, UTF-16, UTF-32 and the like. If you are interested in for character set, go to Google on their own access to relevant information!

Code page (Code Page)

  Code page (Code Page), also known as the code table, is a concept introduced by the IBM's. Remember when, IBM as the first computer maker, fame! In order to flexibly handle input computer problems, they introduced this way in the form of a table to identify encoded in a computer. The character code page is arranged in a particular sequence to selected code list.
  Later, Microsoft began to grow, Microsoft's code page to be used to deal with the problem of different coding regions. Coding each region of the Windows operating system, there is a corresponding code page. In this way, Windows can be based on different addresses to the coding region to display the text in the area! The new Windows system seems to have started using Unicode encoded
  character encoding will affect different string-related functions we use, the wrong choice will lead to coding errors string functions such as unpredictable operation failed appear! For example, API Windows systems, many of which are to distinguish between the character set used Win32 programming people fully understood! More optimistic is that if you do not involve the underlying, we usually do not have to care about character encoding process, the system will be to deal with this problem for us. However, FatFs see such a file system, character encoding must be taken into account!

Unicode API

  FatFs The user configuration, the path name to ANSI / OEM or Unicode input / output. FatFs in function, the parameter type definition specified path name is TCHAR. By default, it is an alias char, set of codes for the path name string specified by FF_CODE_PAGE ANSI / OEM. When FF_LFN_UNICODE set to 1 or greater, TCHAR type switches to the correct type to support Unicode strings. When this option is specified Unicode API, supports full-featured LFN specifications, and Unicode specific characters (such as ✝☪✡☸☭) it can also be used for the path name. It also affects the type of data and the encoded string I / O functions. To define a text string, _T (s) and _TEXT (s) can be used to automatically select the macro ANSI / OEM or Unicode. The following code shows an example of the definition of a text string.

 f_open(fp, "filename.txt", FA_READ);      /* ANSI/OEM string (char) */
 f_open(fp, L"filename.txt", FA_READ);     /* UTF-16 string (WCHAR) */
 f_open(fp, u8"filename.txt", FA_READ);    /* UTF-8 string (char) */
 f_open(fp, U"filename.txt", FA_READ);     /* UTF-32 string (DWORD) */
 f_open(fp, _T("filename.txt"), FA_READ);  /* Changed by configuration (TCHAR) */

Volume Management

  FatFs correctly work requires each volume (logical drive) dynamic workspace, a file system object. It functions by f_mount register / unregister to FatFs module. By default, each logical drive are bound to the physical drive having the same number of the drive, and the volume is mounted on a FAT volume will scan driver process. It reads the boot sector, and the sector according to the SFD is 0 format, the first partition, the second partition, a third partition and a fourth partition to sequentially check whether it is FDISK FAT format boot sector.
  When the configuration options specified FF_MULTI_PARTITION = 1, each individual logical drive is bound to the volume management table (PARTITION VolToPart [FF_VOLUMES];) partitions on the specified physical drives. It requires a user-defined volume management table (PARTITION VolToPart [FF_VOLUMES];) to resolve the logical mapping drives and partitions. The following code is an example of a volume management table.

Example: "0:", "1:" and "2:" are tied to three pri-partitions on the physical drive 0 (fixed drive)
         "3:" is tied to an FAT volume on the physical drive 1 (removable drive)

PARTITION VolToPart[FF_VOLUMES] = {
    {0, 1},     /* "0:" ==> 物理驱动器 0, 第 1 个分区 */
    {0, 2},     /* "1:" ==> 物理驱动器 0, 第 2 个分区 */
    {0, 3},     /* "2:" ==> 物理驱动器 0, 第 3 个分区 */
    {1, 0}      /* "3:" ==> 物理驱动器 1, 自动检测 */
};

The following figure shows FatFs volume management:
Here Insert Picture Description
need to consider several factors when using multiple partition configuration.

  • Having two or more physical partitions installed drive must be non-removable. Change the media when the disable system operation.
  • You can specify only four primary partitions. It does not support extended partitions.
  • Windows does not support multiple volumes on removable storage. It only recognizes the first partition.

reference

  1. FatFs official website document http://elm-chan.org/fsw/ff/doc/filename.html
  2. The official character encoding in Windows
Published 101 original articles · won praise 370 · views 530 000 +

Guess you like

Origin blog.csdn.net/ZCShouCSDN/article/details/96475193