iOS Development Exploration - Base64 Encoding

iOS Development Exploration - Base64 Encoding

Reprint: Original address https://www.jianshu.com/p/b8a5e1c770f9

Base64 encoding principle

Base64 encoding is called Base64 because it uses 64 characters to encode arbitrary data. Similarly, there are Base32 and Base16 encoding. The 64 characters used by the standard Base64 encoding are:

Base64 encoding table

 

These 64 characters are a subset of characters used by various character encodings (such as ASCII), basic, and printable. The only thing that is a bit special is the last two characters. Due to the different choices of the last two characters, there are many variants of Base64 encoding, such as Base64 URL encoding.

Base64 encoding is essentially a scheme for converting binary data into text data. For non-binary data, it is first converted into binary form, and then the decimal value is calculated for every 6 consecutive bits (2 to the 6th power = 64), and the corresponding character is found in the index table above according to the value, and finally a text string.

Suppose we want to base64 encode Hello!, according to the ASCII table, the conversion process is shown in the following figure:

conversion process

 

It can be seen that the Base64 encoding result of Hello! is SGVsbG8h, the length of the original string is 6 characters, and the length after encoding is 8 characters. Every 3 original characters are encoded into 4 characters by Base64, and the length ratio before and after encoding is 4/3. The ratio is important - shorter than the original string length, you need to use a larger coded character set, which is not what we want; the larger the length ratio, the more characters need to be transmitted, and the longer the transmission time. The reason why Base64 is widely used is to achieve a good balance between character set size and length ratio, which is suitable for various scenarios.

Do you think the principle of Base64 encoding is very simple?

But here is a point to note: Base64 encoding is to encode every 3 original characters into 4 characters. If the length of the original string cannot be divisible by 3, what should we do? The original string is supplemented with a value of 0.

Taking Hello!! as an example, the conversion process is:

 

conversion process

 

Note: Binary 0 values ​​with blue background in the chart are additional supplements.

Hello!! Base64 encoded result is SGVsbG8hIQAA. The last two zero values ​​are only supplemented for Base64 encoding. There is no corresponding character in the original character, so the last two characters AA in the Base64 encoding result do not actually carry valid information, so special processing is required to avoid decoding errors.

Standard Base64 encoding usually replaces the last A with the = character, that is, the encoding result is SGVsbG8hIQ==. Because the = character is not in the Base64 encoding index table, its meaning lies in the end symbol. When you encounter = during Base64 decoding, you can know the end of a Base64 encoded string.

If the Base64 encoded strings will not be spliced ​​to each other and then transmitted, the last = can also be omitted. If the length of the Base64 encoded string cannot be divisible by 4 during decoding, add the = character first, and then decode it.

Decoding is the reverse operation of encoding, but pay attention to one thing: for the last two = characters, convert them into two A characters, and then convert them into the corresponding two 6-bit binary 0 values, and then convert them into the original characters. The two 6-bit binary 0 values ​​are discarded because they actually carry no valid information.

The following is the sample code

1. Implemented by NSString+Base64 classification

 

`
#import "NSString+Base64.h"
@implementation NSString (Base64)

 

2. Call method implementation

 

3. Test results

Base64 encoding test results

Do not misuse

Someone may misuse Base64 encoding for data encryption or data verification without understanding it.

Base64 is a data encoding method , the purpose is to make the data conform to the requirements of the transmission protocol. Standard Base64 encoding and decoding is completely reversible without additional information. Even if you design a Base64-like encoding method for data encryption with your own custom character set, it is easier to crack in most scenarios.

For data encryption, a special encryption algorithm that has no effective way to crack quickly should be used . For example: Symmetric encryption algorithm AES-128-CBC, symmetric encryption requires a key, as long as the key is not leaked, it is usually difficult to crack; you can also use asymmetric encryption algorithms, such as RSA, which requires a huge amount of calculation by factoring a large integer. A feature that enables data encrypted with the public key to be decrypted quickly only with the private key.

For data verification, a specialized message authentication code generation algorithm should also be used , such as HMAC - a method of constructing a message authentication code using a one-way hash function, the process of which is irreversible, uniquely determined, and uses a key to generate Authentication codes, whose purpose is to prevent data from being tampered with or forged during transmission. The original data is transmitted together with the authentication code, and the data receiving end uses the same key and the same algorithm to generate the authentication code again from the original data, and compares it with the original authentication code to verify the validity of the data.

Summarize

Base64 takes into account the size of the character set and the length of the encoded data, and can flexibly replace the last two characters of the character set to meet diverse needs, making it applicable to a wide range of scenarios.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325578176&siteId=291194637