Android basic review notes-2. Encoding, encryption and login authorization

1. Cryptography

Origin: Ancient War

When fighting in ancient times, messengers were asked to ride a horse to deliver letters. They often worry about the messenger being caught.

Shift encryption: password stick

An earlier encryption method
Insert picture description here

A cloth strip is wrapped around the password stick, and both sides have the same size password stick

Alternative encryption

Insert picture description here

Code table

2. Modern cryptography

It can be used not only for text content, but also for various binary data.

Symmetric encryption:

Much like replacement encryption

principle:

Use the key and encryption algorithm to convert the data, and the meaningless data obtained is the ciphertext; use the key and decryption algorithm to reverse the ciphertext to obtain the original data.

process:

The original data is changed into an unreadable ciphertext using an encryption algorithm and key. At this time, the other party gets the ciphertext, and he has the key, I have the key, but no one else has it. Then he can decrypt, he uses the decryption algorithm with the key, he can decrypt and get the original data. It is useless for anyone else to get this data.
Insert picture description here

This is useful for computers, why? Because when we communicate, our network is completely untrustworthy. On the way from my home to yours, there may be very, very many intermediate nodes, and they want to take our intermediate data easily. why? We transmit data on the Internet. For example, if I send you a message, it is not that the message is walking along the same road, but it is in the form of propaganda, which is all radial.

How is it different from traditional alternative encryption? He can encrypt binary files

Classic algorithm:

DES (deprecated because the key is too short), AES

Why is the short key deprecated? Because the key is short, it may be cracked.
What is cracking? What is the cracking of the symmetric encryption key? That’s it, we cracked communications, we wrote letters in ancient wars, I know what your password stick is, I know what your code table is, when I get your encrypted ciphertext, I can Give you back. Then I got your ciphertext or code table and cracked it. What is his key? You can get a pair of the original text and the cipher text, and then after you decrypt the cipher text with the key you prepared, it really restores to the original text, which means that the decryption is successful.
Similarly, where there is a crack, there is an anti-crack. The idea of ​​anti-cracking is very straightforward, that is, the method I make you crack is more complicated. The best encryption algorithm is that his key can only be cracked by exhaustive methods. You can't make others unable to crack, but the cost is very high, for example, it takes a thousand or ten thousand years to crack, then I think you can't crack this thing.

Asymmetric encryption:

principle:

Use the public key to encrypt the data to get the cipher text; use the private key to decrypt the data to get the metadata.
Extended use: digital signature

Classic algorithm:

RSA (encryption, decryption, and signature verification are available), DSA (specially designed for signing, the process of signing and verification is extremely fast, with speed advantages, the full name of elliptic curve is ECDSA)
Insert picture description here

How did it do it? The knowledge of mathematics is used. I can explain it briefly, but my explanation is not sufficient to fully explain asymmetric encryption. Because the simplest algorithm for asymmetric encryption is also more complicated. The RSA algorithm is a relatively simple algorithm, but it is still relatively difficult to understand. So let me give you a simple example. There may be loopholes in this example, but you need to know what she means.
For example, our two parties have communicated, and we stipulate that there are only ten communication content, 0-9, and only these ten can be sent. For example, I am sending you a message 110, but if I were intercepted, I would be finished. So I want to make a conversion to her. Then how do I turn? What is my algorithm? What are my encryption key and decryption key? At this time, his usefulness was revealed. My encryption algorithm is to add an addition to each of my characters. And this addition is my key. Then I stipulate that my encryption key is 4, my decryption key is 6, encryption: 110–>554, send it out. Next, I want to decrypt. My decryption algorithm still adds, but the added is 6,554–>110, which is restored.
Although this example has loopholes, she can explain the principle of asymmetric encryption. It is related to overflow, if overflow is not allowed, asymmetric encryption cannot be played. If you overflow all kinds of things, delete the point in front of him. This is a key point of asymmetric encryption.
Insert picture description here

Now, I assume that there are bad people on our way, and they will see our news. But now, we can do encryption. Moreover, I can just throw the key to the other party online. This throw itself, he is also dangerous. Isn't symmetric encryption very secure? Why do we need asymmetric encryption again? One very important reason is how to give your key to the other party. He is a problem. This is not because you want to come to my house, come to my house every day, I will give you a key, and I will give you the key when we meet. Not like this, but I want to send you the key. If your house keys are really so popular, and the couriers will take them apart and take a look, would you dare to send the keys to others? Don't dare. This kind of communication on the Internet is the same. It is perfectly fine for the two of you to communicate through symmetric encryption. But how do you give the key to the other party? This problem is very difficult to solve. And there are various situations, many of which are when the two of us have a temporary intention and want to chat. Let's add a friend. How to add it? I ran to your house to give you the key? right? This is a very important benefit of asymmetric encryption. I can directly transmit the key online without any security risks.
See how there is no safety hazard. First, I give my encryption key from a to b, and b also gives her encryption key from b to a. First I see if a and b can communicate. For example, if a wants to send a message to b, I use the encryption key of b to encrypt my original data. After encryption, it is the ciphertext, and b uses the decryption key to decrypt it to get the original data. Similarly, b sends a message to a.
Then let's suppose that c will definitely not understand the ciphertext when he gets it, but if he not only gets the ciphertext, he also gets the two encryption keys we sent during transmission. Can he read our ciphertext now? Still can't watch. If I say I can see it, think about it, how do I see it? I now have a ciphertext, which was sent by a to b, and he encrypted it with b's encryption key. Which key do you think c uses to decrypt it? Will not work. A key point of asymmetric encryption is that it does not use a key for encryption and decryption. So it is useless for you to take my two encryption keys at this time. The decryption key must be held in your hand and cannot be moved. This is the key. It must be held in your hand, no one can give it, it is impossible to send it out. The encryption key is freely disclosed to the outside world. In fact, the encryption key you disclose to the outside world is called the public key, and the decryption key is called the private key.
Insert picture description here

Extended use of asymmetric encryption: digital signature

Can the public key solve the private key?

First of all, there is a question, the private key can solve the public key, so can the public key solve the private key? is allowed.
But one thing to note is that public and private keys cannot be reversed. Because the public key can be calculated in many cases, there is an asymmetric encryption algorithm called the elliptic curve algorithm, which is an algorithm used by Bitcoin. His public key is calculated by relying on the private key. When you get the private key, it is equivalent to holding both the public key and the private key in your hand. Although they have a peer-to-peer relationship in data conversion, in many cases your public key can be calculated by others, so it doesn’t matter if you publish a public key, it should be public, as long as the private key is not public, you did not post the calculation. My private key.
There is also a situation where RSA has only one public key. Your public key is this value, and my public key is also this value. It is not that the public key is completely a value, it is part of the public key, and the public key is the most critical. That part is the same. It seems to be 65537 I remember. Therefore, the public key and the private key cannot be exchanged. But they are exactly the same in encryption. You can solve me, I can solve you.
Because of this feature, I can sign and verify it.

Signature and certification:

One of the more common usages is to digitally sign and electronically sign online. I have a key of mine. For my data, for example, I claim that I owe Xiaozhuang a hundred dollars. I can write it by hand or use an electronic signature. How did he sign? It is to use a certain key of mine to transform the encryption algorithm of the article I wrote, after others get it, then use another public key to decrypt it, and at first glance, it can be replaced with the original data, which can prove that this thing is indeed me written.
Insert picture description here

In addition, there will be such a step. It is very inconvenient for you to hold only this signature data. You have to decrypt the original data with the public key every time you look at it. Therefore, the original data and the signature data are usually sent together. At this time, you can see the original data, and you need to verify that the data can be restored with the signature data.
Insert picture description here

3.Base64

An encoding algorithm that converts binary data into a string of 64 characters. (Az, AZ, 0-9, +, /)

What is binary data? Non-text data is binary data. Broadly speaking, all computer data is binary data. Because the electrical signal is only 1010 and the computer data has only two bits, the computer data is all binary data, no matter what text documents, movies, word documents, and pictures are all binary data. And one of the more special ones is called text data, which is pure text, such as strings, such as some text stored in a txt file of yours. These are called text data or character data. In addition to text data, it is what we usually call binary data when communicating, which is binary data in a narrow sense. So there are two types of data, text data and binary data.

Base64 code table:
Insert picture description here

2 to the 6th power is 64, base64 is 6 bits

How did he switch? He cuts your data. When we do data, don’t we have 8 bits per byte? One bit is 0 or 1, and you can only choose one of the two. The eight bit is a byte, but base64 cuts you into 6 bits in order to make your data into a string. The specific conversion is like this. The conversion of Man to base64 is TWFu.
Insert picture description here

After base64 conversion, the amount of data will become larger.

How to make your data base not increase after conversion?
Then don't use base64, use base256, but why was base64 created? It is because those common characters do not have 256, if there are 256, then what base64 is needed. What is the purpose of base64? His purpose is to convert your binary data (in a narrow sense) into text data, and convert it into a string.

use:

Let the original data have the characteristics of the character string, such as being able to be placed in the URL for transmission, can be saved to a text file, and can be text transmitted through ordinary chat software. (Convert a non-character string into a string) Turn a string
that is originally readable by human eyes into a string that cannot be read to reduce the risk of peeping.

When will base64 be needed? Take a relatively early scene, that is, the mailbox has just been invented, and the mailbox cannot send pictures. But the computer can save pictures, so how can I send you pictures? Okay, I use base64 to transfer, the file becomes bigger, it doesn't matter, send it slowly. Then use the base64 decoding algorithm to solve it for him. Then we will pass the picture through text. To give another example, if the two of us build a new chat software, but our technology is limited, we can’t support pictures, and can’t upload pictures, what should we do? It doesn't matter, convert the picture to base64, and then I will pass the text data of base64 to you. After passing it, you can decode it locally. The picture will be decoded. What is this? Sometimes he has restrictions and can't do this. The old mailbox is like that. Without this function, you can't upload pictures. I want to upload pictures again. OK, I will use base64 to transfer.

Is base64 encrypted transmission of pictures safer and more efficient? no. Security can only rely on encryption, base64 is not encryption; efficient, after base64 conversion, the data becomes larger and 1/3 longer, can it still be efficient? Regardless of whether you are storing, transmitting, or reading, its speed has slowed down, and it also takes up your bandwidth. You may originally transmit this thing. After this thing is transmitted, you can transmit other things earlier, but Since your thing has been transferred by base64, it's over, you need to extend the transmission time by 1/3. Base64 is absolutely not efficient, on the contrary it is inefficient. Can it be base64 or not, and this thing is poisonous, it will grow longer every time you make this thing.

Base58

There is a variant of base64, base58, which removes 0 and uppercase o, English uppercase i and lowercase l, and two + and /. It is used on Bitcoin or other virtual currency addresses. What are the characteristics of this address? He may be copied by hand. The + and / are removed for the convenience of double-clicking to copy.

4.URL encoding

The encoding of the url address is another variant of base64. The encoding of the
url is also base64, but it has a little difference:
the reserved characters in the url are encoded with a percent sign "%".
Example: the & in the reserved characters is converted to% 26:
Insert picture description here

Purpose: eliminate ambiguity and avoid mistakes

There is also for Chinese display
Insert picture description here

The display is in Chinese, but it is gone after copying

https://www.google.com/search?q=%E6%A4%AD%E5%9C%86%E6%9B%B2%E7%BA%BF%E7%AE%97%E6%B3%95&oq=%E6%A4%AD%E5%9C%86%E6%9B%B2%E7%BA%BF%E7%AE%97%E6%B3%95&aqs=chrome..69i57j0l2.7739j0j7&sourceid=chrome&ie=UTF-8

5. Compression and decompression

Compression: store the data in a different way to reduce storage space
. Decompression: restore the compressed data to its original form in order to use
common compression algorithms: DEFLATE (compression algorithm for zip), JPEG, and MP3
compression. Encoding?
What exactly is encoding? Convert a format to b format, and at the same time, b format can be transferred back without losing any information or adding any information. Compression is also a kind of encoding.
The following section explains in detail

6. Encoding and decoding of media data

What is media data? It is pictures, videos, audios, etc. What are their codecs? Their codec is to convert the original data into an encoding format that can be stored. For example, for pictures, the image data is compiled into files.
For example, how to encode images?
For example, I have a picture, this picture is 64x64, it is a pure white picture. So how do I write and how to save it? It may be a bitmap in the memory. When I want to encode it and save it, how do I save it?
ffffff represents a white point
ffffffffffff...64x6
ffffffffffff...64x6
ffffffffffff...64x6
...64 lines

This is the encoded picture, but it's so annoying, so big. We can compress, and compress at the same time as encoding. How to compress? There are various compression methods. If we really do compression, what is the essence of DEFLATE algorithm and various other algorithms?

Example: aaaaaaaaaaaaaaaaaa...aaaaaaaaaaaaaaaaaaabbbbbbbb...bbbbbb After
some kind of rough compression: text:a=100;b=20 The
above picture can also be: image:64*64;ffffff=[0,0]-[63,63] The
same audio and video The same is true for the compression of. The above is just an example. This compression is not rigorous. An excellent compression algorithm will never make the file larger. Once the types of the above increase, it may be larger than the file itself after compression.

Lossy compression and lossless compression: You can make her pixels smaller, and you can also change the number of colors a bit, so that my data will be lost after pressing.

7. Serialization

There is a data in my java memory, he has several attributes
Insert picture description here

This is in the memory. I need to send these things out. One is on our network, and the other may be our mobile phone or local storage, but our memory is messy, how do we store various formats? I want to take it out into a linear format that can be parsed in general. This is serialization.
Like the above, I can serialize it into json format. If it is serialized, I can choose the json format. json is just an option for serialization. You can serialize it into json, xml... In short, you only need to convert it to storage and transmission, and it is a linear one.
Insert picture description here

Serialization: The process of converting data objects (usually in memory, such as objects in JVM) into byte sequences.
Deserialization: Re-converting byte sequences into objects in memory.
Purpose: to communicate with the outside world

Does serialization belong to encoding?

Strictly speaking, serialization does not belong to coding, because the prototype of serialization is in the memory. What kind of coding is the format of a and the format of b? The coding is two things that have been formed. His format conversion, but the prototype of serialization is not a format. , Is a pile of memory. It's just not coding in a strict sense, but there is no strict requirement for coding.

8.Hash

definition:

Convert any data into data of a specified size range (usually very small). For
example, we have two hundred classmates. We code each student a number, 001, 003... This process is a hash process, and each person’s number is A hash value.

effect:

Used as a summary, digital fingerprint

For example:
directly use the length of the string to make a hash
"haha"–>4
"pa"–>2
followed by the value is the hash value, this is a very bad hash, because
"hehe"–>4
hash will have an identity This is a requirement for identification, so the hash requires a very small collision rate, then study how your hash algorithm is calculated. The data for you may be very large or very small. You all need to get results quickly and not collide with each other.

Classic algorithm:

MD5 (in terms of anti-cracking, it has been basically abandoned because it is too easy to crack), SHA1, SHA256

the real function:

Data integrity verification:

For example, if we want the next installation package, the publisher may provide you with a hash value for verification below, and he will indicate to you whether it is a sha1 value, a sha256, or an md5, and he will tell you, It may also provide you with multiple verified values. What does this do? The uploader has a source file, he has 5 g, his md5 value is 7788 after the calculation, after you download, your download file may have been damaged, maybe your download tool went wrong during the download process, or someone Modify your network. After you download it, this thing may be broken. You should also calculate your hash value for her. For example, calculate his md5. If it is also 7788, prove that the file you downloaded is complete. If you count it as 2567, then your file download failed. Hash is like extracting feature values ​​from a bunch of your data. If you calculate this result multiple times, it can exist as a fingerprint.

(This part of understanding has to be revised if there is a problem)

quick search:

When hashCode() and HashMap
are useful in java, the hashCode method must be rewritten when rewriting the equals method.
Insert picture description here

Hashcode is used to quickly judge whether it is equal (pre-judgment)

privacy protection:

A few years ago, there was a website that was stripped of pants, and technical blog sites were stripped of pants. After being stripped of pants, many people said that they used plaintext storage, which led to the leakage of user privacy. What is the reason for this? First of all, it is easy to understand that user data is stolen. If it is stored in plain text, the stolen person will keep trying your username and password on other websites, which will result in the theft of a lot of user information. So what is non-plaintext storage? That is, after the server receives the user account and password, it converts it with md5 and stores it (the next time you log in, the passed password md5 will be encrypted and converted once and then compared to the locally stored password), so that after it is stolen, he cannot go to other Website login.
At the same time, there is something called adding salt. What is adding salt? Those hackers are also improving, where do we find it more difficult to crack? Because the hash is not reversible, first, you can’t directly log in to the website to verify when you get this thing, because it’s another value once you turn it; second, this thing can’t calculate its original password in reverse. But those hackers have a lot of time. They store the commonly used passwords one by one and then map them. When they steal your library, reverse the calculation. For example, the password is 123456, md5 is converted to ddddd, and the hackers do an md5 mapping. Table, we know that the original password corresponding to ddddd is 123456. This brute-force cracking mapping table is called a rainbow table, which can crack this way of storing passwords using hash to a certain extent. Then, using salt can crack this thing. What is salting? Each website defines a salt by itself. This salt is strictly confidential for you and will not be taken away by others. It would be miserable for someone to take the salt away when you take off your pants. What is salt? Salt is when you make md5 or sha1, instead of using 123456 as hash, you add 333 and 123456333. This 333 is your salt, and its md5 value is completely another value. The salt of each website is different, which causes the rainbow table to fail. Generally, the salt will not be 333, but a long string of unconventional characters.

Is Hash encoding?

No, what is the encoding? You compile the past, and compile any data back without loss. What is hash? Hash is to extract your characteristics, he cannot reverse it.

Is Hash encrypted? It is said that MD5 is irreversible encryption?
It's not encryption, Baidu Baike is mistaken, it is an irreversible transformation. Someone invented an irreversible encryption, which sounds very reasonable. But if you just want to change the meaning of the word, encryption means to change the format A to format B, which others cannot understand and can be restored. But you have to distort the two words encryption into other meanings. What they mean is to make others unable to understand. Then I admit that MD5 is an irreversible encryption, but the word encryption has its own definition. Can't talk nonsense.

9.Hash and asymmetric encryption

Insert picture description here

This is the previous picture. This picture has a shortcoming. Its signature data is the same size as the original data, because the signature data can be restored. It doesn't matter if your file is small, a 10G video, his signature should be as big as 10G, thinking about it, it feels bloated. In fact, they will hash the data first, then sign, and sign the hash value.
Insert picture description here

10. Character set

Meaning: A Map from integers to text symbols in the real world

Branch:

ASCII:

128 characters, one byte (256 can be used, but ASCII only uses 128 of them)

ISO-8859-1:

Expand ASCII, 1 byte

Unicode:

130,000 characters, multibyte
UTF-8: Unicode encoding branch
UTF-16: Unicode encoding branch
What is an encoding branch?
For example, I have three words: Chinese.
Among them, "中" may correspond to 00000001,
"Guo" corresponds to 00001111, and
"人" corresponds to 11111111.

This is their corresponding encoding, but in actual expression, we may not write
that way. China 01
Country 001111
People 11111111 is
a bit shorter, this may be UTF-8

Another way, maybe 1111 people
in China 0001 11111111 This may be UTF-16


That is, their character sets may be the same, but the specific encoding methods may be inconsistent. This is a different encoding. Of course, the above example is wrong. For example, there is an obvious difference between UTF-8 and UTF-16. UTF-16 represents a character with 16 bits, so there is no such a short one above. The above is just an example.

GBK

GBK/GBK2312/GBK18030: Chinese self-developed standard, multi-byte, character set + encoding

Guess you like

Origin blog.csdn.net/qq_36333289/article/details/109049480