SDU Crypto School - Computational Indistinguishability 1

Encryption: Computational security 1-4

Speaker: Li Zengpeng (Shandong University)

Reference materials: Jonathan Katz, Yehuda Lindell, Introduction to Modern Cryptography - Principles and Protocols.


what is encryption

First, the purpose of the encryption scheme is to complete secure communication between Alice and Bob. Specifically, any given plaintext mmm , Alice encrypts it to getccc and send it to Bob through an insecure channel, and Bob hasccAfter c is decrypted,mmm , while the third party Eve cannot obtainmmm any relevant information.

insert image description here


In order to achieve this, there have been many so-called cipher schemes since ancient times, such as Caesar cipher, Virginia cipher, fence encryption, Enigma machine and so on.

One of the common characteristics of these schemes is that there is no rigorous theory to ensure that they are safe. It is not so much the science of cryptography as it is the art of cryptography that carries the brilliant imagination of the ancients.

The second common feature of these schemes is that they are all broken. According to (Prof. Ding Jintai), the Argentines were still using a password system similar to the Enigma machine during the Falklands Battle, so battlefield information should be one-way transparent to the British.


Provable security is an important step in cryptography from art to science.

Another principle in designing modern cryptosystems is that the security of cryptosystems does not depend on the secrecy of encryption schemes or algorithms, but on the security of keys (Kerckhoffs Principle, 1883).

The reasons are as follows:

  1. It is actually easier to protect the key than the algorithm;
  2. Changing keys is also easier than changing algorithms;
  3. The encryption algorithm is unified to facilitate the application;
  4. The public algorithm is helpful for the analysis of algorithm loopholes.

So we can rewrite the general picture of the encryption system according to the Kerckhoffs principle:

  • m m m is the plaintext message
  • key kkk draws from a uniform distribution
  • c = E n c k ( m ) c = Enc_k(m) c=Enck( m ) exposed to the attacker

Note that, compared to the most general case of an encryption system, here we introduce the key kkk , hope that the attacker cannot obtainkkk andmminformation about m .


perfect security

Some notation:

insert image description here

The question now is, how do we define the security of a cryptographic system within the framework of the Kerckhoffs principle?


Idea one:

Definition. If the attacker has no way to calculate kkk , then this scheme is safe.

——It is true that the key is not known to others... Is it right?

A counterexample is, Enck ( m ) = m Enc_k(m) = mEnck(m)=m . In this case, you do not expose anykkk information, butmmm is running naked, and the security effect of encryption is similar to that of the emperor's new clothes.

insert image description here


Idea two:

Definition. If the attacker has no way to calculate mmm , then this scheme is safe.

If an attacker only needs to compute the ciphertext mmSome characteristics of m can infer enough information, so is this scheme safe?

For example, I guess whether a four-character place name is Qiqihar or Urumqi. I even only need to have a few words in these four words or whether there are repeated words to guess the answer.


Idea three:

Definition. If the attacker cannot obtain mmm , then the scheme is safe.

But the attacker may have obtained some prior information, such as knowing the message mmm is written in English.


Idea four:

Definition. If the attacker cannot obtain mmm , then the scheme is safe.

This is more reliable, but how to express it formally?

An example is described using probability distributions/information theory:

The upper half of the Latour is before the communication. mmm is in the hands of Alice, the attacker knows nothing aboutmA prior distribution for m .

The second half of the drawing is after the communication. mmm is in Alice's hands, the attacker knowsc = Enck ( m ) c = Enc_k(m)c=Enck( m ) , knowingccUnder the premise of c , the attacker's prior has not changed at all. This is the so-called "attacker cannot obtain information aboutmmm any additional information".

Thus, we get the perfect security definition based on conditional probability formulation:

Equivalently, we have several definitions:

  • Plain text mmThe distribution of m andEnck ( m ) Enc_k(m)Enck( m ) are independent.
  • E n c k ( m ) Enc_k(m) Enck( m ) distribution andmmm is irrelevant.
  • For any m 0 m_0m0 m 1 m_1 m1, we have Enck ( m 0 ) Enc_k(m_0)Enck(m0) E n c k ( m 1 ) Enc_k(m_1) Enck(m1) have the same distribution.

An equivalent definition of perfect security is Adversarial Indistinguishability. Here, we introduce an opponent A \mathcal{A}A , and define what isA \mathcal{A}The unrecognizabilityof A (roughly speaking, given plaintextm 0 , m 1 m_0, m_1m0,m1And randomly pick one to encrypt to get ccc A \mathcal{A} A has no way of knowing thisccThe plaintext corresponding to c is m 0 , m 1 m_0, m_1m0,m1which of the).

In order to define perfect security, we introduce an indistinguishability experiment (experiment) or game (game), denoted as P riv KA , Π eav PrivK^{eav}_{\mathcal A, \Pi}P r i v KA , Pand v _. Π = ( Gen , Enc , D ec ) \Pi=(Gen, Enc, Dec )Pi=(Gen,Enc,Dec ) is a cryptographic system, this game only considers eavesdropping attackerA \mathcal AA , the attacker has only oneccc and trying to base on thisccc Determine some information of the plaintext.

Definition. P riv KA , Π eav PrivK^{eav}_{\mathcal A, \Pi}P r i v KA , Pand v _It is defined as follows:

  1. Initialize the encryption scheme (e.g., generate the key kkk)。
  2. The attacker gives a pair of plaintext m 0 , m 1 ∈ M m_0, m_1 \in \mathcal Mm0,m1M
  3. The communicating party randomly selects an mb , b ∈ { 0 , 1 } m_b, b\in \{0, 1\} without the attacker's knowledgemb,b{ 0,1 } , returnccc
  4. The attacker according to ccc gives abbEstimatedb'b' of bb' . Note that thisb ′ b’b may be a probability distribution.
  5. if b = b' b = b'b=b , then the attacker wins (return 1), otherwise loses (return 0).

If the system is secure, then your attacker has no control over ccNo matter how hard c is tossing, the final output result is the same as the guessing effect.

So we can define indistinguishability:

Definition. Defined in plaintext space M \mathcal MCryptosystem on M ( G en , E nc , D ec ) (Gen, Enc, Dec)(Gen,Enc,Dec ) is indistinguishable if:

P [ P riv KA , Π eav = 1 ] = 1 2 P[PrivK^{eav}_{\mathcal A, \Pi} = 1] = \frac{1}{2}P[PrivKA , Pand v _=1]=21


one-time pad

In reality, there is an encryption scheme with perfect security: One Time Pad (OTP).

For a binary string mmm , we randomly generate andmmA binary string of equal length kkk , wherekkEach bit of k is an independent uniform distribution,c = E nck ( m ) = m ⊕ kc = Enc_k(m) = m \oplus kc=Enck(m)=mk , where⊕ \oplus is defined as an XOR operation.

The security of OTP is obvious. Because, for each mmm ,ccEach bit of c is evenly distributed.

We generalize the OTP:

(G, +) (G, +)(G,+ ) is to add a group. K = M = C = G \mathcal K = \mathcal M=\mathcal C=GK=M=C=G. _ then,

  • E n c k ( m ) = m + k Enc_k(m) = m+k Enck(m)=m+k
  • D eck ( m ) = m − k Dec_k(m)=mkDeck(m)=mk
    constitutes the one-time pad.

But OTP is not practical because:

  • The key and the plaintext are equal in length

  • One-time password, burn after use, otherwise information may be leaked.

    For example, in OTP, if the same key encrypts m 1 , m 2 m_1, m_2m1,m2After getting c 1 , c 2 c_1, c_2c1,c2, then the attacker knows that m 1 ⊕ m 2 = c 1 ⊕ c 2 m_1 \oplus m_2 = c_1 \oplus c_2m1m2=c1c2- This leaks information in plaintext.

In addition, we can prove that if a cryptographic system is perfectly secure, then it must be ∣ K ∣ ≥ ∣ M ∣ |\mathcal K| \geq |\mathcal M|KM

(The reason is simple, if ∣ K ∣ < ∣ M ∣ |\mathcal K| < |\mathcal M|K<M , the original attacker is to guessmmm , now the attacker guesseskkThe odds of winning with k are better than guessingmmThe odds of winning for m are even higher, which means that the encryption process leaks information. )

Therefore, OTP has become the most efficient solution in perfect security.


"Can you stop, Azu?"


Limiting the ability of attackers: indistinguishability, semantic security

One idea to make a more practical cryptographic system is to limit the capabilities of the attacker.

Restriction methods include:

  1. Limiting the computational power of attackers - computational security.
  2. Other methods: Quantum cryptography, memory-bound models, etc.

The characteristic of quantum cryptography is that it is impossible for the attacker (Eve) to eavesdrop without the knowledge of the communicating parties (Alice and Bob). Once the attacker successfully eavesdrops, the quantum state will be disturbed, and the communicating party can know that he has been eavesdropped through the communication content based on the quantum state transmission.

insert image description here
insert image description here

(It should be noted that quantum cryptography and quantum computing are not the same thing)


The mainstream approach is to limit the attacker's computing power. If an attacker can perform a brute force attack regardless of the cost, then any password can be cracked.

Here comes the question: how to define the limit on the attacker's computing power?

Can an attacker buy several V100s and use them until they are scrapped, or rent several cloud servers within a year?

Even with these computing resources, what kind of attack algorithm does the attacker run?

Therefore, this definition is not very beautiful in theory, and it is not convenient for analysis.

Next, we abstract the calculation model. Specifically, we can use a Turing machine to describe the attacker's capabilities:

A system XXX is( t , ϵ ) (t, \epsilon)(t,ϵ ) is safe if any running time does not exceedttThe probability that a Turing machine of t can crack it does not exceedϵ \epsilonϵ .

In fact, there are many Turing machine models, and basically it can be proved that they are equivalent to the most general Turing machine. Then the Church-Turing proposition also tells us that any algorithmically computable problem is also computable by a Turing machine. So we can abstract various models based on Turing machines into algorithms, so all discussions are based on algorithms, which avoids trying to figure out how the paper tapes and read-write heads of Turing machines fly.


For the aforementioned ( t , ϵ ) (t, \epsilon)(t,ϵ ) security, we use an asymptotic analysis approach for concretization.

  • Turing machine runs ttt step, means "efficient computation"
  • ϵ \epsilonϵ , is a very small (close to 0) number

The way to define these things is to use the idea of ​​asymptotic analysis.

Let’s talk about the conclusion directly. What we think is small is smaller than any polynomial reciprocal, that is, smaller than any 1 / poly ( n ) 1/poly(n)1/poly(n)

What we think of as efficient computations are computations with polynomial complexity.

This small defined by polynomials has better properties:


Based on the above, we further define computational security. First, we define the private-key encryption scheme:

Definition. The private key cryptography scheme is three probabilistic polynomial time (Probability polynomial time, PPT) algorithms ( G en , E nc , D ec ) (Gen, Enc, Dec)(Gen,Enc,Dec ) , where:

  1. G e n Gen G e n : input security parameternnn returns a keykkk . It is generally believedthat ∣ k ∣ ≥ n |k| \geq nkn
  2. E n c Enc E n c : Enter keykkk and plaintext messagem ∈ { 0 , 1 } ∗ = ⋃ N = 1 ∞ { 0 , 1 } N m \in \{0, 1\}^* = \bigcup_{N=1}^\infty \{0 , 1\}^Nm{ 0,1}=N=1{ 0,1}N. _ The output ciphertext may be random.
  3. D e c Dec Dec : Enter keykkk and the ciphertext messageccc , outputmmm or throw an error.

Requirements: For any n , k , mn, k, mn,k,m D e c k ( E n c k ( m ) ) = m Dec_k(Enc_k(m)) = m Deck(Enck(m))=m

We still don't make good assumptions about the attacker's capabilities. For example, what algorithm the attacker may use to attack. The security definition we devise next will defend against any attack by an attacker given the computing power.

In perfect security, the attacker cannot obtain any information about the plaintext from the ciphertext. After limiting the computational power of the attacker, we will define semantic security. Semantic security is not so practical, we have another thing called indistinguishability. Indistinguishability is equivalent to semantic security, but indistinguishability is easy to use.

We also engage in something like unrecognizability when the attacker's capabilities are limited. The difference from the game introduced in Perfect Security is that we change two places:

  1. The attacker's ability is limited to polynomial time
  2. The success rate of the attacker's attack is small, which is defined as 1 2 + ϵ ( n ) \frac{1}{2}+ \epsilon(n)21+ϵ ( n )

So we get the game:

Definition. adversarial indistinguishabilityP riv KA , Π eav ( n ) PrivK^{eav}_{\mathcal A, \Pi}(n)P r i v KA , Pand v _(n)

  1. Attacker A \mathcal AA Given security parameters, output two ciphertextsm 0 , m 1 m_0, m_1m0,m1, where ∣ m 0 ∣ = ∣ m 1 ∣ |m_0| = |m_1|m0=m1
  2. The defender generates the key kkk and a random bitbbb . Calculatec = Enck ( mb ) c = Enc_k(m_b)c=Enck(mb)
  3. The attacker gets ccAfter c, the output bit b' b'is calculated by one passb
  4. if b' = b b' = bb=b , the attacker wins, recorded asP riv KA , Π eav ( n ) = 1 PrivK^{eav}_{\mathcal A, \Pi}(n) = 1P r i v KA , Pand v _(n)=1

Based on this we have the definition of computational indistinguishability:

Definition. Encryption scheme Π = ( G en , E nc , D ec ) \Pi = (Gen, Enc, Dec)Pi=(Gen,Enc,Dec ) is computationally indistinguishable (EAV-SECURE) under eavesdropping attacks, if all PPT algorithms attackA \mathcal AA , there exists a function small enough that for anynnn ,
P [ P riv KA , Π eav ( n ) = 1 ] ≤ 1 2 + ϵ ( n ) P[PrivK^{eav}_{\mathcal A, \Pi}(n) = 1] \leq \frac {1}{2} + \epsilon(n)P[PrivKA , Pand v _(n)=1]21+ϵ ( n )

where ϵ ( n ) \epsilon(n)ϵ ( n ) is the epsilon we defined earlier.

An equivalent definition is, suppose we give the experiment P riv KA , Π eav ( n ) PrivK^{eav}_{\mathcal A, \Pi}(n)P r i v KA , Pand v _( n ) Turn on the shadow clone, these two shadow clone experiments are selecting bitbbIn b , one chooses 0 and the other chooses 1, then the attacker can't actually tell which shadow clone he is fighting with.

Writing this definition is left as an exercise, or reference book.

Note that the encryption schemes involved here do not need to hide the length information of the plaintext. For the time being, we assume that the plaintext lengths are all equal. For specific reasons why we should make this assumption, you may wish to refer to the relevant content in the book, because this problem does not affect the main narrative for the time being.


To introduce semantic security, we first take a closer look at what computational indistinguishability means.

  • Computational indistinguishability means that it is impossible for an attacker to have a significant advantage in guessing a certain bit of the plaintext through his computational tricks.

  • The calculation is indistinguishable, which means that the attacker cannot learn any function f ( m ) f(m) of the plaintext through his calculation tricksf ( m ) wheremmm is defined in any setS ⊂ { 0 , 1 } l \mathcal S \subset \{0, 1\}^lS{ 0,1}l , andf ( m ) f(m)The value output by f ( m ) is 0-1 bits. That is, the attacker uses the ciphertextccc arithmetic functionf ( m ) f(m)The probability of f ( m ) should not be much different from the fact that the attacker knows nothing. (Otherwise the information is leaked).

Based on the above description, we give the definition of semantic security:

Definition. An encryption scheme ( E nc , D ec ) (Enc, Dec)(Enc,Dec ) is semantically secure under eavesdropping attacks, if for any PPT algorithmA \mathcal AA , there is a PPT algorithmA ′ \mathcal A'A such that the following are epsilons:

∣ P [ A ( 1 n , E n c k ( m ) , h ( m ) ) = f ( m ) ] − P [ A ′ ( 1 n , ∣ m ∣ , h ( m ) ) = f ( m ) ] ∣ |P[\mathcal A(1^n, Enc_k(m), h(m))=f(m)] - P[\mathcal A'(1^n, |m|, h(m))=f(m)]| P[A(1n,Enck(m),h(m))=f(m)]P[A(1n,m,h(m))=f(m)]

where h ( m ) h(m)h ( m ) is the external information that the attacker has mastered. In semantic security, the attacker uses the algorithmA \mathcal AA givenc = Enck ( m ) c=Enc_k(m)c=Enck(m) h ( m ) h(m) Calculatemm in the case of h ( m )function of m f ( m ) f(m)f ( m ) , its effect should be the same as that of any algorithm A ′ \mathcal A'used by the attackerA at a givenh ( m ) h(m)h ( m ) and∣ m ∣ |m|m case calculationf ( m ) f(m)f ( m ) is about the same.

Theorem. Computational indistinguishability and semantic security equivalence.


Pseudorandomness and stream ciphers

Pseudo-random number generator GGG is an efficient (polynomial time), deterministic algorithm that can transform a short bit string (seed) sampled from a uniform distribution into a long bit string that appears to be sampled from a uniform distribution.

The significance of studying pseudorandom numbers is to simulate randomness in a statistical sense. In order to judge where various pseudo-random numbers are not random, various statistics can be constructed.

One problem is how to judge the randomness of pseudo-random numbers? Similar to computational security, we can come up with something similar to computational randomness: a good pseudo-random number generator needs to fool all (that can be computed efficiently) statistics.

So, we give the formal definition:

define.ll _l is a polynomial,GGG is a deterministic polynomial-time algorithm such that for anynn (the security parameter, here the length of the seed) and any inputs ∈ { 0 , 1 } ns \in \{0, 1\}^ns{ 0,1}n (this is the random seed),G ( s ) G(s)G ( s ) is of lengthl ( n ) l(n)A string of l ( n ) . we sayGGG is a pseudorandom number generator if:

  1. for any nnn,有l ( n ) > nl(n) > nl ( n )>n
  2. For any probabilistic polynomial-time algorithm DDD , there exists a small quantityϵ ( n ) \epsilon(n)ϵ ( n ) so that∣
    P [ D ( G ( s ) ) = 1 ] − P [ D ( r ) = 1 ] ∣ ≤ ϵ ( n ) |P[D(G(s))=1] - P[ D(r) = 1]. \leq \epsilon(n)P[D(G(s))=1]P[D(r)=1]ϵ ( n )

Among them, we call lll is the expansion factor of the pseudo-random number generator. We thinkDDD is a random discriminator,D ( ⋅ ) = 1 D(·)=1D()=1 indicates that it thinks the input is random, otherwise it thinks the input is not random. s ∈ { 0 , 1 } ns \in \{0, 1\}^ns{ 0,1}n r ∈ { 0 , 1 } l ( n ) r\in \{0, 1\}^{l(n)} r{ 0,1}l ( n ) are all sampled from a uniform distribution.

In fact, the pseudo-random number generation stuff is obviously not uniform. For example, suppose we have nnn -bit seed,l ( n ) = 2 nl(n) = 2nl ( n )=2 n . deterministic mapGGG will at most2 n 2^n2n seeds are mapped to the image space (2 n 2^n2n values), while{ 0 , 1 } 2 n \{0, 1\}^{2n}{ 0,1}There are 2 2 n − 2 n 2^{2n}-2^nin 2 n space22 n2n values ​​are not mapped to.

Another question is, do pseudo-random number generators really exist? We don't actually know how to unconditionally prove the existence of pseudo-random numbers, but we have very good reasons to believe they exist. First, if one-way function is assumed (one-way function fff has the property: givenxxx computesf ( x ) f(x)f ( x ) is easy, but givenf ( x ) f(x)It is difficult to calculate the preimage of f ( x ) . A typical example is a hash function) exists, and we can construct a pseudo-random number generator based on a one-way function. The assumption that one-way functions exist is a weak assumption. In addition, there are already many pseudo-random number generators (such as stream ciphers), and no efficientDDD

stream cipher

A stream cipher is a bit different from a pseudo-random number generator: it can generate any number of "random" bits.

We define a stream cipher as two functions: ( I nit , G et B its ) (Init, GetBits)(Init,GetBits)

  • Init input parameters include: seed sss and (optional) initialization vectorIV IVIV . _ output an initial statex 0 x_0x0
  • The input parameters of GetBits include: state xk x_kxk, output a bit yk y_kykThen update the state to xk + 1 x_{k+1}xk+1

So, if we let the stream cipher run ll after initializationl G e t B i t s GetBits G e tB it s , you can get a possible "pseudo-random number generator", denoted as G l G_lGl

Roughly speaking, the above stream cipher is secure if it does not require IV IVI V , and for any polynomiall ( n ) l(n)l ( n ) wherel ( n ) > nl(n) > nl ( n )>n G l G_l Glexpansion factor llA pseudo-random number generator for l .

Guess you like

Origin blog.csdn.net/weixin_43466027/article/details/132185588