These 10 lines of code that compares strings for equality give me a complete confusion. If you don’t believe me, come and see

These 10 lines of code that compares strings for equality give me a complete confusion. If you don’t believe me, come and see

Code Ape Stone

These 10 lines of code that compares strings for equality give me a complete confusion. If you don’t believe me, come and see

Sorry to use this title to entice you to click in, but you might as well read it through and see if you can gain something. (Yes, please leave a message in the comment area. Nothing. I will live broadcast next weekend **, haha, you believe it too)

Supplementary note: The revision of the WeChat official account has a considerable impact on various account owners. At present, from the background data, it has little effect on me, because I am a small picture anyway. The amount of reading itself is pitiful. The truth is, the picture (just learned from the exchange group).

First go directly to the code:


boolean safeEqual(String a, String b) {
   if (a.length() != b.length()) {
       return false;
   }
   int equal = 0;
   for (int i = 0; i < a.length(); i++) {
       equal |= a.charAt(i) ^ b.charAt(i);
   }
   return equal == 0;
}

The above code was translated into Java based on the original version (Scala). The Scala version (the code that initially attracted the attention of the programmer) is as follows:


def safeEqual(a: String, b: String) = {
  if (a.length != b.length) {
    false
  } else {
    var equal = 0
    for (i <- Array.range(0, a.length)) {
      equal |= a(i) ^ b(i)
    }
    equal == 0
  }
}

It feels strange at first to see this source code. The function of this function is to compare whether two strings are equal. First of all, "the length is not equal, the result is definitely not equal, return immediately" is well understood.
Take a look at the back, and use your brains a little bit, and you can understand the doorway even when you turn: use the exclusive OR operation 1^1=0, 1^0=1, 0^0=0 to compare each bit, if If each bit is equal, the two strings must be equal, and the variable equal that stores the cumulative XOR value must be 0, otherwise it is 1.

Think about it again?


for (i <- Array.range(0, a.length)) {
  if (a(i) ^ b(i) != 0) // or a(i) != b[i]
    return false
}

We often talk about performance optimization. From the perspective of efficiency, shouldn't it be possible to immediately return that the two strings are not equal as long as the result of a certain bit is found to be different (that is, 1)? (As shown above).
There must be...
These 10 lines of code that compares strings for equality give me a complete confusion. If you don’t believe me, come and see

Think about it again?


In combination with the method name safeEquals, you may know something about safety.

The code at the beginning of this article comes from playframewok used to verify whether the data in the cookie (session) is legal (including the verification of the signature), which is also the origin of this article.

We used to know how to improve efficiency through delayed calculations and other means, but this is the first time that the results have been calculated but returned late!
Let's take a look, there is a similar method in JDK, the following code is taken from java.security.MessageDigest:


public static boolean isEqual(byte[] digesta, byte[] digestb) {
   if (digesta == digestb) return true;
   if (digesta == null || digestb == null) {
       return false;
   }
   if (digesta.length != digestb.length) {
       return false;
   }

   int result = 0;
   // time-constant comparison
   for (int i = 0; i < digesta.length; i++) {
       result |= digesta[i] ^ digestb[i];
   }
   return result == 0;
}

See the comments, the purpose is to compare with constant time complexity.
But what is the risk that the time spent in this calculation process is not constant? (Background music rang in my mind: "Kid, do you have many question marks?")
These 10 lines of code that compares strings for equality give me a complete confusion. If you don’t believe me, come and see

The truth is revealed


After further exploration and understanding, it turns out that this is done to prevent Timing Attack. (Some people translate it into timing***)
These 10 lines of code that compares strings for equality give me a complete confusion. If you don’t believe me, come and see

Timing Attack (Timing Attack)


Timing*** is a type of side channel*** (or "side channel attack", Side Channel Attack, SCA for short). Side channel*** is a kind of software or hardware design defect, which is a kind of evil way. "A *** method.
This *** method achieves the purpose of cracking through power consumption, timing, electromagnetic leakage, etc. In many physically isolated environments, it is often surprisingly successful. The effectiveness of this new type of security is much higher than the traditional mathematical methods of cryptanalysis (as stated on an encyclopedia).
This method can make calling safeEquals("abcdefghijklmn", "xbcdefghijklmn") (only the first digit is different) and calling safeEquals("abcdefghijklmn", "abcdefghijklmn") (two identical strings) takes the same time. Prevent the string to be compared from brute-forcing through a large number of changes in input and running time statistics.
Give an example

Guess you like

Origin blog.51cto.com/15072927/2607592