Revealing the synchronized block index (middle): How to get the HashCode of an object

Reprinted from: http://www.cnblogs.com/yuyijq/archive/2009/08/13/1545617.html

 

Digression: In order to try out the early adopters, I also installed Win7 excitedly, but I was excited but depressed. Because I used Live Writer to write a blog, I wrote several drafts, about 80% of which were completed, and all of them were lost without backup. I wanted to cry without tears, so I had to rewrite it.

Visual Studio + SOS Small Experiment

At first glance at the title, I feel a little strange, what does the synchronization block index have to do with HashCode? Judging from the name, it is 108,000 miles away. Before I knew the details, I thought the same way. After knowing the details, I realized that the two brothers are so close. Let's first use Visual Studio + SOS to see one thing. Here is the sample code as a little white rabbit:

   1: using System;
   2: public class Program
   3: {
   4:     static void Main()
   5:     {
   6:         Foo f = new Foo();
   7:         Console.WriteLine(f.GetHashCode());
   8:  
   9:         Console.ReadLine();
  10:     }
  11: }
  12: //Just such a simple class
  13: public class Foo
  14: {
  15:  
  16: }

(When using Visual Studio + SOS to debug, please set "Allow unmanaged code debugging" in the project properties and debug bar first)

We set breakpoints on lines 7 and 9 respectively, and F5 runs. When the program stops at the first breakpoint (f.GetHashCode() has not been executed at this time), we enter in the immediate window of Visual Studio. :

   1: .load sos.dll
   2: extension C:\Windows\Microsoft.NET\Framework\v2.0.50727\sos.dll loaded
   3: !dso
   4: PDB symbol for mscorwks.dll not loaded
   5: OS Thread Id: 0x1730 (5936)
   6: ESP/REG  Object   Name
   7: 0013ed78 01b72d58 Foo
   8: 0013ed7c 01b72d58 Foo
   9: 0013efc0 01b72d58 Foo
  10: 0013efc4 01b72d58 Foo

Use .load sos.dll to load the sos module, and then use !dso, we find the memory address of the f object of type Foo: 01b72d58, and then use the Visual Studio debug menu to view the memory window to view the contents of the f object header:

image

The 00 00 00 00 shaded by the shadow is where the sync block index is located. It can be seen that the value of the sync block index is still 0 at this time (this will be explained later), and then continue to F5, the program runs to the next interrupt At this point, f.GetHashCode() has also been called at this time. If you are careful, you will find that the value of the original object synchronization block index has changed:

image

Visual Studio, the memory viewer, has a nice feature that shows memory changes in red. We see that what was originally 00 00 00 00 has become 4a 73 78 0f. Well, it seems that the acquisition of HashCode is still related to the synchronization block index. Otherwise, why does the value of the synchronization block index change when the GetHashCode method is called. Let's take a look at the output of Console.WriteLine(f.GetHashCode()):

image 
I don't know if the two values ​​are related, let's convert them to binary first. Note that 4a 73 78 0f here means that the low order is on the left, the high order is on the right, and the decimal below is the high order and then the left, and the low order is on the right. Then 4a 73 78 0f is actually 0x0f78734a.

0x0f78734a:000011 110111110000111001101001010

   58225482:00000011011110000111001101001010

 Rotor source code

We first filled 32 bits with 0, and suddenly found that the lower 26 bits of the two are exactly the same (the part marked in red). Is this a coincidence or a necessity? In order to find out, I had to move out the Rotor source code and see if I could find anything in the source code. Still following the old fashioned way, let's start with managed code:

   1: public virtual int GetHashCode()
   2: {
   3:    return InternalGetHashCode(this);
   4: }
   5: [MethodImpl(MethodImplOptions.InternalCall)]
   6: internal static extern int InternalGetHashCode(object obj);

As mentioned in the first article of this series, methods marked with the [MethodImpl(MethodImplOptions.InternalCall)] feature are implemented using Native Code. In Rotor, these codes are located in sscli20\clr\src\vm\ In the ecall.cpp file:

   1: FCFuncElement("InternalGetHashCode", ObjectNative::GetHashCode)
   2: FCIMPL1(INT32, ObjectNative::GetHashCode, Object* obj) {
   3:     DWORD idx = 0;
   4:     OBJECTREF objRef(obj);
   5:     idx = GetHashCodeEx(OBJECTREFToObject(objRef));
   6:     return idx;
   7: }
   8: FCIMPLEND
   9: INT32 ObjectNative::GetHashCodeEx(Object *objRef)
  10: {
  11:     // This loop exists because we're inspecting the header dword of the object
  12:     // and it may change under us because of races with other threads.
  13:     // On top of that, it may have the spin lock bit set, in which case we're
  14:     // not supposed to change it.
  15:     // In all of these case, we need to retry the operation.
  16:     DWORD iter = 0;
  17:     while (true)
  18:     {
  19:         DWORD bits = objRef->GetHeader()->GetBits();
  20:  
  21:         if (bits & BIT_SBLK_IS_HASH_OR_SYNCBLKINDEX)
  22:         {
  23:             if (bits & BIT_SBLK_IS_HASHCODE)
  24:             {
  25:                 // Common case: the object already has a hash code
  26:                 return  bits & MASK_HASHCODE;
  27:             }
  28:             else
  29:             {
  30:                 // We have a sync block index. This means if we already have a hash code,
  31:                 // it is in the sync block, otherwise we generate a new one and store it there
  32: SyncBlock *psb = objRef->GetSyncBlock();
  33:                 DWORD hashCode = psb->GetHashCode();
  34:                 if (hashCode != 0)
  35:                     return  hashCode;
  36:  
  37:                 hashCode = Object::ComputeHashCode();
  38:  
  39:                 return psb->SetHashCode(hashCode);
  40:             }
  41:         }
  42:         else
  43:         {
  44:             // If a thread is holding the thin lock or an appdomain index is set, we need a syncblock
  45:             if ((bits & (SBLK_MASK_LOCK_THREADID | (SBLK_MASK_APPDOMAININDEX << SBLK_APPDOMAIN_SHIFT))) != 0)
  46:             {
  47:                 objRef->GetSyncBlock();
  48:                 // No need to replicate the above code dealing with sync blocks
  49:                 // here - in the next iteration of the loop, we'll realize
  50:                 // we have a syncblock, and we'll do the right thing.
  51:             }
  52:             else
  53:             {
  54:                 DWORD hashCode = Object::ComputeHashCode();
  55:  
  56:                 DWORD newBits = bits | BIT_SBLK_IS_HASH_OR_SYNCBLKINDEX | BIT_SBLK_IS_HASHCODE | hashCode;
  57:  
  58:                 if (objRef->GetHeader()->SetBits(newBits, bits) == bits)
  59:                     return hashCode;
  60:                 // Header changed under us - let's restart this whole thing.
  61:             }
  62:         }
  63:     }
  64: }

There is a lot of code, but most of the operations are doing AND, OR, shifting, etc. The object of operation is obtained by this line of code: objRef->GetHeader()->GetBits(), which is actually to obtain the synchronization block index.

Think about it, when the first breakpoint is hit, the value of the synchronized block index is still 0x00000000, which should be the execution of the following code:

   1: DWORD hashCode = Object::ComputeHashCode();
   2: DWORD newBits = bits | BIT_SBLK_IS_HASH_OR_SYNCBLKINDEX | BIT_SBLK_IS_HASHCODE | hashCode;
   3: if (objRef->GetHeader()->SetBits(newBits, bits) == bits)
   4:     return hashCode;

Calculate a hash value through the ComputeHashCode method of Object (because this article is not concerned with the hash algorithm, so the implementation of this ComputeHashCode method is not discussed here). Then perform a few OR operations (the original bits OR operation is to retain the original value, indicating that the synchronization block index also plays another role, such as the lock in the previous article), and then put the synchronization block index in the old one. bits are replaced. We can't see anything from here. However, what if we call the GetHashCode() method on this object again? The synchronization block index is no longer 0x00000000, but 0x0f78734a. Let's take a look at the values ​​of several defined constants:

   1: #define BIT_SBLK_IS_HASH_OR_SYNCBLKINDEX    0x08000000
   2: #define BIT_SBLK_IS_HASHCODE            0x04000000
   3: #define HASHCODE_BITS                   26
   4: #define MASK_HASHCODE                   ((1<<HASHCODE_BITS)-1)

You can see from the place where the hashcode was just set: DWORD newBits = bits | BIT_SBLK_IS_HASH_OR_SYNCBLKINDEX | BIT_SBLK_IS_HASHCODE | hashCode;

So the first two ifs can be passed, and the returned hashcode is bits & MASK_HASHCODE.

This MASK_HASHCODE is to shift 1 to the left by 26 bits = 10000000000000000000000000, then subtract 1 = 0000001111111111111111111111111 (the lower 26 bits are all 1, and the upper 6 bits are 0), and then it is compared with the synchronization block index. In fact, the function here is not to take out the synchronization The value of the lower 26 bits of the block index. Thinking back to the experiment at the beginning of this article, it turned out to be no coincidence.

Connected to the previous article, we can see that the synchronized block index not only plays the role of lock, but also sometimes assumes the responsibility of storing HashCode. In fact, the synchronization block index is such a structure: a total of 32 bits, the upper 6 bits are used as control bits, and the specific meaning of the last 26 changes with the different upper 6 bits. The upper 6 bits are like many small switches, and some are turned on ( 1), some are closed (0), the opening and closing of different bits have different meanings, and the program knows what the lower 26 bits are doing. The design here is really ingenious, the constant memory usage is very compact, and the program can be handled flexibly and flexibly expanded.

 

postscript

This article is the same as the previous one, and the independent content is taken out separately, so that it can be explained more easily. For example, in this article, I only imagine that the sync block index is used for hashcode storage. At this time, the sync block index is clean (the sync block index obtained first in the previous experiments in this article is a 0), but in practice, the sync block index may be Taking on more responsibilities, such as both locking and obtaining HashCode, the situation is more complicated at this time. This will be explained in more detail in a later article.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325168590&siteId=291194637