Jingdong Mathematics one-sided internship experience

Jingdong Mathematics (11 AM, January 13, 2021)

The interviewer was very good, and the whole process was patient and listened to me and analyzed it with me. It was a very rewarding interview experience.
Total interview time: 43 minutes

  1. Self introduction

    (I didn’t talk about the project... Maybe it’s too simple to see the technology stack)

  2. Just ask what do you know about JVM?

    Answer: don't understand

  3. A question about TCP/IP

    Answer: No (I didn't hear it clearly, and I didn't prepare for this one, so I just said no)

    According to the resume, I wrote familiar with multi-threading, and then asked...

  4. Have you learned about CAS?

    Answer: √

    CAS (Compare-and-Swap), that is, compare and replace, is a technology commonly used when implementing concurrent algorithms. Many classes in Java concurrent packages use CAS technology.

    CAS operations need to rely on the methods of the Unsafe class. All methods in the Unsafe class are natively modified, and all methods directly call the underlying resources of the operating system to perform tasks. CAS is a CPU concurrency primitive, and in the operating system, the execution of the primitive must be continuous, and no interruption is allowed during the execution, which means that CAS is an atomic instruction.

    CAS needs to have 3 operands: memory address V, old expected value A, and target value B to be updated.

    When the CAS instruction is executed, if and only if the value of the memory address V is equal to the expected value A, the value of the memory address V is modified to B, otherwise it does nothing. The entire comparison and replacement operation is an atomic operation.

  5. What are the disadvantages of CAS?

    Answer: √ I said three

    1. Infinite loop affects performance

    2. Only guarantee the atomicity of a shared variable

    3.ABA problem

  6. Can you tell me about the ABA problem?

    Answer: √ There is a brief description, but I also give an example to illustrate.

    The CAS process is usually as follows:

    1. First read A from address V
    2. Calculate the target value B according to A
    3. Change the value of address V from A to B in an atomic manner through CAS

    The value read in step 1 is A, and the modification is successful in step 3. Can we say that its value has not been changed by other threads between steps 1 and 3?

    If its value has been changed to B during this period, and then changed back to A, the CAS operation will mistakenly believe that it has never been changed. This vulnerability is called the "ABA" problem of CAS operations.

  7. How to solve the ABA problem?

    Answer: √

    In order to solve this problem, the Java Concurrent Package provides a tagged atomic reference class "AtomicStampedReference", which can ensure the correctness of CAS by controlling the version of the variable value. Therefore, before using CAS, you must consider whether the "ABA" problem will affect the correctness of program concurrency. If you need to solve the ABA problem, switching to traditional mutually exclusive synchronization may be more efficient than atomic classes.

  8. Then you use the atomic reference AtmoicRefence to explain the example just now, how to solve it?

    Answer: Half √, I think I didn't answer the point. Later, I thought about it. It should be that I wanted to lure me to this and it couldn't completely solve the ABA problem. The final solution was to use AtomicStampRefence to introduce the version number mechanism (I'm a fool...)

    Under the java.util.concurrent.atomic concurrency package in the JDK, AtomicStampedReference is provided . By establishing a Stamp similar version number for the reference, it can solve the ABA problem in the CAS mechanism.

    Read the source code! ! !

    First of all, there are four parameters in the compareAndSet function of AtomicStampedReference. Look at the source code:

    compareAndSet(V expectedReference, V newReference, int expectedStamp, int newStamp)

    1. expectedReference: indicates the expected value
    2. newReference: the value to be updated
    3. expectedStamp: expected timestamp
    4. newStamp: updated timestamp

Insert picture description here
Here we will find that only the value reference and timestamp stamp are stored in the Pair. The Insert picture description here
key lies in the casPair method, which is mainly to use the CAS mechanism to update the new value reference and timestamp stamp.
Insert picture description here
9. Then I tell you that CAS depends on the method of the Unsafe class. The bottom layer is implemented in C++. Can you tell me how to implement it in C++?

答:What??C++??See you

Unsafe packet class is sun.misc.Unsafe

img

  1. Do you know the thread pool in multithreading?

    Answer: After a few simple words, he didn't make me embarrassed...

    There is no answer here, I suggest to understand these questions:

    1. What is thread pool
    2. Why is there a thread pool? What is the main purpose of solving the problem and the benefits of the thread pool?
    3. Look at the source code
    4. Optimization of thread pool
    5. How to create a thread pool?

  2. Do you know Volatile keywords?

    Answer: √ (Looking at the interview, I found that JD Mathematics interviewers always ask this, so I reviewed it carefully last night)

    Volatile is a lightweight synchronization mechanism provided by the Java virtual machine. The variables defined by Volatile are read from the main memory every time the system uses it, rather than the working memory of each thread. It has three characteristics:

    • Memory visibility

    • No guarantee of atomicity

    • Prohibit order rearrangement

    1. Memory visibility

    ​ The operation of each thread on the shared variable in the main memory is the operation of each thread copying to its own working memory, and then writing back to the main memory

    1. No guarantee of atomicity

    ​ Indivisible, complete, that is, when a thread is doing a specific business, time cannot be blocked or divided. Requires completeness to either succeed at the same time or fail at the same time

    ​ For example: create 20 threads, each thread i++ 1000 operations will cause problems (how to solve? 1. Add synchronized 2. Use the atomic class AtmoicInteger.getincrement () method)

    1. Prohibition of order rearrangement ("orderliness")

    ​ In a multi-threaded environment, threads are executed alternately. Due to the reordering of optimization by the compilation bureau, it is impossible to determine whether the variables used by the two threads can be consistent, and the result is unpredictable.

  3. Talk about the "memory visibility" in its features?

    Answer: √ Hey, I have done a lot of research. According to my own understanding, I can easily say it

    When a variable is modified by volatile, the modification to it will immediately be flushed to the main memory. When other threads need to read the variable, they will go to the memory to read the new value. And ordinary variables cannot guarantee this.

    Suggestion: When the interviewer asks you this question, you should answer this way. Tell me about the Java memory model, because its existence leads to the existence of visibility problems, thus introducing the visibility of Volatile!

  4. Can you tell me how it is implemented at the bottom level?

    Answer: Uh... I don’t know... Look at what I said smoothly and start to fuck me

    Earlier we said that when the computing tasks of multiple processors all involve the same main memory area, it may lead to inconsistencies in their respective cache data. An example is the sharing of variables among multiple CPUs.

    If this happens, whose cached data will prevail when syncing back to main memory?

    In order to solve the problem of consistency, each processor needs to follow some protocols when accessing the cache, and operate according to the protocol when reading and writing. Such protocols include MSI MESI(IllinoisProtocol),, MOSI, Synapse, Firefly, and DragonProtocol.

    MESI (Cache Coherency Protocol)

    When the CPU writes data, if the operating variable is found to be a shared variable, that is, a copy of the variable exists in other CPUs, it will send a signal to notify other CPUs to invalidate the cache line of the variable, so when other CPUs need to read When you fetch this variable, if you find that the cache line that caches the variable in your cache is invalid, then it will read it from the memory again.

    As for how to find out whether the data is invalid?

    Sniff

    Each processor checks whether the value of its cache has expired by sniffing the data spread on the bus. When the processor finds that the memory address corresponding to its cache line has been modified, it will set the current processor’s cache line to be invalid. State, when the processor modifies this data, it will read the data from the system memory to the processor cache again.

    Bus storm

    Because Volatile's MESI cache coherency protocol requires continuous sniffing from the main memory and continuous looping of cas, invalid interaction will cause the bus bandwidth to reach its peak.

    So don't use Volatile in large quantities. As for when to use Volatile and when to use locks, distinguish according to the scene.

  5. Does AQS understand?

    Answer: Just a few words (my progress is not here yet...)

  6. The difference between thread and process?

    Answer: √

    1. Process is the basic unit of resource allocation; thread is the basic unit of program execution.
    2. A process has its own resource space, and every time a process is started, the system will allocate an address space for it; threads have nothing to do with CPU resource allocation, and multiple threads share the resources in the same process and use the same address space
    3. A process can contain multiple threads
  7. How to communicate between processes? (I want you to be proud... see if you can)

    Answer: I said IPC, I only saw it accidentally, and I forgot

    Interprocess communication (IPC, Interprocess communication) is a set of programming interfaces,

    Allow programmers to coordinate different processes so that they can run simultaneously in an operating system, and transfer and exchange information with each other.

    IPC methods include pipes (PIPE), message queues, signals, shared memory, and sockets.

    Are you a multi-threaded weird? Can't stand it anymore

  8. Do you know about reentrant locks? It is the reentrant lock under JUC

    Answer: em... I don’t know if that means, synchronized and ReetrentLock locks

    https://blog.csdn.net/w8y56f/article/details/89554060

  9. Tell me about ReetrentLock

    Answer: √ Almost all said, and a simple comparison with synchronized

    https://zhuanlan.zhihu.com/p/65727594

  10. Tell me about your understanding of fair locks and unfair locks

    Answer: √

    simply say:

    1. Fair lock: guarantee absolute fairness, threads come first, come first
    2. Unfair lock: very unfair, you can jump in the queue

    The standard says:

    Fair lock: Multiple threads acquire locks in the order in which they apply for locks. The threads will directly enter the queue to queue up, and they will always be the first in the queue to get the lock.

    • Advantages: All threads can get resources and will not starve to death in the queue.
    • Disadvantages: throughput will drop a lot, except for the first thread in the queue, other threads will be blocked, and the overhead of cpu waking up blocked threads will be very large.

    Unfair lock: When multiple threads acquire a lock, they will directly try to acquire it. If they cannot acquire it, they will enter the waiting queue. If they can acquire the lock, they will directly acquire the lock.

    • Advantages: It can reduce the overhead of CPU waking up threads, and the overall throughput efficiency will be higher. The CPU does not have to wake up all threads, which will reduce the number of threads to be called.
    • Disadvantages: You may have also discovered that this may cause the thread in the middle of the queue to be unable to acquire the lock or acquire the lock for a long time, resulting in starvation.
  11. Which one do you think is more efficient? why?

    Answer: I think I must be wrong. He analyzed it with me for a long time

    Fair locks need to maintain a queue, and subsequent threads need to lock. Even if the lock is free, first check whether there are other threads waiting. If there is one that needs to be suspended, add it to the back of the queue, and then wake up the front thread of the queue. In this case, compared with the unfair lock, there is one more suspend and wake up

    The overhead of thread switching is actually the reason why unfair locks are more efficient than fair locks, because unfair locks reduce the probability of thread suspension, and subsequent threads have a certain chance to escape the suspended overhead.

    Okay, I think you can have a deeper understanding of multi-threaded concurrency...

  12. Let's talk about collections, do you know HashMap?

    Answer: √ (Come on, I will take the move, I have eaten it thoroughly since the last time I read the byte)

    HashMap uses Entry arrays to store key-values. Each key-value pair forms an Entry entity (jdk1.7, jdk1.8 changed to Node). The Entry class is actually a one-way linked list structure, which has a next pointer. , You can connect to the next Entry entity. In JDK1.8, when the length of the linked list is greater than 8, the linked list will be converted into a red-black tree!

  13. Tell me about the whole process of its put method, right?

    Answer: √

    1. Call the hash function of the object to obtain the hash value corresponding to the key, and then calculate its array subscript;
    • If there is no hash conflict, put it directly into the array
    • If there is a hash conflict, put it behind the linked list in the form of a linked list
    1. If the length of the linked list exceeds the threshold (TREEIFY THRESHOLD==8 ), the linked list is converted into a red-black tree. When the length of the linked list is less than 6, turn the red-black tree back to the linked list;

    2. If the key of the node already exists, replace the value value;

    3. If the number of key-value pairs in the collection is greater than 12, call the resize method to expand the array

    According to the source code you have read and your own understanding, just say it step by step.

  14. How does it expand? Let's talk about the expansion mechanism

    Answer: √

    Default capacity=16, loadfactor=0.75f

    After reaching threshold = capacity * loadfactor, it is expanded to 2 times.

    After rehashing, recalculate the subscript. According to whether the results before and after rehash are the same, it is divided into low linked list and high linked list for rehash

  15. How is the specific code implemented?

    Answer: √

    Perform a bit operation on the original size and shift one bit to the left

  16. Why do you say that the size of the expansion must be a power of 2?

    Answer: √ Because the AND operation is used, the efficiency of hash value calculation is improved

    Because the algorithm of HashMap to find the corresponding bucket position uses the AND operation instead of the traditional modulo operation

    The premise of hash% length == hash&(length-1) is that length is a power of 2.

  17. You said that if the length of the linked list reaches 8, it will be converted to a red-black tree. Why is it 8?

    Answer: √ I have said from many aspects, the Poisson distribution of probability theory, and search performance. It should be right...because it doesn't say anything

    1. According to the Poisson distribution, when the load factor is 0.75, the probability that the number of elements in a single hash slot is 8 is less than one in a million
    2. When the length is 8, instead of ensuring the search overhead of the linked list structure, it is better to convert to a red-black tree and maintain its balance overhead instead.
  18. What is the search time complexity of the red-black tree?

    Answer: √

    O (logN)

    em...this one understands pretty well, then I won't ask you

    What I thought in my heart, don't... Don't you like to ask about multithreading? Don’t talk about the thread safety of HashMap?

  19. (Then I will look at the algorithm again, and start messing with me) Do you understand the classic TopK problems? How do you solve it?

    Answer: Half √, it should be right here, I am autistic... I answered minimum heap, quick selection, hash

  20. Hash talk about the process

    Answer: I gave an example and talked about the process (I don’t know, right?)

  21. Then analyze the time complexity of your hash

    Answer: I'm dead...I didn't understand the process just now, did he understand it?

    Rhetorical question

    1. What department? What is the number of science and technology in Taiwan

    2. Can you give me a brief comment on my interview evaluation? Or what should I pay attention to in the future?

      Answer: You need to learn a little bit more about multi-threaded concurrency (such as thread pool, AQS, etc.), and it is recommended to prepare well for the algorithm.

    My own feeling: The cold must be cold. But how to say, answering some questions, I feel that the daily preparation during this period is also rewarding, at least I will know which aspects need more in-depth study and research. After all, the preparation time is really not long (formally and seriously prepared on 1.9, only... the fourth day). Keep on working hard! ! !

Guess you like

Origin blog.csdn.net/weixin_44723496/article/details/112585718