copyonwrite thought

Unaware that "high-level goods," you deserve an interview cannon fodder. . . [Notes] architecture Huperzine

Original: Chinese Shi Shan

No reply back public " information "

Acquisition of exclusive secret learning materials

table of Contents

1. The problems caused by lower reading and writing little scene?

2. The introduction of CopyOnWrite ideas to solve the problem!

3.CopyOnWrite using the idea of the source code in Kafka

Today chat a very hard-core technical knowledge, give you an analysis of what is thought CopyOnWrite, and embodied in Java and contracting, including in Kafka kernel source code is how to use this idea to optimize the performance of concurrency.

This CopyOnWrite during the interview, the interviewer is likely to be a killer blow to kill the candidate, is also likely to be unique Cheats candidate to win the Offer, is a relatively high level of knowledge.

1 , the lower reading and writing problems caused little scene?

We can imagine now that we have a memory of ArrayList, the ArrayList default must be thread-safe, if multiple threads concurrently read and write ArrayList this may be a problem.

Well, the question is, how should we let this become ArrayList thread safe?

There is a very simple way, access to the ArrayList are plus thread control synchronization.

For example, must be in synchronized ArrayList access to this section of the code, this is the case, you can at the same time let a thread to operate it, or read-write lock is ReadWriteLock ways to control, you can.

We assume that a read-write lock ReadWriteLock way to control access to the ArrayList.

Thus a plurality of read requests can be performed simultaneously in the read data from ArrayList, but between the read and write requests exclusive, write and write requests are mutually exclusive.

We can see that the code is probably something like this:

                   

  1. public Object read() {

          

  1.  lock.readLock().lock();

          

  1.  // for ArrayList read

          

  1.  lock.readLock().unlock();

          

  1. }

          

          

  1. public void write() {

          

  1.  lock.writeLock().lock();

          

  1.  // to write ArrayList

          

  1.  lock.writeLock().unlock();

          

  1. }

         

Think about it, similar to the above What's wrong with it?

The biggest problem , in fact, is that write locks and read locks are mutually exclusive. Assuming that the write operation is very low frequency, high frequency read, write less reading, more scenes.

So occasionally performs a write operation when it is not it will add a write lock, this time over a large number of read operations will be blocked is not live, can not perform?

This is the biggest problem of read-write locks may encounter.

2 , the introduction of CopyOnWrite ideas to solve the problem

This time is necessary to introduce CopyOnWrite ideas to solve the problem.

His idea is, do not add any read-write locks, lock all to get rid of me, a lock is a problem, there is a lock mutex lock may result in poor performance, you blocked my request, cause my requests the card can not be executed.

So how could he guarantee the safety of concurrent multi-threading it?

Very simple, as the name suggests, the use of "CopyOnWrite" the way, the English translated into Chinese, probably " when writing data using a copy of a copy to perform ."

When you read the data, in fact, does not lock does not matter, we are about a After reading, did not influence each other.

The main problem is in writing, the time to write since you can not lock up, then you have to adopt a strategy.

If you say that your underlying ArrayList is an array to store your data list, then when for example you want to modify this array of data, you must first copy of a copy of the array.

Then you can write your data to be modified in a copy of the array's, but in the process you are actually in operation a copy of it.

In this case, the read operation is not possible while the normal execution? The write operation to the read operation is not affected by any of it!

We see the figure below, together understand what this process:

The key question is, now that the writer thread to modify the copy of the array is over, now how to make the reader thread to perceive this change?

The key point here, draw the focus! Here we must cooperate on the use of the volatile keyword.

After I wrote the previous article, we explained to the use of the volatile keyword, the core is to make a thread variable is written to changes immediately so that other threads can read the most recent value of this variable references, this is the most volatile central role .

So once wrote thread to get a modified copy of the array, it can be a volatile write mode , to assign this array to copy the modified volatile reference to an array of variables.

As soon assigned to the volatile variables, will immediately be visible to the reader thread, we can see the latest array.

Here is the JDK CopyOnWriteArrayList source.

We look at writing data, he is how to copy a copy of the array, and then modify the copy, then assigned by volatile variable manner, to update the edited copy of the array to go back, let other threads immediately visible.

                   

  1.   // This array is the core, as modified by the volatile

          

  1.  // as long as the array of his latest assignment, other threads can immediately see the latest array

          

  1.  private transient volatile Object[] array;

          

          

  1.  public boolean add(E e) {

          

          

  1.  final ReentrantLock lock = this.lock;

          

  1.  lock.lock();

          

          

  1.  try {

          

  1.  Object[] elements = getArray();

          

  1.  int len = elements.length;

          

          

  1.  // array copy of a copy of it

          

  1.  Object[] newElements = Arrays.copyOf(elements, len + 1);

          

          

  1.  // copy of the array to be modified, such as adding an element in it

          

  1.  newElements [len] = a;

          

          

  1.  // Then copy of the array assigned to volatile variables

          

  1.  setArray(newElements);

          

  1.  return true;

          

          

  1.  } finally {

          

  1.  lock.unlock();

          

  1.  }

          

          

  1.  }

         

Then I thought, because it is to be updated by a copy, in case if multiple threads must also update it? That engage in out multiple copies will be any problems?

Of course, not multiple threads simultaneously update, and this time is to see the source above, the addition of a mechanism to lock the lock, that is, at the same time only one thread can update.

Then the update, the read operation will have any effect on you?

Absolutely not, because the read operation is very simple reading of the array only, does not involve any locks. And as long as he updated to volatile modified variable assignment, then read the thread can be seen immediately after the array of the latest changes, which is volatile guaranteed.

                   

  1.  private E get(Object[] a, int index) {

          

          

  1.  // The simplest array is read

          

  1.  return (E) a[index];

          

          

  1.  }

         

   

This perfect solution to read how much to write questions we said before.

If the mutex lock used to read and write, then write lock can cause a lot of blocking reads, affect concurrent performance.

But if the CopyOnWriteArrayList, is to use space for time, based on an updated copy of the updated time, to avoid the lock, and then finally be assigned to ensure volatile variable visibility, when the update does not have any effect on the reader thread!

3 , the use of CopyOnWrite source of ideas in Kafka

In the kernel source in Kafka, there is such a scenario, the client at the time of writing data to Kafka, the client would write a message to the local memory buffer, and then forming a disposable Batch after retransmission buffer in memory to Kafka server up, which helps to enhance the throughput.

Ado, we look:

This time Kafka buffer memory with what data structure? We look at the source code:

                   

  1. private final ConcurrentMap<topicpartition, deque<="" span="">

          

          

  1.  New CopyOnWriteMap < TopicPartition , touch                > ();

         

This data structure is the core of the data structures used to store the message written in the buffer memory, to understand the data structures required for many Kafka kernel source in concept to explain, here to not start.

However, the concern that he is himself a CopyOnWriteMap realized, this CopyOnWriteMap use is the CopyOnWrite thought.

Let's look at the source code to achieve this CopyOnWriteMap:

                   

  1.  // Typical volatile modification ordinary Map

          

  1.  private volatile Map                                 map                ;

          

          

  1.  @Override

          

  1.  public synchronized V put(K k, V v) {

          

          

  1.  // update the time to create a copy of an updated copy, then go back and write to volatile variable assignment

          

  1.  Map                                 copy                 = new HashMap                                 (                this.map);

          

  1.  V prev = copy.put(k, v);

          

  1.  this.map = Collections.unmodifiableMap(copy);

          

  1.  return prev;

          

  1.  }

          

          

  1.  @Override

          

  1.  public V get(Object k) {

          

          

  1.  // time to read directly read volatile variable references map data structure, without lock

          

  1.  return map.get(k);

          

          

  1.  }

         

So Kafka The core data structure has been adopted here CopyOnWriteMap ideas to achieve, because the Map of key-value pairs, in fact, not so frequent updates.

That is TopicPartition-Deque this key-value pairs, update frequency is very low.

But his get operation is a read request frequency, the frequency will be read out as a data structure TopicPartition Deque corresponding to enqueue dequeue this queue and other operations, so for this map, the high frequency is the get operation.

This time, Kafka on the use of CopyOnWrite idea to achieve this Map, avoid updating key-value when staying high frequency blocking read operation, to achieve the effect of lock-free, concurrent threads to optimize performance.

I believe you read this article, as well as ideas for CopyOnWrite application scenarios, including the realization of the JDK, as well as the source code used in Kafka, there is the experience of a personal up.

If you can make it clear that his thoughts and reflected in the JDK, and also in conjunction with the underlying source of the well-known open source projects Kafka to further elaborate the interviewer, the interviewer the impression that you are certainly a big plus in the interview.

Guess you like

Origin blog.csdn.net/suifeng629/article/details/93738910