[Java intern] Daily interview questions: 15 questions about the source code of ConcurrentHashMap, how many questions can you stick to?

[Java intern] Daily interview questions: 15 questions about the source code of ConcurrentHashMap, how many questions can you stick to?

  • Approaching the autumn recruitment, prepare for the summer internship, I wish everyone a little better every day! Day11
  • This article summarizes the interview questions related to ConcurrentHashMap , which will be updated daily~
  • Those who are not familiar with the source code of ConcurrentHashMap can refer to my previous blog: ConcurrentHashMap source code analysis article catalog
  • Volume? The volume is right, Java is just such a volume list!


1. Please describe what the storage data structure of ConcurrentHashMap looks like?

  • The internal map structure of ConcurrentHashMap is the same as HashMap, and both are composed of: array + linked list + red-black tree .

  • The data storage unit of ConcurrentHashMap and HashMap are also the same, that is, the Node structure:

    static class Node < K , V > implements Map . Entry < K , V > { //hash value final int hash; //key final K key; //value volatile V val; //rear drive node volatile Node<K, V> next; .... } Copy code
  • The difference between ConcurrentHashMap and HashMap is that the former supports concurrent expansion, and its internal locks ( spin lock + CAS + synchronized + segment lock ) are used to ensure thread safety.


2. Can the load factor of ConcurrentHashMap be newly specified?

  • The load factor of ordinary HashMap can be modified, but ConcurrentHashMap cannot, because its load factor is used
    final
    Keyword modification, the value is fixed 0.75 :
//Load factor: indicates how full the hash table is~ In ConcurrentHashMap, this attribute is a fixed value of 0.75 and cannot be modified~ private static final float LOAD_FACTOR = 0.75f ; Copy code

3. In general, the Node.hash field of the node must be >=0. Why?

In other words, how many situations are there for the hash value of a Node node? Analyze different situations?

  • in case
    Node.hash = -1
    , Indicating that the current node is **FWD (ForWardingNode) ** node (representing a node that has been migrated).
  • in case
    Node.hash = -2
    , Which means that the current node has been treeized , and the current node is a TreeBin object, and the TreeBin object agent operates the red-black tree.
  • in case
    Node.hash> 0
    , Indicating that the current node is a normal Node node, which may be a linked list or a single Node .

4. Can you briefly describe the function of the sizeCtl field in ConcurrentHashMap (the meaning under different circumstances)?

sizeCtr
Namely Size Control, this field must be carefully understood. If this field is not understood, the entire ConcurrentHashMap source code may be confused.

When sizeCtl == 0, what does it mean?

  • izeCtl == 0
    , Is the default value, which means that the default capacity of 16 is used for initialization when the hash list is initialized for the first time .

When sizeCtl == -1, what does it mean?

  • sizeCtl == -1
    Indicates that the current scattered linked list is being initialized . A thread is initializing the current hash table ( table ).

  • The hash list of ConcurrentHashMap is initialized lazily. Under concurrent conditions, it must be ensured that it can only be initialized once, so

    sizeCtl == -1
    It is equivalent to an "identity lock" to prevent multiple threads from initializing the hash table.

  • The specific initialization operation is in

    initTable()
    In the method, it will be modified by CAS
    sizeCtl
    The value is
    -1
    , Which means that there is already a thread that is initializing the scatter list, and other threads cannot initialize again, they can only wait!

    //SIZECTL: expected value, initially 0 //this means the current ConcurrentHashMap object //sc means the sizeCtrl to be modified //-1 means change sc to -1 U.compareAndSwapInt( this , SIZECTL, sc, -1 ); Copy code
  • If CAS is modified

    sizeCtl = -1
    The thread that succeeded in the operation can then execute the logic of initializing the hash list. The thread that failed to modify the CAS will continue to spin check here until the initialization of the scattered link list is completed.

  • Here the thread that failed CAS will follow the following logic, that is, the spinning thread will pass

    Thread.yield();
    Method, release the CPU resources for a short time, and give the CPU resources to hungry threads to use. The purpose is to reduce the consumption of spin on CPU performance.

After initializing the hash table, when map.sizeCtl> 0, what does it mean?

  • sizeCtl> 0
    , It means the next threshold for triggering expansion after initialization or expansion is completed .
  • such as,
    sizeCtl = 12
    When the number of data in the scattered linked list
    >=12
    When the time, the expansion operation will be triggered.

When sizeCtl <0 && sizeCtl != -1, what does it mean?

  • sizeCtl <0 && sizeCtl != -1
    , It means that the current scattered linked list is in the state of expansion .
  • -(1 + nThreads)
    , Which means that n threads are expanding together.
  • At this moment,
    sizeCtl
    The high 16 bits represent the expansion identification stamp , and the low 16 bits represent the number of threads participating in concurrent expansion :
    1 + nThread
    , That is, the number of threads currently participating in concurrent expansion is
    n
    A.

5. Could you please talk about the function of the expansion mark and its calculation method?

  • According to the length of the old watch
    tab.length
    Get the unique identification stamp for expansion.
  • Assuming 16 -> 32 expansion in this way, the function of the expansion flag is that under the condition of multi-threaded concurrent expansion, all threads with 16 -> 32 expansion can participate in concurrent expansion.
//The fixed value is 16, which is related to expansion. When calculating expansion, an expansion identification stamp will be generated according to the attribute value private static int RESIZE_STAMP_BITS = 16 ; /** * When the table array is expanded, an expansion identification stamp is calculated. When concurrent expansion is required, the current thread must obtain the expansion identification stamp to participate in the expansion: */ static final int resizeStamp ( int n) { //RESIZE_STAMP_BITS: a fixed value of 16, related to expansion. When calculating expansion, an expansion identification stamp will be generated according to the attribute value return Integer.numberOfLeadingZeros(n) | ( 1 << (RESIZE_STAMP_BITS - . 1 )); } Copy code
  • sizeCtl <0 && sizeCtl != -1
    At this time
    sizeCtl
    The high 16 bits represent the expansion identification stamp , and the low 16 bits represent the number of threads participating in concurrent expansion :
    1 + nThread
    , That is, the number of threads currently participating in concurrent expansion is
    n
    A.

6. How does ConcurrentHashMap ensure the thread safety of writing data?

This question is actually about how to add data to ConcurrentHashMap to ensure thread safety is achieved.

  • ConcurrentHashMap internally uses locks ( spin lock + CAS + synchronized + segment lock ) to ensure thread safety.

The specific process of adding data is as follows:

  • 1. determine whether the hash list has been initialized. If it is not initialized, initialize the hash list first, and then perform the write operation.

  • When writing data to the bucket, first determine whether the bucket is empty. If it is an empty bucket, write the newly added data node into the bucket directly through CAS . If the CAS write fails, it means that other threads have already written data in the current bucket position, and the current thread has failed to compete, return to the spin position and spin waiting.

  • If the current bucket is not empty, you need to determine the type of the head node in the current bucket:

    • If the hash value of the head node in the bucket is
      -1
      , Which means that the head node of the current bucket position is the FWD node, and the scattered linked list is currently in the process of capacity expansion. At this time, the current thread needs to assist in the expansion.
    • If the conditions of and are not met, it means that the storage of the current bucket position may be a linked list, or it may be the proxy object TreeBin of the red-black tree . In this case, synchronized will be used to lock the head node in the bucket to ensure that the write operation in the bucket is thread-safe.

7. Describe the hash addressing algorithm in ConcurrentHashMap

The addressing algorithm of ConcurrentHashMap is not much different from HashMap:

  • 1. get its HashCode value through the Key of the Node node, and then pass the HashCode value

    spread(int h)
    The method performs a detour operation, and then obtains the final Hash value.

    /** * Algorithm for calculating Node node hash value: parameter h is the hash value * eg: * h binary is --> 1100 0011 1010 0101 0001 1100 0001 1110 * (h >>> 16) --> 0000 0000 0000 0000 1100 0011 1010 0101 * (h ^ (h >>> 16)) --> 1100 0011 1010 0101 1101 1111 1011 1011 * Note: (h ^ (h >>> 16)) The purpose is to allow the high 16 bits of h to also participate in the addressing calculation, so that the obtained hash values are more dispersed and reduce the occurrence of hash conflicts~ * ------------------------------------------------- ----------------------------- * HASH_BITS --> 0111 1111 1111 1111 1111 1111 1111 1111 * (h ^ (h >>> 16)) --> 1100 0011 1010 0101 1101 1111 1011 1011 * (h ^ (h >>> 16)) & HASH_BITS --> 0100 0011 1010 0101 1101 1111 1011 1011 * Note: (h ^ (h >>> 16)) The result obtained is & HASH_BITS, the purpose is to make the result of the hash value always a positive number */ static final int spread ( int h) { //Let the original hash value XOR ^ the original hash value to the right by 16 bits, and then with & on HASH_BITS(0x7fffffff: Binary is 31 1) return (h ^ (h >>> 16 )) & HASH_BITS; } Copy code
  • After obtaining the final hash value, use the addressing formula:

    index = (tab.length -1) & hash
    Obtain the subscript of the bucket position.


8. How does ConcurrentHashMap count the amount of data in the current hash table?

ConcurrentHashMap counts the number of stored data through

addCount(long x, int check)
The method is implemented essentially with the help of the LongAdder atomic class. (Reference article: LongAdder source code analysis )

Why does ConcurrentHashMap not use ConcurrentHashMap and why not use AtomicLong to count the amount of hash table data? How about counting the amount of hash table data?

  • Because the AtomicLong atomic class auto-increment operation is implemented based on CAS, the implementation based on CAS will cause a problem, that is, when a large number of threads perform CAS operations at the same time, only one thread can execute successfully, and all other threads will enter spin due to failure. State, spin itself is a
    while(true)
    The cycle is very consuming system resources.

So how does LongAdder ensure that the performance is still efficient under a large amount of concurrency?

First look

LongAdder
Schematic diagram of the operation: (Picture reference from the interviewer asked me LongAdder, I was surprised... )

LongAdder
Use a " segmented " approach to reduce
CAS
The frequency of failure, typically using space for time :

  • LongAdder
    There is a global variable
    volatile long base;
    Value, when the concurrency is not high, it is passed
    CAS
    To operate directly
    base
    Value if
    CAS
    Fails, then for
    LongAdder
    middle
    Cell[]
    In the array
    Cell
    get on
    CAS
    Operation to reduce the probability of failure.

As in the current class

base = 10
, There are three threads for
CAS
Atomic add operation 1 , the thread performs a successful case. 11 = Base , the two threads, the thread three failed starts directed to
Cell[]
In the array
Cell
Add 1 to the element , the same is true
CAS
Operation, at this time the array
index=1
with
index=2
in
Cell
of
value
Both are set to 1.

After the execution is completed, the accumulated data is counted:

sum = 11 + 1 + 1 = 13
,use
LongAdder
The accumulation operation is finished, the flowchart is as follows:

(Picture reference from the interviewer asking me LongAdder, I was shocked... )


9. What are the preprocessing tasks performed by the thread that triggers the expansion condition?

  • First of all, the thread that triggers the expansion condition, the first thing to do is to modify it through CAS

    sizeCtl
    Field value, make the high 16 bits a unique identification stamp for expansion, and the low 16 bits are ( number of threads participating in expansion + 1 ), indicating that there is a thread that is expanding logic!

    • Note : The high 16 -bit expansion unique identification stamp here is calculated based on the length of the current scattered link list, and its highest bit is 1 . Then finally got
      sizeCtl
      It should be a negative number.
  • Then, thread the current condition of the trigger expansion will create a new list of bulk, the size of the original old scattered list of 2 times. And assign the reference of the new scattered list to

    map.nextTable
    Field.

  • And because ConcurrentHashMap supports multi-threaded concurrent expansion, it is necessary to let the threads assisting in the expansion know the progress of the migration of the old bulk list data to the new bulk list. For this ConcurrentHashMap provides a

    transferIndex
    Field, used to record the total migration progress of the old scattered list data! The migration progress starts from the high-level bucket and continues to the bucket where the subscript is 0 .


10. How to mark the barrels in the old loose linked list after migration?

  • For the buckets that have been migrated in the old scattered linked list, a ForwardingNode object needs to be used to represent the nodes in the bucket. This kind of Node is special and is used to indicate that the data in the current bucket has been migrated to the bucket of the new scattered linked list.

What are the functions of ForwardingNode?

  • The ForwardingNode object provides a method for querying the target data in the new scattered linked list.
    find()
    method.
  • When a thread is just querying the target element in the old scattered linked list at this time, it just encounters the FWD node stored in the current bucket , then it can pass the FWD node
    find()
    The method is redirected to the new scattered list to query the target element!

11. How to deal with a new write request when the hash table is being stored?

There are two cases to consider here:

  • If the current thread performs a write operation, the bucket accessed according to the addressing algorithm is not an FWD node (that is, the data in the current bucket has not been migrated). Then first judge whether the bucket is empty, and if it is empty, the data will be written in CAS mode. And if the bucket is not empty, it may be a linked list or TreeBin , this time you need to pass

    synchronized
    The keyword locks the head node of the bucket and then performs the write operation.

  • If the current thread performs a write operation, the bucket accessed according to the addressing algorithm is the FWD node (that is, the data in the current bucket has been migrated).

    • When the FWD node is encountered, it means that the scattered list is being expanded at this time. At this time, the current thread needs to be added to assist the expansion of the scattered list (
      helpTransfer(tab, f);
      The purpose of assisting in expansion is to minimize the time spent on data migration for expansion).
    • After the current thread is added to the expansion assistance, ConcurrentHashMap will be based on the global
      transferIndex
      Field to assign migration tasks to the current thread (need to be responsible for migrating the bucket interval of the old scattered list) For example, let the current thread be responsible for migrating the data in the bucket position [0-4] in the old scattered linked list to the new scattered linked list.
    • Until the current thread is unable to allocate the task to be responsible for the migration, it exits to assist in the expansion, that is, the expansion ends. At this time, the current thread can continue to execute the logic of writing data!

12. During the expansion period, how does the expansion worker thread maintain the lower 16 bits of sizeCtl?

  • Every thread that performs expansion tasks (including assisting expansion), it will be updated before it starts working
    sizeCtl
    The low 16 bits, that allow low 16 Wei
    +1
    , Indicating that a new thread is added to perform expansion.
  • sizeCtl
    16 16
    -1
  • sizeCtl
    16
    -1
    The following value is 1 , indicating that the current thread is the last thread to exit the concurrent expansion.

13. When the linked list in the bucket is upgraded to a red-black tree, and the current red-black tree is being accessed by a reader thread, what should I do if a new write thread request is made?

  • The writing thread will be blocked because the red-black tree is special. Newly written data may trigger the self-balancing of the red-black tree, which will cause the tree structure to change and affect the reading result of the reader thread!

Reading data and writing data on the red-black tree are mutually exclusive. The specific principle analysis is as follows:

  • We know that the red and black trees in ConcurrentHashMap are represented by TreeBin , and there is an Int type inside TreeBin
    state
    Field.
  • When the reader thread is reading data, it will use CAS to change
    state
    value
    +4
    (Indicating that a read lock is added), after reading the data, use CAS to
    state
    value
    -4
    .
  • If the writer thread writes data to the red-black tree, it will first check
    state
    Is the value equal to
    0
    , If it is 0 , it means that no reader thread is retrieving data. At this time, data can be written directly, and the writing thread will also use CAS.
    state
    The field value is set to
    1
    (Indicating write lock is added).
  • If the writer thread checks
    state
    Value is not
    0
    , This time it will
    park()
    Suspend the current thread so that it is waiting to be awakened. When suspending the writer thread, the writer thread will first
    state
    The second bit of the value is set to 1 (binary is
    10
    ), converted to decimal is 2, which means there is a writing thread waiting to be awakened.

Conversely, when there are writer threads on the red-black tree that are performing write operations, what should be done if there is a new reader thread request?

  • A linked list structure is retained inside the TreeBin object, which is designed for this situation. At this time, the new reader thread is allowed to access the data on the linked list without going through the red-black tree.

14. After suspending the waiting writing thread, when will it be awakened and then continue to perform the writing operation?

  • In the previous question, we analyzed: When the reader thread is reading data, it will use CAS to change
    state
    value
    +4
    (Indicating that a read lock is added), after reading the data, use CAS to
    state
    value
    -4
    .
  • when
    state
    After the value is subtracted by 4 , the reader thread will check it first
    state
    Is the value 2 if it is 2 , it means that there is a writing thread waiting to be awakened in the suspension waiting, this time call
    unsafe.unpark()
    Method to wake up the writing thread.

15. Could you please tell me about the lastRun mechanism in ConcurrentHashMap?


The summarized interview questions are also time-consuming. The articles will be updated from time to time, sometimes a few more a day. If you help you review and consolidate your knowledge points, please support three consecutive times, and you will be updated with billions of points in the future!


In order to help more Xiaobai advanced Java engineers from zero, a set of "Java Engineers Learning and Growth Knowledge Graph" was obtained from the CSDN official side, size

870mm x 560mm
, After unfolding, it is the size of a desk, or it can be folded into the size of a book. Interested friends can learn about it. Of course, bloggers articles are always free anyway~