java 如何线程安全地使用HashMap?
为什么HashMap不是线程安全的?如何正确使用HashMap?
参考:http://yemengying.com/2016/05/07/threadsafe-hashmap/#SynchronizedMap
一、为什么HashMap不是线程安全的
(1)JDK1.7中HashMap可能发生的问题
手头上没有JDK1.7,所以只是从源码上研究一下可能遇到的问题,没有实际引发错误的代码。
1.插入导致头结点元素丢失
查看JDK1.7的addEntry方法:
1 2 3 4 5 6 |
void addEntry(int hash, K key, V value, int bucketIndex) { Entry<K,V> e = table[bucketIndex]; table[bucketIndex] = new Entry<K,V>(hash, key, value, e); if (size++ >= threshold) resize(2 * table.length); } |
假设现在存在两条线程A、B,如果两条线程同时对同一个数组位置(统一bucket)调用addEntry,那么两条线程将同时获取头结点的位置:
1 |
Entry<K,V> e = table[bucketIndex]; |
然后其中一条线程(比如线程A)会先行把新元素(线程A中的新元素)插入头结点:
1 |
table[bucketIndex] = new Entry<K,V>(hash, key, value, e); |
此时bucket的头结点已经变成了线程A中的新元素,但是线程B不会知道这一点,依然会把线程B中的新元素插入头结点,导致线程A中的新元素被覆盖。
2.移除导致元素被覆盖
查看JDK1.7的removeEntryForKey方法
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
final Entry<K,V> removeEntryForKey(Object key) { int hash = (key == null) ? 0 : hash(key.hashCode()); int i = indexFor(hash, table.length); Entry<K,V> prev = table[i]; Entry<K,V> e = prev; while (e != null) { Entry<K,V> next = e.next; Object k; if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) { modCount++; size--; if (prev == e) table[i] = next; else prev.next = next; e.recordRemoval(this); return e; } prev = e; e = next; } return e; } |
大概思路是找到要移除的元素的位置,然后让后面的元素直接覆盖要移除的元素,保证链表不断。
那么问题来了,假如有两条线程同时操作HashMap对象,一条线程插入元素,另一条线程移除元素,那么后执行的线程必定会覆盖前执行线程的结果,导致某操作被覆盖。
3.resize导致元素丢失
查看JDK1.7的resize方法:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
void resize(int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; if (oldCapacity == MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return; } Entry[] newTable = new Entry[newCapacity]; transfer(newTable); table = newTable; threshold = (int)(newCapacity * loadFactor); } |
当数组容量不足时,HashMap将调用resize方法对数组进行扩容。
那么问题来了,假如一条线程在插入时遇到数组容量不足的情况,那么将发生扩容。在此时,如果有别的线程正在修改HashMap中的元素,又会发生相互覆盖的问题。
4.死循环问题
HashMap的resize方法可能导致死循环问题。大概原因就是resize导致两个元素next的对象为彼此,形成了环路,在遍历这两个元素所在的bucket时就会进入死循环。
死循环问题的详细分析,可以查看:http://www.importnew.com/25070.html
死循环可由以下代码引发(在JDK 1.8中无此问题):
testHashMapLoop.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
package com.test.testHashMapLoop; import java.util.HashMap; import java.util.concurrent.atomic.AtomicInteger; public class testHashMapLoop { public static void main(String[] args) { HashMapThread hashMapThread1 = new HashMapThread(); HashMapThread hashMapThread2 = new HashMapThread(); HashMapThread hashMapThread3 = new HashMapThread(); HashMapThread hashMapThread4 = new HashMapThread(); HashMapThread hashMapThread5 = new HashMapThread(); hashMapThread1.start(); hashMapThread2.start(); hashMapThread3.start(); hashMapThread4.start(); hashMapThread5.start(); } } class HashMapThread extends Thread { private static AtomicInteger atomicInteger = new AtomicInteger(0); private static HashMap<Integer, Integer> hashMap = new HashMap<Integer, Integer>(1); public void run() { while (atomicInteger.get() < 100000) { hashMap.put(atomicInteger.get(), atomicInteger.get()); atomicInteger.incrementAndGet(); } } } |
体现为进程一直运行,需要手动结束。
可以使用jps + jstack定位死循环问题。
(2)JDK1.8中HashMap可能发生的问题
实践一下,看看在JDK1.8中还存不在上面的问题。
1.插入导致头结点元素丢失和扩容
并发问题元素少了不容易出效果,元素多了HashMap就resize,单独测试是真的不好测,所以放在一起进行测试。
TestPut.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
package testHashMap; import java.util.HashMap; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.Phaser; public class TestPut { public static void main(String[] args) { HashMap<Integer, String> hashMap = new HashMap<>(); for (int i = 0; i < 10000; i++) { hashMap.put(i, String.valueOf(i)); } System.out.println(hashMap.size()); Phaser phaser = new Phaser(3); ExecutorService executorService = Executors.newCachedThreadPool(); PutTask putTask1 = new PutTask(phaser, hashMap, 10000, 20000); PutTask putTask2 = new PutTask(phaser, hashMap, 20000, 30000); PutTask putTask3 = new PutTask(phaser, hashMap, 30000, 40000); executorService.execute(putTask1); executorService.execute(putTask2); executorService.execute(putTask3); phaser.awaitAdvance(phaser.getPhase()); System.out.println("all thread had arrived"); System.out.println(hashMap.size()); executorService.shutdown(); } } class PutTask implements Runnable { private Phaser phaser; private HashMap<Integer, String> hashMap; private int start; private int end; public PutTask(Phaser phaser, HashMap<Integer, String> hashMap, int start, int end) { // TODO Auto-generated constructor stub this.phaser = phaser; this.hashMap = hashMap; this.start = start; this.end = end; } @Override public void run() { // TODO Auto-generated method stub for (int i = start; i < end; i++) { hashMap.put(i, String.valueOf(i)); } System.out.println(Thread.currentThread().getName() + " has arrived"); phaser.arrive(); } } |
结果为:
1 2 3 4 5 6 |
10000 pool-1-thread-1 has arrived pool-1-thread-2 has arrived pool-1-thread-3 has arrived all thread had arrived 39258 |
期待的结果应该是10000 + 10000 * 3 = 40000,很明显插入时出现了并发问题。
2.移除导致元素被覆盖
TestPut.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
package testHashMap; import java.util.HashMap; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.Phaser; public class TestPut { public static void main(String[] args) { HashMap<Integer, String> hashMap = new HashMap<>(); for (int i = 0; i < 10000; i++) { hashMap.put(i, String.valueOf(i)); } System.out.println(hashMap.size()); Phaser phaser = new Phaser(2); ExecutorService executorService = Executors.newCachedThreadPool(); PutTask putTask = new PutTask(phaser, hashMap, 10000, 20000); RemoveTask removeTask = new RemoveTask(phaser, hashMap, 15000, 20000); executorService.execute(putTask); executorService.execute(removeTask); phaser.awaitAdvance(phaser.getPhase()); System.out.println("all thread had arrived"); System.out.println(hashMap.size()); executorService.shutdown(); } } class PutTask implements Runnable { private Phaser phaser; private HashMap<Integer, String> hashMap; private int start; private int end; public PutTask(Phaser phaser, HashMap<Integer, String> hashMap, int start, int end) { // TODO Auto-generated constructor stub this.phaser = phaser; this.hashMap = hashMap; this.start = start; this.end = end; } @Override public void run() { // TODO Auto-generated method stub for (int i = start; i < end; i++) { hashMap.put(i, String.valueOf(i)); } System.out.println(Thread.currentThread().getName() + " has arrived"); phaser.arrive(); } } class RemoveTask implements Runnable { private Phaser phaser; private HashMap<Integer, String> hashMap; private int start; private int end; public RemoveTask(Phaser phaser, HashMap<Integer, String> hashMap, int start, int end) { // TODO Auto-generated constructor stub this.phaser = phaser; this.hashMap = hashMap; this.start = start; this.end = end; } @Override public void run() { // TODO Auto-generated method stub for (int i = start; i < end; i++) { hashMap.remove(i); } System.out.println(Thread.currentThread().getName() + " has arrived"); phaser.arrive(); } } |
结果为:
1 2 3 4 5 |
10000 pool-1-thread-1 has arrived pool-1-thread-2 has arrived all thread had arrived 14990 |
期待的结果应该是10000 + 10000 – 5000 = 15000,很明显移除时出现了并发问题。
二、如何线程安全地使用HashMap
从上文我们已经知道HashMap不是线程安全的,在并发情况下必须有新的解决方案。
说起这个问题,我就想起很久以前在做验证码的时候,我直接使用了HashMap。后来觉得不对劲,才改用redis。现在想起来,当时简直就是失了智。
(1)解决方案
我个人知道有一下三种解决方案:
1.HashTable
HashTable通过synchronized修饰实现线程安全。
查看HashTable的get方法:
1 2 3 4 5 6 7 8 9 10 11 12 |
@SuppressWarnings("unchecked") public synchronized V get(Object key) { Entry<?,?> tab[] = table; int hash = key.hashCode(); int index = (hash & 0x7FFFFFFF) % tab.length; for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) { if ((e.hash == hash) && e.key.equals(key)) { return (V)e.value; } } return null; } |
HashTable通过对所有方法都加上synchronized锁的方式保证线程安全。
但是我们都知道,synchronized修饰的方法并发效率较低。所以现在基本上不会使用HashTable了。
2.ConcurrentHashMap
在JDK1.7中,ConcurrentHashMap通过分段锁实现线程安全,但是JDK1.8中ConcurrentHashMap启用了分段锁,采用了别的方式保证线程安全。这个问题之后我会另写文章来研究,在并发条件下,ConcurrentHashMap是较好的解决方案。
改写TestPut.java
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
package testHashMap; import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.Phaser; public class TestPut { public static void main(String[] args) { ConcurrentHashMap<Integer, String> concurrentHashMap = new ConcurrentHashMap<>(); for (int i = 0; i < 10000; i++) { concurrentHashMap.put(i, String.valueOf(i)); } System.out.println(concurrentHashMap.size()); Phaser phaser = new Phaser(2); ExecutorService executorService = Executors.newCachedThreadPool(); PutTask putTask = new PutTask(phaser, concurrentHashMap, 10000, 20000); RemoveTask removeTask = new RemoveTask(phaser, concurrentHashMap, 15000, 20000); executorService.execute(putTask); executorService.execute(removeTask); phaser.awaitAdvance(phaser.getPhase()); System.out.println("all thread had arrived"); System.out.println(concurrentHashMap.size()); executorService.shutdown(); } } class PutTask implements Runnable { private Phaser phaser; private ConcurrentHashMap<Integer, String> concurrentHashMap; private int start; private int end; public PutTask(Phaser phaser, ConcurrentHashMap<Integer, String> concurrentHashMap, int start, int end) { // TODO Auto-generated constructor stub this.phaser = phaser; this.concurrentHashMap = concurrentHashMap; this.start = start; this.end = end; } @Override public void run() { // TODO Auto-generated method stub for (int i = start; i < end; i++) { concurrentHashMap.put(i, String.valueOf(i)); } System.out.println(Thread.currentThread().getName() + " has arrived"); phaser.arrive(); } } class RemoveTask implements Runnable { private Phaser phaser; private ConcurrentHashMap<Integer, String> concurrentHashMap; private int start; private int end; public RemoveTask(Phaser phaser, ConcurrentHashMap<Integer, String> concurrentHashMap, int start, int end) { // TODO Auto-generated constructor stub this.phaser = phaser; this.concurrentHashMap = concurrentHashMap; this.start = start; this.end = end; } @Override public void run() { // TODO Auto-generated method stub for (int i = start; i < end; i++) { concurrentHashMap.remove(i); } System.out.println(Thread.currentThread().getName() + " has arrived"); phaser.arrive(); } } |
结果为:
1 2 3 4 5 |
10000 pool-1-thread-2 has arrived pool-1-thread-1 has arrived all thread had arrived 20000 |
3.SynchronizedMap
SynchronizedMap是Collections类的一个方法:
1 2 3 |
public static <K,V> Map<K,V> synchronizedMap(Map<K,V> m) { return new SynchronizedMap<>(m); } |
具体实现为:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
private static class SynchronizedMap<K,V> implements Map<K,V>, Serializable { private static final long serialVersionUID = 1978198479659022715L; private final Map<K,V> m; // Backing Map final Object mutex; // Object on which to synchronize SynchronizedMap(Map<K,V> m) { this.m = Objects.requireNonNull(m); mutex = this; } SynchronizedMap(Map<K,V> m, Object mutex) { this.m = m; this.mutex = mutex; } ...... } |
可以这样使用:
1 |
SynchronizedMap synchronizedMap = Collections.synchronizedMap(new HashMap<String, Integer>()); |
(2)性能对比
写一个测试类TestPerformance.java,进行测试:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
package testHashMap; import java.util.Collections; import java.util.HashMap; import java.util.Hashtable; import java.util.Map; import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.TimeUnit; public class TestPerformance { public final static int THREAD_POOL_SIZE = 5; public static Map<String, Integer> HashTable = null; public static Map<String, Integer> ConcurrentHashMap = null; public static Map<String, Integer> SynchronizedMap = null; public static void main(String[] args) throws InterruptedException { HashTable = new Hashtable<>(); test(HashTable); ConcurrentHashMap = new ConcurrentHashMap<>(); test(ConcurrentHashMap); SynchronizedMap = Collections.synchronizedMap(new HashMap<String, Integer>()); test(SynchronizedMap); } public static void test(final Map<String, Integer> map) throws InterruptedException { System.out.println("Test started for: " + map.getClass()); long averageTime = 0; for (int i = 0; i < 5; i++) { long startTime = System.nanoTime(); ExecutorService executorService = Executors.newFixedThreadPool(THREAD_POOL_SIZE); for (int j = 0; j < THREAD_POOL_SIZE; j++) { executorService.execute(new Runnable() { @SuppressWarnings("unused") @Override public void run() { for (int i = 0; i < 500000; i++) { Integer integer = (int) Math.ceil(Math.random() * 550000); Integer crunchifyValue = map.get(String.valueOf(integer)); map.put(String.valueOf(integer), integer); } } }); } executorService.shutdown(); executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS); long entTime = System.nanoTime(); long totalTime = (entTime - startTime) / 1000000L; averageTime += totalTime; System.out.println("2500K entried added/retrieved in " + totalTime + " ms"); } System.out.println("For " + map.getClass() + " the average time is " + averageTime / 5 + " ms\n"); } } |
结果为:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
Test started for: class java.util.Hashtable 2500K entried added/retrieved in 1836 ms 2500K entried added/retrieved in 1746 ms 2500K entried added/retrieved in 1793 ms 2500K entried added/retrieved in 1758 ms 2500K entried added/retrieved in 1684 ms For class java.util.Hashtable the average time is 1763 ms Test started for: class java.util.concurrent.ConcurrentHashMap 2500K entried added/retrieved in 2292 ms 2500K entried added/retrieved in 609 ms 2500K entried added/retrieved in 669 ms 2500K entried added/retrieved in 587 ms 2500K entried added/retrieved in 655 ms For class java.util.concurrent.ConcurrentHashMap the average time is 962 ms Test started for: class java.util.Collections$SynchronizedMap 2500K entried added/retrieved in 1910 ms 2500K entried added/retrieved in 1628 ms 2500K entried added/retrieved in 2794 ms 2500K entried added/retrieved in 1728 ms 2500K entried added/retrieved in 1638 ms For class java.util.Collections$SynchronizedMap the average time is 1939 ms |
从效率上看,使用ConcurrentHashMap是更好的选择。
三、总结
下次分析ConcurrentHashMap。