HashMap
与 LinkedHashMap
在通过 values()
函数遍历时是否存在性能差异?
HashMap
与 LinkedHashMap
在通过 values()
函数遍历时是否存在性能差异?
Iterator
的nextEntry
实现更优,因此LinkedHashMap
在遍历方面必须更快。values
实现开始逐步分析。 HashMap
的values
实现如下:public Collection<V> values() {
Collection<V> vs = values;
return (vs != null ? vs : (values = new Values()));
}
LinkedHashMap
是 HashMap
的一个扩展,并继承了相同的实现。Values
的 Iterator
实现。HashMap
,它继承自 java.util.HashMap.HashIterator
。private final class ValueIterator extends HashIterator<V> {
public V next() {
return nextEntry().value;
}
}
对于LinkedHashMap
而言,它是从java.util.LinkedHashMap.LinkedHashIterator
继承而来的。
private class ValueIterator extends LinkedHashIterator<V> {
public V next() { return nextEntry().value; }
}
因此,本质上的区别在于nextEntry
的实现。
对于LinkedHashMap
,只需调用e.after,其中e是Entry
,
但对于HashMap
,需要遍历Entry[]
数组以查找下一个元素。
更新:HashMap
中nextEntry()
的代码如下:
final Entry<K,V> nextEntry() {
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
Entry<K,V> e = next;
if (e == null)
throw new NoSuchElementException();
if ((next = e.next) == null) {
Entry[] t = table;
while (index < t.length && (next = t[index++]) == null)
;
}
current = e;
return e;
}
Entry[] 不是一个连续的存储空间(其中可能有空值)。如果您查看上面的代码,它所做的就是将next指向current,并通过迭代Entry[]找到下一个next。
但是我认为这种性能提升将以插入成本为代价。请查看两个类中的addEntry方法作为练习。
我写了一个小型性能分析程序,创建了100万个键(整数)和Boolean.TRUE进行比较,重复执行100次。发现如下结果:
HashMap:-
Create: 3.7sec
Iterate: 1.1sec
Access: 1.5sec
Total: 6.2sec
LinkedHashMap:-
Create: 4.7sec (30% slower)
Iterate: 0.5sec (50% faster)
Access: 0.8sec (50% faster)
Total : 6.0sec
虽然垃圾回收没有执行,因此在一定程度上会影响数字的准确性,但我认为 LinkedHashMap 比 HashMap 更优秀,在未来的代码中我会使用它。
LinkedHashMap
更有意义,但我不清楚访问速度如何更快。对我来说,LinkedHashMap
的get(Object key)
函数与HashMap
的get(Object key)
函数本质上执行相同的操作,只不过在返回结果之前还会检查一个afterNodeAccess()
函数,这样做怎么可能会有更快的性能表现呢? - MarcusLinkedHashMap
将某个任务的处理时间从 25 秒缩短到了 8 秒!我为每个插入的条目遍历整个映射表(以进行一些特定于值的关系修复),因此时间复杂度为 O(N^2)。(86M 次 nextNode()
调用,哈希表大小为 1M,约有 13K 个元素) - Mark Jeronimus基本上不太重要,问题是:你需要什么。如果元素的顺序很重要,你就需要使用 LinkedHashMap
。否则,你就不需要它,可以使用 HashMap
。
最好的建议是“不要害怕尝试”,但我相信它们非常相似。获取设置值的getter为O(1),每个迭代步骤也是如此。遍历链表与遍历哈希桶一样简单,可能略微偏向于链表。
我在一个单元测试中尝试了10000次迭代values(),结果是806毫秒对比902毫秒。两者几乎相同。
是的,在所有遍历HashMap
和LinkedHashMap
上都存在相同的性能差异: HashMap
将花费与条目数量和哈希表大小成比例的时间,而LinkedHashMap
只需要花费与条目数量成比例的时间。
Code...
import java.lang.management.GarbageCollectorMXBean;
import java.lang.management.ManagementFactory;
import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedHashMap;
import java.util.Map;
public class MapTest
{
public static void main(String[] args)
{
int iterations = 1000000;
long time1, time2, time3;
System.nanoTime(); //Just to make sure any possible overhead is done...though there shouldn't really be any
int sequential[] = new int[iterations]; //Counting up from 0
int random[] = new int[iterations]; //Same set of values, but randomized (no duplicates)
HashSet<Integer> addedRandoms = new HashSet<>();
for (int i = 0; i < iterations; i++)
{
sequential[i] = i;
int randomVal = random(iterations);
while (addedRandoms.contains(randomVal)) randomVal = random(iterations); //Get another random instead of sequentially finding another unused value, to prevent clumping
addedRandoms.add(randomVal);
random[i] = random(iterations);
}
HashMap<Integer, Integer> hashMap = new HashMap<>();
LinkedHashMap<Integer, Integer> linkedHashMap = new LinkedHashMap<>();
int gcRuns = 0, prevGCRuns;
for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) gcRuns += gcBean.getCollectionCount();
prevGCRuns = gcRuns;
//Test
time1 = System.nanoTime();
for (int i : sequential) hashMap.put(i, 0);
time2 = System.nanoTime();
for (int i : sequential) linkedHashMap.put(i, 0);
time3 = System.nanoTime();
prevGCRuns = printAndReset("Put: sequential key (from 0 to " + (iterations - 1) + "), no overwrites", time1, time2, time3, prevGCRuns);
//Test
time1 = System.nanoTime();
for (int i : random) hashMap.put(i, 0);
time2 = System.nanoTime();
for (int i : random) linkedHashMap.put(i, 0);
time3 = System.nanoTime();
prevGCRuns = printAndReset("Put: random key (between 0 and " + (iterations - 1) + " inclusive), all overwrites (exactly one per entry, random order)", time1, time2, time3, prevGCRuns);
//Attempt GC
System.gc();
prevGCRuns = 0;
for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();
//Test
time1 = System.nanoTime();
for (int i : sequential) hashMap.put(i, 0);
time2 = System.nanoTime();
for (int i : sequential) linkedHashMap.put(i, 0);
time3 = System.nanoTime();
prevGCRuns = printAndReset("Put: sequential key (from 0 to " + (iterations - 1) + "), all overwrites (exactly one per entry, sequential order)", time1, time2, time3, prevGCRuns);
//Empty maps
hashMap = new HashMap<>();
linkedHashMap = new LinkedHashMap<>();
//Attempt GC
System.gc();
prevGCRuns = 0;
for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();
//Test
time1 = System.nanoTime();
for (int i : random) hashMap.put(i, 0);
time2 = System.nanoTime();
for (int i : random) linkedHashMap.put(i, 0);
time3 = System.nanoTime();
prevGCRuns = printAndReset("Put: random key (between 0 and " + (iterations - 1) + " inclusive), no overwrites", time1, time2, time3, prevGCRuns);
//Test
time1 = System.nanoTime();
for (int i : sequential) hashMap.get(i);
time2 = System.nanoTime();
for (int i : sequential) linkedHashMap.get(i);
time3 = System.nanoTime();
prevGCRuns = printAndReset("Sequential get, randomized internal keys", time1, time2, time3, prevGCRuns);
//Test
time1 = System.nanoTime();
for (int i : random) hashMap.get(i);
time2 = System.nanoTime();
for (int i : random) linkedHashMap.get(i);
time3 = System.nanoTime();
prevGCRuns = printAndReset("Random get, randomized internal keys", time1, time2, time3, prevGCRuns);
//Set sequential keys
hashMap = new HashMap<>();
linkedHashMap = new LinkedHashMap<>();
for (int i : sequential)
{
hashMap.put(i, 0);
linkedHashMap.put(i, 0);
}
//Attempt GC
System.gc();
prevGCRuns = 0;
for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();
//Test
time1 = System.nanoTime();
for (int i : sequential) hashMap.get(i);
time2 = System.nanoTime();
for (int i : sequential) linkedHashMap.get(i);
time3 = System.nanoTime();
prevGCRuns = printAndReset("Sequential get, sequential internal keys", time1, time2, time3, prevGCRuns);
//Test
time1 = System.nanoTime();
for (int i : random) hashMap.get(i);
time2 = System.nanoTime();
for (int i : random) linkedHashMap.get(i);
time3 = System.nanoTime();
prevGCRuns = printAndReset("Random get, sequential internal keys", time1, time2, time3, prevGCRuns);
//Test
time1 = System.nanoTime();
for (int i : hashMap.values()) ;
time2 = System.nanoTime();
for (int i : linkedHashMap.values()) ;
time3 = System.nanoTime();
prevGCRuns = printAndReset("values() iteration, sequential internal keys", time1, time2, time3, prevGCRuns);
//Set random keys
hashMap = new HashMap<>();
linkedHashMap = new LinkedHashMap<>();
for (int i : random)
{
hashMap.put(i, 0);
linkedHashMap.put(i, 0);
}
//Attempt GC
System.gc();
prevGCRuns = 0;
for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();
//Test
time1 = System.nanoTime();
for (int i : hashMap.values()) ;
time2 = System.nanoTime();
for (int i : linkedHashMap.values()) ;
time3 = System.nanoTime();
prevGCRuns = printAndReset("values() iteration, randomized internal keys", time1, time2, time3, prevGCRuns);
//Test
time1 = System.nanoTime();
for (int i : hashMap.keySet()) ;
time2 = System.nanoTime();
for (int i : linkedHashMap.keySet()) ;
time3 = System.nanoTime();
prevGCRuns = printAndReset("keySet() iteration, randomized internal keys", time1, time2, time3, prevGCRuns);
//Set sequential keys
hashMap = new HashMap<>();
linkedHashMap = new LinkedHashMap<>();
for (int i : sequential)
{
hashMap.put(i, 0);
linkedHashMap.put(i, 0);
}
//Attempt GC
System.gc();
prevGCRuns = 0;
for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();
//Test
time1 = System.nanoTime();
for (int i : hashMap.keySet()) ;
time2 = System.nanoTime();
for (int i : linkedHashMap.keySet()) ;
time3 = System.nanoTime();
prevGCRuns = printAndReset("keySet() iteration, sequential internal keys", time1, time2, time3, prevGCRuns);
//Test
time1 = System.nanoTime();
for (Map.Entry<Integer, Integer> entry : hashMap.entrySet()) ;
time2 = System.nanoTime();
for (Map.Entry<Integer, Integer> entry : linkedHashMap.entrySet()) ;
time3 = System.nanoTime();
prevGCRuns = printAndReset("entrySet() iteration, sequential internal keys", time1, time2, time3, prevGCRuns);
//Set random keys
hashMap = new HashMap<>();
linkedHashMap = new LinkedHashMap<>();
for (int i : random)
{
hashMap.put(i, 0);
linkedHashMap.put(i, 0);
}
//Attempt GC
System.gc();
prevGCRuns = 0;
for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) prevGCRuns += gcBean.getCollectionCount();
//Test
time1 = System.nanoTime();
for (Map.Entry<Integer, Integer> entry : hashMap.entrySet()) ;
time2 = System.nanoTime();
for (Map.Entry<Integer, Integer> entry : linkedHashMap.entrySet()) ;
time3 = System.nanoTime();
prevGCRuns = printAndReset("entrySet() iteration, randomized internal keys", time1, time2, time3, prevGCRuns);
}
protected static int printAndReset(String description, long time1, long time2, long time3, int prevGCRuns)
{
System.out.println(description);
System.out.println("HashMap: " + (time2 - time1) + " nanos");
System.out.println("LinkedHashMap: " + (time3 - time2) + " nanos");
int gcRuns = 0;
for (GarbageCollectorMXBean gcBean : ManagementFactory.getGarbageCollectorMXBeans()) gcRuns += gcBean.getCollectionCount();
System.out.println("GC during test: " + (gcRuns != prevGCRuns));
System.out.println();
return gcRuns;
}
public static int random(int maxvalue)
{
return (int) ((double) maxvalue * Math.random());
}
}
输出...
Put: sequential key (from 0 to 999999), no overwrites
HashMap: 30190070 nanos
LinkedHashMap: 44618672 nanos
GC during test: false
Put: random key (between 0 and 999999 inclusive), all overwrites (exactly one per entry, random order)
HashMap: 118536111 nanos
LinkedHashMap: 110828524 nanos
GC during test: false
Put: sequential key (from 0 to 999999), all overwrites (exactly one per entry, sequential order)
HashMap: 25070771 nanos
LinkedHashMap: 23569874 nanos
GC during test: false
Put: random key (between 0 and 999999 inclusive), no overwrites
HashMap: 93353708 nanos
LinkedHashMap: 106686445 nanos
GC during test: false
Sequential get, randomized internal keys
HashMap: 38817600 nanos
LinkedHashMap: 39499373 nanos
GC during test: false
Random get, randomized internal keys
HashMap: 42636179 nanos
LinkedHashMap: 51062653 nanos
GC during test: false
Sequential get, sequential internal keys
HashMap: 19986540 nanos
LinkedHashMap: 19502021 nanos
GC during test: false
Random get, sequential internal keys
HashMap: 58083205 nanos
LinkedHashMap: 59547574 nanos
GC during test: false
values() iteration, sequential internal keys
HashMap: 21284921 nanos
LinkedHashMap: 18383069 nanos
GC during test: false
values() iteration, randomized internal keys
HashMap: 19616868 nanos
LinkedHashMap: 15392964 nanos
GC during test: false
keySet() iteration, randomized internal keys
HashMap: 18054895 nanos
LinkedHashMap: 16067725 nanos
GC during test: false
keySet() iteration, sequential internal keys
HashMap: 18764430 nanos
LinkedHashMap: 18604873 nanos
GC during test: false
entrySet() iteration, sequential internal keys
HashMap: 18493825 nanos
LinkedHashMap: 18067752 nanos
GC during test: false
entrySet() iteration, randomized internal keys
HashMap: 16252707 nanos
LinkedHashMap: 13175517 nanos
GC during test: false