The main difference is that elements need not be converted to interface{}'s for pushing and popping. This reduces the overhead of convertion from value to interface and vice verser. The trick is to use the last element as the in/out place so the interface doesn't need to touch the element. Push, Pop and Remove are replaced with PushLast, PopToLast and RemoveToLast, respectively.
An example of a heap with integers is like this:
type IntHeap []int
func (h *IntHeap) Pop() int {
heap.PopToLast((*sort.IntSlice)(h))
res := (*h)[len(*h) - 1]
*h = (*h)[:len(*h) - 1]
return res
}
func (h *IntHeap) Push(x int) {
*h = append(*h, x)
heap.PushLast((*sort.IntSlice)(h))
}
Pushing and poping the elements are done by calling the corresponding methods in the type IntHeap other than calling heap.Push(h, el). The code looks much clearer for me.
type IntHeap []int
func (h *IntHeap) Pop() int {
heap.PopToLast((*sort.IntSlice)(h))
res := (*h)[len(*h) - 1]
*h = (*h)[:len(*h) - 1]
return res
}
func (h *IntHeap) Push(x int) {
*h = append(*h, x)
heap.PushLast((*sort.IntSlice)(h))
}
Pushing and poping the elements are done by calling the corresponding methods in the type IntHeap other than calling heap.Push(h, el). The code looks much clearer for me.
In the case where the element is a struct, the benchmark shows about 10~15% of performance increase by removing the convertion:
BenchmarkDataHeap_Plus 100 10541648 ns/op
BenchmarkDataHeap_Pkg 100 11921974 ns/op
where each element is of the type:
type Data struct {
Value string
Priority int
}
The closure (func) version of functions are also defined as PushLastF, PopToLast and RemoveToLast, respectively. Using them can make the calling faster than using the interface.
The benchmark shows a 20~30% peformance improvement by using closure version vs the builtin heap:
BenchmarkDataHeap_F 200 8875933 ns/op
Heaps of int/string/float64 are predefined with support of customized less function. Here is the benchmark results (similar performance increase):
BenchmarkBuiltinIntHeap 300 5680077 ns/op
BenchmarkPlusIntHeap 300 4049263 ns/op
BenchmarkPlusIntHeap_Less 300 4933297 ns/op