Monday, July 13, 2015

A light and fast type for serializing to a byte slice in Go

Sometimes we need to serialize some data structure and form a byte slice ([]byte). The builtin library provides a type in bytes package named Buffer. While in GolangPlus, a type named ByteSlice can be an better alternative in package.

Actually, ByteSlice is nothing but a renamed []byte, i.e. simply

  type ByteSlice []byte

while a Buffer contains much more fields:

  type Buffer struct {
    buf []byte
    off int
    runeBytes [utf8.UTFMax]byte
    bootstrap [64]byte
    lastRead  readOp

which means much more memory usage. In some situation, this could be a problem.

Preparing a ByteSlice is much lighter than preparing a Buffer. Here is the benchmark for serializing a 10-byte data:

  BenchmarkByteSliceWrite10       20000000 101 ns/op
  BenchmarkBytesBufferWrite10_New  3000000 460 ns/op
  BenchmarkBytesBufferWrite10_Def  3000000 474 ns/op

BenchmarkBytesBufferWrite10_New initializes the buffer with bytes.NewBuffer and a 10-byte-long byte slice and BenchmarkBytesBufferWrite10_Def just defines a zero value bytes.Buffer variable. The more than 4 times advantage of ByteSlice over Buffer is caused by the difference of intializing the object.

Writing to a *ByteSlice is appending to the slice. For example, (*ByteSlice).WriteByte is implemented as follow:

  func (s *ByteSlice) WriteByte(c byte) error {
    *s = append(*s, c)
    return nil

Comparing it with the implementation of Buffer.WriteByte:

  func (b *Buffer) WriteByte(c byte) error {
    b.lastRead = opInvalid
    m := b.grow(1)
    b.buf[m] = c
    return nil

Which is much more complicated. Here is the benchmark showing the efficiency difference:

  BenchmarkByteSliceWrite1k   200000  9971 ns/op
  BenchmarkBytesBufferWrite1k 100000 11933 ns/op

At the same time, Buffer doesn't create the overhead for nothing. It supports UnreadByte and UnreadRune which are not supported by ByteSlice (they do need extra memory to support). But if one doesn't need them, which is most of the case for me, ByteSlice is obvious a better choice.

No comments: