tag:blogger.com,1999:blog-192647382024-03-13T05:26:29.159-07:00Yi and CodeYi DENGhttp://www.blogger.com/profile/14160396608979133205noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-19264738.post-35491474940249299832019-08-28T06:12:00.000-07:002019-08-28T06:15:17.537-07:00A C++ Pitfall I was just caught by<div style="text-align: justify;">
It's be a long time that I was using C++ without surprise. I'm not saying my C++ code was totally bug free, but the mistakes were commonly due to some kind of carelessness and can be easily identified, understood and fixed. Until today... So, I think I should share it.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Let's say we have a vector of string:</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<span style="font-family: "courier new" , "courier" , monospace;">std::vector<std::string> tokens;</span></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Now I iterate over each element of the vector.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<span style="font-family: "courier new" , "courier" , monospace;">for (int i = 0; i < tokens.size(); i++) {</span></div>
<div style="text-align: justify;">
<span style="font-family: "courier new" , "courier" , monospace;"> ...</span></div>
<div style="text-align: justify;">
<span style="font-family: "courier new" , "courier" , monospace;">}</span></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The index i is useful within the body of the loop. Everything goes as expected.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Then, for some reason, I want to do one more loop than the number of tokens. So I change the code to:</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<span style="font-family: "courier new" , "courier" , monospace;">for (int i = -1; i < tokens.size(); i++) {</span></div>
<div style="text-align: justify;">
<span style="font-family: "courier new" , "courier" , monospace;"> ...</span></div>
<div style="text-align: justify;">
<span style="font-family: "courier new" , "courier" , monospace;">}</span></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Is this correct. No. And even worse, it is supposed to generate some signals along with others. It does not crash. And the code was for experiments with no test cases covering it yet. It was found until I happened to carefully check the generated data. And I went back to read the code again and again. The -1 was not a constant in the real code but was generated by an expression. All of a sudden, I realized the problem.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
What is the problem then? The unsigned int. For some reason, <span style="font-family: "courier new" , "courier" , monospace;">std::vector::size</span> (and all other sizes in std) is of <span style="font-family: "courier new" , "courier" , monospace;">size_t</span>, which is an unsigned integer. While i is a signed integer. Now, in the condition expression <span style="font-family: "courier new" , "courier" , monospace;">i < tokens.size()</span> in C++, i is first cast into an unsigned int before being compared with the unsigned int. So, <span style="font-family: "courier new" , "courier" , monospace;">-1</span> becomes <span style="font-family: "courier new" , "courier" , monospace;">0xFFFFFFFF</span> (64-bits) which fails the condition and the whole loop body never runs.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
This nearly wasted several days of my time. I begin to miss the requirement of explicit cast in Go for this situation.</div>
Yi DENGhttp://www.blogger.com/profile/14160396608979133205noreply@blogger.com0tag:blogger.com,1999:blog-19264738.post-63315895056207371912015-07-13T22:36:00.000-07:002017-07-26T08:27:20.642-07:00A light and fast type for serializing to a byte slice in GoSometimes we need to serialize some data structure and form a byte slice (<span style="font-family: "courier new" , "courier" , monospace;">[]byte</span>). The builtin library provides a type in <span style="font-family: "courier new" , "courier" , monospace;">bytes</span> package named <span style="font-family: "courier new" , "courier" , monospace;"><a href="http://golang.org/pkg/bytes/#Buffer" target="_blank">Buffer</a></span>. While in <a href="https://github.com/golangplus" target="_blank">GolangPlus</a>, a type named <span style="font-family: "courier new" , "courier" , monospace;">ByteSlice</span> can be an better alternative in <span style="font-family: "courier new" , "courier" , monospace;"><a href="http://github.com/golangplus/bytes">github.com/golangplus/bytes</a></span> package.<br />
<br />
Actually, <span style="font-family: "courier new" , "courier" , monospace;">ByteSlice</span> is nothing but a renamed <span style="font-family: "courier new" , "courier" , monospace;">[]byte</span>, i.e. simply<br />
<pre><code>
type ByteSlice []byte
</code></pre>
<br />
while a <span style="font-family: "courier new" , "courier" , monospace;">Buffer</span> contains much more fields:<br />
<pre><code>
type Buffer struct {
buf []byte
off int
runeBytes [utf8.UTFMax]byte
bootstrap [64]byte
lastRead readOp
}</code></pre>
<br />
<div>
which means much more memory usage. In some situation, this could be a problem.</div>
<div>
<br /></div>
<div>
Preparing a <span style="font-family: "courier new" , "courier" , monospace;">ByteSlice</span> is much lighter than preparing a <span style="font-family: "courier new" , "courier" , monospace;">Buffer</span>. Here is the benchmark for serializing a 10-byte data:</div>
<div>
<br /></div>
<div>
<div>
<pre> BenchmarkByteSliceWrite10 20000000 101 ns/op
BenchmarkBytesBufferWrite10_New 3000000 460 ns/op
BenchmarkBytesBufferWrite10_Def 3000000 474 ns/op
</pre>
</div>
<div>
<br /></div>
<div>
<span style="font-family: "courier new" , "courier" , monospace;">BenchmarkBytesBufferWrite10_New</span> initializes the buffer with <span style="font-family: "courier new" , "courier" , monospace;">bytes.NewBuffer</span> and a 10-byte-long byte slice and <span style="font-family: "courier new" , "courier" , monospace;">BenchmarkBytesBufferWrite10_Def</span> just defines a zero value <span style="font-family: "courier new" , "courier" , monospace;">bytes.Buffer</span> variable. The more than 4 times advantage of <span style="font-family: "courier new" , "courier" , monospace;">ByteSlice</span> over <span style="font-family: "courier new" , "courier" , monospace;">Buffer</span> is caused by the difference of intializing the object.</div>
<div>
<br /></div>
<div>
Writing to a <span style="font-family: "courier new" , "courier" , monospace;">*ByteSlice</span> is appending to the slice. For example, <span style="font-family: "courier new" , "courier" , monospace;">(*ByteSlice).WriteByte</span> is implemented as follow:</div>
<pre><code>
func (s *ByteSlice) WriteByte(c byte) error {
*s = append(*s, c)
return nil
}</code></pre>
<div>
<br /></div>
<div>
Comparing it with the implementation of <span style="font-family: "courier new" , "courier" , monospace;">Buffer.WriteByte</span>:</div>
<div>
<div>
<pre><code>
func (b *Buffer) WriteByte(c byte) error {
b.lastRead = opInvalid
m := b.grow(1)
b.buf[m] = c
return nil
}</code></pre>
</div>
<div>
<br /></div>
<div>
Which is much more complicated. Here is the benchmark showing the efficiency difference:</div>
<div>
<br /></div>
<div>
<div>
<pre> BenchmarkByteSliceWrite1k 200000 9971 ns/op
BenchmarkBytesBufferWrite1k 100000 11933 ns/op
</pre>
</div>
<div>
<br /></div>
<div>
At the same time, <span style="font-family: "courier new" , "courier" , monospace;">Buffer</span> doesn't create the overhead for nothing. It supports <span style="font-family: "courier new" , "courier" , monospace;">UnreadByte</span> and <span style="font-family: "courier new" , "courier" , monospace;">UnreadRune</span> which are not supported by <span style="font-family: "courier new" , "courier" , monospace;">ByteSlice</span> (they do need extra memory to support). But if one doesn't need them, which is most of the case for me, <span style="font-family: "courier new" , "courier" , monospace;">ByteSlice</span> is obvious a better choice.</div>
</div>
</div>
</div>
Yi DENGhttp://www.blogger.com/profile/14160396608979133205noreply@blogger.com0tag:blogger.com,1999:blog-19264738.post-56859191406601537162015-06-17T21:01:00.003-07:002015-06-26T23:38:45.834-07:00An alternative design for "container/heap" in Go.Here is an alternative design of the heap package in Go:<br />
<div>
<br /></div>
<div>
<a href="https://github.com/golangplus/container/tree/master/heap" target="_blank">github.com/golangplus/container/heap</a> (<a href="http://godoc.org/github.com/golangplus/container/heap" target="_blank">GoDoc</a>)</div>
<div>
<br /></div>
<div>
The main difference is that elements need not be converted to <span style="font-family: Courier New, Courier, monospace;">interface{}'s</span> for pushing and popping. This reduces the overhead of convertion from value to interface and vice verser. The trick is to use the last element as the in/out place so the interface doesn't need to touch the element. <span style="font-family: Courier New, Courier, monospace;">Push</span>, <span style="font-family: Courier New, Courier, monospace;">Pop</span> and <span style="font-family: Courier New, Courier, monospace;">Remove</span> are replaced with <span style="font-family: Courier New, Courier, monospace;">PushLast</span>, <span style="font-family: Courier New, Courier, monospace;">PopToLast</span> and <span style="font-family: Courier New, Courier, monospace;">RemoveToLast</span>, respectively.</div>
<div>
<br /></div>
<div>
An example of a heap with integers is like this:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">type IntHeap []int</span><br />
<span style="font-family: Courier New, Courier, monospace;">func (h *IntHeap) Pop() int {</span><br />
<span style="font-family: Courier New, Courier, monospace;"> heap.PopToLast((*sort.IntSlice)(h))</span><br />
<span style="font-family: Courier New, Courier, monospace;"> res := (*h)[len(*h) - 1]</span><br />
<span style="font-family: Courier New, Courier, monospace;"> *h = (*h)[:len(*h) - 1]</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;"> return res</span><br />
<span style="font-family: Courier New, Courier, monospace;">}</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;">func (h *IntHeap) Push(x int) {</span><br />
<span style="font-family: Courier New, Courier, monospace;"> *h = append(*h, x)</span><br />
<span style="font-family: Courier New, Courier, monospace;"> heap.PushLast((*sort.IntSlice)(h))</span><br />
<span style="font-family: Courier New, Courier, monospace;">}</span><br />
<br />
Pushing and poping the elements are done by calling the corresponding methods in the type <span style="font-family: 'Courier New', Courier, monospace;">IntHeap</span><span style="font-family: Times, Times New Roman, serif;"> other than calling </span><span style="font-family: Courier New, Courier, monospace;">heap.Push(h, el)</span><span style="font-family: Times, Times New Roman, serif;">. The code looks much clearer for me.</span></div>
<div>
<div>
</div>
</div>
<div>
<br />
In the case where the element is a <span style="font-family: Courier New, Courier, monospace;">struct</span>, the benchmark shows about <span style="font-family: Times, Times New Roman, serif;">10~15% of performance increase by removing the convertion:</span><br />
<span style="font-family: Times, Times New Roman, serif;"><br /></span>
<br />
<div class="p1">
<span style="font-family: 'Courier New', Courier, monospace;">BenchmarkDataHeap_Plus </span><span style="font-family: 'Courier New', Courier, monospace;">100</span><span style="font-family: Courier New, Courier, monospace;"> 10541648 ns/op</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">BenchmarkDataHeap_Pkg 100 </span><span style="font-family: 'Courier New', Courier, monospace;">11921974 ns/op</span></div>
<div class="p1">
<span class="s1"><br /></span></div>
<div class="p1">
where each element is of the type:</div>
<div class="p1">
<br /></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">type Data struct {</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>Value string</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;"><span class="Apple-tab-span" style="white-space: pre;"> </span>Priority int</span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace;">}</span></div>
<div class="p1">
<span class="s1"><br /></span></div>
<div class="p1">
The closure (<span style="font-family: Courier New, Courier, monospace;">func</span>) version of functions are also defined as <span style="font-family: Courier New, Courier, monospace;">PushLastF</span>, <span style="font-family: Courier New, Courier, monospace;">PopToLast </span><span style="font-family: Times, Times New Roman, serif;">and</span> <span style="font-family: Courier New, Courier, monospace;">RemoveToLast</span>, respectively. Using them can make the calling faster than using the interface.</div>
</div>
<div>
<br />
The benchmark shows a 20~30% peformance improvement by using closure version vs the builtin heap:<br />
<span style="font-family: 'Courier New', Courier, monospace;"><br /></span>
<span style="font-family: 'Courier New', Courier, monospace;">BenchmarkDataHeap_F </span><span style="font-family: 'Courier New', Courier, monospace;">200 </span><span style="font-family: 'Courier New', Courier, monospace;">8875933 ns/op</span><br />
<br />
Heaps of <span style="font-family: Courier New, Courier, monospace;">int</span>/<span style="font-family: Courier New, Courier, monospace;">string</span>/<span style="font-family: Courier New, Courier, monospace;">float64</span> are predefined with support of customized less function. Here is the benchmark results (similar performance increase):<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">BenchmarkBuiltinIntHeap 300 5680077 ns/op</span><br />
<span style="font-family: Courier New, Courier, monospace;">BenchmarkPlusIntHeap 300 4049263 ns/op</span><br />
<span style="font-family: Courier New, Courier, monospace;">BenchmarkPlusIntHeap_Less 300 4933297 ns/op</span></div>
Yi DENGhttp://www.blogger.com/profile/14160396608979133205noreply@blogger.com0tag:blogger.com,1999:blog-19264738.post-84106189426098905462015-02-24T07:19:00.000-08:002015-02-24T12:53:13.349-08:00A lock-free Java object poolIt's a little bit surprising for me that I could not find an article or library of a totally lock-free Java object pool by googling. The closet one is <a href="http://ashkrit.blogspot.com/2013/05/lock-less-java-object-pool.html" target="_blank">Lock Less Java Object Pool</a>. Here comes my version of a totally lock-free Java object pool.<br />
<div>
<br /></div>
<div>
The code is shared as <a href="https://github.com/daviddengcn/lockfreepool" target="_blank">a public project at Github</a>. Fell free to share or fork it.</div>
<div>
<br /></div>
<div>
The idea is simple. All returned objects (which need to be cached) are stored in a stack, which is held in an <a href="https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/AtomicReferenceArray.html" target="_blank">AtomicReferenceArray</a> instance and an <a href="http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/AtomicInteger.html" target="_blank">AtomicInteger</a>-typed variable stores the top position of the stack. allocation or freeing of an object contains two steps:</div>
<div>
<ol>
<li>Firstly, a change of the top variable needs to be reserved,</li>
<li>Then we try get/put the element in the reserved position,</li>
<li>Since some other thread may also reserve the same place because the stack could expand and shrink, we may have to start over if this happens, i.e. Step 2 failed.</li>
</ol>
<div>
Step 1 utilizes the method of <a href="http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/AtomicInteger.html#compareAndSet-int-int-" target="_blank">compareAndSet</a> of AtomicInteger in a loop and step 2 uses <a href="http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/AtomicInteger.html#getAndSet-int-" target="_blank">getAndSet</a>/<a href="http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/AtomicInteger.html#compareAndSet-int-int-" target="_blank">compareAndSet</a> of AtomicReferenceArray. Both are in a lock-free style.</div>
</div>
<div>
<br /></div>
<div>
This framework is also known as an optimisitic locking. It is useful for cases when contentions are relatively rare events.</div>
<div>
<br /></div>
<div>
Here are some code fragments:<br />
<br /></div>
<div>
<pre> public E alloc() {
while (true) {
// Try reserve a cached object in objects
int n;
do {
n = top.get();
if (n == 0) {
// No cached oobjects, allocate a new one
return factory.alloc();
}
} while (!top.compareAndSet(n, n - 1));
// Try fetch the cached object
E e = objects.getAndSet(n, null);
if (e != null) {
return e;
}
// It is possible that the reserved object was extracted before
// the current thread tried to get it. Let's start over again.
}
}
public void free(E e) {
while (true) {
// Try reserve a place in this.objects for e.
int n;
do {
n = top.get();
if (n == objects.length()) {
// the pool is full, e is not cached.
factory.free(e);
}
} while (!top.compareAndSet(n, n + 1));
// Try put e at the reserved place.
if (objects.compareAndSet(n + 1, null, e)) {
return;
}
// It is possible that the reserved place was occupied before
// the current thread tried to put e in it. Let's start over again.
}
}
</pre>
</div>
<div>
<br />
Any comments are welcome!</div>
Yi DENGhttp://www.blogger.com/profile/14160396608979133205noreply@blogger.com0tag:blogger.com,1999:blog-19264738.post-76070942016430186062013-08-11T20:01:00.001-07:002013-08-12T02:31:09.203-07:00Comparison of Go data types for string processing.<div>
<i>String processing</i> is a very common operation in an application. This post is going to talk about some data types used in string processing in <a href="http://golang.org/" target="_blank">Go language</a>. Two main concerns are readibility(or maintaining) and efficiency.<br />
<br /></div>
<h3>
String type</h3>
<div>
<span style="font-family: inherit;">String is no-doubt the most direct type for string processing. Go provides many ways to represent a <a href="http://golang.org/ref/spec#String_literals" target="_blank">string literal</a> in the source code.</span></div>
<div>
<span style="font-family: inherit;"><br /></span></div>
<div>
<span style="font-family: inherit;">A string in Go is a serial of <i>immutable</i> bytes. Many builtin string related functions suppose the text a string representing is encoded with UTF-8 encoding.</span></div>
<div>
<br /></div>
<div>
Each string object contains a pointer to the start of the byte buffer and the length to the string. Sub-strings from a long one may share the same large buffer with the original one.</div>
<div>
<br /></div>
<div>
Since string is immutable, processing on a string means creating new string objects. Two kinds of new strings may be created:</div>
<div>
<ol>
<li>Sub-strings. e.g. <span style="font-family: Courier New, Courier, monospace;">sub := org[2:5]</span>. In this case, a new string header is allocated, the pointer is set to some offset to the original pointer, and the lengh is computed and filled.</li>
<li>A totally new string. E.g. catenation of strings. The underlying buffer is newly allocated, and bytes are filled as needed.</li>
</ol>
<div>
The following code fragment contains both string allocations:</div>
</div>
<div>
<br /></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> newStr := "prefix " + oldStr[1:10] + " suffix"</span></div>
<br />
A string is comparable, or it can be a key in a Go map object.<br />
<br />
<h3>
Byte slice type</h3>
<div>
In Go, a string can be easily converted to a byte slice using similar grammar to type convertion:</div>
<div>
<br /></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">sl := []byte(str)</span></div>
<div>
<br /></div>
<div>
But one has to remember that, this convertion allocates a new byte slice object, and copies the bytes to the new slice. So changing the content of the slice will not affect the original string. (Otherwise that string is no longer immutable). For the same reason, converting back from a byte slice to a string also needs a new buffer to be allocated. In other words, this convertion is not cheap.</div>
<div>
<br /></div>
<div>
Like other type of slices in Go, a byte slice contains a header of a pointer to the buffer, the length of the slice, and the capability of the slice. This means a byte-slice consumes more meta memory than a string.</div>
<div>
<br /></div>
<div>
But a slice is mutable. The content, or the bytes, in a slice can be changed. So when you are appending sub strings to a byte slice, the underlying buffer may keep unchanged, unless the length exceeds the capability. A more commonly used pattern is to use a <span style="font-family: Courier New, Courier, monospace;">bytes.Buffer</span> type, (or for my personally used, I prefer a more lightweight object <span style="font-family: Courier New, Courier, monospace;"><a href="https://github.com/daviddengcn/go-villa/blob/master/byteslice.go#L16" target="_blank">"github.com/daviddengcn/go-villa".ByteSlice</a></span> sometimes, which utilizing the Go type system and allocates fewer extra bytes). In some special cases, in-place operation can be performed on a byte slice.</div>
<div>
<br /></div>
<div>
A byte slice cannot be used as a key type in a map.<br />
<br /></div>
<h3>
Byte array type</h3>
<div>
For some special situations, the length of a string is fixed, a byte array can be used. It needs no extra meta bytes, and is mutable. You can easily create a byte slice representing the whole or part of a byte array.<br />
<br />
Content of a byte array is mutable. E.g. in-place processing can be performed like the byte slice. But the assignment of a byte array is like that of a struct, i.e. the whole array is copied to the destination.</div>
<div>
<br /></div>
<div>
A byte array type can be used as a key type in a map.<br />
<br />
<h3>
Rune slice type</h3>
</div>
<div>
A string can be converted to a rune slice in a grammar similar to type convertion:</div>
<div>
<br /></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> rsl := []rune(str)</span></div>
<div>
<br /></div>
<div>
supposing the bytes are UTF-8 encoding. Besides the allocation of the rune slice, the contents are decoded, other than copied, as the rune slice. This is more expensive than convertions between a string and a byte slice.</div>
<div>
<br /></div>
<div>
A rune slice cannot be used as a key type in a map.<br />
<br /></div>
<h3>
Rune array type</h3>
<div>
Like the byte array type, rune array is used to represent strings with a fixed/limited number of runes. No extra meta bytes, and the contents are mutable.</div>
<div>
<br /></div>
<div>
A rune array can be used as a key type in a map.<br />
<br /></div>
<h3>
Summary</h3>
<div>
Here is a table of the summary to the differences of the above types and convertions between them.</div>
<table>
<thead>
<tr>
<th></th>
<th><div style="text-align: left;">
byte-array</div>
</th>
<th><div style="text-align: left;">
byte-slice</div>
</th>
<th><div style="text-align: left;">
string</div>
</th>
<th><div style="text-align: left;">
rune-slice</div>
</th>
<th><div style="text-align: left;">
rune-array</div>
</th>
</tr>
</thead>
<tbody>
<tr>
<th><div style="text-align: left;">
type</div>
</th>
<td><span style="font-family: Courier New, Courier, monospace;">[4]byte</span></td>
<td><span style="font-family: Courier New, Courier, monospace;">[]byte</span></td>
<td><span style="font-family: Courier New, Courier, monospace;">string</span></td>
<td><span style="font-family: Courier New, Courier, monospace;">[]rune</span></td>
<td><span style="font-family: Courier New, Courier, monospace;">[4]rune</span></td>
</tr>
<tr>
<th><div style="text-align: left;">
size</div>
</th>
<td>fixed</td>
<td>variable</td>
<td>variable</td>
<td>variable</td>
<td>fixed</td>
</tr>
<tr>
<th><div style="text-align: left;">
access</div>
</th>
<td>read/write</td>
<td>read/write</td>
<td>read only</td>
<td>read/write</td>
<td>read/write</td>
</tr>
<tr>
<th><div style="text-align: left;">
comparable</div>
</th>
<td>yes</td>
<td>no</td>
<td>yes</td>
<td>no</td>
<td>yes</td>
</tr>
<tr>
<th><div style="text-align: left;">
meta bytes</div>
</th>
<td>0</td>
<td>pointer+len+cap</td>
<td>pointer+len</td>
<td>pointer+len+cap</td>
<td>0</td>
</tr>
<tr>
<th><div style="text-align: left;">
to byte-array</div>
</th>
<td style="text-align: center;">-</td>
<td><span style="font-family: Courier New, Courier, monospace;">copy(ar[:], sl)</span></td>
<td><span style="font-family: Courier New, Courier, monospace;">copy(ar[:], st)</span></td>
<td><span style="font-family: Courier New, Courier, monospace;">copy(ar[:], string(rsl))</span></td>
<td><span style="font-family: Courier New, Courier, monospace;">copy(ar[:], string(rar[:]))</span></td>
</tr>
<tr>
<th><div style="text-align: left;">
to byte-slice</div>
</th>
<td><span style="font-family: Courier New, Courier, monospace;">ar[:] </span></td>
<td style="text-align: center;">-</td>
<td><span style="font-family: Courier New, Courier, monospace;">[]byte(st) </span></td>
<td><span style="font-family: Courier New, Courier, monospace;">[]byte(string(rsl)) </span></td>
<td><span style="font-family: Courier New, Courier, monospace;">[]byte(string(rar[:])) </span></td>
</tr>
<tr>
<th><div style="text-align: left;">
to string</div>
</th>
<td><span style="font-family: Courier New, Courier, monospace;">string(ar) </span></td>
<td><span style="font-family: Courier New, Courier, monospace;">string(sl) </span></td>
<td style="text-align: center;">-</td>
<td><span style="font-family: Courier New, Courier, monospace;">string(rsl) </span></td>
<td><span style="font-family: Courier New, Courier, monospace;">string(rar[:]) </span></td>
</tr>
<tr>
<th><div style="text-align: left;">
to rune-slice</div>
</th>
<td><span style="font-family: Courier New, Courier, monospace;">[]rune(string(ar)) </span></td>
<td><span style="font-family: Courier New, Courier, monospace;">[]rune(string(sl)) </span></td>
<td><span style="font-family: Courier New, Courier, monospace;">[]rune(st)</span></td>
<td style="text-align: center;">-</td>
<td><span style="font-family: Courier New, Courier, monospace;">rar[:] </span></td>
</tr>
<tr>
<th><div style="text-align: left;">
to rune-array</div>
</th>
<td><span style="font-family: Courier New, Courier, monospace;">copy(rar[:], []rune(string(ar))) </span></td>
<td><span style="font-family: Courier New, Courier, monospace;">copy(rar[:], []rune(string(sl))) </span></td>
<td><span style="font-family: Courier New, Courier, monospace;">copy(rar[:], []rune(st))</span></td>
<td><span style="font-family: Courier New, Courier, monospace;">copy(rar[:], rsl) </span></td>
<td style="text-align: center;">-</td>
</tr>
</tbody>
</table>
(variables for byte array, byte slice, string, rune slice and rune array are <span style="font-family: Courier New, Courier, monospace;">ar</span>, <span style="font-family: Courier New, Courier, monospace;">sl</span>, <span style="font-family: Courier New, Courier, monospace;">st</span>, <span style="font-family: Courier New, Courier, monospace;">rsl</span>, <span style="font-family: Courier New, Courier, monospace;">rar</span>, respectively)<br />
<br />
<h3>
Best practice</h3>
<div>
Some principles or best practice advices are given, out of my own experience:</div>
<div>
<ol>
<li>When you are processing a very small string, e.g. for showing some text to the console, using string directly is the best choice, since it is very direct and easily to read.</li>
<li>Reading from a long string can use <span style="font-family: Courier New, Courier, monospace;">strings.Reader</span></li>
<li>When you have to do a lot of modification on a string, e.g. modifying some content inside a long text, you can consider the whole processing totally in variables of the byte slice type. Text can be read as a byte slice, and a byte slice can be written out through an <span style="font-family: Courier New, Courier, monospace;">io.Writer</span> instance. Go provides many builtin functions upon byte slices. Don't convert the bytes to strings, unless they have to be the key of a map.</li>
<li>When possible, using a byte-array is very efficient. E.g. you need a large map with the key a string with a fixed/limited length.</li>
<li>rune slices are used only when the performance is not in consideration.</li>
<li>Using rune array is also efficienty, when possible, but convertion from and to UTF-8 encoding bytes may be expensive.</li>
</ol>
</div>
Yi DENGhttp://www.blogger.com/profile/14160396608979133205noreply@blogger.com0