The beauty of Go language concurrency

The beauty of Go language concurrency

Article source:



        Multi-core processors are becoming more and more popular. Is there a simple way to let the software we write unleash the power of multi-core? The answer is: Yes. With the rise of programming languages designed for concurrency, such as Golang, Erlang, Scale, etc., new concurrency patterns have gradually become clear. Just like procedural programming and object-oriented programming, a good programming model requires an extremely concise core and rich extensions on top of it to solve various problems in the real world. This article takes GO language as an example to explain the core and extension.


Concurrent mode kernel


        The kernel of this concurrent mode only needs coroutines and channels. The coroutine is responsible for executing code, and the channel is responsible for passing events between coroutines.

Concurrent programming has always been a very difficult job. To write a good concurrent program, we have to understand the way threads, locks, semaphores, barriers and even the CPU update the cache, and they all have weird tempers and traps everywhere. Unless absolutely necessary, the author will never manipulate these underlying concurrent elements by himself. A simple concurrency mode does not require these complex underlying elements, only coroutines and channels are enough.


        Coroutines are lightweight threads. In procedural programming, when a procedure is called, it needs to wait for its execution to complete before returning. When calling a coroutine, you don't need to wait for its execution to complete, it will return immediately. The coroutines are very lightweight, and the Go language can execute hundreds of thousands of coroutines in one process and still maintain high performance. For ordinary platforms, a process has thousands of threads, its CPU will be busy with context switching, and its performance will drop sharply. It's not a good idea to create threads at will, but we can use coroutines a lot.

        The channel is the data transmission channel between the coroutines. The channel can transfer data between many coroutines, and it can be a value or a reference. There are two ways to use the channel.

        The coroutine can try to put data into the channel. If the channel is full, the coroutine will be suspended until the channel can put data for him.

        The coroutine can try to request data from the channel. If the channel has no data, the coroutine will be suspended until the channel returns data.

        In this way, the channel can control the operation of the coroutine while transmitting data. It's a bit like event-driven, and a bit like a blocking queue. These two concepts are very simple, and each language platform will have a corresponding implementation. There are also libraries for both Java and C.

As long as there are coroutines and channels, concurrency problems can be solved elegantly. There is no need to use other concurrency-related concepts. How to use these two sharp blades to solve all kinds of practical problems?


Concurrency mode extension


        Compared with threads, coroutines can be created in large numbers. Opening this door, we can develop new usages, which can be used as generators, can return functions to "services", can execute loops concurrently, and can share variables. But when new usages appear, it also brings new thorny problems, coroutines will also leak, and improper use will affect performance. Various usages and problems will be introduced one by one below. The demo code is written in GO language because it is concise and supports all functions.




       Sometimes, we need to have a function that can continuously generate data. For example, this function can read files, read networks, generate self-increasing sequences, and generate random numbers. The characteristic of these behaviors is that some variables of the function are known, such as file paths. Then keep calling to return new data.

Let's take random number generation as an example, let's make a random number generator that will execute concurrently.

The non-concurrent approach is this:

//Function rand_generator_1, returns int

funcrand_generator_1() int {

         return rand.Int()



        The above is a function that returns an int. If the function call rand.Int() takes a long time to wait, the caller of the function will also hang. So we can create a coroutine that specifically executes rand.Int().


//Function rand_generator_2, return channel (Channel)

funcrand_generator_2() chan int {

        //Create channel

         out := make(chan int)

        //Create a coroutine

         go func() {

                   for {

                           //Write data to the channel, if no one reads it will wait

                            out <- rand.Int()



         return out

funcmain() {

        //Generate random numbers as a service

         rand_service_handler :=rand_generator_2()

        //Read the random number from the service and print it



        The above function can execute rand.Int() concurrently. It is worth noting that the return of a function can be understood as a "service". But when we need to get random data, we can get it from this service at any time. He has prepared the corresponding data for us, no need to wait, just come whenever we want. If we call this service infrequently, a coroutine is enough to meet our needs. But what if we need a lot of visits? We can use the multiplexing technology described below to start several generators and then integrate them into a large service.

        Calling the generator can return a "service". It can be used in situations where data is continuously acquired. It has a wide range of uses, reading data, generating IDs, and even timers. This is a very concise way of thinking to make the program concurrent.


        Multiplexing is a technique that allows multiple queues to be processed at once. Apache needs a process to process each connection, so its concurrency performance is not very good. Nginx uses multiplexing technology to allow one process to handle multiple connections, so the concurrent performance is better. Similarly, in the case of a coroutine, multiplexing is also required, but it is different. Multiplexing can integrate several similar small services into one large service.

So let's use multiplexing technology to make a random number generator with higher concurrency.

//Function rand_generator_3, return channel (Channel)

funcrand_generator_3() chan int {

        //Create two random number generator services

         rand_generator_1 := rand_generator_2()

         rand_generator_2 := rand_generator_2()


        //Create channel

         out := make(chan int)


        //Create a coroutine

         go func() {

                   for {

                           //Read the data in generator 1, integrate

                            out <-<-rand_generator_1



         go func() {

                   for {

                           //Read the data in generator 2 and integrate

                            out <-<-rand_generator_2



         return out


        The above is a highly concurrent version of random number generator using multiplexing technology. By integrating two random number generators, the capacity of this version is twice that of just now. Although goroutines can be created in large numbers, many goroutines will still compete for output channels. Go language provides the Select keyword to solve it, and each has its own tricks. Increasing the buffer size of the output channel is a general solution.

        Multiplexing technology can be used to integrate multiple channels. Improve performance and ease of operation. It has great power when used in conjunction with other modes.

Future technology

        Future is a very useful technology, we often use Future to manipulate threads. When we use threads, we can create a thread, return a Future, and then wait for the result through it. But the Future in the coroutine environment can be more thorough, and the input parameters can also be Future.

When calling a function, the parameters are often already prepared. The same is true when calling coroutines. But if we set the passed parameter as a channel, then we can call the function without preparing the parameters. This design can provide a lot of freedom and concurrency. The two processes of function call and function parameter preparation can be completely decoupled. Here is an example of using this technology to access the database.

//A query structure

typequery struct {

        //Parameter Channel

         sql chan string

        //Result Channel

         result chan string


//Execute Query

funcexecQuery(q query) {

        //Start the coroutine

         go func() {

                  //Get input

                   sql := <-q.sql

                  //Access the database, output the result channel

                   q.result <- "get" + sql



funcmain() {

        //Initialize Query

         q :=

                   query{make(chan string, 1),make(chan string, 1)}

        //Execute Query, note that there is no need to prepare parameters when executing



        //Prepare parameters

         q.sql <- "select * fromtable"

        //Get the result



        Using the Future technology above, not only the results are obtained in the Future, but the parameters are also obtained in the Future. After preparing the parameters, it will be executed automatically. The difference between Future and Generator is that Future returns a result, and Generator can be called repeatedly. Another point worth noting is that the parameter Channel and the result Channel are defined in a structure as parameters instead of returning the result Channel. Doing so can increase the degree of aggregation, and the advantage is that it can be used in combination with multiplexing technology.

        Future technology can be used in combination with various other technologies. You can monitor multiple result channels through multiplexing technology, and automatically return when there are results. It can also be used in combination with a generator, which continuously produces data, and Future technology processes the data one by one. Future technology itself can also be connected end to end to form a concurrent pipe filter. This pipe filter can be used to read and write data streams and manipulate data streams.

        Future is a very powerful technical means. You can do not care about whether the data is ready or the return value is calculated when you call. Let the components in the program run automatically when the data is ready.

Concurrent loop

       Loops are often a hot spot in performance. If the performance bottleneck appears on the CPU, then 90% of the hot spots are likely to be inside a loop. So if the loop body can be executed concurrently, then the performance will improve a lot.

To concurrent loops is very simple, only start the coroutine within each loop body. The coroutine can be executed concurrently as a loop body. A counter is set before the call is started, and an element is added to the counter after each loop body is executed. After the call is completed, the loop coroutine is fully completed by monitoring the counter.

//Build a counter

sem :=make(chan int, N);

//FOR loop body

for i,xi:= range data {

        //Establish a coroutine

    go func (i int, xi float) {



        sem <- 0;

    } (i, xi);


//wait for the end of the loop

for i := 0; i <N; ++i {<-sem}

       The above is an example of a concurrent loop. Use the counter to wait for the loop to complete. If you combine the Future technology mentioned above, you don't have to wait. You can wait until the result is really needed, and then check whether the data is complete.

        Concurrent loops can provide performance, use multiple cores, and solve CPU hot spots. It is precisely because coroutines can be created in large numbers that they can be used in the loop body. If threads are used, it is necessary to introduce thread pools and the like to prevent the creation of too many threads, while coroutines are much simpler.

ChainFilter technology

      As mentioned earlier, the Future technology is connected end to end to form a concurrent pipe filter. Many things can be done in this way. If each Filter consists of the same function, there is also a simple way to connect them.

Since each Filter coroutine can run concurrently, this structure is very conducive to a multi-core environment. The following is an example of using this model to generate prime numbers.

//Aconcurrent prime sieve



//Sendthe sequence 2, 3, 4, ... to channel'ch'.

funcGenerate(ch chan<- int) {

         for i := 2;; i++ {

                  ch<- i//Send'i' to channel'ch'.



//Copythe values from channel'in' to channel'out',

//removing those divisible by'prime'.

funcFilter(in <-chan int, out chan<- int, prime int) {

         for {

                   i := <-in//Receive value from'in'.

                   if i%prime != 0 {

                            out <- i//Send'i' to'out'.




//Theprime sieve: Daisy-chain Filter processes.

funcmain() {

         ch := make(chan int)//Create a newchannel.

         go Generate(ch)//Launch Generate goroutine.

         for i := 0; i <10; i++ {

                   prime := <-ch

                   print(prime, "\n")

                   ch1 := make(chan int)

                   go Filter(ch, ch1, prime)

                   ch = ch1



        The above program creates 10 Filters, each filtering a prime number, so the first 10 prime numbers can be output.   

        Chain-Filter creates concurrent filter chains through simple code. Another advantage of this approach is that only two coroutines can access each channel, there will be no fierce competition, and the performance will be better.

Shared variable

**** Communication between coroutines can only pass through channels. But we are used to shared variables, and many times using shared variables can make the code more concise. For example, a Server has two states on and off. Others just want to get or change their state, so how do they do it. You can put this variable in the 0 channel and use a coroutine to maintain it.

The following example describes how to implement a shared variable in this way.

//Shared variables consist of a read channel and a write channel

typesharded_var struct {

         reader chan int

         writer chan int


//Shared variable maintenance coroutine

funcsharded_var_whachdog(v sharded_var) {

         go func() {

                  //Initial value

                   var value int = 0

                   for {

                           //Monitor the read and write channels to complete the service

                            select {

                            case value =<-v.writer:

                            case v.reader <-value:





funcmain() {

        //Initialize and start to maintain the coroutine

         v := sharded_var{make(chan int),make(chan int)}


        //Read the initial value


        //Write a value

         v.writer <- 1

        //Read the newly written value



        In this way, a safe shared variable can be implemented on the basis of the coroutine and the channel. Define a write channel, and write a new value in it when the variable needs to be updated. Then define a read channel, when you need to read, read from it. These two channels are maintained through a separate coroutine. Ensure data consistency.

        Generally speaking, it is not recommended to use shared variables for interaction between coroutines, but according to this method, in some situations, it is also advisable to use shared variables. Many platforms have relatively native shared variable support. It is a matter of opinion on which implementation is better. In addition, using coroutines and channels, you can also implement various common concurrent data structures, such as locks, etc., so I won't repeat them one by one.


Coroutine leak

        Coroutines, like memory, are system resources. For memory, there is automatic garbage collection. But for coroutines, there is no corresponding recovery mechanism. Will coroutines become popular in a few years, and coroutine leaks, like memory leaks, will become programmers' eternal pain? Generally speaking, the coroutine will be destroyed after its execution. Coroutines also occupy memory. If a coroutine leak occurs, the impact is as serious as the memory leak. At least it slows down the program, but at the worst it can overwhelm the machine.

        Both C and C++ are programming languages without automatic memory reclamation, but as long as you have good programming habits, you can solve the circumvention problem. The same is true for coroutines, as long as you have good habits.

        There are only two situations that will cause the coroutine to fail to end. One situation is that the coroutine wants to read data from a channel, but no one writes data to this channel, perhaps this channel has been forgotten. Another situation is that Cheng wants to write data to a channel, but because no one is monitoring this channel, the coroutine will never be able to execute downwards. How to avoid these two situations are discussed below.

        For the case where the coroutine wants to read data from a channel, but no one writes data to this channel. The solution is very simple, adding a timeout mechanism. In the case of uncertain whether it will return, a timeout must be added to avoid permanent waiting. In addition, it is not necessary to use a timer to terminate the coroutine. It is also possible to expose an exit reminder channel to the outside. Any other coroutine can remind this coroutine to terminate through this channel.

For the case where the coroutine wants to write data to a channel, but the channel is blocked and cannot be written. The solution is also very simple, which is to buffer the channel. But the premise is that this channel will only receive a fixed number of writes. For example, it is known that a channel can only receive data N times at most, then the buffer of this channel is set to N. Then the channel will never be blocked, and the coroutine will naturally not leak. You can also set its buffer to infinite, but this will take the risk of memory leaks. After the coroutine is executed, this part of the channel memory will lose its reference and will be automatically garbage collected.

funcnever_leak(ch chan int) {

        //Initialize timeout, buffer is 1

         timeout := make(chan bool, 1)

        //Start the timeout coroutine, because the cache is 1, it is impossible to leak

         go func() {

                   time.Sleep(1 * time.Second)

                   timeout <- true


        //Monitoring channel, because there is a timeout, it is impossible to leak

         select {

         case <-ch:

                  //a read from ch hasoccurred

         case <-timeout:

                  //the read from ch has timedout



        The above is an example of avoiding leakage. Use timeout to avoid read blockage, and use buffer to avoid write blockage.

        Like objects in memory, we don't have to worry about leaks for long-lived coroutines. One is the long-term existence, and the second is the small number. Only those goroutines that are temporarily created are to be vigilant. These goroutines are large in number and have a short life cycle, and are often created in a loop. The methods mentioned above should be applied to avoid leakage. The coroutine is also a double-edged sword. If something goes wrong, it will not only fail to improve the performance of the program, but it will crash the program. But just like the memory, there is also the risk of leakage, but the more you use it, the more it slips away.


Implementation of Concurrent Mode

        Today, when concurrent programming is popular, support for coroutines and channels has become an indispensable part of each platform. Although each has its own name, they can meet the basic requirements of coroutines concurrent execution and large-scale creation. The author summarized their implementation methods.

        Here are some common languages and platforms that already support coroutines.

GoLang and Scala, as the latest languages, have perfect coroutine-based concurrency functions since they were born. Erlang is the most veteran concurrent programming language, rejuvenating. Almost all other second-line languages have added coroutines in the new version.

        What is surprising is that the three most mainstream platforms in the world, C/C++ and Java, do not provide language-level native support for coroutines. They are all burdened with a heavy history, which cannot be changed, and they do not need to be changed. But they have other ways to use coroutines.

        There are many ways to implement coroutines on the Java platform:

        Modify the virtual machine: patch the JVM to implement the coroutine, this implementation effect is good, but it loses the benefits of cross-platform

        Modify bytecode: enhance bytecode after compilation, or use a new JVM language. Slightly increased the difficulty of compilation.

        Use JNI: Use JNI in the Jar package, which is easy to use, but not cross-platform.

        Use threads to simulate coroutines: make coroutines heavyweight and completely rely on the thread implementation of JVM.

        Among them, the way to modify the bytecode is more common. Because of this approach, performance and portability can be balanced. The most representative JVM language Scale can well support coroutine concurrency. The popular Java Actor model class library akka is also a coroutine implemented by modifying the bytecode.

        For the C language, coroutines are the same as threads. Can use a variety of system calls to achieve. As a relatively advanced concept, there are too many ways to implement coroutine, so I won't discuss it. The more mainstream implementations include libpcl, coro, lthread and so on.

        For C++, there is a Boost implementation and some other open source libraries. There is also a language called C++, which provides concurrent extensions on the basis of C++.

        It can be seen that this programming model has been widely supported in many language platforms and is no longer a minority. If you want to use it, you can add it to your toolbox at any time.


Concluding remarks


        This article discusses an extremely concise concurrency model. In the case of only the two basic elements of coroutine and channel. Can provide a wealth of functions to solve all kinds of practical problems. And this model has been widely implemented and has become a trend. I believe that the function of this concurrency model is far less than this, and there will be more and more concise usages. Perhaps the number of CPU cores in the future will be as large as the number of neurons in the human brain. At that time, we will have to rethink the concurrency model.