🇷🇺|🇷🇸|🇬🇪 Dmitriy Lezhnev
Software Developer
PHP/Go practitioner
Zend Certified Engineer
Clean Architecture advocate


PHP version 7+ Nginx web-server MySQL Linux Ubuntu Jet Brains Docker DuckDB Clickhouse
Remote developer

Find me on the Internet






Fri, 13 Dec 2024

# Protecting Expensive Calls In Go

Imagine you are designing an API endpoint for public use which must call some critical internal function. That function does expensive calculations. How to protect it from over-usage, what measure can we do to balance availability and latency? Let's find out.

Source code is available at https://github.com/lezhnev74/go_expensive_call.

# Prepare The API

Let's make a simple app that serves one API endpoint over HTTP and calls the expensive function. I am using a popular package Gin as an HTTP server here. It exposes GET /calculate endpoint that we are going to abuse a little bit now :)

package main

import (
	"fmt"
	"github.com/gin-gonic/gin"
	"log"
	"net/http"
	"time"
)

func ExpensiveCall() (string, error) {
	time.Sleep(time.Second)
	log.Printf("Expensive call invoked.")
	result := fmt.Sprintf("Done at %s", time.Now())
	return result, nil
}

func main() {
	r := gin.Default()

	r.GET("/calculate", func(c *gin.Context) {
		result, err := ExpensiveCall() // <-- Here is the bottleneck
		if err != nil {
			c.JSON(http.StatusBadRequest, err)
			return
		}
		c.JSON(http.StatusOK, result)
	})

	err := r.Run("0.0.0.0:8088")
	if err != nil {
		panic(err)
	}
}

Notice that it takes at least a second to complete the expensive function call, that is a handy trait that we will use in performance testing. It can fail and return an error, but in our tests we won't test that branch.

Let's use Hey tool to make a few concurrent requests to our API endpoint. For this we are going to start our app: go run main.go and then run Hey:

./hey -n 4 -c 2 http://localhost:8088/calculate

Summary:
  Total:	2.0057 secs
  Slowest:	1.0043 secs
  Fastest:	1.0010 secs
  Average:	1.0027 secs
  Requests/sec:	1.9943
...

It made 4 total requests with concurrency level at 2 (2 parallel requests). It tells us that on average each request took ~1s, which is totally expected as the expensive call sleeps for 1 second. At this point we have established our baseline - no optimizations, no protection from abusing our API. Let's start optimizing it.

# Level-1: Cache Expensive Results

My first obvious optimization would be adding a caching layer in front of the expensive function call. We could use a simple map type for that with a mutex, or use an existing package. I preferred the latter, let's use this one: github.com/Code-Hex/go-generics-cache.

I am going to apply a bit of functional design here and wrap our expensive call into another function like that:

cacheKey := "expensive"
ttl := 10*time.Second
result, err := Cache(cacheKey, ttl, ExpensiveCall)

Now we can call our function with a cache key and a ttl argument that controls how long the cached result should live. The caching code looks like this:

package internal

import (
	"context"
	internalCacheVendor "github.com/Code-Hex/go-generics-cache"
	"time"
)

var internalCache *internalCacheVendor.Cache[string, string]

func InitCache() {
	internalCache = internalCacheVendor.NewContext[string, string](context.Background())
}

// cache decorates expensive data fetch call with a caching layer.
func Cache(key string, ttl time.Duration, expensiveCall func() (string, error)) (string, error) {
	var (
		val string
		err error
		ok  bool
	)

	val, ok = internalCache.Get(key)
	if ok {
		return val, nil // fast track: return from the cache
	}

	val, err = expensiveCall() // slow track: jump into the expensive call

	if err == nil {
		internalCache.Set(key, val, internalCacheVendor.WithExpiration(ttl))
	}

	return val, err
}

Notice the 2 functions: InitCache prepares the memory for cached values, while our desired function is Cache. Now we can improve our endpoint code:

...

internal.InitCache()

r.GET("/calculate", func(c *gin.Context) {
    cacheKey := "expensive"
    ttl := 10 * time.Second
    result, err := internal.Cache(cacheKey, ttl, ExpensiveCall) // <-- Now we use cache here
    if err != nil {
        c.JSON(http.StatusBadRequest, err)
        return
    }
    c.JSON(http.StatusOK, result)
})
...

Let's put some pressure again with Hey:

./hey -n 4 -c 2 http://localhost:8088/calculate

Summary:
  Total:	1.0036 secs
  Slowest:	1.0025 secs
  Fastest:	0.0010 secs
  Average:	0.5018 secs
  Requests/sec:	3.9855
  
  Total data:	256 bytes
  Size/request:	64 bytes

Response time histogram:
  0.001 [1]	|â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– 
  0.101 [1]	|â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– 
  0.201 [0]	|
  0.301 [0]	|
  0.402 [0]	|
  0.502 [0]	|
  0.602 [0]	|
  0.702 [0]	|
  0.802 [0]	|
  0.902 [0]	|
  1.002 [2]	|â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– 

...

The picture has changed! On the histogram we can see that 2 requests took 1 second, while two others were super fast and took only a fraction of it. That tells us that our caching layer improved things a lot! We are moving in the right direction.

At this point we still have a problem which is called "Cold Cache". Until the cache contains a value, all concurrent requests hit the expensive function. We can confirm that by looking at the app's logs, we should see multiple lines coming from the expensive function:

...
[GIN-debug] Listening and serving HTTP on 0.0.0.0:8088
2024/12/13 11:27:12 Expensive call invoked.
2024/12/13 11:27:12 Expensive call invoked.
...

I want to come up with a design which queues all concurrent requests, while just one is allowed to enter the expensive function. The rest must wait and use the value that first request returns. As an effect we must see this message just once: Expensive call invoked.. Fortunately, there is a wonderful package for that called Singleflight. Notable it works only within the same process, in case we will run multiple app instances we would need to use a distributed queue for the same matter, but that is out of the scope of this text.

# Level-2: Limit Concurrent Access To Expensive Function

Let's update our endpoint code with Singleflight package. All concurrent incoming requests will queue up and wait until one of them exits from the expensive function call.

...

internal.InitCache()
g := new(singleflight.Group)

r.GET("/calculate", func(c *gin.Context) {
    cacheKey := "expensive"
    ttl := 10 * time.Second
    decorated := func() (any, error) {
        return internal.Cache(cacheKey, ttl, ExpensiveCall)
    }
    result, err, _ := g.Do(cacheKey, decorated) // <-- Here it queues all the callers
    if err != nil {
        c.JSON(http.StatusBadRequest, err)
        return
    }
    c.JSON(http.StatusOK, result)
})
...

Here we decorate our expensive call once again (functional programming is quite handy). Now it is wrapped in another layer which implements queueing of calls. Each client will block at g.Do(...) and wait until a single call to the expensive function returns.

Let's use Hey again:

./hey -n 4 -c 2 http://localhost:8088/calculate

Summary:
  Total:	1.0037 secs
  Slowest:	1.0026 secs
  Fastest:	0.0009 secs
  Average:	0.5018 secs
  Requests/sec:	3.9852
  
  Total data:	256 bytes
  Size/request:	64 bytes

Response time histogram:
  0.001 [1]	|â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– 
  0.101 [1]	|â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– 
  0.201 [0]	|
  0.301 [0]	|
  0.402 [0]	|
  0.502 [0]	|
  0.602 [0]	|
  0.702 [0]	|
  0.802 [0]	|
  0.902 [0]	|
  1.003 [2]	|â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– â– 
...

It remained the same, however our app's log shows that there was just a single message Expensive call invoked.. That's a win. Two first incoming requests were queued up, one of them were in the expensive call, while the other one waiting outside. Nice! At this point we did not improve latency, but we reduced the pressure on the expensive call! That could be crucial in real business apps.

Before we give up, maybe we can add another layer of protection from abusing our API: throttling.

# Level-3: Limit Concurrent Access To API Endpoint

Though this is not strictly needed, it also adds up to the stability of our API endpoint. We can protect the number of concurrent requests that are allowed to hit our endpoint. We can configure our server to allow N requests In X seconds, thus implement throttling. This sort of optimization is used to avoid overusing our server capacity. We have already done a fair bit of protection on the previous steps, however it is still a good practice and unlimited public use can become a problem at the most unexpected moment.

There is great package that implements throttling based on the idea of a bucket of tokens: https://pkg.go.dev/golang.org/x/time/rate. Let's allow 1 request per second at this time, the rest should get HTTP responses with status code 419 Too Many Requests.

...

internal.InitCache()
g := new(singleflight.Group)
l := rate.NewLimiter(rate.Every(time.Second), 1)

r.GET("/calculate", func(c *gin.Context) {
    // Level-3: Throttling
    allowed := l.Allow()
    if !allowed {
        c.JSON(http.StatusTooManyRequests, nil) // <-- backoff immediately
        return
    }

    cacheKey := "expensive"
    ttl := 10 * time.Second
    decorated := func() (any, error) {
        // Level-1: Caching
        return internal.Cache(cacheKey, ttl, ExpensiveCall)
    }
    
    // Level-2: Queueing
    result, err, _ := g.Do(cacheKey, decorated) // <-- Here it queues all the callers
    if err != nil {
        c.JSON(http.StatusBadRequest, err)
        return
    }
    c.JSON(http.StatusOK, result)
})
...

Very nice! Now let's observe how this works:

./hey -n 4 -c 2 http://localhost:8088/calculate
...
Status code distribution:
  [200]	2 responses
  [429]	2 responses
...

We sent 4 requests totally, while using concurrency level as 2. So not a surprise that 2 requests was rejected due to throttling. At this point we have designed quite a durable API endpoint that protects our sensitive business function from public abuse.

I hope that was useful to you!

# References







ATOM feed | Website source code