Golang Timeout Handling: 5 Practical Lessons

Engineering

Imri Goldberg

Tech Lead Advisor

June 20, 2022

min read

Join our newsletter

Your privacy is important to us, privacy policy.

At Piiano we are developing a Vault, a type of storage database dedicated to PII. Vault is implemented in Go, which is a great language for implementing cloud services. One of the basic requirements of any server is to implement a request timeout. Any request that is handled by the server should have a timeout which is typically around 15-30 seconds on web servers these days.

When that timeout expires the user should receive a “503 Timeout Exceeded” response, and the handling of the request on the backend should be stopped. As this was my first contribution to Vault, and also my first time writing Go, this definitely has been a learning experience. I thought that sharing a few lessons about implementing this feature would be helpful for other developers.

1. Go supports timeouts out-of-the-box

In Go many functions and definitely IO functions receive a Context argument that allows making any request cancelable, or with a deadline, in a cooperative manner. This means that the implementation of any such function should adhere to the cancel signal in the context, for example by select()-ing on the context’s Done channel. This is pretty standard in Go, and when you are looking to add a 10-second timeout to your request-handling code, you could write something like the following:

ctx, cancel := ctx.WithTimeout(10*time.Second)
defer cancel()

2. Error handling with timeouts (part 1)

Error handling in Go is also pretty standardized. When you get an error from an internal function, you would normally wrap it e.g. using: fmt.Errorf("Unable to add person: %w", err)
‍

Then, when you go up the call tree, before returning the HTTP error to the user, you could check if the error was initiated by a timeout or not, and if it’s from a timeout, handle it differently. You can implement this check using code such as:

func IsTimeoutError(err error) bool {
var netErr net.Error
if errors.As(err, &netErr) && netErr.Timeout() {
return true
}
return false
}

If IsTimeoutError returns True, then return a 503 timeout error, otherwise, return your standard 4xx error.

3. There are strange errors that are actually caused by timeouts

The above solution worked almost great. However, one of our tests was flaky. Once every 10 runs or so it would return a 500 error when it was supposed to return a 503 timeout error. Essentially it was treating a timeout error as a “generic” error. The reason was some errors that were caused by timeouts got translated to non-timeout errors. Why?

Read on for the situation that we ran into. Imagine you have a bit of code that starts a DB transaction, makes an INSERT query, and then commits the transaction. If the timeout expires during the INSERT then the DB API call will fail with a timeout error, which would be easy to detect. However, if the timeout expires exactly in the time window after the INSERT but before the COMMIT of the transaction, the SQL library will rollback the transaction, because the timeout was sent to the DB when the transaction was started.

When we’ll try to COMMIT the transaction, instead of a timeout error, we’ll get an error message saying we’re trying to commit a transaction that’s already rolled back. This error will NOT be a timeout error, and so our code will understand it to be some unrelated error, which would be a beautiful race-condition bug. This is easy to reproduce if you add a time.Sleep(10 * time.Second) between the last query and the commit. (How to discover this is left as an exercise for the reader :)

4. Error handling with timeouts (part 2)

Fortunately in the above case, Go comes to our rescue. If the timeout on our context expires, we are guaranteed that ctx.Err() would be context.DeadlineExceeded (or an error wrapping it). So to handle our error, our code should now be:

if IsTimeoutError(ctx.Err()) {
return ...HTTPTimeoutError...
}

Where ...HTTPTimeoutError... should be replaced with your own code to return the appropriate HTTP error.

5. Testing your timeout code

How should we test our code? Testing timeouts can be particularly tricky. First, we need to understand the requirements of our timeout test:

The test should fail if the server ignores the timeout parameter (i.e. API calls are allowed to run indefinitely)
The test shouldn’t be slow - it should run under 1 second
The test shouldn’t be flaky - its success should not depend on the speed of the machine on which it runs
The test should verify that the HTTP return for a regular call is correct, and with a short timeout it should be HTTP 503
The test should verify that the backend processing was stopped due to the timeout - e.g. a transaction inserting values to the database should not be committed.

This is quite a tall order. Here is my initial approach:

Pick a particular standard API call. In our case AddPerson.
Call it with a standard timeout. It should succeed and you should be able to observe its effects (a person was added.)
Call it with a very short timeout.
It should fail,
and you should be able to observe that no change was made to the state of your system. (No second person was added.) This makes sure that the backend processing was halted.

The first complication:

Due to the way the API is structured, observing the state of the system is done with an API call. This is all fine and dandy until you realize that reading the state of the system will also fail due to a timeout.The solution: resetting the server with a new, longer timeout, without resetting the DB.

The second complication:

On fast machines, the action that should fail will actually succeed - the timeout is not short enough. If it’s too short, the server won’t even start processing the request and you’re not testing anything. So you need to find a sweet spot for the timeout in the test. Let’s say, 10ms.

However, on some faster machines, even that is enough time for the action to sometimes succeed, which makes the test flaky. The solution? Fault injection. We will add a configuration option that is not user visible and will only be used in testing, called FaultInjection. For some particular value, the AddPerson API call will make an SQL query with pg_sleep(1) (Note: this is postgres specific). This will be sufficient to make sure the call fails on timeout, and our test for the timeout is now stable.

Summary

So there you have it, timeouts implemented with a stable test that proves that they work.

About the author

Imri Goldberg

Tech Lead Advisor

Imri is a lifelong software engineer who loves Python and has taken on CTO roles as an entrepreneur. He has been writing code professionally for his entire life, and he is passionate about building SaaS companies.

# Tags:

No items found.

Powering Data Protection

Skip PCI compliance with our tokenization APIs

Skip PCI compliance with our tokenization APIs

hey

h2

dfsd

link2

It all begins with the cloud, where applications are accessible to everyone. Therefore, a user or an attacker makes no difference per se. Technically, encrypting all data at rest and in transit might seem like a comprehensive approach, but these methods are not enough anymore. For cloud hosted applications, data-at-rest encryption does not provide the coverage one might expect.

John Marcus

Senior Product Owner

const protectedForm = 
pvault.createProtectedForm(payment Div, 
secureFormConfig);

This is some text inside of a div block.

Continue your reading

See all articels

Text Link

Engineering

Substring Matching over Field-Level Encrypted Data

Nir Haas

25 mins

min read

January 1, 2025

Engineering

Substring Matching over Field-Level Encrypted Data

Comparing Dynamic Data Masking Proxy to Data Privacy Vault

Gil Dabah

min read

October 25, 2024

Engineering

Comparing Dynamic Data Masking Proxy to Data Privacy Vault

5 Tokenization Types Demonstrated Using Piiano Vault

Ariel Shiftan

min read

September 19, 2024

Engineering

5 Tokenization Types Demonstrated Using Piiano Vault

Why Spilling PII to Logs Is Bad and How To Avoid It

Ariel Shiftan

min read

September 2, 2024

Engineering

Why Spilling PII to Logs Is Bad and How To Avoid It

Ariel Shiftan

September 2, 2024

Piiano offers developer-friendly privacy and security products.

Piiano Privacy Solutions US, Inc

135 W. 50th St. Suite 200 
New York, NY 10020

Product

Piiano Vault PCI Tokenization

Company

About us

Resources

Docs Blog Privacy by Design Data Tokenization

What is PII PII Protection Column - Level Encryption 101

PII By Design ^TM Cheat Sheet

Compared to Hashicorp Vault

Compare to other data vaults

Piiano offers developer-friendly privacyand security products.

Piiano offers developer-friendly privacy and security products.

Piiano Privacy Solutions US, Inc

135 W. 50th St. Suite 200 
New York, NY 10020

Golang Timeout Handling: 5 Practical Lessons

1. Go supports timeouts out-of-the-box

2. Error handling with timeouts (part 1)

3. There are strange errors that are actually caused by timeouts

4. Error handling with timeouts (part 2)

5. Testing your timeout code

The first complication:

The second complication:

Summary

Skip PCI compliance with our tokenization APIs

h2

Continue your reading

Substring Matching over Field-Level Encrypted Data

Substring Matching over Field-Level Encrypted Data

Comparing Dynamic Data Masking Proxy to Data Privacy Vault

Comparing Dynamic Data Masking Proxy to Data Privacy Vault

5 Tokenization Types Demonstrated Using Piiano Vault

5 Tokenization Types Demonstrated Using Piiano Vault

Why Spilling PII to Logs Is Bad and How To Avoid It

Why Spilling PII to Logs Is Bad and How To Avoid It

Get security and privacy best practices, tips and news.