cat /dev/brain - Production Ready Requests

Production Ready Requests

I find myself often helping people with their usage of python-requests. The docs exist for all of these things, but sometimes it is hard to piece them together, and frankly it's hard to give people a bunch of links every time, so I'm putting this together as a cheat sheet of sorts. The tips included can be summarized as:

Use a Session
Data Optimizations
Use timeouts
Be careful about threads and gevent
TLS tricks and trip-ups
Customize Your User-Agent
Use redirects well
Learn just enough about HTTP and networking to know where to look next
Remember that python-requests is not a browser
Understand the limitations of Python

Improve Performance

One of the things I hear often is complaints about performance in python-requests that people observe. As part of this I tend to ask the following questions:

What is your traffic pattern?

Are you talking to the same service almost all the time or do you talk to a multitude of unpredictable services?

How much data are you managing?
Do your services support keep-alive?
What have you measured thus far?
How are you using python-requests today?

Depending on the answers above, I tend to have follow-ups that lead me to some set of the following answers.

Use a Session

If you're largely talking to the same service (or set of services), connection pooling is your friend. If you're doing something like:

import requests

requests.get(url, ...)
requests.post(url, ...)

A lot, then what you're effectively doing each time is:

# requests.get(url, ...)
s = requests.Session()
s.get(url, ...)
s.close()
# requests.post(url, ...)
s = requests.Session()
s.post(url, ...)
s.close()

This can be fine for prototyping or debugging, but in production applications where performance is important, it's very wasteful, especially because for every requests.Session you create two urllib3.PoolManager instances, etc.

If you instead create a session that you share across these places, it would look like:

s = requests.Session()
s.get(url, ...)
s.post(url, ...)

And if the connection can be kept open after the get, then you can reuse that connection on your post. That avoids:

A DNS lookup
Socket creation, including finding a DNS address that accepts the connection within your connection timeout setting
A TLS handshake
And a few other less costly things.

Be Smart About Data

This goes for both sending and receiving data.

Sending

On versions of Python older than 3.7 [1] http.client allows one to specify a blocksize when constructing an HTTPConnection [2]. By default, the standard library will only send blocks of 8,192 bytes at a time from the request body. urllib3 defaults to 16,384 bytes. For large enough request bodies, 8,192 bytes was atrociously slow. So the increased performance here is significantly better.

That said, if you know that you need to send a lot of data and the remote service can handle you sending larger chunks, you can customize this in a custom HTTPAdapter.

Furthermore, python-requests has one quirk that can cause you headaches if you're unaware. When sending multipart/form-data requests, if you have a file-like object the entire file is read into memory if you do

s.post(url, files={"myformfieldname": open("large_file", "rb")})

Instead, you need to rely on the requests-toolbelt package [3], e.g.,

import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder

s = requests.Session()
m = MultipartEncoder(
    fields={'field0': 'value', 'field1': 'value',
            'field2': ('filename', open('file.py', 'rb'), 'text/plain')}
    )

r = s.post(url, data=m, headers={'Content-Type': m.content_type})

That will avoid loading everything into memory for you.

Receiving

As for receiving data, it's important to know that stream=True is your friend for large response bodies. If you don't tell python-requests that you want it to stream the response, the entire response body will be loaded into Response.content as bytes.

Character Encoding

If you rely on Response.text and you have a large amount of data, there's a possibility that the Content-Type does not specify the character set. python-requests will try to use other heuristics if there's no explicit character set. If all of that fails, it relies on a third-party library which can cause a large delay if people are unaware.

In that case, you may find that it will be best to set Response.encoding yourself to avoid Response.apparent_encoding from being used.

Use Timeouts

A lot of times people can get away without specifying timeouts for python-requests and often times when they start specifying them they get tripped up by one big thing. So let's start there:

Timeouts Are Not Total Time Timeouts

In other words, if you wrote

s.get(url_that_trickles_data_slowly, timeout=5)

Some people expect that if all the data hasn't been returned by the server within 5 seconds that this will timeout, but will later be surprised to learn that this response took 10 seconds, 30 seconds, 5 minutes, or longer.

The plain timeout value in python-requests specifies both the time spent waiting for a socket to connect (a.k.a, connection timeout) and the time to wait for bytes on a socket (a.k.a., read timeout) as the same value. So here, you're telling python-requests to wait 5 seconds before canceling the connection and to wait 5 seconds for the server to return bytes before disconnecting. This, however, is 5 seconds until the next bytes until the end of the response. So if a server behaves as follows:

for some_bytes in data:
   write(response, some_bytes)
   sleep(3)

Then you would say that the response appeared to take:

duration_of_connection_in_ms
+ number_of_chunks_sent_by_server * 3000

To use numbers to really exaggerate this, let's say it took us 2,135 milliseconds to open a connection to the server and there were 150 chunks of data written by the server before it seemed to sleep for three seconds, that math would look like

2135 + 150 * 3000 = 452135
452135 / ( 60 * 1000 ) = 7.535583333

So it took the server 452,135 milliseconds (or approximately 7.54 minutes) to fully respond.

The behaviour people expect here is often called a "Wall Clock Timeout". There are ways to do that in Python, but if you're a library that has to work on many different operating systems and other implementations, then there's no good universal way to do it.

How to Get Good Timeouts

So what we can see above is that this is hard if the remote service you're talking to has pathological behaviour (even if it's only 1% of the time). If you trust the remote service to not exhibit that behaviour, then you can rely pretty confidently on the built-in timeouts. I would suggest, however, specifying separate connection and read timeouts though. In python-requests that code would be

connect_timeout_s: float = ...
read_timeout_s: float = ...
s.get(url, timeout=(connect_timeout_s, read_timeout_s))

Note

Note that you can specify a float to get sub-second resolution.

Right now, there's not much instrumentation in python-requests or urllib3, but if we do grow it, you can rely on your instrumentation to find your median and p99 and use that to determine what's reasonable for a given remote service. In the absence of that, I would suggest having a connection timeout that is no greater than one second (1.0) and a read timeout that is no higher than five seconds (5.0). This obviously has some caveats:

If you're crossing an ocean (Atlantic or Pacific) then you likely want higher timeout values
If you know you have a problematic server then you should set it appropriately and ignore my generic rules above
Be mindful of DNS. It's happened often times that a service has lots of IPv6 addresses and IPv4 addresses published in DNS and going through all of them and getting to a working address takes way longer than anticipated. In that case, you likely want something much lower than one second for your connect timeout to fail faster

Gotchas and Common Issues

I see a few of these often, and I think they're just rare enough that people don't think to look into the documentation closely or look for past issues.

Be Careful With Threads and `gevent`

There are ways around making sure python-requests is thread-safe. If you use a requests.Session per thread, then this isn't a concern for you. You'll still have connection pooling but per thread, so you'll want to ensure you track which remote services have been requested via which threads.

Most people don't use threads, though, but many do use gevent. You'll want to ensure that before you import anything else, you import gevent and monkey patch the world. For example, urllib3 uses a last-in-first-out queue to pool connections but because we rely on being able to set the class attribute at import time, if gevent monkey-patches the standard library's queue.LifoQueue after urllib3 is imported, then this doesn't get patched properly.

TLS Tricks and Trip-ups

As I mentioned above in Use a Session, urllib3 and python-requests will pool connections for you that can be kept alive. The beauty of it is that it improves how quickly you can send the next request to the remote service if a connection is already open to it. One of the reasons that may be faster is not having to renegotiate a TLS handshake.

That, of course, may also be a downside. Let's say that you have two code paths that do similar things with the same service but that service is a little... unconventional. Those paths may look like:

import requests

def f1(s: requests.Session, data: bytes) -> requests.Response:
   url: str = 'https://example.local/f1'
   return s.post(
      url,
      data=data,
      headers={'Content-Type': 'application/vnd.mycustom+json'},
      verify=True,
   )

def f2(s: requests.Session, data: bytes) -> requests.Response:
   url: str = 'https://example.local/f2'
   return s.post(
      url,
      data=data,
      headers={'Content-Type': 'application/vnd.othercustom+json'},
      verify=False,
   )

def something_else(s: requests.Session):
   r1 = f2(s, b'...')
   r2 = f1(s, b'...')

The problem, however, is that depending on the order you call these you get behaviour you do not want. In a call to something_else above, we create a new connection to our endpoint with TLS verification disabled. Then, we make a second call where we explicitly want it enabled. If we're able to reuse the first connection, though, then we will not re-establish a TLS connection in order to verify the chain of trust in the second call.

This affects a number of old versions of requests but the fix is in the next release.

Customize User-Agent Header

One thing that people often seem to run into is being bucketed into a group with abusive users of python-requests. Many servers now look for the default python-requests User-Agent string and change the behaviour of their service. Regardless of whether this applies to you, or you're only talking to other services you own or are internal to your company, you should create a User-Agent header that distinguishes you from someone else.

These are by no means meant as authorization or authentication, but they will help with potentially finding hints as to what's happen.

Personally, I recommend the name of your library/service and the version of it that makes sense to you. If you distribute a library to others to use to interact with your service and you're on the hook for providing support for that library, I also include suggest including the version of critical dependencies. Luckily, there's a utility that makes this easier.

Use Redirects Well

By default, python-requests has redirect handling and it has had this support for quite a while. python-requests also limits how many redirects it will follow by default to 30 redirects. If we allowed unlimited redirects, someone could easily create a server that has infinite redirects which tie up a client indefinitely creating a denial of service.

My advice here, is that you probably want to set a lower limit than that. Personally, I always lower that to 3 as a starting point (or if I know I should never see a redirect, I disable them). I also deliberately combine this with constrained timeouts. I constrain my timeouts because, as I mentioned above, a server could trickle data to you without tripping a read timeout. So if it trickles data to you but is also redirecting you after it finishes writing all that data, then a single request can take as long as the amount of time it takes to connect and read all of that data multiplied by the number of redirects.

Learn Just Enough About Networking

Finally, the team behind python-requests is very small and we have full-time jobs and families. Many times we get people who file issues "urgently" seeking help with an exception they're encountering that they could otherwise learn about themselves. Unfortunately for them, there is no guarantee of support at all from python-requests and definitely no urgent support available.

If instead you learn a little bit about what goes into HTTP, HTTPS, etc. (for example, you could start by reading a zine) then you'll better be able to help yourself. The more you can get a basic understanding of what's happening, the better able you'll be to find information online to help yourself.

If you want an outline, here are some good starting points:

Learn what TLS is (and why it is sometimes still called SSL):
- Keep in mind that when establishing a TLS connection, things can get in the way
- Learn a little about certificates and how to get information about them, e.g., using openssl s_client.
Learn a little about TCP and DNS:
- Know that a connection can go through intermediaries and so sometimes you get a vague message about a "Peer".
- Know that DNS is the backbone of much of the infrastructure you need to care about. (Bonus points: Learn how to use a tool to inspect DNS, like dig)
- Know that sometimes performance is caused by having to break up large messages across multiple TCP packets
Learn a little about network layers
- Know that there are seven different layers of the network
- Know how "appliances" operating at different layers of the network may affect your usage of any HTTP client
  - For example, know that an "appliance" (e.g., load balancer) operating at layer 4 cannot modify the HTTP request or response bodies, but can affect the TCP packets. (This is useful to know if you care about preserving aspects of the clients information. See also [4] [5])
  - Also know that at Layer 7, the "appliance" is what you establish TLS with and it may not establish TLS with the service behind it that you are hoping to securely communicate with. Also Layer 7 allows the "appliance" to modify aspects of the request and response.

Remember `python-requests` is not a Browser

While a lot of the time, python-requests looks to browsers and curl to determine what the "right" thing might be to do, it is not in fact a browser. It will not do the following:

HTTP Strict Transport Security, a.k.a., HSTS
Render/Execute JavaScript
Save login sessions across executions of the Python interpreter
Follow redirects in the <head> section of an HTML page
Or do any other number of things that you might expect of a browser like Firefox or Chrome.

Understand the Limitations of Python

Additionally, there are some things that python-requests can not do because of Python. For example, exposing the TLS chain (verified or not) to the end user is not something that is possible because that's not something that is available anywhere in Python's ssl module. Likewise, we cannot do Online Certificate Status Protocol (a.k.a., OCSP) verification to check for revoked certificates because there's no way to do that with ssl today during the current connection handshake. If you want to know that you received a 100 Continue informational response, and wait for that, python-requests effectively relies on http.client which does not support that.

Conclusion

There are lots of things that can be done to improve your usage of python-requests (and really any HTTP library that you are using). It starts by looking at how the library behaves and what you're observing and forming a hypothesis and then testing to see what happens when you change something (as any experiment does).

Footnotes

[1]	At the time of this writing, 3.7 and everything before is unsupported, so surely no one is using that, right?

[2]	https://docs.python.org/3/library/http.client.html#http.client.HTTPConnection

[3]	https://toolbelt.readthedocs.io/en/latest/uploading-data.html#streaming-multipart-data-encoder

[4]	https://www.haproxy.com/blog/use-the-proxy-protocol-to-preserve-a-clients-ip-address

[5]	https://github.com/haproxy/haproxy/blob/master/doc/proxy-protocol.txt

Improve Performance

Use a Session

Be Smart About Data

Sending

Receiving

Character Encoding

Use Timeouts

Timeouts Are Not Total Time Timeouts

How to Get Good Timeouts

Gotchas and Common Issues

Be Careful With Threads and gevent

TLS Tricks and Trip-ups

Customize User-Agent Header

Use Redirects Well

Learn Just Enough About Networking

Remember python-requests is not a Browser

Understand the Limitations of Python

Conclusion

Footnotes

Be Careful With Threads and `gevent`

Remember `python-requests` is not a Browser