Nginx and weird 416 responses
We have been experiencing lots of 416 responses on our nginx based video servers recently. Well… not that recently in fact. We were seeing those since we saw our first access log, but it’s like one client getting tens of those every second for a few seconds so we thought it may be a problematic flashplayer or something like that. Deeper inspection proved otherwise. But first, WTF is 416?
416 is a client-error type HTTP response code which is defined as “Requested Range Not Satisfiable”. It means the client requested some (byte) range over a resource but the range for that resource is not applicable. Simple example would be like the file is 1MBs but the client requested the portion between 2MBs and 3MBs of it. Server cannot reach the 2nd MB of a 1MB file so it will respond with 416. All the details about this are told in rfc2616 section 14.35.
On our case, it didn’t actually make much sense since most of our video is played on flash-based players and as far as we know, flashplayer is unable dispatch a HTTP request with Range header (i don’t know much about flash so i am not actually sure about this). But it happens and generates a lot of traffic waste, and probably some unhappy clients.
So, after some clever but dirty tcpdumping we see those requests were not like the simple example i just mentioned but actually invalid Range headers. Out of that dirty tcpdump output we saw things like:
Range: bytes=7259-7258 Range: bytes=10513-10512 Range: bytes=0--1
There is a pattern here: some client, some browser, some code with an off-by-one error causes last-byte-position to be one less than the first-byte-position which generates a syntactically invalid Range header as rfc puts it. This is the client’s problem. But the next sentence in the rfc says the recipient (nginx) MUST ignore such headers. And AFAICT if you ignore a header you should process the request as if that header was never there and you should respond 200 with all the content. It seems nginx has a bug there.
Our first course of action in these kind of situations is google the hell out of the problem, since we generally believe we are not alone. But this time we found nothing. Next, we tried apache and lighttpd with the same request and they responded 200 as rfc suggested. So we replaced our nginxes with apache… just kidding.
We thought it’d be possible to workaround this by using some lua in our config so here is what we came up with:
header_filter_by_lua ' if ngx.var.http_range then local brange = string.sub(ngx.var.http_range, 7) local start = tonumber(brange:sub(1, brange:find("-")-1)) local stop = tonumber(brange:sub(brange:find("-")+1)) if stop and start and stop < start then ngx.req.set_header("Range", nil) end end ';
Without any prior knowledge of lua and searching for every basic operation this is as good as it gets. We strip bytes= part first, then cast the parts before and after the first dash into numbers. If those numbers are syntactically invalid, remove the header. Sometimes all that string manipulation and casting will fail and start or stop vars will be nil. If that happens, do not touch anything and let the nginx core do its work.
At the time of writing this post, we haven’t moved this piece into production yet but some local tests showed it’s OK. Just paste it into the server section of your nginx conf and you should be good to go.