Heisenbug of the Day: IIS 7.0 Discarding POST Data From Firefox 3 when using Custom 404 Handlers

Our site uses IIS custom error handlers, so that when you request /not/a/real/page.html, it’ll actually run /errors/404.asp – there’s a nice article on 4guysfromrolla about how you do this in classic ASP.

In theory, this works for both GET and POST requests, but last week we hit a snag – some of our jQuery Ajax code wasn’t working properly in Firefox 3. More specifically – it worked fine locally, it worked fine in all other browsers, but when we deployed the code to any of our test or live servers, it wouldn’t work in Firefox. IE, Opera, Safari, Chrome – all fine; it seems like only Firefox was affected.

Method Server Request URL

Result

GET IIS 7.5 (Windows 7) /errors/404.asp

OK

POST IIS 7.5 (Windows 7) /errors/404.asp

OK

GET IIS 7.5 (Windows 7) (404 handler)

OK

POST IIS 7.5 (Windows 7) (404 handler)

OK

GET IIS 7.0 (Windows 2008) /errors/404.asp

OK

POST IIS 7.0 (Windows 2008) /errors/404.asp

OK

GET IIS 7.0 (Windows 2008) (404 handler)

OK

POST IIS 7.0 (Windows 2008) (404 handler)

FAIL

Firebug didn’t show up anything unusual, so we fired up Fiddler, a web debugging proxy that’ll show you what’s actually being passed between the client and the server. At least, that’s the idea… what actually happened is that when we started running Fiddler, the bug went away. Yep… we had ourselves a real live Heisenbug:

Heisenbug: “…a computer bug that disappears … when an attempt is made to study it.” [via Wikipedia]

Fiddler runs as an HTTP-level proxy – in other words, it understands the HTTP protocol, and sits between your web browser and your web server, and – in theory – transparently forwards information between them, whilst recording all the bits that fly backwards and forwards so that you can dissect them and see what’s going on. I’d guess that, somehow, Firefox was sending dodgy requests, and Fiddler was cleaning up these requests as part of the proxying process – hence why the problem disappeared when Fiddler was running.

Time to dig a little deeper. Wireshark is a deep-level network protocol analyser that’ll sniff your network traffic right down to the frame level. What I did next was to load up Wireshark, set up a filter [1] to show only HTTP traffic to/from our build server, and then submit the same request from a couple of different browsers – including Firefox.

image image

The first grab there is what’s travelling over the wire when you POST that form using Google Chrome; the second is the same POST submitted using Firefox. Remember – at this point, we’re totally lost and so looking for absolutely anything that’s different. If you look closely, you’ll see the Chrome trace includes an extra line - [Reassembled TCP Segments (680 bytes)] – that isn’t in the Firefox trace. They’re otherwise identical other than known differences like the User-Agent string and so on. Curious. A bit of experimentation verifies that Safari and IE are doing the same thing as Chrome – submitting two frames of data for each POST – where Firefox is only submitting one.

It turns out this triggers a bug in IIS 7.0 when you’re using custom 404 handlers.

Bad Analogy Time…

image Imagine it’s your birthday. You’ve got a load of packages to open - you open the first one, and there’s a card inside saying “Happy Birthday! Enjoy the Lego! Love Granny xxx”

Now – at this point, you’re expecting some Lego, right? Well, if Granny is Chrome, IE or Safari, she’s been sensible – she’s sent the card in its own envelope, and put the Lego in the next parcel. But, if Granny is Firefox, then Granny has done something foolish, and has crammed as many of the Lego bricks as she can into the same envelope as the birthday card. If the Lego set is only tiny, then she can fit all the bricks into the envelope – and so won’t bother sending the now-empty Lego box.

So… imagine the envelopes/parcels are TCP frames, the birthday card is your HTTP request, and the Lego is the associated POST data. The card (headers) say “hey, there’s more stuff coming” – and then somewhere close behind, there’s another package with that “stuff” in it.

Now, onto the IIS 7.0 bug. Under normal circumstances, IIS 7 copes just fine with POST data being in the same frame as the actual request. (That’s why this bug doesn’t affect every Firefox user who visits an IIS7 site.)

Thing is - when a request is processed by a custom 404 handler, IIS 7.0 is opening the envelope, finding the birthday card, going “whooopeee! Lego!” – and then throwing the envelope away without checking to see if there’s any Lego in it, before looking around excitedly to see where the next parcel is.

For very small POSTs, this results in the Request.Form being empty (because all the Lego has been thrown away with the envelope). If you deliberately pad your POST with a couple of really long fields - <input name=”padding” value=”xxxxxxxx … xxx” /> for 2000 characters or so – then you’ll see that even Firefox now has to split the request over more than one frame, and that any POST values that end up in the second frame are now accessible to IIS via Request.Form in the usual way. Kinda like Granny sending you a really big Lego set, and putting the first 20 or so bricks in the envelope with the birthday card, and the rest in a separate parcel or two – throw away the envelope, and you’ve still got *most* of the bricks, but many of them have gone missing.

So… workarounds. Firefox patch – no good. Too many installed users. Upgrade all our web servers to IIS 7.5? Er, not right now, thanks. IIS hotfix? Lovely – if you’ve got one, send it over.

In the meantime, the best option we could find was to stick two hidden fields at the top of the affected form, something like:

<input type=”hidden” name=”ff_frame” value=”xxxxx . . . 1460 Xs here  . . . xxxx” />
<input type=”hidden” name=”ff_split” value=”1” />

<!—everything after this point will show up intact in Request.Form -->

<input type=”hidden” name=”real” value=”some_data” />

The big string of X’s pads the first frame to make sure all your real data ends up in the second one, and the ff_split value ensures that this padding doesn’t mess up IIS’ parsing of subsequent POST values. Yes, this is disgusting - and it adds 1Kb+ to every POST - but it’s only required in a handful of places, and we’re looking to isolate it inside the jQuery code we’re using so it’ll be dynamically inserted into POSTs where necessary.

[1] ((http.request || http.response) && (http.host contains "build")) && !(http.request.uri == "/favicon.ico")