Okay, I’m the first to admit it: I am sometimes obsessed with the tiniest of details. But this has been nagging at me for quite some time now: why do people prefer to use POST requests? Why do I find this option enabled in the Mid-Tier configuration all the time? I know you can choose with what request method your HTTP requests are being sent from the browser to Mid-Tier, but why would you change this from GET to POST?
I have had many discussions with system administrators trying to convince them to please use GET instead. You see, I’m a stickler – no, that’s the wrong word. I’m a champion for the GET request method, and I’m here to convince you.
Let’s take a step back first so we’re all clear on the terminology. Mid-Tier is a web application which requires a web server to run. In order to communicate with a browser we use a request-response protocol called HTTP. We don’t have much of a choice of course, that’s just the way the Internet works. I won’t be going into this in too much detail, but the point is that HTTP defines the way we need to send requests up and down between the browser and the web server.
One key request we send over is the backchannel request. I've mentioned this before, it makes Mid-Tier interactive, rather than blocking the entire interface it’s all done asynchronously in the background. We load the page and if we need more data or instructions, the browser application sends out a backchannel request.
We send a lot of these requests, it’s an integral part of Mid-Tier and therefore we rely heavily on backchannel requests. To make sure the browser application and the Mid-Tier server understand each other we adhere to an internal standard when sending these requests. This is what a request to get records for a form:
They’re encoded instructions which both the browser application and the Mid-Tier server understand. Since this is an Internet application the underlying protocol is of course HTTP, which is a standard protocol detailing how clients (browsers) and servers need to communicate over the Internet. You need to make sure everyone is using the same thing, and that’s what the HTTP protocol describes. This is what a backchannel request looks like when it’s sent via HTTP:
This HTTP request is thus generated by the browser and consists of the following elements:
- The request line: POST http://localhost:8082/arsys764/BackChannel HTTP/1.1. This is telling the server what the browser wants, it is requesting a resource with the address http://localhost:8082/arsys764/BackChannel from the server. We’ll have a look at the format in more detail later.
- Next we can see various request headers, such as Accept-Language, User-Agent, etc. These are included so the server knows who it’s dealing with. Who is this client? What encoding does it support? Etc.
- Next there’s an empty line which separates the header from the body of the request. Sounds obvious, but if this isn't included the server doesn't know where the header ends and the body begins.
- And then we have of course the message body which contains whatever data the client wants to send over. In our case the backchannel qualification.
I deliberately used a POST request format as it’s easier to understand. What I want you to look at is that first line, the request line. There are three elements here: the first one is the request method, the second the URL of the resource the browser is requesting followed by the version. The version is straightforward, the browser is simply telling the web server that it’s sending its request using HTTP version 1.1, so the web server can check if it supports this version and if it knows what it looks like. Just to make sure the browser and the web server are all on the same page, so to say. The URL is straight forward as well, that’s the address it’s sending its request to. The one I want you to look at is the first element, the request method. Most people glance over this, but I want you to really understand what it means. This is not something merely formal like specifying the HTTP version, this is telling the server the method, the structure the browser is using to get the request across. Not only that, it tells the server what action that needs to be performed.
That’s as far as the theory goes. I do actually agree that under most circumstances there are perfectly good reasons why you’d use GET or POST. Any static content should really be requested via GET, add an attachment, you of course use POST. But what about backchannel requests? You’re not necessarily retrieving data all the time, the same goes for submitting data. This request asks the server for the data of a record in the User form:
This one asks the server to store a new record for the form frmtest:
Mid-Tier uses HTTP requests differently, it isn't necessarily storing or retrieving data, so it doesn't really matter what you use, you’re free to do whatever you prefer. There’s even an option in Mid-Tier configuration file config.properties to do just this:
But should you change this? Is there any point in sending requests either via GET or POST? Well, I don't think so. Let’s first have a look at what the same backchannel request looks like when you send it via GET or via POST. This is our request where the browser asks the Mid-Tier server to compile a qualification:
This is that request the way it would look as a POST request:
The request line, the headers followed by the empty line and finally the data. This is the same request sent via the GET method:
Forget the encoding, that’s just to make sure the web server doesn't confuse any of the characters for HTTP instructions. I hope you noticed that the request is part of the URL now. Not just of the URL, it’s part of the header. We’re not using the request body at all. The GET request relies completely on the header.
You’re probably wondering why this matters. Okay, so they look different, so what? Well, that’s what I’m getting at: this has consequences for your network – you need to understand what request method you’re using, and do this for the right reasons. If you’re going to send your data differently, this will ultimately affect your network.
HTTP is an application protocol that relies on the underlying TCP protocol. Where HTTP defines what requests (and responses) should look like over the Internet, TCP describes the protocol that makes this possible. We’re talking about bytes now. Don’t worry, I won’t go into too much detail, but on network level we’re not looking at HTTP requests, we’re looking at TCP/IP packets. There’s a lot of data being sent up and down and things can sometimes go wrong. If you’re going to use POST it will send the header and the body in two separate TCP/IP packets. That’s important to notice. If you have a very busy server you’ll have a lot of TCP/IP packets to deal with. If your requests are sent using several packets, it’s more likely things can go wrong. You can’t always prevent this of course, if there’s simply too much data even GET will have to use multiple TCP/IP packets. But POST consistently uses a different packet. This can sometimes get delayed and the server can have difficulties piecing it together.
Not convinced? Let’s have a look at the same example I used for this blog post. This is my request:
Notice the body is only sent after the server already times out (and returned the ARERR9350 error). You can see this happening on TCP/IP level if you work your way through network traces. This is Wireshark, a tool I often use to analyse network traffic:
See those TCP/IP packets I've been talking about? Don't worry, this is quite complex to read, but what I want to point out here is that the packet I highlighted contains the request body which is sent separately. If you'd work your way through the various packets you'd notice there is a delay between the packets for the headers and the body. There are delays on the network and you want to avoid that at all costs.
I can’t stress enough the importance of avoiding network delays. I’m not only talking about performance problems; I'm talking about network problems as well. If you see all sorts of weird ARERR9350 and ARERR9351 errors, check this setting, or get your Wireshark out. GET tends to be faster with the smaller requests and your application needs to be as responsive as possible.
There’s no real advantage to using POST requests. System administrators often quote security considerations as the main reason for switching to POST. I don’t agree. The idea is that you don’t see the data as it’s in the POST request. This is true if you’re using a page refresh and the whole URL would be visible in the browser’s address bar therefore exposing all your data, but this isn't the case here. Backchannel requests are all happening on the background, the user won’t see them.
It’s true that GET requests are cached by the browser whereas POST requests aren't, but Mid-Tier can influence the caching. If you check your browser cache you won’t see them either, your browser does not cache the backchannel requests so they won’t appear in your browser history. We do this by setting the cache headers:
The browser is instructed not to cache the requests, so a user won’t see them anyway. If security is a concern, you need to use SSL, switching to POST requests won’t help you here.
However, backchannel requests can get very large. If you add a long qualification it can quickly result in a very long header. Some web servers don’t like this and restrict the maximum header size of a request. I acknowledge this is a concern, but I don’t think switching to POST is necessarily the answer here. True, your header will be a lot shorter, but you can also just increase the allowed header size of your server.
Are there any advantages to using POST requests instead of GET for backchannel requests? I’m not convinced, for me it all comes down to performance and stability. You need to nip this in the bud. I’m not saying that POST requests will necessarily result in delays and network errors, but it is a real possibility. If you ask for my opinion, you’ll GET it (pun intended).
But hey, that’s just my opinion. What do you think?
Right, that's all for now. Until next time,