Some errors are clearer than others and the same goes for errors thrown by Remedy. Session is invalid or has timed out is pretty clear, Required field not specified is self-explanatory. They’re no-brainers, you know what to do – or at least get a clue what they mean and you’re able to work towards a solution. But what about Unable to setup data connection, which is preventing the application from working correctly? What data connection is it talking about? What does it mean it’s not working correctly? How is it preventing the application from working? Other than the error the system seems to work fine, it’s not like the system is crashing or it’s stopping you from using the application. It just comes up once in a while. So what does it actually mean?
The full error message is ARERR 9351: Unable to setup data connection, which is preventing the application from working correctly (not to be confused with the altogether completely different 9350 error which we will discuss in a future blog post). I have to agree with you, it’s not the most descriptive error (or even grammatically correct). The first thing you’ll have to understand here is that we’re talking about the data connection between the browser and the web server. Notice that I call it web server, not Mid-Tier server. This is all from the browser’s perspective and as far as the browser is aware it’s talking to the web server, not the JSP engine, not Mid-Tier, and certainly not the AR server.
This is what the data flow of a Remedy setup looks like. There’s an AR server which is fronted by a Mid-Tier server. Mid-Tier is a JSP application so it requires a JSP engine (think Tomcat). The JSP engine is able to execute all the JSP code and Java servlets but we still need a web server to communicate with the browsers via HTTP (IIS, Apache, Tomcat). What ARERR9351 is trying to tell you is that something went wrong with that last bit: when the browser and the web server communicated, something failed. So how are the two communicating and how do you know what’s failing?
A long time ago, back in the 1990s, if you wanted to get or send data to the web server you had to reload your entire page or request a new one. There was no way around this; you just had to get very creative with passing the content back to your page after it was loaded again. Interaction with the server was very limited; you wanted to avoid reloading pages the whole time, so the applications were pretty static. A Contact Us form was generally as fancy as it got.
That’s exactly how Mid-Tier works: we send an application to the browser in the form of HTML, CSS, JS files, etc. Every time we need something else from the web server we send a HTTP request in the background. And we need a lot of things: we need data, we need instructions on what to do next, we need to know what workflow to execute, we submit data, we execute services, the list goes on and on. For all these things we send HTTP requests to the web server, and all these requests happen in the background – you won’t see them happening. Nor should you – that’s the great thing about this technology, the secret to making Mid-Tier dynamic. The drawback is that if things go wrong, it’s pretty difficult to figure out what’s going on. It all happens in the background, remember? So how can we find this out? That’s where the ARERR9351 error comes in.
We call those requests backchannel requests and what ARERR9351 is trying to tell you is that one of them failed. There’s a lot of HTTP traffic between the browser and the web server but this error only happens when a backchannel request fails. If images fail to load, report servlets malfunction or forms load you’ll get different errors, this error is specific to backchannel requests.
Still confused? Let’s have a look at an example just to give you an idea what we’re dealing with. Say I have a simple form with a table field and I click on the Refresh button. The browser needs to request the data and sends a backchannel request to the Mid-Tier server. The data is then returned and the browser loads it in the table field.
When I clicked on the Refresh button this is the HTTP request that went out:
/arsys/BackChannel/?param=143/GetTableEntryList/1/09/localhost6/formT126/Default Administrator View9/5368709139/localhost6/frmtst0/1/04/10002/0/0/2/0/2/0/2/0/2/0/1/02/0/2/0/
That’s the backchannel request I mentioned which the browser sends to the web server. It’s in a very specific format, we put BackChannel in the URL, as that’s the servlet we’re using, followed by the parameters. Depending on your HTTP method these either appear in the header (GET) or the body (POST). It doesn’t really matter what they mean, but you can get a gist of what it’s doing. I highlighted the server (localhost), the action (GetTableEntryList) and the form (formT1), just to give you an idea. If everything goes okay, this is what the web server sends back to the browser:
HTTP/1.1 200 OK
Last-Modified: Wed, 31 Jul 2013 10:34:09 GMT
Date: Wed, 31 Jul 2013 10:34:09 GMT
But what happens when things don’t go as expected? Say I do the same call, but I have a proxy server between my browser and web server which, under specific circumstances, blocks my outgoing request. I don’t know this and neither does my application. All my HTTP calls go through okay, but then I click on Refresh. For whatever reason the proxy server picks up on it. Maybe my call is too long, maybe I waited too long. It doesn’t really matter, the point is that the proxy server suddenly wants to authenticate my backchannel request. Since my call doesn’t contain the correct authentication information it fails:
HTTP/1.1 407 Proxy Auth Required
Date: Fri, 25 Jan 2013 16:49:29 GMT
Proxy-Authenticate: Basic realm=""
Denied: HTTP/407 Proxy Auth Required.
This all happens in the background, so I don’t know this is happening. We need a way to inform the end user, and that’s where the ARERR9351 error comes in. For every response for backchannel requests we receive we do a quick sanity check on the browser side. We want to know if the response is what we’d expect – if it isn’t we need to let the end user know. That’s what ARERR9351 really means: a request for data went out and whatever we got back from the server can’t be used.
Simple, right? It’s an error check, that’s all it is. If you want to resolve it you need to understand why it’s thrown. Easier said than done of course, there’s so much stuff happening on HTTP level, so how do you know what you need to look at? I always approach this as a network problem. Forget Mid-Tier for the moment, if you want to know why this happens you need to check the HTTP request and response. Every backchannel response is validated and in order to know what’s good and what isn’t we’re using the HTTP status code. Every HTTP response contains a HTTP status code. It tells the browser what the server did with the HTTP request and what is supposed to happen next. There are basically four categories: codes starting with 2, 3, 4 and 5. You can ignore anything that starts with 3, they’re used to tell the browser further action is required: redirection, authentication, etc. If everything is okay we get HTTP200 (like our first example), result codes starting with 4 or 5 can potentially result in ARERR9351.
The HTTP status code is the key to understanding the reasons why ARERR9351 errors are thrown. It’s the one piece of information you need to collect. Up to version 7.6.04SP3 you actually had to go through the HTTP logs, but from SP4 onwards it’s part of the error message. This is handy as you ask end users to send you the screenshots. What you’re looking for is consistency; you need to understand if the errors are always the same or if they vary.
For example, say I have various users reporting the error. I ask for a few screenshots, and this error keeps coming up:
That’s HTTP414. To me, this immediately rings a bell. The first 4 confirms it’s a client error, this means that the web server rejected the request because of something the client (browser) did. This is important, the communication stops at the web server, it doesn’t go any further, so there’s no point looking at Mid-Tier. HTTP414 means 'Request-URI too long', so there’s probably a good chance that my request URI exceeds the maximum allowed.
See how those status codes really help? I don’t need to do any HTTP debugging at this stage, I can figure these things out purely by looking at the HTTP status code. This doesn’t mean it’s always that easy. The status code is great, but it won’t give me a lot of details. In the above example I still want to know how long my requests actually are. The way I do this is by using a HTTP debugger. My tool of choice is Fiddler, but there are plenty of alternatives. It can be a bit tricky to capture the error as you need to run HTTP debuggers next to your browser, but with a bit of patience and luck you should be able to get a usable log. Let’s have a look at the log file for the above example:
You might wonder how Mid-Tier fits in to all this. Mid-Tier is a JSP application running on a JSP engine behind the web server. It’s only a concern if the error is server related. This is easy to check by looking at the HTTP status code again. So far we’ve seen client errors (starting with a 4) which stop at the web server. If the error starts with a 5 it means that the error comes from the server which in our case usually means it comes from the Mid-Tier side.
The most common server errors you’ll run into are HTTP500 and HTTP502. If this happens you need to check what happens to the requests when they arrive on the Mid-Tier server. Bear in mind, we’re only looking at backchannel requests, nothing else. The good thing is that Mid-Tier logs all the requests as they arrive on the servlet engine. When you see ARERR9351 errors and the status code is either HTTP500 or HTTP502, enable your Mid-Tier logs. The only categories you need are Performance and Servlet. Make sure to set the log level to Fine and the format to Detailed Text.
You need some form of HTTP debugging in this case, just the HTTP status code won’t do. Ideally what you need are Fiddler logs plus the corresponding Mid-Tier logs so you can cross reference the two. Mid-Tier logs the actual URL so you can just copy the problematic URL from your Fiddler log and look for it in your Mid-Tier log. For example, say this URL results in HTTP500 error:
GET .../arsys/BackChannel/?param=1082/GetTableEntryList/1/218/remedy-prod.kp.org19/SHR:OverviewConsole25/Overview Homepage Content9/30144420018/remedy-prod.kp.org25/SHR:ARDBC_OverviewConsole0/1/03...
I have my corresponding Mid-Tier logs and I do a normal text search with the URL. I keep the timestamp in mind to make sure I’ve got the correct one. Next I check what happens with the request. To do this I follow the call using the Thread ID. That’s the only way I can be sure: all calls for one request have the same thread ID, that’s how we group calls. That’s good news for us, I can now just note the thread ID and look for all lines with the same ID until I see the line containing ‘doRequest Backchannel end ‘. You end up with small stack containing the properties (mapProperties). If you’ve got the right call you also will see why the error returned HTTP500. In my example:
(Thread 211) mapProperties mSchema=SHR:ARDBC_OverviewConsole
(Thread 211) mapProperties mQualification=5\400068500\
(Thread 211) mapProperties mQualFieldTypes=4
(Thread 211) mapProperties mCCId=
(Thread 211) com.remedy.arsys.goat.GoatException <init> Throw ARException - ERROR (1588): Value specified for selection not one of defined values for this field; (1000). ERROR (1588): Value specified for selection not one of defined values for this field; (1000).
at com.bmc.arsys.qual.a.a.c.a(Unknown Source)
at com.bmc.arsys.qual.a.a.b.a(Unknown Source)
at com.bmc.arsys.qual.a.a.b.if(Unknown Source)
at com.bmc.arsys.qual.a.a.b.for(Unknown Source)
When Mid-Tier processed the request it encountered an error and Java threw a nullpointer error. The request therefore failed and the web server returned HTTP500. What you need to do next is check what the error means on the server side. It’s beyond the scope of this article to go into this, but you see where I’m going with this.
This sounds like a lot of work, and I agree with you. Therefore I put a tool together to do the Fiddler / Mid-Tier log analysis for you. It does the same thing: it identifies requests in Fiddler and outputs the corresponding Mid-Tier log stack. Go here for more information.
But what if you can't use Fiddler? What if you can't install an application on your client machine, or if the error is simply too intermittent? Your best bet is using Access Logs. Every web server keeps a log that records the URIs that come in and the outcome. What you could do is go through these logs and identify backchannel requests that result in HTTP status codes that start with 4 or 5. If you have a screenshot and already know the error code, even better, just look for that error. Access logs only log the HTTP header, so it’s not nearly as good as Fiddler. What you need to do is check for Backchannel requests and then check which ones result in status codes starting with 4 or 5. We’ll have a more detailed look into this some other time.
So that’s how you go about analysing ARERR9351 errors. To summarise, this is what I’d do:
- Collect screenshots and check the HTTP status codes. Check if they’re consistent.
- If you know the status code, check what it means (see the links below). If it’s starting with a 4 it’s not problem on the Mid-Tier side. Use Fiddler (or another HTTP debugger) to get more information if needed.
- If it’s starting with a 5, get the corresponding Mid-Tier logs (performance and servlet). When the problem is reproduced, check what happens to the request when it arrives on the server. You can use my Fiddler extension for this.
The one thing you have to keep in mind is that Mid-Tier is trying to tell you it’s running into a network error and you’ll have to approach it as such. It’s usually a problem with a proxy server or a web server. How do you know? You’ll never guess, the clue is in the HTTP status code.
Right, that’s all for now. Until next time,
Don't forget to follow me on twitter!
- Overview of HTTP Status Codes: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
- IIS’s Status codes (including codes specific to IIS) : http://support.microsoft.com/kb/943891
- Fiddler Tool: http://www.fiddertool.com
- Fidder/Mid-Tier Link: https://communities.bmc.com/docs/DOC-24895
- Data collection using Fiddler: View Web Traffic
- Tomcat's Access Logs: http://tomcat.apache.org/tomcat-6.0-doc/config/valve.html#Access_Log_Valve
- Troubleshooting-focused summary: https://communities.bmc.com/docs/DOC-25847