We are experiencing an intermittent issue involving our Advanced File Server(AFS)/Advanced Repeaters(ARS). The best that I can determine, this issue happens when the environment is under heavy load. I say this because if we deploy one of our batch jobs against say 15 targets, some will finish correctly and others will fail with one of the two errors below:
If we Execute against failed targets on the batch job the failed targets will usually complete correctly, but if the environment is still very busy, they may fail again with similar errors. On 10/07/2015, we did an uninstall/reinstall of the repeater software on our AFS as a troubleshooting step under ticket #ISS04497767. We opened this ticket because we were seeing similar deployment issues, but with the Unified Agent Installer job.
With all this said, my real question is what would I be looking for to see if the AFS is indeed being overwhelmed or if there is something else causing these errors? I suspect that Marimba is the real culprit because at one of our sites that use an advanced repeater, the WAN link is sufficient to disable the repeater routing rule and run the deployment without the repeater in the mix and we do not see these issues. But this is not a solution for this one site, and the WAN links to our other sites cannot possibly handle the traffic. Our AFS is a 2008R2 VM with 4 CPUs and 8GB RAM.
We are running 8.5 Patch 5, Advanced Repeater 8.3.0.
If there are any Marimba Guru's out there, please chime in. We are desperate for a solution!