Can you clarify if it’s the analysis phase or deploy phase that is taking a long time ?
Where are these systems in relation to the appserver? across a wan ? same network ?
It is the analysis phase that is taking a long time.
The systems are split across different datacenters. We are using advanced repeater for each of the DCs. But i think repeater will not be used for PAJ.
One thing i noticed is that when the PAJ runs from one of the appservers (03 in our case) , it is completing within 2mins as expected. And when the same PAJ runs from either of the other two app servers (01 or 02), it is taking longer time. Below are the screen shots of a PAJ taking longer time:
Can you please let me know what configurations i can check in our environment which might cause this kind of delays.
I suggest you monitor the processes and CPU on one of the servers being affected, while it's running, and check if by any chance, BLPatchCheck2.exe is running at high CPU usage for too long.
Does it even get pass the copying of the xml based on the logs or it doesn't show any logs at all for the targets that are running?
Thanks for the reply. Sure, i will check the CPU usage when i run the analysis and let you know what i find.
Yes, the copying of the xml does get completed on all the targets, but only after a lot of time (approx. 20-22mins). In the screen shots you can see targets that do not show any thing in the logs, those are the servers which are taking longer time.
But it is little strange that the analysis gets completed in expected time on all targets when the PAJ runs from one of the app servers(03). And it does not complete in expected time when it is run from any of the other 2 app servers.
what do you mean 'runs from either of the other two appservers' - the job is picked up by those appservers or the WIT from the job are running on those appservers ? in the 2m in case, are all the WIT for the job running on the 1st appserver ?
are all 3 of these appservers in the same location ? is the network path from the appserver to the target the same for all 3 appservers ?
Yes, i meant the job being picked up by those appservers.
All 3 appservers are in the same location and the network path from these app servers to the target servers is same as well.
And for the 2min case, the WIT for the job were running on all the 3 app servers.
I suggest you monitor exactly what is happening in terms of processes (use Task Manager with the added Command-Line column -- add it with the View menu) when each line is shown in the PAJ logs. Sometimes a line is printed to indicate it started a certain step, and it's completing it without printing something else before starting another task which it doesn't echo anything for in the logs... so it can be confusing.
I.e. it says it starts copying the xml, and then that it's starts the analysis, but in between it had to decompress and decrypt the xml. The issue will seem like it' sthe copy, but the file probably got there within seconds, and it's the decryption that's taking forever.
I know that shavlik's engine is relying on msxml to decrypt the HUGE xml files in order for hfcli to be able to read it. That takes a lot of CPU and on some servers, has been known to cause problems (and take forever to run).
Did you by any chance, change the PRIORITY setting in the Windows options of the Patch Global Configuration panel?