After I raised the question here I will also post the resolution as it might be useful for someone else.
After the first 5 failed deployments and before I created the question in the community I also opened a ticket with bmc support and provided all the available log data - unfortunately there was no useful feedback till now.
Actually the error messages about the missing forms are miss leading because the forms will be created once the migrations can run successful after the tenant creation which also includes the schema creation.
In my case the problem was related to jetty. During the start or restart the start script tries to login and run various actions via rest.
This login call never succeeded except when I managed to manually login via login script at the right moment when the system was starting and before the script tried to login.
If not then the login call was sent and kept open and no response was sent -> that is why the installer failed in the last step and also all start/stop tries failed.
As this was no issue in our dev and staging environment we scaled down the production nodes to only 6 cores and installation was running fine and no problem during any of the restarts.
Checking the catalog server sizing requirements again it seems like there happened anyway a mistake during server sizing in our environment as our nodes had way to many cores.
Before we scaled down all the nodes we also tested the deployment on a node with 16 cores again but this time after the arsystem installation we increased the min threads in jetty.xml to 20 threads.
Surprise Surprise - the deployment finished successfully within ~20 minutes.
I have not investigated this further or tried to find out if this is only related to this specific embedded jetty version (9.4.17.v20190418).
I will also share this info with bmc support for my case - maybe it makes sense to add 2 lines to the installer script that set the min threads based on the detected cores.