Skip navigation
Share:|

BMC Remedy Single Sign On Service Provider (SP) certificate shipped with the product, which is used to sign SAML request, will be expired on April 21st 2016.

 

If you are using out of the box certificate to sign SAML requests in BMC Remedy Single Sign On, the request will fail due to the expiry of certificate.

 

In this blog, I will be covering the steps to update the BMC Remedy Single Sign On (RSSO) SP certificate so that it has a new expiry date, which will prevent from failure of SAML authentication.

 

If this certificate has already been replaced with a newer one with a valid future expiry date, you don't have to follow the steps mentioned in this blog.

 

First of all, how to find the Certificate expiry date of relying party (RSSO) for SAML authentication?

 

  • An easy way to find the certificate expiry is by logging to ADFS tool and checking the RSSO service provider relying party properties.
  • In the Signature tab, you should see the certificate expiry date.

 

Likewise, for other IdP tools that you are using with RSSO, you will have to contact your IdP administrator to check the RSSO relying party certificate expiry date.

 

What steps are necessary to update BMC Remedy Single Sign On (RSSO) SP Certificate?

 

Important Notes:

 

(A) The below instructions are written for Windows OS. All paths mentioned below are for Windows OS. Please use relative paths if you're using Linux or Solaris OS.

 

(B) The file name for the java keystore should be cot.jks. The alias for java keystore (cot.jks) should be test2.  The password for the cot.jks keystore is 'changeit'    Please do not change the password.

 

(C) Please make sure to set the Path environment to jdk or jre bin folder or else you may get error like ‘unknown internal or external command’. In Windows this means that you'll need to edit the System Environment properties and find the global variable PATH to update it.

 

1.png

 

Steps to update the certificate:

 

1. Update java keystore named cot.jks

 

Perform the following steps on the machine installed with RSSO server by being in <tomcat>\rsso\WEB-INF\classes folder:

 

a. Take a backup of existing cot.jks from <tomcat>\rsso\WEB-INF\classes folder

 

b. Delete alias ‘test2’ from existing cot.jks using keytool command line:

 

keytool -delete -alias test2 -keystore cot.jks

 

Note:  The password for the cot.jks is "changeit".  Please don't change the password

 

c. Create a new keypair with alias ‘test2’ in existing cot.jks

 

keytool -keystore cot.jks -genkey -alias test2 -keyalg RSA -sigalg SHA256withRSA -keysize 2048 -validity 730

 

Note:  In the above example, we used 730 days as validity, which is equivalent to 2 years validity.  You can use the validity days at your discretion

 

d. Export ‘test2’ certificate in PEM format

 

keytool -export -keystore cot.jks -alias test2 -file test2.pem –rfc

 

e. Take a backup of the updated cot.jks

 

If you have other RSSO server instances in same cluster, replace cot.jks in <tomcat>\rsso\ rsso\WEB-INF\classes folder with the updated cot.jks in step 1.e

 

2. Update signing certificate in RSSO Admin console

 

a. Login RSSO Admin console

 

b. Go to ‘General->Advanced’ tab

 

c. Open the file test2.pem which is created in step 1.d in text editor, remove the first line:

 

(-----BEGIN CERTIFICATE-----)

 

and the last line:

 

(-----END CERTIFICATE-----)

 

Also remove the newline delimiters (\r\n), and then copy the contents.

E.g. If you use Notepad++, you can open ‘replace’ dialog, select ‘Extended’ search mode, find ‘\r\n’ and click ‘Replace All’ button.

 

 

2.png

 

d. Paste the copied content in step 2.c to the ‘Signing Certificate’ field, replace existing content in the text area

 

3.png

 

e. Click ‘Save’ button to save the change

 

f. Wait for 15 seconds, view the realm using SAML, click ‘View Metadata’ button in ‘Authentication’ tab. Verify the SP metadata is updated with the new signing certificate.

 

3. Update SP metadata at IdP side

 

- Export the SP metadata in step 2.f and save it in a local file

 

- Send the exported SP metadata and the new signing certificate in step 1.d to IdP team for updating.

 

If the IdP is ADFS, the customer can add the new signing certificate as below:

 

a. Open ‘Properties’ dialog of the relying party for RSSO
b. Go to ‘Signature’ tab
c. Click ‘Add’ button, select the new signing certificate file and click ‘OK’

 

4.png

 

 

Notes for rolling upgrades (Cluster / High Availability environment)

 

Should you have a requirement for zero-down time in a cluster environment (assuming ADFS is the IdP) for the signing certificate update, then you can take actions with following sequence:

 

1. Take one RSSO server instance down first, perform step 1 on it
2. Perform step 2
3. Perform step 3 (remember NOT to delete the old signing certificate)
4. Make the RSSO server instance up again
5. Take the second RSSO server instance down, update its cot.jks with the one already updated on first RSSO server instance in step 1.e, then make it up again
6. Repeat step 5 on all other RSSO server instances
7. After the keystore cot.jks is updated on all RSSO server instances, you can remove the old signing certificate on the RSSO relying party at ADFS side.

Share:|

This blog post is really just to gauge interest and gather feedback on something I've spent a lot of the last year working on - which is sanitizing the ar.conf parameters that are published in the wiki docs here:

 

A-B - https://docs.bmc.com/docs/display/ars81/ar.cfg+or+ar.conf+options+A-B

C-D - https://docs.bmc.com/docs/display/ars81/ar.cfg+or+ar.conf+options+C-D

E-M - https://docs.bmc.com/docs/display/ars81/ar.cfg+or+ar.conf+options+E-M

N-R - https://docs.bmc.com/docs/display/ars81/ar.cfg+or+ar.conf+options+N-R

S-Z - https://docs.bmc.com/docs/display/ars81/ar.cfg+or+ar.conf+options+S-Z

 

In the comments section of each page I see questions being asked, so I wanted to ensure that the parameter information we provide is accurate and relevant.

 

The current published list could use improving in the form of:

  • Updating the parameters to include any unpublished ones and exclude any obsolete ones
  • Mapping each parameter to where in the application you can find it
  • Providing a link to where in the wiki docs you can read more about the parameter to understand it's context
  • Clearly identifying the default value

 

As a sample I've provided the A-B parameters that I've sanitized so far.

Unfortunately there's simply too much information to put it into a table inline to this blog post, so the only other alternative was an Excel file that is attached.

 

Let me know any thoughts/questions/suggestions etc. Bear in mind that this is still very much a work in progress so there are some gaps and parts that need to be modified further.

 

Some highlights of what's been changed so far are:

 

  1. All superscripts have been removed and each parameter now either explains where you can find the option in the UI or has a black cell indicating they don't map anywhere. Many of the parameters that had superscripts (which denoted you cannot set of view using the Server Information form) were incorrect, 44 on this page alone.
  2. URL's are added to where you can read up more on the parameter. Any that don't have a URL it's either because the correct page hasn't yet been located or one doesn't exist. For the latter, this will become an action point for our documentation team.
  3. The default value for each parameter has been added in a separate column to make it easy to identify what these should be without reading through the whole parameter description.
  4. The parameters for the Alert tool have been added back to the list. Although support for the Alert tool has now ended, Alerts are still used in Remedy for BMC Atrium Orchestrator.
  5. Parameter names have been corrected from where they were once wrong (for example: AE-Worker-Threads from AE-Worker-Thread and ARDBC-LDAP-Base-Dn from ARDBC-LDAP-Base-DN)
  6. The Atrium SSO parameters have been added
  7. Each parameter has now been categorized by component. This is for easy reference when you want to identify which component(s) use a particular parameter. This will be a filter option on the table once published the documentation pages.

 

My goal here is to make them easier to consume for you the customer/admin user.

 

I'm planning on trying to reduce the size of some of the parameter descriptions also, to make them more concise.

The corresponding page listed under the URL column should contain all the finer details of the parameter. This an action point for later though.

I'd also like to have an icon or similar to identify new parameters that were added to the version the page is published in (for example: 'API-SQL-Stats-Control' in 8.1.01)

 

I look forward to the feedback.

 

David

Share:|

start now.jpg

 

BMC training schedule is posted for January and February.  We want you to be successful in BMC solutions.  We run classes year round and worldwide across the BMC product lines.  Below are the classes listed by class/product name.


Review the below and register today.  Please check BMC Academy for latest availability, BMC reserves the right to cancel/change the schedule.  View our cancellation policy and FAQs.   As always, check back in BMC Academy for the most up to date schedule.


To see all our courses by product/solution, view our training paths.


Also, BMC offers accreditations and certifications across all product lines, learn more.

 

For questions, contact us

Americas - education@bmc.com

EMEA - emea_education@bmc.com

Asia Pacific - ap_education@bmc.com

 

Class
Date/Location
BMC Remedy AR System 8.0: Developer - Part 2

18 January / Americas / Online

15 February / EMEA / Online

BMC Certified Developer: BMC Remedy AR System 8.x

4 January / Asia Pacific / Bangalore, IN

25 January / Americas / McLean, VA

15 February / Asia Pacific / Singapore, SG

22 February / EMEA / Winnersh, UK

BMC Remedy AR System 8.0: Foundation - Part 2

11 January / Americas / Online

11 January / EMEA / Winnersh, UK

18 January / Asia Pacific / Online

8 February / EMEA / Online

8 February / EMEA / Paris, FR

22 February / Americas / Online

BMC Remedy AR System 9.0: Administering

4 January / Asia Pacific / Online

18 January / Americas / Online

18 January / EMEA / Paris, FR

25 January / EMEA / Online

1 February / EMEA / Dortmund, DE

8 February / Americas / Online

8 February / Asia Pacific / Online

8 February / EMEA / Winnersh, UK

29 February / Americas / Online

ITIL Foundation and Exam

4 January / Americas / Online

1 February / Asia Pacific / Online

29 February / Americas / Online

Share:|

If you have ever put together a web site you’re probably aware how tricky it can be to get the design right. Back in the day you were pretty limited as to what you could do. A few tables, some images and a bit of text, HTML mark-up certainly had its limitation. If you were too ambitious you were on your own.

 

But things were looking up with the introduction of Cascading Style Sheets. Suddenly it was possible to separate the content from the styling and with every new CSS version (and subsequent browser version) the design possibilities increased. But here’s the problem: the more complex it gets, the harder it is to figure out when things go wrong. Because that’s the thing with these sorts of problems: you’ve got to be able to work out how it’s put together.

 

That was actually a real headache. If something didn’t work they way you’d expect you just had to try again until it did. Other than checking the HTML source code there really wasn’t much else you could do. That all changed when browsers started including Web Development tools which finally allowed you to diagnose and (more importantly) fix the layout.  The part that helps us the most is called the DOM Explorer. The Document Object Model represents the page in the form of an object. The DOM Explorer displays this object in the form a tree structure. All the DIVs, tables, images, etc, are all represented as nodes within the tree, so you can navigate the whole page in an organised fashion.

 

Mid-Tier pages are no exception, if you are familiar with DOM it’s easy enough to see how they are constructed. To get a better idea how this is done, let’s have a look at an actual problem with Mid-Tier and see how we can use the DOM Explorer to make sense of the page’s construction. Here’s my problem: in ITSM I added a few custom fields to one of the existing forms. It all looks fine in Developer Studio and initially it seems to look fine in the browser. But then I noticed the fields are somehow displayed as read-only, no matter what I’m trying I’m not able to get any content in there.

 

pica3.png

 

Here's Firefox:

 

blogx1.png

 

The fields are configured correctly in Developer Studio so let’s check what’s happening with the layout in the browser. I’m going to use Firefox for this example, but the functionality works similar in other browsers like Chrome. You need to be familiar with how ARS objects are displayed. Every field added via Developer Studio is part of the DOM tree, this is how this looks like:

 

pica1.png

 

So there’s a DIV which contains a few ARS fields. There's the FieldID in this format: WIN_0_101, you can see the type (char), the full name and the help text. The bits in red are HTML specific and mainly used for styling. Under the DIV we have the label and the actual field, a text field in this case. Here’s what this looks like via the DOM Explorer:

 

pica2.png

 

On the right we can see the various properties of the objects. There’s a lot of styling going on here, mainly colours, look and positioning.  You have to be relatively familiar with CSS formatting  to understand what’s going on here, but the good thing is that you can change stuff here to see what sort of effect it has on the actual layout. This is all client based, so you can’t do any permanent damage.

 

Keeping the DOM structure in mind, let’s have a look at one of our problematic fields. I’m going to use the Field ID (which I looked up in Developer Studio) to identify the field in the DOM tree. Here it is:

 

pica4.png

 

What I’m looking for is anything that would explain why the field is read-only. To do this I’m going to compare this field to one of the out-of-the-box ITSM fields. I’m going to put them next to each other and see what the obvious differences are.

 

blogx2.png

 

I hope you’ve spotted the z-index property which is xxx in the out-of-the-box field and 0 in the custom field. z-index is a property used in CSS to determine the order of the element. It’s of course entirely possible to stack element onto element in your layout and you need a way to control what element goes on top. That’s what the z-index element does. And guess what? The higher the value, the higher the order. So if it’s 0 it’s pretty much at the very bottom.

 

Here’s the good thing about using the DOM Editor: You can change things on the fly. So what will happen when I change the z-index of my custom field to 1000? Well, let’s give it a go:

 

blogx3.png

 

Behold, I can now type content in the field! What changed? The order of course, my custom field was stuck under another (transparent) panel. I could see it but not use it.

 

It’s not possible to explicitly set the order in Developer Studio. Mid-Tier’s interpretation of the order is based on the bring-to-front and bring-to-back functionality. It’s not always that obvious unfortunately and sometimes the layout can differ somewhat. The solution here? Bring it to the front a few times and keep checking with your browser.

 

I hope this gave you some idea how to use the DOM Explorer to look into layout problems. Mid-Tier’s job is to interpret the forms and translate it to HTML and CSS. If there’s anything that isn’t working the way you’d expect it, I’d encourage you to use the Web Development Tools. If you try it a few times you’ll notice it will be easy enough to diagnose this sort of layout issues. So the next time you notice an image which doesn’t look right, a field which isn’t in the right position, or text which is cut-off, press F12 on your keyboard and start debugging.

 

Regards,

Justin

 

Don't forget to follow me on twitter!

Young So

BMC Data Migrator Tool

Posted by Young So Oct 13, 2015
Share:|

Introduction

My Journey with Migrator tool was an interesting one.  Trying to learn about the Migrator tool thru documentation was somewhat limited.  Once I starting using the tool.  I've started run into problems that are found or not found in BMC sites. The goal of this blog is to save the newbie sometime on getting jump start on actually using the tool right off the bat without troubleshoot the tool itself.

 

I will divide things into section so that you don't have to read the whole blog to get started with migrator. I would recommend that you read the tuning section so that you don't run into issue or error with the tool.

 

Understanding Migrator

There are two tools within BMC Migrator.  There is the migrator itself and delta data migration tool.  These tool use Jet engine to create copy of the database being migrated and migrated from in the %temp% in most case.  I found that my log files are stored in the different location.  The location  found was %userfolder%/appdata/roaming/ar system for the migrator tool on windows system.  The log files for Delta Data Migration was stored in the working folder of Delta Data Migration tool.  These thing can be changed via registry on windows system.

 

-- This is where you edit most of the Migration tool settings

HKEY_CURRENT_USER\Software\Remedy

     

 

The software seems to have few issue that are well documented but, I see the information scatter around knowledgeable and communities.  I recommend reading the tuning section before starting your migration.

 

Enable Logging

If you run into issue with migrator, in most case you'll need detail logging.  In order to achive logging with the migrator tool start with the understanding of where the log is kept.  The log file kept:  C:\Users\%username%\AppData\Roaming\AR System\Remedy Migrator\backup\%servername%

 

Migrator is got it's beginning form AR platform thus you can use API log settings that work on the AR server.  In order for you to enable API logging you have to enable environment variable ARAPILOGGING=%loglevel%.  Log level value are 1 thru 88 far has I know.  The LOGLEVEL 88 isn't documented on BMC documents from what I could tell.

 

There are different way to enable logging with Migrator tool.  Enable server style logging.  By changing the value registry, you'll enable different type of logging.  I haven't found documentation on it.  You'll have to play with it.

[HKEY_CURRENT_USER\Software\Remedy\Remedy Migrator]

"LogLevel"=dword:00000006

"LogLayout"=dword:00000002

    

 

Understanding API Logging variable

 

Here is the documented log levels:  https://docs.bmc.com/docs/display/public/ars81/C+API+Client-side+ARAPILOGGING

 

ARAPILOGGING = 88 (shows you time stamp of the transaction)

 

 

Migrator Tuning

Here are knowledge article and communities discussion on Migrator tool.

 

I am firm believer in have two partition on servers.  One for the OS operation and second for the application.  Need to modify Configuration.xml setting delta work-dir to the correct install folder.  With migration there are lots of read and write to disk.  For example, when it cache the source server object.  It has read that information from the network and write it to disk.

 

Windows Registry Editor Version 5.00

 

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\inetinfo]

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\inetinfo\Parameters]

"ObjectCacheTTL"=dword:00000030

"MemCacheSize"=dword:00002048

"MaxCacheFileSize"=dword:00000256

"PoolThreadLimit"=dword:00000010

"ListenBackLog"=dword:00000250

"SynAttackProtect"=dword:00000000

"TcpTimedWaitDelay"=dword:0000001e

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Terminal Server]

"DeleteTempDirsOnExit"=dword:00000000

"PerSessionTempDir"=dword:00000000

[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Jet 4.0]

"MaxLocksPerFile"=dword:ffffffff

[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Jet 3.x]

"MaxLocksPerFile"=dword:ffffffff

          

 

Thread Error 2004

 

1.  Enable logging

2.  Provided the logs to support and get the issue escalated

 

With my experience with this error, we found that there was "overlay schema migrator was not handling multiple field references correctly for while parsing object property"  and I had to apply a hotfix.  In addition, I had to install the latest version of Migrator.

 

How to migrate ITSM with Migrator During a Version Upgrade (This also address non-Unicode to Unicode migration)

 

1. build the AR Server

2. Use migrator to move data only on "AR System Licenses" form

3. Install CMDB

4. Atrium Integrator Server

5, Install ITSM

6. Install SRM

7. Process Designer

8. Install SLM

9. Apply available patches

10. Upgrade the mid-teir (optional or only required during version upgrade)

11. Extended CDM for custom attributes

12. Disable DSO

13. Disable Escalations

15. Disable Database Triggers (Ask the DBA to perform for ARSystem database)

16. Fix known DDM issue on production/source database with DDM fixes

17. Tune database for migration

18.  Turn off the following Recon Jobs:

        BMC Asset Management - Sandbox

        BMC Asset Management - Sandbox Bulk

        BMC Asset Management CI Data Load

19.    Disable Normalization Job if any at the destination server

20. Do a difference report on forms that was change %date%

21. Need to note the date that DDM was run:  9/29/15  (Need to run DDM more than 3 times)

22. Run migratorFindCustomForms.bat

23. Run Post Scripts for DDM

Share:|

With the release of 9.0.01 the BMC Remedy AR System product now has a new component, BMC Remedy Single Sign-On (Remedy SSO). This component solves the problem of single sign-on for those who primarily use the BMC Remedy applications with either AR System based authentication or the SAML based authentication. This blog focuses on introducing the features of BMC Remedy Single Sign-on. 

Features of Remedy Single sign-on.

  • Light weight,
  • easy to deploy,
  • simple architecture,
  • modern UI,
  • default multi-tenant support,
  • easy high availability configuration.

The Remedy Single sign-on supports easy integration with BMC Remedy applications such as BMC Remedy IT Service Management (BMC Remedy ITSM), BMC MyIT, and BMC Analytics. Most of the integration steps are handled internally by the component as a result these applications are integrated in quick easy steps.

The Remedy Single sign-On is easy to configure in a high availability environment as there is no primary node and a secondary node. All the nodes are connected to a single database instance to store the data.

You can use the Remedy Single sign-on if you use the AR based authentication or SAML based authentication. Configuring these authentications are extremely easy and quick. For an AR System based authentication you only need to provide the AR System hostname and port number. To configure SAML V2.0 authentication properties, you need a Service Provider and Identity Provider, which allow you to set connectivity between SP and the IdP, certificates, attributes, and so on. In quick easy steps you can configure either of these authentication protocols and start using the Remedy Single Sign-on to solve the single sign-on problem in your organization.

Where to get started?

 

 

Watch the video below highlighting the features of Remedy Single-Sign On Remedy AR System

 

Share:|

AR 8.1 has achieved Common Criteria Certification in May 2015

 

Refer to the following Common Criteria Certification artifacts

  1. Security Target document – https://www.commoncriteriaportal.org/files/epfiles/bmc-remedy-v81-sec-eng.pdf
  2. Certification document  – https://www.commoncriteriaportal.org/files/epfiles/bmc-remedy-v81-cert-eng.pdf

 

--- Abhijit Rajwade

Share:|

Some of the common questions and answers regarding Remedy9 Full Text Search!


 

Answer

1. What's New in Remedy9 As far as Full Text Search feature is concerned? 

• Implemented FTS HA Model (Similar v7604 SP5 & v8.1 SP2)

• Apache Lucene version upgrade from 2.9 to 4.9

• Apache Tika version upgrade 1.2 to 1.6

• We broke down Monolithic FTS index file into schema specific files (a.k.a. Schema based index) using Lucene 4.9

• Developed FTS Index Migration Utility, Which will migrate Old Monolithic Indexes into Schema based Indexes & convert into Lucene 4.9 format too

•Incorporate FTS Index Migration Utility into AR Installer

2. What is Schema Specific Files (a.k.a. schema Based index) ?


  • Till Remedy9 Release, All FTS indexes gets stored into One Lucene specific File (Monolithic index file). 

      E.g. All incidents, Problems, Change, KA related indexes gets stored into this one single Index file. 

 

  • Now From 9.0 release we have broke down that file and we create separate folders inside <AR/FTSconfiguration/collection> folder for each AR Form/Schema stated above.


  • Example:

      If HPD:HelpDesk Schema has schemaId = 558 then All incidents related Indexes will get stored into one folder e.g. 558/

      <ARINSTALLDIR>/ftsconfiguration/collection/558


  • All Change Management related data will get stored into one folder e.g. 561

        similarly for rest of the AR forms, data will get stored.


Earlier Monolithic Indexes:

Monolithic.png


Schema Based Index:


SchemaBasedIndex.png



3. What's a need of FTS Index Migration Utility?

 

  • Till Remedy9, All FTS indexes gets stored into Monolithic Files, and Apache Lucene (The underline Search engine) was at 2.9.
  • In 9.0 We have upgraded Lucene Version to 4.9 and broke down the Monolithic indexes into Schema Based index.
  • The need arises to cater above two requirements
    • Convert Monolithic Indexes into Schema Based Indexes
    • Upgrade older indexes to newer format (i.e. from Lucene 2.9 to 4.9)

4. Where does FTS index Migration Utility gets deployed?

 

  • Windows:
    • <AR INSTALL DIR>\arftsutil.bat

 

  • Linux:
    • <AR INSTALL DIR>/arftsutil.sh

5. Is FTS index Migration Utility will kicked off automatically While Upgrading AR Server to 9.0?

 

  • Yes, As you start AR Server 9.0 upgrade, Index migration is part of AR Upgrade process.

6. Where does FTS Index Migration utility logs gets stored?

 

  • It gets stored under <AR INSTALL DIR>\Arserver\db

7. Can I skip the FTS Index Migration utility Execution?

 

  • If you are upgrading AR Server to 9.0 Using Installer (running UI) Then User do not have any option to disable the FTS index Migration Utility execution.
  • Out of the box, Installer will migrate old Indexes.

8. Is there any Way to Skip The FTS Index Migration Utility Execution?

 

  • If You upgrade AR Server 9.0 Using Slient Mode then one can skip the FTS index Migration Utility execution.
  • During AR Server 9.0 Upgrade if <AR Server INSTALL DIR>\ftsconfiguration\collection folder does not have any files then in this case, AR Server FTS Index migration utility will not migrate anything.

9. What is the Silent install parameter in order to SKIP FTS Index Migration process during AR Upgrade via Silent installation mode?


  • FTS index migration can be skipped if Installation done via AR Silent installation.

 

  • Using silent Install Parameter one can skip Index Migration:

                -J BMC_AR_SKIP_FTS_INDEX_MIGRATION= true

                  (Value to this parameter is case insensitive)

  • If this Parameter is present in AR Silent install File and it is set to True/TRUE/true then ONLY Index migration will skipped.
  • In any other condition, By default index migration will execute and will migrate old indexes.

10. If utility fails will it cause AR Server Upgrade failure ?

 

  • If FTS Utility fails it will cause AR Upgrade Failure.
  • Utility returns 3 return codes:
    • 0: Success
    • 1: Warning[When DB Only Upgrade OR when just adding new locale on existing AR Installation, it will return 1 but upgrade will not fail]
    • 2: Fail [AR upgrade will also fail]
  • FTS index Migration Utility Logs will be get stored under <AR INSTALL DIR>\Arserver\db

11. Shared load among  indexer servers. How does one configure that?

 

  • We have FTS Configuration plugin in place. Using that, one can dedicate the Indexer Server and Searcher.
  • E.g. If I have 4 Servers in Server Group, and I have nominated 2 as Indexer servers, and 2 as searcher.

        From FTS Config Plugin UI, One can configure Server A & Server B  as Indexer,  and Server C & Server D as Searcher.

  • To nominate as Indexer server, in the Server Group Ranking Form, that server should have rank for FTS Operation, that way AR Server will assume it is FTS Indexer Server.

12. FTS Index Migration Utility execution Failures: Possible Reasons?


  • Old indexes are corrupted
  • Not Enough space on the disk where ftsconfiguration/collection directory exists (during Index migration, it would required double the space on the disk, later while merging it will delete unwanted files present in the /collection folder.


        Example: Before upgrade if /collection dir is of 2 GB then, then while AR upgrade there must be 4 GB of free space                                        available in the /collection folder, at the end of the migration process, /collection folder will become ~1.2 GB


13. Can I execute FTS index Migration Utility Manually later some time?

  • Yes, user can but if indexes already migrated then no need to execute manually again.


14. Which Version of Luke can I use in order to debug Lucene 4.9 Index?


  • you can use Luke 4.1x onwards, can download from here


15. AR FTS Index Migration Utility - Usage?


  • <ARInstallDir>\arftsutil.bat  –d “<COLLECTION DIR PATH>”  -c “<FTS CONFIGURATION DIR PATH>”
  • arftsutil.bat -help (it will print usage)

16. How much time required to migrate Old indexes into newer format using FTS Index Migration Utility?

 

  • In R&D Performance Lab we conducted tests and here are the results (Note: timings may vary depending on the Server configurations)
  • Test I
    • Collection Folder  Size Before Conversion: 1.5 GB
    • Collection Folder Size After Conversion  :  0.8 GB
    • Total Time for Index Conversion Process : 13 Mins
    • Test Data during this test: ~100K records (Incidents, Change, Articles, RKM external docs etc)
  • Test II
    • Collection Folder  Size Before Conversion:  24.5 GB
    • Collection Folder Size After Conversion  :  15 GB
    • Total Time for Index Conversion Process : 130 Mins
    • Test Data during this test: INC ~522K, CHG 103K, SR 535K, WO 12K, PB 100K, RKM Docs 100K, AR Form w/       

                                                          Attachments 200K, AR Form w/o Attachments 300K

17. Global Re-indexing is required after I move my entire stack to Remedy 9?

 

      No, Not required.

18. How can I Re-index particular schema if, indexes got corrupted and wanted to create indexes for that schema?

 

  • One can use Process command: Application-FTS-Reindex-Form "<FORM NAME>" using driver program OR can write Filter and use this proc command to re-index particular schema.
Share:|

There is a new KB article https://kb.bmc.com/infocenter/index?page=content&id=KA429507 available on resolving BIRT Security Issues ISS04485990 (CVE-2015-5071) and ISS04485988 (CVE-2015-5072).

 

Work around as well as fixes are available.

--- Abhijit Rajwade

 

Share:|

There has been a lot of excitement around Remedy 9 but you might ask if it is ready for prime-time. The short answer is "absolutely". BMC Remedy 9 has been in the works for over a year and represents the most solid, high-quality release of Remedy yet. This post will cover 5 key reasons why you can be confident in upgrading to Remedy 9.


Reason 1: Long validation cycle

The Remedy 9 release had an extra-long external validation cycle beginning in November, 2014 and stretching out until the end of April, 2015. A key component of that was the BETA program. It spanned over 12 weeks and had more than 700 participants across 170 companies. This extended beta program provided BMC an unprecedented amount of feedback and enabled us to refine the release to reach an exceptionally high level of quality.


Reason 2: Focus on Simplifying Upgrade

A key theme in the release was to make it as easy as possible to upgrade to Remedy 9. We wanted to help customers adopt Remedy 9 quickly so they can start to utilize the great benefits of this release. New capabilities were introduced such as the customization reconciliation functionality and the upgrade pre-checker to optimize the time and effort spent during an upgrade. In this webinar by Vyomlabs they share the results of four Remedy 9 upgrade case studies and highlight the benefits of the new tools in reducing the customization reconciliation process from 6 weeks to just 1 day, in one example.

 

The BMC R&D organization has upgraded over 20 customer databases to ensure the upgrade works in various configurations and with different data. BMC also had some select customers visit us in our San Jose office to participate in a 3-day workshop focused on upgrade and evaluation of the new features. BMC pre-upgraded the customers’ databases, discussing the logistics around upgrade and gave our customers hands on experience with some of the new functionality. This win-win event helped BMC to get a customer’s perspective on an upgrade and gave these customers direct access to the BMC product team.


Reason 3: Quality is Job #1

Many customers are hesitant about being an early adopter of a major release for enterprise software so with Remedy 9 we worked exceptionally hard to address these concerns. During the extensive re-write of the platform we addressed thousands of defects that were "hanging around" from previous releases and made sure we had a very low defect rate as we approached General Availability. We also performed numerous endurance tests with the system running under very heavy load for several weeks. By the time we reached GA we knew we had an extremely robust release.

 

Quality and security are usually tied tightly together. Remedy 9 has over 100 3rd-party library updates to ensure the latest and greatest components. Making sure version 9 was thoroughly tested for penetration vulnerability was a key focus. You can review the published results here.


Reason 4: The move to Java

The Remedy 9 platform has been re-written in Java. In doing so, BMC has reduced the code-base footprint for Remedy by up to 50%, even with the all the new functionality. All of this change happened "under the covers" - there are no changes to existing API's and there is no impact to any code you may have already created. Why is this important? for 3 key reasons: 1) it helps BMC to innovate faster, 2) it helps BMC provide better support to our customers and 3) it is the foundation for new and exciting enhancements on the Remedy roadmap. BMC has invested heavily in the new platform capabilities.

 

The memory footprint HAS changed with Remedy 9 as compared to version 8.1. However, the overall machine recommendations for memory HAVE NOT changed from version 8.1. As we moved the platform to Java, we took the opportunity to focus on code maintenance and optimization. We also included some components in the server and not as a plug-in like our Full Text Search (FTS) component. Because of this structure, we expect version 9 memory consumption to be MORE consistent than version 8.1. There are some use cases around delay re-caching that will impact 8.1 memory but not 9.0 memory.

 

Some may see this significant change and categorize it as a 1.0 release. This move is far from a 1.0 release. We have been executing on this initiative for over 3 years through a separate code line and have thousands of hours testing. Remedy 9.0 is also NOT the first time customers have used the Java-based platform. We released this in our Remedy OnDemand solution over a year ago and have had a select number of customers live in production long before Remedy version 9.0 came out. We have seen great performance, stability and scalability in real-life customer scenarios.


Reason 5: The core of any production system: Performance and Scale.

Finally, performance and scale are key factors to feeling comfortable adopting a new release.

BMC performance labs performed numerous tests and a complete view of the benchmark results can be found on the BMC documentation portal here. We have observed performance and scale to be similar as 8.1 with some areas even better. Remedy 9 introduced some significant FTS improvements and optimized indexes to increase search responsiveness by up to 11x. The CMDB access and reconciliation process was also a focus area for version 9 and we have seen up to 2x throughput improvement and better sustainability on large reconciliation jobs. Check out some great comparison results between the two versions in a recent blog by Vipul Jain here.

 

There are always discussions around the performance and scale of a new release. Please consider the following when evaluating any performance reports:


Machine sizes and configuration

Any testing needs to reflect a production environment or one that BMC would perform load and scale testing on in our lab. This is a reference environment that we would use to confirm load scripting and general product regression testing and confirmation. Also, make sure you tune your environment using the BMC tuning guidelines. Performing any tests using OOTB configuration will not reflect an optimized production environment. BMC has tuning guidelines that are specific to load testing across the versions under test. These settings and tuning characteristics are also the ones we deploy in customer environments when they have scale requirements.

KEY TAKEAWAY: Make sure you have the right deployment and tune your environment with the right configuration.


Scale

Remedy is an extremely scalable solution but is a CPU-bound application. Once load impacts CPU beyond 80% then the hardware needs to be scaled in order to get accurate and consistent performance load results. We always recommend no more than 80% CPU load on a single VM or server instance. When planning for deployment, never plan for over 80% capacity on a machine or you will get unpredictable results. We carry this same methodology into our performance scale lab.

KEY TAKEAWAY: Make sure your environment is appropriately sized for scale.


Performance

BMC’s own internal testing of performance and scale with Remedy 9, on a right-sized, tuned and configured environment, shows similar results as version 8.1 for most test cases. In some use case scenarios (FTS, CMDB) we actually see significantly better performance and scale. We did not observe areas where performance regressed and if we had then this would have been a release gating defect.

KEY TAKEAWAY: Review the published Remedy v9 performance testing results from BMC to get an accurate picture of the entire performance behavior.

 

This post didn’t cover any of the new functionality introduced in Remedy 9. A brand new Smart Reporting capability, expanded persona coverage delivered in Smart IT, and numerous additional platform features such as REST-based web services and holistic archive makes adopting Remedy 9 even more compelling.

 

I hope you understand why Remedy 9 is truly ready for "prime-time". The solution has great performance, unparalleled scale and is the easiest version yet to upgrade. Check out the wiki documentation and the many webinars available through BMC Communities that fully describe Remedy 9. Your action now is to get started on your upgrade so your organization can enjoy all the great value in the new release.

Share:|

BMC has just released Remedy AR Platform version 9.0. In this blog we will look at some of the performance enhancements in version 9.0 and benchmark these improvements against version 8.1.

Remedy 9.0 is a major release with many new and long awaited features to increase performance, reliability, scalability, and high availability. Here are some key features:

  • A complete rebuild of the AR Server in Java

New features and third party Java libraries can be quickly incorporated into the AR platform. This also allows resource consumption monitoring at the JVM level so that optimization can be made to the AR server’s performance based on actual usage.

  • AR Server uses bind variables in SQL statements

Eliminates the replacing of literals with system generated bind variables at the database level thus increasing performance and scalability under very high workload. 

  • Optimally sized HTTP session object to optimize replication in web clustering deployments (for horizontal scaling and high availability)

Enables web clustering with minimal overhead so that browser users can experience a seamless fail-over in a clustered deployment. This also allows horizontal scaling with zero service disruption as nodes can be added dynamically to a cluster.

  • Object-sharing in Mid-Tier to reduce JVM heap usage (for multi-tenancy)

Identical meta-data objects across different AR Servers are shared in Mid-Tier to reduce memory usage.

 

To evaluate performance, reliability, scalability and high availability of the 9.0 release, the BMC Performance team conducted a series of benchmarks using a comprehensive solution-level workload on a clustered, multi-tenant deployment.


Environment

The hardware specifications and the deployment architecture of the BMC Remedy ITSM Suite 9.0 Solution in the BMC performance lab were as follows:

  • Two 4 VCPU/8GB Atrium SSO servers
  • Three 8 VCPU/16GB AR Mid-tier servers
  • Five 2 VCPU/12GB AR System servers (one AR server per tenant)
  • Three 8 VCPU/16GB Smart Reporting servers
  • One 16 VCPU/32GB SQL Server database server
  • One F5 load balancer

 

Remedy90 Deployment Architecture.jpg

 

 

Methodology

The details of the above diagram: The Atrium SSO, the Mid-tier and the Smart Reporting servers were deployed as clusters behind an F5 load balancer with the Mid-tier cluster connecting to 5 AR servers, one server per tenant.

The workload composed of typical BMC Remedy ITSM, Service Request Management, and Smart Reporting use cases with Atrium CMDB normalization and reconciliation batch jobs running in the background to simulate a typical “day in life” scenario.

The tests conducted were:

Test scenarios

Online user workload
(including ITSM, SRM, Reporting ) + Atrium CMDB

Number of
tenants

Number of servers

A

600 concurrent Users

2

2 AR, 2 mid-tier & 2 reporting servers

B

900 concurrent users

3

3 AR, 2 mid-tier & 2 reporting servers

C

1200 concurrent users

4

4 AR, 2 mid-tier & 2 reporting servers

D

1500 concurrent users

5

5 AR, 2 mid-tier & 2 reporting servers

E

1500 concurrent users

5

5 AR, 3 mid-tier & 3 reporting servers (added 1 mid-tier and reporting server for high availability)

 

The BMC Performance Team also conducted additional tests for Smart Reporting, Atrium CMDB, FTS search, etc. to validate performance and scalability of these products individually.


Results

The benchmark results showed that even with all the added functionality, there is no performance or scalability degradation in 9.0 compared to 8.1. Some areas show performance improvements.


Summary of 8.1 versus 9.0:

 

Blog_ResponseTime.jpg

 

 

Version 9.0 also focused on the CMDB access and reconciliation process. Benchmark results showed up to 2x or more throughput improvement and better sustainability on large reconciliation jobs.

 

Atrium_100.jpg

 

The following chart shows that the Atrium CMDB throughput is sustained with increasing data volume.

 

Atrium101.jpg

 

 

Version 9.0 also showed increased responsiveness of searches through FTS enhancements and optimized indices. Benchmark results showed improvements up to 11x for FTS based searches while consuming 40% less CPU.


FTS90_1.jpg


Key Takeaways

  • Version 9.0 shows significant performance improvements for some functional areas, such as FTS search and Atrium CMDB.
  • CPU and memory usage are optimized in 9.0 with the new server implementation in Java. (E. g. 9.0 does not perform the copy cache operation while promoting changes during production operation; object sharing in Mid-tier, optimized indices in FTS, etc.)
  • Horizontal and vertical scalability are enhanced with 9.0.
  • Web users have a seamless failover experience with clustered Mid-tiers. Side benefit: High availability.
  • Smart Reporting servers scales horizontally and vertically. And can also be configured with session replication to support seamless failover. Side benefit: High availability.
Share:|

On a Friday night, there’s nothing I like better than to curl up with a good glass of red wine, a selection of fine cheeses and the latest set of log files which, especially for the occasion, I have printed out and bound. I start at page one, and I quickly get caught up in the various adventures of the session id, the interplay of the threads with the HTTP requests, and of course I always finish with the latest collection of memory dumps. The perfect way to start my weekend.

 

No really, I’m only joking. I don’t even like wine. And as for the log files, well, it doesn’t really work like that. You see, you need to know what you’re looking for in those log files. You can’t just stare at them for 10 minutes and hope to understand what happened on an application level. At their best, log files offer an incomplete picture of what happened.

 

But is that a bad thing? I don’t think it is. But here’s the thing, you really need to understand what these files are trying to tell you. You need to approach them with a specific idea in mind of what you want to achieve, what you are hoping to find. Sure, if your aim is to get an overview of the general performance of an application (think slowest SQL queries or the average processing time of HTTP requests), put your logs through a log analyser and use the reports. That’s generally a good approach, but sometimes there’s a tendency to over-rely on these sort of reports. If it doesn’t resolve your problem, you need to use a more forensic approach; you need to know how to read the logs.

 

Let’s have a look at the various logs we have on offer for Mid-Tier and AR server and let’s try to follow a complete conversation from start to finish. I’ll use version 8.1 for now, but as the log format changed somewhat in version 9, I’ll write a follow-up on this next month.

 

So here’s my problem statement: at certain times of the day users complain that the system slows down. We’ve collected set after set of AR server logs, but the log analyser hasn’t come up with anything conclusive. There are no obvious problems on the database, the network team can’t find anything, and we’re a bit stuck. So where do we go next?

 

Well, you need to get specific, really look at how requests are handled in the log files, follow a complete conversation from start to finish and see how this is handled. In other words, you need to adopt a more forensic approach.

 

A conversation is spread across multiple log files, let’s first remind ourselves what AR System looks like:

 

pic1.png

 

So the client (usually the browser) communicates with the web server, which in turn communicates with the JSP engine. Mid-Tier communicates via API calls with AR server which in turn communicates via SQL with the database. Here are the logs I feel help the most in logging a complete conversation:

 

pic2.png

 

First stop, the access logs. These are the web server logs that record all the requests that are received. You can of course use Fiddler as well, but at this stage I’m more interested in the bigger picture. What sort of requests are received, what answer goes back and crucially (at least in this case) how long do these requests take?

 

Next, the Mid-Tier logs. Mid-Tier doesn’t log everything at the same level of detail, but what we’re primarily interested in are the backchannel requests, which are logged with enough detail to work with. I need to know if they arrive safely on Mid-Tier and what Mid-Tier does with them. I included an API trace as well, more about this later.

 

From AR server we collect the API/SQL logs, the first part to know what sort of API calls the request results into, and the SQL calls to know what we actually send to the database. I like to use the combined format, with everything in one file it makes for easier reading.

 

Good, let’s get started. It’s all about following conversations in logs, you need to know where it starts and you need to know how to follow the conversation to the end.  But you need employ a more forensic method, what I mean by that you need to prove or disprove something. That might sound a bit theoretical, but let’s have a look.

 

The access logs are a great start for any log analysis. It’s a log file that’s easily overlooked, and although it doesn’t provide the same amount of information as a proper HTTP trace (think Fiddler), it captures all the requests that come in. You get a much better picture of how a server is performing and you don’t have to capture any logs on the client side. With a little tweaking (see my article here for more information) this is the information you can get out of this:

 

pic3.png

 

The IP data tells us where the request comes from (as in the browser), we have the timestamp, the URL, the HTTP status code which tells us what the outcome was, the amount of bytes that are returned and the total time it took for the request to get processed. 

 

pic4.png

 

I like to convert them to Excel via a tool I wrote. What I’m looking for are requests that are slow. Not necessarily the slowest, but requests that I’d expect to go through relatively quickly. The slowest requests are not necessarily a problem, I’d expect a request for an attachment of a report to take a lot of time. But check the requests I highlighted, I sorted these by URL to get to see how they perform. They are identical, same URL, same amount of bytes (469). They get small amounts of data. It’s not that they take exceptionally long, but sometimes the requests go from a few milliseconds to 4 seconds.  That shouldn’t happen.

 

So let’s follow two requests, one that works okay and another one that takes too long and see what happens. We now have the start of the conversation, let’s see where Mid-Tier picks it up. That’s quite easy to do. I know the timestamp (that’s the time it sends the response, not the time it arrives) and the URL so I can just look that up in the Mid-Tier logs. I went for the categories Servlet, performance and Internal, which give me the best information. Let’s have a look at what’s logged:

 

pic5.png

I used the thread ID to sort everything, every request uses the same thread so you need to use this ID to string everything together, I left it out to make it a bit more readable. In both cases Mid-Tier picks up the requests at the correct time, they are extracted and sent to AR server. But then Mid-Tier starts waiting until it gets an answer from AR server. In the first case it only has to wait a few milliseconds, in the second case it waits seconds. 

 

What Mid-Tier is doing is translating the backchannel request to a series of API calls and sending these to AR server. The API calls are specific requests or instructions using a protocol which both Mid-Tier and AR server understand. The Mid-Tier logs don’t actually record the exact moment these calls leave the server. It’s probably the last line of the mapProperties, but to be absolutely sure you need to check the API trace.

 

It’s not a log I use very often, but if I want to know exactly how a request travels across the various servers, that’s the log I use on Mid-Tier. There are a lot of logs on ARS that are called API logs and it can get a bit confusing. The log on Mid-Tier is the one that records the API traffic from the Mid-Tier perspective, it record everything that is sent and received. It’s not the same as the API log on AR server or the API log which you can enable via the Mid-Tier Configuration tool. This is something you need to enable via a specific XML file. I’ve included instructions below.

 

API trace logs record all the API traffic from all the users on that Mid-Tier, logs tend to grow fairly quickly so you need to be careful. There’s a lot of data recorded in these logs so it can be a bit tricky to get the bit you need. I usually use the timestamp as recorded by Mid-Tier and some details of the request, a form name, value, those sort of things. This is what I find:

 

pic6.png

What we’re looking at are the representations of two individual calls that Mid-Tier sends to AR server. Mid-Tier converted the request to specific calls, the values are all recorded and we now know exactly at what moment the request leaves the system and at what time the answer comes back.

 

So in our test case, we now know the delay is not on the web server side or the Mid-Tier side. The request leaves the server on time and the Mid-Tier server is just waiting for the answer.

 

So what’s next? Well, you need to look at the next piece of the conversation, AR server. The best log for this is the combined API/SQL log. Since we’re now communicating via API calls between Mid-Tier and AR server and we already know what the API call looks like that Mid-Tier sends off, we can just check for these calls as they arrive on AR server.

 

There’s no unique ID we can use to link the logs (I’ve raised an Idea for this), but you can use details like the sort of request, timestamp and field names, values, etc. to find a match. Similar to the Thread ID in the Mid-Tier logs you can use the RPC ID in the API/SQL logs to string everything together. This is what I came up with:

 

pic7.png

 

Notice that the name of the API call is different from the one used by Mid-Tier. So what we have now is the API call sent by Mid-Tier and the same API call recorded by AR server. But there’s something odd here. Notice the timestamp. That’s the moment AR server is first aware of the request. When everything is okay that’s just milliseconds later than the timestamp Mid-Tier recorded, but check the timestamp of the second API call, there’s a big difference here It’s recorded much later. When it’s received it’s processed fairly quickly, and the answer goes back within milliseconds to Mid-Tier. But why it received so late?

 

If you ask a Mid-Tier engineer he will tell you the timestamp recorded in the API trace is the moment the request leaves the server and it’s an AR server delay. But if you ask an AR server engineer he will tell you that the timestamp recorded in the server API log is the moment the AR server is first aware of the request and it’s a Mid-Tier delay. You basically have two parts of a conversation and there’s a gap of a few seconds in the middle which is unaccounted for.

 

At this stage we know it’s not a web server delay, we know it’s not delaying on the Mid-Tier server and we know it’s not delaying on the AR server, you now need to prove (or disprove) it’s the network. It’s the only part we haven’t looked at yet.

 

So how you do this? Well, that’s last piece of the puzzle and it’s a difficult one. All the logs we’ve looked at so far are logs on application level, but now we need to check what happens between the two applications, but there’s no neat log file for this. Instead we need to check the network traffic. The way to do this is by using network traces. You basically record all the traffic that flows over the network card. If you do this on both the machines that host Mid-Tier and AR server you should get a good picture of what’s happening. How to do this depends on the OS, in this case it’s Linux which means we need to use tcpdump: tcpdump –I <interface> -s 65535 –w <file>

 

I’ve included an article with more information, but the basic idea is to capture the network traffic from two sides with the API logs (also on two sides) to get the correct details. The application to read network data is Wireshark, so once you have the logs, open them via Wireshark and have a look.

 

Not as easy to read as the other logs. What you’re looking at is basically raw data, so you need to look for specific pieces of traffic here. The first thing I do is filter by TCP traffic. There’s a lot of stuff logged here, but I know the API traffic between the two servers are sent via TCP.

 

Next I use the same method I used for the other logs. Using a combination of timestamps, field IDs, field values I scan through the logs for any requests that match. I also keep an eye on the IP addresses here since I’m looking for traffic between the Mid-Tier and AR server. I know both IPs so I know which is the sender and the receiver. Here’s what the traffic looks like that I found:

 

pic8.png

 

From a Mid-Tier perspective it looks like it leaves the server on time, but it doesn’t get an answer back until four minutes later. The network from the AR server machine confirms this, the traffic comes in too late so there’s nothing else AR server can do. You see, the network is delaying this here, something goes wrong when the packages are sent between the two servers. It’s a network problem and the network team can probably answer where this is going wrong.

 

I know this all quite detailed, but my point is that none of these problems would show up if you only look at the logs using reports that record the slowest call, highest threads, etc. You can only find this out if you actually look at the traffic. If you follow a conversation in the log files you can see exactly where what happens. You can prove it’s not the web server, then you check if it’s the Mid-Tier server that’s causing the problem. If not, follow the conversation to AR server, the database, etc. I often just take a few similar requests over a certain period and check how they perform. If there are any differences, I check where it spends most of its time and try to find out why.

 

That said, your first port of call when dealing with performance problems are probably still the log analyser and the access logs, but if that doesn’t help and the circumstances are suspicious, you’ve got it to get down to the details.

 

Regards,

 

Justin

 

Don't forget to follow me on twitter!

 

Further Reading:

 

  1. Enable API trace logs
  2. Access2CSV Tool
  3. TCPDump instructions
Young So

Mid-Teir Troubleshooting

Posted by Young So Jul 1, 2015
Share:|

These steps will speed up your analysis time for you and BMC support to get your resolution quicker.  In my case, I had BMC & Microsoft involved in troubleshooting same issue. There are mid-tier tuning documents and blank.  (try to find that doc)

 

These steps are for a mid-tier problem that is not too obvious based on log files.  If you running to such issue.  I would recommend the following step to be followed before opening a ticket with BMC.  There might be a customer who uses IIS instead of Tomcat.  Either way, these tips will help you get to the bottom of the problem.

 

Tomcat Stuff:

  1. Edit the Java options on Tomcat with the following settings.  (The last 4 lines are to set thread dump or application profile for Tomcat)
    • -XX:PermSize=256m
    • -XX:+UseConcMarkSweepGC
    • -XX:+UseParNewGC
    • -XX:NewRatio=3
    • -XX:+HeapDumpOnOutOfMemoryError
    • -Dcom.sun.management.jmxremote
    • -Dcom.sun.management.jmxremote.port=8086
    • -Dcom.sun.management.jmxremote.authenticate=false
    • -Dcom.sun.management.jmxremote.ssl=false

 

     2.  Connect to your JMX remote.  You have to have JDK installed to do this part.

     3.  Schedule a restart of the mid-tier.  Wait for the problem to reoccur.

    • go to C:\Program Files\Java\jdk%version%\bin
    • Execute jvisualvm.exe
    • If you’re local, you can connect the port set 8086.  If your remote, you have named the host name
      • Once you are connected to the remote host. Create a JMX connection using the port number
      • Here you can validate the performance of the JVM and its settings etc.

 

Capture.PNG

 

4.  Create an application profile and thread dump for support.

5.  Create a BMC support ticket.

 

If you running IIS.  Then you would do a thread dump of IIS from task manager.

 

IIS stuff:

 

Microsoft has a very good section on the symbol and debugs tool.  The following KB article has the details.

https://support.microsoft.com/en-us/kb/919790

 

WebSphere stuff:

 

Here is a website that shows how to do API dump for WebSphere.  Here is tuning guide for WebSphere.

 

TCP/OS Tuning:

 

I found few things that help tune the mid-tier at the network layer.  Here some windows registry setting that improves TCP/IP performance.  These setting are well documented with Microsoft Technet.  In addition, you want to adjust your RX buffer on the NIC.

 

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\inetinfo]

 

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\inetinfo\Parameters]

"ObjectCacheTTL"=dword:00000030

"MemCacheSize"=dword:00002048

"MaxCacheFileSize"=dword:00000256

"PoolThreadLimit"=dword:00000010

"ListenBackLog"=dword:00000250

"SynAttackProtect"=dword:00000000

"TcpTimedWaitDelay"=dword:0000001e

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Terminal Server]

 

      

 

Here is TCP/OS tuning for different favor UNIX and Windows:  Operating System Tuning

 

Here is power shell script that looks at port activity using netstat -ano. This is for windows.  http://www.core-admin.com/portal/kb-24032014-001-dealing-with-time-wait-exhaustion-no-more-tcp-connections

7-1-2015 2-50-21 PM.png

Here way to do the same thing in UNIX for port/socket status.

netstat -n

Port Exhaustion does occur believe it or not.  There is lots of documentation on the internet about it.  Most network people will argue against this notion happening on our network and pointed to the application or the web server causing the problem based on resource limitation.  Moving on with how to gather information for vendor support to resolve your issue quickly.

 

Adjusting TCP Settings for Heavy Load on Windows

 

The underlying Search architecture that directs searches across multiple physical partitions uses TCP/IP ports and non-blocking NIO SocketChannels to connect to the Search engines. These connections remain open in the TIME_WAIT state until the operating system times them out. Consequently, under heavy load conditions, the available ports on the machine running the Routing module can be exhausted.

 

On Windows platforms, the default timeout is 120 seconds, and the maximum number of ports is approximately 4,000, resulting in a maximum rate of 33 connections per second. If your index has four partitions, each search requires four ports, which provides a maximum query rate of 8.3 queries per second.

 

(maximum ports/timeout period)/number of partitions = maximum query rate

 

If this rate is exceeded, you may see failures as the supply of TCP/IP ports is exhausted. Symptoms include drops in throughput and errors indicating failed network connections. You can diagnose this problem by observing the system while it is under load, using the netstat utility provided on most operating systems.

 

To avoid port exhaustion and support high connection rates, reduce the TIME_WAIT value and increase the port range.  Note: This problem does not usually appear on UNIX systems due to the higher default connection rate in those operating systems.

 

To set TcpTimedWaitDelay (TIME_WAIT):

 

  1. Use the regedit command to access the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\ Services\TCPIP\Parameters registry subkey.
  2. Create a new REG_DWORD value named TcpTimedWaitDelay.
  3. Set the value to 60.
  4. Stop and restart the system.

 

To set MaxUserPort (ephemeral port range):

 

  1. Use the regedit command to access the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\ Services\TCPIP\Parameters registry subkey.
  2. Create a new REG_DWORD value named MaxUserPort.
  3. Set this value to 32768.
  4. Stop and restart the system.

 

Need to write a section about mid-tier server.xml options.

 

Let's move the discussion to create API dump for windows & JVM.  I am assuming that you know how to read the dump

 

Windows API Dump:

 

00000000`2faee630 00007ff8`0d690f15 mswsock!WSPSelect+0x7e9

00000000`2faee7d0 00000000`50f74d93 ws2_32!select+0x1f9

00000000`2faee8c0 00000000`50f73b33 net!NET_Timeout+0x73

00000000`2faeeb30 00000000`01598b50 net!Java_java_net_SocketInputStream_socketRead0+0xdb

 

Java API Dump:

 

"http-bio-80-exec-381" - Thread t@668

java.lang.Thread.State: RUNNABLE

at java.net.SocketInputStream.socketRead0(Native Method)

at java.net.SocketInputStream.read(Unknown Source)

        at java.net.SocketInputStream.read(Unknown Source)

 

   

 

If you notice what's highlighted in red for Windows dump "socketRead" and it's repeated in the Java Dump has "socketRead0."  If both API dump confirms a pattern.  Then you can begin to ask why am I getting this "sockeread" at the API level or vendor support?  Thus, you can get to the bottom of the issue quicker or escalation become seamless.  Having information above on re-occurs problem when support isn't available to help support you or your team understand the issue can be frustrating at sometime.  Having some of knowledge above can help elasticated the issue quicker to expert that deals with Tomcat or IIS.  (3rd level support or developer)

 

Now this part is above and beyond for most of you.  There is sniffer tool call wire shark and Windows network Monitor.  These tool can be helpful to you and your network team.  Your ability to capture the issue while it's occurring at this level also help everyone to understand the problem better alone with the API dumb.

 

 

More edits to come...

Share:|

As you might have noticed, I'm writing a few articles on new features in Version 9 which was released not too long ago. This time something relatively short (I have a tendency to write rather lengthy articles) as we're having a look at Mid-Tier's new response time monitoring capabilities.

 

Picture this: you’re about to log a change request via the system. You log in, open the page. All going well so far, and then ... then, well not an awful lot really. A loading icon for the first minute or so, you get through it eventually, but every action you take seems to take longer and longer. By the time you’re done logging the change request, you managed to answer ten emails, finish that report you were working on and even check the latest news.

 

If you ask a network administrator to look into this he’s going to suggest an awful lot of log files. Could you run a HTTP trace for a while, maybe a few workflow logs while you’re at it? He will enable the server logs and say he’ll have a look afterwards. It’s frustrating and of course I sympathise.  But the thing is, there’s only so much you can learn from looking at the server side – if you want to know why things are slowing down on the user side, well you have to look at this from a user’s perspective.

 

Yes, I’m one of those people who frequently asks end users to record HTTP traces (although I’m less fond of workflow logs). You see, what I want to know is how long things take exactly. How long does it take for a page to get loaded? How long does it take for a backchannel request to get processed on the server? That’s what will tell you where things get delayed. Because remember, there are a lot of places where this can go wrong: the client OS might be too busy, JavaScript might run wild, the network might be overloaded, maybe the server isn't coping. Our role as administrators is to find out where, why and (here’s the important bit) how to fix it.

 

Enter version 9’s new response time monitoring. It’s a new feature that was added which should give us a better idea of the performance from a client side perspective. What this gives us is an overview of how long things take. How long does it take for a page to load? How much of that time was spent on the network and on the server? The more seasoned network administrators among you will probably argue that a HTTP tracer like Fiddler is the answer to this, and you’d be absolutely correct. But that’s not necessarily the point where you want to start. I mean, it’s asking quite a lot from an end user to install Fiddler and run it next to the browser for a period of time while eliminating any non-Remedy traffic.

 

Let’s see what it can do. It’s a server setting which you enable via Mid-Tier’s Config Tool. No restarts or cache flushes required, just enable it. Note that it’s either on or off for the whole Mid-Tier environment. You can’t set this per user.

 

mon1.png

 

Once it’s on a little icon is added on each page in the bottom-right corner which shows you details of how long things took.

 

mon2.png

 

If you click on the information bar you get a few more details.

 

mon3.png

 

To measure this data, the browser sends a request for a small servlet called ResponseTimeServlet. One request at the start and another at the end. What you see of course is the difference. Most of this data is also available when you use Fiddler, but the good thing is that it’s readily available. No need to turn on your logs, a simple screenshot or two will do.

 

For example, if my form spends a lot of time interacting with the server (and by that I mean Mid-Tier and AR server) it will be obvious in the Page Load Time data. My server time will be unusually high which gives me a good indication where to look. If my load proxy servers are acting up it will be obvious too, my latency will be far too high and I’d just know something isn't quite right. That's the point where I start digging through Mid-Tier logs, or Access logs on the server. Or maybe get Fiddler or Wireshark out to see what's going on the network side. On the other side, if the Browser Time is higher than I'd expect it's probably JavaScript gone wild.

 

But is this the answer to all our performance monitoring questions? I’d argue not. It’s easy enough to enable and it’s not very intrusive – it’s passive in its design, doing its best not to add to the footprint. It can be as simple as asking users to take a few screenshots and you've got an idea what’s going on. If more data is needed you can always get your HTTP tracer or your server logs out for some serious network analysis.

 

The downside here is that it’s not a log file. It’s user friendly, but I’d be interested how the application performs during a user’s session. When the user navigates from page to page, how long does this take? What pieces of workflow take the longest? Am I looking at delays on the browser side, network side, or server? And at what points exactly? This tool doesn't answer these questions. It’s a snapshot of one form or one console at one particular moment in time. Although this certainly will get me started, I’m not entirely sure if it’s enough.

 

But hey, that's just my opinion. What do you think? Would you use the new response time monitoring feature? Can you work within it's limitations? Should it be user based, maybe in the form of a log file like the workflow log? The only way to know is by giving it a go! (And when you do,leave a comment below to let me know how you got on.)

 

Until next time,

 

blogname.png

 

 

Further reading:

 

Share:|

Now that the Remedy 9 is successfully released and we see a lot of excitement in the customer base, the BMC product team finally finds some time to address an action item that has been open for quite a while.

 

We've heard feedback from many of our BMC community members that the placement of this Remedy AR System space under Atrium is hard to understand and makes it difficult to find and use, especially for new community members.

 

BMC is blessed that Remedy AR System is such an active community space for Remedy platform related discussions. It's the most active community space across all BMC offerings. And therefore we plan to give this space the credit (or rather the place) it deserves: starting next week the Remedy AR System space is planned to move to become a top-level community space and thus will be more easily accessible from the Products menu. This will bring the Remedy platform specific community space to the same level as Remedy ITSM (for discussions about Remedy ITSM applications) and BMC Helix (for discussions specific to the delivery of Remedy as SaaS service), and it reflects BMC's renewed focus on the Remedy developer community. Here's how this change looks like:

 

New Remedy Community Structure.png

 

Please note that this move is part of a wider discussion about how we can make the BMC Communities work better for our Remedy customers.  We already streamlined the community structure by archiving some of the low-traffic sub-spaces.  This has reduced the number of spaces our community members have to deal with.  And we plan for further simplification of the Remedy community spaces (including The specified item was not found. space). So, stay tuned for further optimizations that will help make Remedy community even more active, engaging and valuable.

Filter Blog

By date:
By tag: