Share:|

The Story So Far...

In parts one and two of this series of blogs we've seen how to setup the Elastic stack to collect logs from a Remedy server.  At the end of the last post we had introduced Logstash between our Filebeat collection agent and Elasticsearch so that we're ready to start parsing those interesting pieces of data from the logs.

 

One of the challenges of working with Remedy logs in Elastic is that, although there is some level of standardisation in their format, there's still a wide variety of information present.  Many of the the different logs types may share the same markers at the beginning of their lines but they then contain very different data from the timestamp onward.   This is exactly what Logstash is designed to do deal with by making use of its many filter plugins.  These provide different ways to manipulate data and restructure it so that it becomes queryable beyond simple text searches.  There's one filter plugin in particular that we're going to use to help us grok our Remedy logs.

The grok Logstash Plugin

The documentation for the grok plugin says...

This is exactly what we want to do so how do we use it?  Logstash ships with the most commonly used filter plugins already installed so there are no additional steps to make it available.

 

grok works by using patterns to match data in our logs.  A pattern is a combination of a regular expression and a variable used to store the value if it matches the search regex.   As an example consider the first bit of data in our API logs - the log type:

 

<API > <TID: 0000000336> <RPC ID: 0000021396> <Queue: Prv:390680> <Client-RPC: 390680 >.......

 

A grok pattern to read this and create a field called log_type in Elasticsearch would be

 

^<%{WORD:log_type} >

 

Let's break it down

  • ^< means we're looking for the < character only at the start of a line
  • the grok pattern syntax uses %{...} to enclose regex:field pairs
  • WORD is one the many built in Logstash regular expressions and matches the characters A-Za-z0-9_
  • log_type is the name of the field that the value will be assigned to in Elasticsearch

 

When a log line matches the pattern, that is it starts with < and has a string followed by a space and then >, the value of the string will be added as a field called log_type.

 

We can add more patterns to match the next piece of data on the log line, the thread ID:

 

^<%{WORD:log_type}%{SPACE}> <TID: %{DATA:tid}>

 

  • I've changed the the line to include %{SPACE} (another built in pattern matching 0 or more spaces) instead of an actual space character because, if this was a FLTR log line for example, there would be no space before the closing >
  • > <TID: is the literal text we're expecting
  • DATA is another built in regex

 

Now we will have two fields added to our Logstash records:

 

Logstash field
Value
log_typeAPI
tid0000000336

 

We can continue to build up the patterns until we have all the data we want from the log line.  Developing these patterns can be complex but there are a number of tools available to help you,  there's even one in Kibana.  Click on the Dev Tools link in the left hand panel and then Grok Debugger.

 

Here's the grok pattern for a complete API log line shown in the debugger and you can see the resulting field names and values in the Structured Data window:

 

Note the new patterns used, %{NUMBER:overlay_group} to create an integer type field rather than a string as this value is only ever a number, and %{GREEDYDATA:log_details} at the end which captures the remainder of the line and assigns it the log_details field.

 

^<%{WORD:log_type}%{SPACE}> <TID: %{DATA:tid}> <RPC ID: %{DATA:rpc_id}> <Queue: %{DATA:rpc_queue}%{SPACE}\> <Client-RPC: %{DATA:client_rpc}%{SPACE}> <USER: %{DATA:user}%{SPACE}> <Overlay-Group: %{NUMBER:overlay_group:}%{SPACE}>%{SPACE}%{GREEDYDATA:log_details}$

 

We now need to add this grok filter definition to our Logstash configuration file which we created in the previous post.  My example was /root/elk/pipeline/logstash.conf which needs to be edited to include the grok filter definition:

 

# cat elk/pipeline/logstash.conf

input {

  beats {

  port => 5044

  }

}

 

filter {

  grok {

    match => {"message" => "^<%{WORD:log_type}%{SPACE}> <TID: %{DATA:tid}> <RPC ID: %{DATA:rpc_id}> <Queue: %{DATA:rpc_queue}%{SPACE}\> <Client-RPC: %{DATA:client_rpc}%{SPACE}> <USER: %{DATA:user}%{SPACE}> <Overlay-Group: %{NUMBER:overlay_group:}%{SPACE}>%{SPACE}%{GREEDATA:log_details}$"}

  }

}

 

output {

  elasticsearch {

  hosts => "elasticsearch:9200"

  }

}

 

Logstash needs to reload the updated configuration which can be done by restarting it using:

 

# docker-compose -f elk.yml restart logstash

Restarting logstash ... done

 

So let's see what out logs look like in Kibana now.  Go to the Discover tab and make sure you're looking at the logstash-* index pattern, expand one of the records and, if all went well, you's see something like this:

Index Pattern Refresh

Our new fields are listed and we can see the values for them from out log line!  There are orange warning flags by the values because they're new fields that are not in the index definition that we're using.  To fix this click on the Management link go to Index Patterns, select the logstash-* pattern and click the refresh icon.  You should see that the count of the number of fields increases and you can page through the field list if you want to see which fields are present.  While we're here I suggest clicking the star icon to set logstash-* as the default index pattern so that you don't have keep switching it from filebeat-* on the Discover page.

 

 

Reload the Discover page and the warnings should have gone.  The index pattern refresh is something that needs to be done each time new fields are added.

 

Making Use of Remedy Specific Fields in Kibana

 

Now that we're enriching our log line records with Remedy data we can start doing some more interesting things, such as...

Filtering

 

Remember in Part 1 we saw how to filter the log data using the list of fields?  Well, now we have some fields which are relevant to our application, such as the User or RPC Queue, that the logged activity belongs to, let's see how we can use that to isolate actions from a single user.

 

I'm going to login to my Remedy server as Demo so let's setup the Discover page to see what I'm up to.  In addition to applying a filter from the field list you can use the Add a filter + link to get an interactive filter builder or you can use the search bar at the top of the screen.  To filter for log lines with the user field value of Demo enter user:Demo in the search bar and press return.  Assuming there are any matching logs within the current time window you should get a count of the hits and the lines will be displayed.  To make it a bit easier to see what's going on hover over the log_details field name and click add to show this field in the main pane.  Finally let's turn on auto-refresh of the search results so we can monitor our actions as they happen.  Click on the Auto-refresh link at the top of the page and select 5 or 10 seconds.  Now go ahead and login to Remedy as Demo and see what happens as you work.

 

 

Each time the screen refreshes you should see the results updated with the latest log entries and the timeline shows the number of matching log lines in that period.  There are many filtering options available so see what you can find by using the different fields we've added to look for specific types of activity.

 

Visualizations

 

Search and filtering are helpful but not very exciting to look at so how about some graphical representations of our data?  Click on the Visualize link in the left hand pane and then Create a visualization to see the range of formats available.

 

 

Let's start with a Pie chart, click on the icon, select the logstash-* index and yoi should see this

 

 

This is simply a count of the number of records so we need to provide some additional options to make it a bit more interesting.  At this point it's worth switching of the auto refresh if you have it set to avoid the graphics being refreshed as we experiment with them.

 

Click on Split Slices, choose Significant Terms as the Aggregation and rpc_queue.keyword as the Field.  Set the Size to 10 and click the play icon at the top of the panel.

Here we can see how much of our log activity belongs to the different RPC queues in our server.

 

Change the Field and try some of the others such as the user.keyword

 

The data used to create the graphics can be filtered just as on the Discover page and the time range can also be adjusted as required.

 

With so many different visualization types available you'll be able to view your logs in a variey of different ways, and a future post will look at how to get even more of the log line data in to fields so that they can be used this way.

 

Wrapping Up

This post builds on the previous two and shows how to start using Logstash to enrich the data being collected from our Remedy server using the powerful grok filter plugin.  In future posts we'll look at using it further to get even more information from our API log lines, and then extend it to other types of logs such as SQL and FT Index.  With all of this extra data we can build even more complex graphics to help us visualize and analyse our systems.

 

Comments, suggestions and questions welcome.

 

Using the Elastic Stack with Remedy Logs - Part 1

Using the Elastic Stack with Remedy Logs - Part 2

Using the Elastic Stack with Remedy Logs - Part 4