Optimize IT

9 Posts authored by: Charlie Geisler
Share: |

Picture1.pngAs with many words in the English language, the word integrate (or integration) is derived from the Latin word integratus which means to “to bring together or incorporate (parts) into a whole.”  The word integration has been in existence since the 1600’s which means that people have been integrating things for a very long time.  When I hear that two products are integrated I generally have a positive reaction and assume that the integrated-whole is better than the individual parts.  But is that really the case – and is that enough?



After years of hearing the challenges, pains and requirements of IT operations professionals, I have the impression that when people learn that something is “integrated”, they assumed that it’s also (by default) - simple to use and will meet most or all their needs.  Now I was inclined to agree with them but then there’s that old adage, “don’t assume anything.”


Many times integration between management products simply consists of a launch-in-context capability, where the operator has the ability to; for example, seamlessly switch from viewing a device in an incident ticket, to viewing the historical performance or capacity utilization of the same device from another application.  Now it might be simple enough to swivel from one application to another, but it is worth it?  Does this integration (e.g. launch-in-context) give the operator enough information to triage and remediate the incident faster as a result, or does he/she simply have more data?  Unfortunately in many cases it’s the latter, which is minimally useful, since it is information that enables decision making and data is simply the input. 



So if integration is not enough, then what is enough or required to make IT operations more proactive?  I suggest the combination of analytics, visibility and workflow is the key…



Analytics – is a contemporary and very popular word that is at times overused and under defined, but it’s relevance to IT Operations is growing.  In the context of IT operations (i.e. managing the availability, performance and capacity of the IT infrastructure) the word analytics equates to intelligence.  For example, intelligence that can be applied to vast amounts of complex data, to identify patterns of behavior or correlate business metrics to infrastructure utilization, allows IT operations to identify normal/abnormal activity (and reduce incidents) and understand which and how infrastructure resources will respond to business demand. 



Visibility – is something that IT Operations has had for years in the form of availability, performance and capacity charts, graphs and reports.  But many of these views display nothing more than data and require extensive analysis to understand where to focus your efforts.  However, visibility that’s driven by analytics provides “actionable” views that can quickly identify the hot spots in the infrastructure and help prioritize efforts, speed analysis and remediation.  Having multiple levels of visibility is also a must.  No longer are device utilization views all that are required.  Some views, like a view showing the number of servers in the 95th percentile (2 std. deviations) of CPU capacity used, are required by the majority of performance/capacity analysts.  Factor in the professional preference as well as the skill and ability of the analyst and what’s required are views at all levels - infrastructure, applications, services and the data center.



Workflow – is the means in which actions are taken and things get done “automatically” – and are driven by intelligence so the right things get done right (and quickly) instead of doing the wrong things right, or the right things wrong.  IT workflows have been around for decades but in my experience there’s certain things that IT operations can and will allow to be done automatically, such as server configuration.  Then there are actions, such as automatically adjusting capacity to a mission critical server, that (rightfully paranoid) IT operations folks will only do semi-automatically at most.


We have seen an explosion in IT complexity, speed and scale with the advent virtualization, converged infrastructure and cloud computing.  In my opinion IT challenges are only compounding exponentially as these new technologies are not completely displacing physical systems, which come with their own challenges.  In short, people can’t keep up and automated workflows, driven by intelligence, are what will reduce human error and increase operational efficiency.



For years IT organizations have struggled to deliver proactive IT operations.  While product integration provides efficiency benefits it’s the addition of analytics coupled with visibility and workflows that will propel organizations from simply improving mean time to repair (MTTR), to achieving the higher value mean time between failure (MBTF).



Don’t get me wrong, I enjoy using integrated products – my iPhone being one of them.  But sometimes integration is not enough.

Share: |

untitled.bmpYou may not recognize the acronym BNY, but you’ve likely heard of the Bank of New York-Mellon (www.bnymellon.com, a worldwide investment management and investment services company headquartered in New York City.  BNY Mellon competes in the rapidly changing global financial services marketplace, where IT is a key player in supporting the business by providing technology based solutions that enhance the businesses ability to be competitive and successful.  With a global IT infrastructure consisting of multiple data centers, various hardware and software platforms, and many servers the capacity planning group has  their hands full, ensuring that the appropriate amount of resources are available to handle all business, application and system requirements.     




Now the art and science of capacity management is not new to Boris Gdalevich, Capacity and Performance Strategist at BNY Mellon.  With a 15 year history in the Capacity Management discipline he’s seen a lot of changes in IT.  As has Giuseppe Nardiello, Principle Product Manager at BMC Software, who also has a long and storied IT background.


Recently Boris and Giuseppe teamed up to produce a podcast discussing the evolution of capacity management and how BNY Melon has adapted their capacity management practice to keep up with increasing IT complexity and the steady decline of capacity planners.  Whether you’re just establishing a capacity management practice or a seasoned professional listen to the podcast and learn...


  • How the capacity management landscape has changed
  • Why “automation” is the key ingredient for any organization implementing a capacity management process
  • The contemporary challenges of capacity management and the approach BNY Mellon has taken
  • The benefit and business value of an automated approach

Automated capacity management – with Boris Gdalevich & Giuseppe Nardiello” - podcast and white paper is posted on the BMC Software web site at…

Share: |


In April 2012 BMC introduced a new product in it's capacity optimization line called "Moviri Integration for BMC Capacity Optimization."  This new product comes to BMC through a partnership with an Italian company - Moviri.  The folks at Moviri have many years of experience in the capacity management discipline and many customers have asked to learn more about BMC's partnership with Moviri, and their product offered through BMC.  I had a recent opportunity to talk with Riccardo Casero - Product Manager at Moviri and ask him a few questions about the new product and the partnership with BMC...





CG:  How did this partnership come about?

Riccardo C:  The partnership between BMC and Moviri is actually twofold. A little bit of history to explain why. Neptuny  was an Italian company, acting both as a software development farm and a consulting firm in the IT systems performance evaluation market. In 2010 BMC acquired  Neptuny’s  flagship product (Caplan), now called BMC Capacity Optimization (BCO), as well as the Caplan product development team and the Neptuny brand. The “company” Neptuny, was renamed Moviri and retained the entire consulting arm, with its many years of professional services experience in the deployment, integration, customization and end-usage of the tool. Integrations in particular play a key role in BCO success. This is due to the openness of the BCO framework, which can incorporate IT data from potentially any electronic data source. And this is one area where Moviri helped customers the most and where they grew their biggest expertise.

So, under this scenario, there was an immediate professional services partnership between Moviri and BMC, with Moviri being  the most trusted company to engage with for BCO related projects. This partnership has flourished since then.

Soon, as a natural evolution, BMC and Moviri saw  the opportunity to more efficiently make Moviri expertise available to customers, enabling them to purchase integration software components as full-featured and supported products. And this is how the partnership came about.

CG:  What is Moviri providing to BMC?  What would you like customers to know?
Riccardo C: 
Moviri is extending BMC Capacity Optimization's  visibility of IT infrastructure utilization and performance data, enabling  customers to leverage their deployed  monitoring solutions to feed the BCO data warehouse. The BCO framework already enables customers to build their own connectors to  in-house platforms; with Moviri Integrations they can now get these connectors in addition to those that are available out-of-the-box, with the standard BMC products quality and level of support.




The current offering includes packages for IBM Tivoli Monitoring and HP Reporter. The integrations enable the transfer of historical performance and configuration data from monitored standalone OS instances, in a robust and controlled manner. Enhancement to cover specific metrics for monitored virtualized platforms such as IBM LPARs and Solaris Containers are already in roadmap.


We are also actively discussing with BMC on how to further leverage Moviri experience on the field in order to enlarge the offering, i.e. more integrations to many other monitoring and system management platforms. So continue to follow communication channels for news.

CG:  How complex is to create a new connector (development thru testing)?

Riccardo C:  There are a lot of factors influencing the effort to build a new connector. First one is the extent of the IT entities and the metrics to be imported. Here are  two examples: There is a considerable difference in the degree of effort required to just import server network card traffic  versus importing metrics from all the  network devices (firewalls, routers, load balancers)  recorded by a network monitoring tool; The same holds true for  importing basic OS-level server utilization metrics rather than importing OS-specific and virtualization-aware metrics.

A second aspect is the understanding of the subject data and the matching of the subject data to BCO data model. In order to have meaningful capacity reports and models in BCO, data cannot simply be transferred with its original label from one tool to the other. Appropriate matching and transformation needs to be done at the data-flow level.

The third aspect is the means of integration: how, how often, and through which protocol the data is transferred.  And finally, there are general software development aspects, such as code reusability, maintainability and extensibility that need to be addressed.





CG:  You mentioned that Moviri connectors are supported.  Does that mean the you will enhance them in the future?

Riccardo C:  Yes, together with BMC we are willing to continuously find ways to add functionalities that can bring value to the customers. Main objective is to fully cover the set of imported entities from the integrated data source so that BCO can take full advantage from the integration. Scalability, robustness and performance are other areas where we plan to regularly improve.  We will also closely look at new versions of integrated data source in order to provide needed upgrades for available connectors.





CG:  How can customers access/buy Moviri connectors?

Riccardo C: The Moviri integration for BCO product can be purchased by customers directly through BMC.  For more information contact your BMC sales representative, or visit Moviri Integration for BMC Capacity Optimization.

Share: |


I’m a big fan of low tech.  I probably shouldn’t say that, given that I work for a high tech company.  Every time I see a new device with a stylus, I think why buy a $100 pen that I am going to lose when the virtually free one in my hand does just fine?  That said I’m also a big fan of leveraging technology in  ways where it makes sense and where the benefits vastly outweigh the costs.  When I can leverage technology to make myself or my customers more efficient, effective and thereby more productive – I’m the first one in line to check into it.


For Kalyan Kumar (aka KK), Worldwide Head of IT Consulting and Cross Tower Services at HCL Technologies, ( using the predictive analytics in BMC ProactiveNet Performance Management is an advanced technology that makes sense. It enables  his team to manage a growing base 100,000 servers, deployed globally across multiple customers, and deliver a higher quality of service at a lower cost.  The ability of predictive analytics to aggregate, analyze and action massive amounts of data allows HCL to do something that is simply not humanly possible without this technology.  That is to proactively identify and repair problems before things fail.  This is simply something that cannot be done with threshold-based monitoring tools - or a with a pen.


In a BMC podcast KK discusses the reasons why HCL adopted predictive analytics as well as the business and customer value derived from it...


1.  Incident Reduction - HCL staff can proactively detect application failure, and  reduce incident volumes by 60%


2.  Integrated - with a single predictive analytics platform increases his  staff efficiency and effectiveness.  It provides a single source for  - global, integrated - performance, event and service-impact management


3.  Actionable intelligence - reduced MTTR, lower TCO and better/consistent customer performance with automated root cause analysis


You know I keep hearing from industry analysts that 2012 will be the breakout year when behavior learning and predictive analytics will go from the margins to the mainstream.  And I have to believe that they are not too far off, given the increasing complexity and continued expansion IT environments – to include both enterprise IT and cloud.


Learn more about predictive analytics with a with a podcast from an early adopter - HCL Technologies:

Share: |

Picture1.pngThe Defense Information Services Agency (DISA), is a US DoD Combat Support Agency responsible for providing continuous command and control capabilities and global enterprise infrastructure to joint war fighters, National level leaders, and other mission and coalition partners.  For DISA, information is the greatest source of military power and it is imperative that its customers have the information they need, when they need it and wherever they need it. 


DISA is responsible for ensuring its services are accessible while protecting the network -and the information on it – from their adversaries.  DISA’s Field Security Operations (FSO) has the responsibility for ensuring the strength of those systems and networks by certifying and testing them against threats, using Security Readiness Reviews (SRR’s).  SRR’s employ extensive scripts to audit the configuration and security posture of their system configurations.  Rapid identification of vulnerabilities, proper alert and auditing, and fast, accurate remediation are critical capabilities that cannot be done manually to meet the demands of the environment.  DISA FSO’s needs automation tools that can quickly find potential vulnerabilities and remediate them across a global network.


Access Security Configuration & Control project (ASCC) – DISA’s implementation of BMC’s Bladelogic - has recently been accredited as an FSO SRR Tool, meaning that it provides the trust, auditability, accuracy, and reliability to meet DISA’s stringent requirements.  This puts BMC’s Bladelogic at the front lines of defense for securing the US Military’s most critical network and information systems – and in turn, its war fighters and its nuclear command capabilities.

Share: |

Pants.pngEveryone knows that one size (pants for example) doesn’t fit all, nor does it fit “most.”  People come in all different shapes and sizes so while we’re all similar, we’re all very different.  The same is true for IT platforms.   Just like pants, no IT platform is perfect for every business application.  As your business evolves,  your infrastructure will become more diverse and complex.  As good as it is, VMware is unlikely to be the only platform in your shop.


And speaking of size, how do you size your Cloud, your virtualized environment, your blade servers and UNIX racks?  Getting the initial “fit” right is critical; you need enough capacity to manage your business transaction volume, but you also don’t want to have too much - and you certainly don’t want to have too little. With pants you have options – you can roll your pants up if they’re too long and you might even use them as shorts or capris if they’re too short. But with resource capacity, too much is costly and too little can cost you even more – in terms of lost business/revenue and disgruntled customers and internal users. 


So how do you get it right?  The VMware folks would like you to believe that a solution designed specifically to “fit” VMware vSphere implementations will fit AIX, Windows, Linux and everything else.  How likely is that?  In fashion, clothes crafted to flatter fashion models rarely adapt and flatter the wider range of body types in the real world.  In the IT world, getting your infrastructure “size” right requires some things that just aren’t there:



  • Platform-agnosticism – no matter what servers and hardware your application spans, it must be able to accurately estimate capacity needs.
  • Business-awareness – the ability to match resource (server, storage, network, etc.) to real business transactions (e.g. orders-per-minute or trades-per-second) ensures that IT capacity is continually aligned with cyclical business patterns and demands.
  • Application-visibility – understanding of the details of application usage – who uses it, what priorities are involved, what causes fluctuation in transaction volume for each one.
  • Prediction - since response time is the metric of value to an end user, capacity forecasts must be able to anticipate and predict performance into the future; drawing a line from the known to the unknown won’t cut it.  Watching last year’s trend just keeps you stuck in the past
  • Exception detection - increased infrastructure complexity is an unfortunate consequence of virtualization and cloud computing, so you need a way to simplify the management of that complexity by understanding performance and capacity exceptions, automatically.
  • Continuous optimization – the dynamic and elastic nature of cloud and virtual environments is what is driving you to the cloud, and you need to be able to take advantage of these features to adjust your ‘size’ just like you can tighten or loosen your belt to adjust to your fluctuating weight.


What would it be like to pull on that perfect pair of pants that fits every time, no matter how your weight shifts?  To know that you will always have the capacity you need, when you need it?  Then move on past the hype to find the solution that “fits” you best.

Share: |

Picture1.jpgI’ve been in the IT industry a little over 13 years and even though I’ve not been completely around the block I’ve been around long enough to know that the one thing we all can do to make our jobs a whole lot easier -  is to “KISS.” 


Years ago my father introduced me to the KISS principle which stands for “Keep it Simple Stupid” or “Keep it Short and Simple.” – there are many derivations.  And at the time I thought he was calling me both stupid and a simpleton, so I wasn’t exactly open to his words of wisdom.  But I soon came to learn the true meaning of the KISS principle.  Years later I still subscribe to this time tested idea and think it’s needed more than ever, especially in this fast paced world.


And here’s why…


In the world of IT and enterprise management software we are constantly bombarded with new technologies that are promised to revolutionize the datacenter.  Two of latest and greatest technologies today are cloud computing, which can simply be thought of as virtualization on steroids, and software as a service (SaaS) – where services are hosted centrally and delivered on-demand to users via the web. 


Now I’ve noticed during my jaunt around the block that with any new technology - be it client-server, x86 virtualization, cloud computing or SaaS comes added (IT and business) risk and increased IT complexity.  One of the primary reasons for the increased IT complexity is for the  simple fact that new technology never immediately or completely replaces the old technology.  This leaves IT to manage multiple platforms (physical, virtual and cloud), operating systems (UNIX, Linux, Windows, iSeries and maybe even Mainframe) and hypervisors (VMware, Xen, Microsoft HyperV, etc).


So what’s an IT person to do to make it through the day?  You’re likely one of the 85.8 percent of males and 66.5 percent of females in the U.S. who work more than 40 hours per week, so your time is limited and you need to be as productive (i.e. efficient + effective = productivity) as possible. 


Here’s a great example of where you can apply the “KISS” principle… 


“For any given IT project, whether you’re implementing a performance management solution, launching a new capacity management process, or expanding your IT reporting deliverables, begin by identifying what’s mission critical and what isn’t.  Start with your critical servers/apps/services that drive your business.  The other “stuff” can wait – most of the time.”


Case in point: I’m continually asked by customers, “how do I do full-blown capacity management on all of my 2,000 servers when I only have one full-time equivalent (FTE)?  And my reply is, you don’t.”  Keep it simple…


Communicate, up front, with your stakeholders that you will not be doing comprehensive capacity management (i.e. analysis, planning and reporting) on every server (physical and virtual) in your infrastructure.  It’s simply not practical, possible or cost-effective if you have thousands of systems in your environment and limited resources – which is the case in virtually every organization on the planet.  The criticality of the server should define the attention it receives.



Start by identifying the mission servers in your infrastructure and begin collecting performance data from them first.  Many customers collect granular, (10-second sample rate) process-level metrics from critical systems and (1 – 5 minute sample rate) from non-mission-critical tier-2 or 3 systems.  Remember, there’s a cost associated with collecting data in terms of storage space, managing the data, etc.  So the goal should be to only collect the data you need to do the analysis, reporting and modeling for that particular system, application or service.   


Define the server(s) to application(s) relationships, this can be done automatically, with a good discovery solution, or manually (if you have a lot of time on your hands) - so all of your activities (i.e. analysis, reporting and modeling, etc) are in the context of an application or service (e.g. online banking) and not and in IT speak.  Bottom line anytime you communicate your results to stakeholders make it “business relevant” in a language that they understand (e.g. talk about “service” performance not “server” performance.)


If you want to learn more about how to apply the KISS principle to IT read the white paper, “Align IT Operations to Business Priorities”


I swear by the KISS principle, but if you’re still skeptical don’t take my word for it, take theirs…


  • Everything should be made as simple as possible, but no simpler.” Albert Einstein
  • “Our life is frittered away by detail... Simplify, simplify, simplify!”  Henry David Thoreau
  • “There is no greatness where there is not simplicity, goodness, and truth.” Leo Tolstoy


Want to learn more…


    Share: |

    Top10.jpg In the coming day’s thought leaders from across BMC Software will share their knowledge, experience and random thoughts on how you can become more “proactive” when it comes to IT operations.  You’ll find their posts to be educational, informative as well as humorous at times.  So I thought I’d start things off by defining the word “proactive” and providing a top-10-list of the reasons why you should strive for proactive IT operations.  Because until you’re clear on what we mean by “proactive” and understand “why” you should do it, there’s no reason to learn “how” you should do it.


    If you look up the word proactive in you’ll see that it’s an adjective.  And if you remember back to grade school adjectives modify a noun or pronoun by describing, identifying, or quantifying words, and normally precede the noun they’re describing.  Here’s the definition of proactive…


    pro·ac·tive [proh-ak-tiv]

    Serving to prepare for, intervene in, or control an expected occurrence or situation, especially a negative or difficult one; tending to initiate change rather than reacting to events; anticipatory: proactive measures against crime.



    The opposite of “proactive” is “reactive” which means you’re responding to incidents/events – many of which have caused service degradation or downtime.  You’re continuously fighting fires, reacting to events around you, and are generally speaking, in a defensive posture.  But in today’s fast-changing, cost-conscious world, where IT complexity & risk are on the rise and end-user quality of service is key IT operations can’t simply react to events.  They must do far more than simply ensure IT availability.  It’s no longer a question of whether you can afford to be proactive, it’s a question of whether you can afford not to.


    That said - here are the Top-10 Reasons for Being Proactive…


    10. Increase Agility – speed IT’s responsiveness to business demand


    9.  Reduce Costs – eliminate costly and risky reactive processes


    8.  Reduce the Number of IT Events - focus on “real” problems


    7.  Mitigate Risk - detect problems earlier and faster


    6.  Improve Service Availability, Performance and Delivery – increase QOS


    5.  Speed MTTR – repair problems quicker


    4.  Reduce MTBF – avoid many  problems all together  


    3.  Optimize IT Resources – continually align IT capacity with business demand


    2.  Increase Efficiency – automate manual time intensive triage and remediation processes


    1.  Everyone else is doing it so you might as well too!



    Want to learn more...

    Read the thought leadership white paper on Proactive Operation - “Proactive Operations for the Modern Datacenter"

    Charlie Geisler

    "The Chaos Principle"

    Posted by Charlie Geisler Apr 22, 2011
    Share: |

    Homer.jpgIn this day and age of rapid technology advancement you can automate just about anything – paying your bills, renewing subscriptions and mowing the lawn (by having your son/daughter do it for you ). But for all of this automation to work seamlessly requires intelligence and a bit of work on your part. You’ll need to know at all times that that you have enough money in your checking account to cover both your expected and unexpected bills. Otherwise you’ll find yourself in a chaotic situation where your withdrawals exceed deposits. Your bank will be the only winner in this scenario and you’ll wind up paying multiple fees and penalties and wonder how this could happen to an intelligent person like yourself.


    Automating the complexities of modern-day life is not dissimilar to automating IT. Take technological innovations such as virtualization and cloud computing. These along with advancements in IT management software are transforming IT organizations - enabling them to be more efficient, flexible, agile and automated. Nowadays you can achieve remarkable results, by automating what were in the past manual processes such as provisioning, compliance and configuration management. But what I find very interesting is that even with automation and virtualization technologies, customers are still challenged with…


    • Physical and/or virtual machine (VM) sprawl– which results in many (100’s or even 1,000’s) of underutilized systems and VM’s many of which are not compliant with operational policies
    • Virtualizing mission critical applications/services – which is mainly due to the inability to translate workload requirements from the physical world to the virtual world
    • Over-provisioning– which has led many IT organizations to fail to achieve the desired VM density and cost savings they expected. And with limited policies in place inconsistent VM images are also the norm.


    Now it’s true that virtualization and automation technologies have built-in intelligence (e.g. resource schedulers and runtime policies), but in many cases IT Operations professionals require an added degree of intelligence to address these new challenges and tame the rapidly increasing agility of today’s hybrid datacenters.


    IT management software vendors have answered the call by integrating their performance and capacity management solutions with configuration management. This provides an additional level of intelligence that prevents the “chaos” and enables IT to right-size their infrastructure, increase VM densities, forecast future capacity requirements and proactively prevent performance disruptions. It also ensures greater insight into how scheduled and unscheduled changes will impact performance, provides performance analysts with more precise root cause analysis and sets the stage for automated remediation of non-compliant changes.


    As anyone who has implemented a virtual or cloud infrastructure knows, even the most innocent of modifications to a virtual-host or VM, can have a significant negative impact not only on the intended object, but other objects in the shared infrastructure. Integrating performance and capacity management with configuration automation is not only the key to preventing costly performance degradation and downtime, but will help you overcome the fear of virtualizing mission critical applications and enable you to achieve the desired level of efficiency and cost savings you require.


    Now on a personal note, as a father of two young girls, ages 9 & 11, I wholeheartedly support process integration and understand the importance of continuous capacity management and proactive performance management as a means to minimize the chaos and meet or exceed their high service level expectations. I realize that as they get older they’ll want to spend less time with me and more time with their friends. I only have a limited amount of free capacity, so I spend as much time as I can with them now, knowing that in the not too distant future it will be very uncool to hang out with dad. I also take a proactive approach to managing their happiness. Preteen girls can be challenging at times and I look to leading indicators (e.g. how long since the last time they played the Wii) to anticipate their needs and I try to correct the issue before they’re unhappy. Now as far as automation goes I do use auto bill pay and I regularly check my balance to ensure that I have just enough money in my checking account to cover the bills. But I’m not sure I want to automate the process of moving the lawn by having my 11 year old daughter do it. I’ll stick to doing that myself.

    Filter Blog

    By date:
    By tag:
    It's amazing what I.T. was meant to be.