OSSIM applied to ITIL

January 17, 2008 | Dominique Karg

Recently I stumbled across an interesting article talking about Microsoft, Opensource and ITIL where ossim was being mentioned. (the article can also be found googling for “ossim itil microsoft” in case the link breaks).

I’ve never been very keen about learning ITIL either (although I’ve heard about it everywhere during the last year) but this really caught my attention. In that paper ossim gets referenced only on the “security management” section, but I think that’s mainly caused by ossim being hard to install, setup and understand when that article was written, so I thought I give it another try from my point of view, taking the included tools into account for the different ITIL sections.

So, the goal of this article would be to extend and improve that other article, giving a thought about how I’d approach all those ITIL recommendations from an OSSIM point of view.

The Information Technology Infrastructure Library is comprised by two main sets and a series of subsets (from what I’ve read on that article and the wikipedia):

Service Support
Service Delivery

Note: The definitions after each topic have been quoted from the MS article since they’re small and concise.

The following diagram illustrates a sample support request handled according according to ITIL (thanks Gabi althought there are some typos :wink:):

(Image removed, broken link, I’m very sorry. DK.)

Service Support

Incident Management

Solving incidents and restoring services quickly.

The incident manager ~~http://www.ossim.net/dokuwiki/doku.php?id=user_manual:incidents~~ [no longer available] is the obvious choice for this activity from an OSSIM point of view, with a couple of details and exceptions mentioned below.

Five points are mentioned as important for incident management:

Detect and Record an incident ~~http://www.ossim.net/dokuwiki/doku.php?id=user_manual:incidents#incidents_incidents~~ [no longer available] (the main incident manager)
Classify ~~http://www.ossim.net/dokuwiki/doku.php?id=user_manual:incidents#incidents_types~~ [no longer available] (Incident types)
Initial Support (This requires more manual intervention, although automated urls with information could be sent to the users involved in the incident)
Investigate and Resolve ~~http://www.ossim.net/dokuwiki/doku.php?id=user_manual:events~~ [no longer available] (With forensics, realtime viewers, vulnerability databases and everything logged on a central location there are plenty of tools for doing this)
Track, Monitor and Communicate ~~http://www.ossim.net/dokuwiki/doku.php?id=user_manual:dashboard#dashboard_executive_panel~~ [no longer available] (Specific metrics / dashboards could be designed for this

Problem Management

Solving root cause problems to prevent future incidents.

Problem and Error Control (Well, this is what a SIEM is used for 90% of the time, isn’t it ? Linking to the root description for overview.)
Proactive Management ~~http://www.ossim.net/dokuwiki/doku.php?id=user_manual:events#events_anomalies~~ [no longer available] (Identifying problems and errors before they occur. Anomalies can be a very valuable tool for this)
Report ~~http://www.ossim.net/dokuwiki/doku.php?id=user_manual:reports~~ [no longer available]

Configuration Management

Maintaining all necessary information about services, service components, and relationships.

At first I was confused and didn’t see how ossim could fit into this. The following tasks are mentioned as being important for this part. As I don’t see how this fits I don’t link them to any specific sites.

Planning
Identification
Configuration Control
Status Accounting
Verification and Audit

But after this they point out a series of important things that a software would have to accomplish in order to help out on this, namely:

Discover devices on a network
Determine the host OS
Determine OS version and patch levels
Determine which applications are installed
Detect any changes to the configuration

Well, this started to sound familiar. OCS ~~http://www.ossim.net/dokuwiki/doku.php?id=user_manual:reports#reports_ocs_inventory~~ [no longer available], Nmap ~~http://www.ossim.net/dokuwiki/doku.php?id=user_manual:tools#tools_net_scan~~ [no longer available] network inventory, the whole policy ~~http://www.ossim.net/dokuwiki/doku.php?id=user_manual:policy~~ [no longer available] area, p0f, pads, arpwatch all fit perfectly in this section.

Change Management

Controlling the implementation of changes in the infrastructure.

This is an area where ossim can be greatly improved. OCS already includes some “Install software updates and configuration changes” functionality but it’s not fully integrated.

The article suggests Bcfg2, cfgengine or Webmin for this. We already considered using webmin for the installer CD configuration so this would be an obvious addition.

The rest of it is covered by the Incident manager ~~http://www.ossim.net/dokuwiki/doku.php?id=user_manual:incidents~~ [no longer available], reports, executive panels and OCS.

Filtering Changes
Implementing Changes
Review and Close
Report

Release Management

Controlling the rollout of new releases in the infrastructure.

Again there are things missing for this to be fully covered by ossim. Integrating Zenoss would be an option although with tight ocs integration and some additional development this should be easily accomplished without additional dependencies. Maybe webmin too.

Build and Configure
Test and accept
Schedule and Plan (The incident manager would be suitable although not perfect for this)
Communicate and Prepare (Again we’d use the subscription feature for this)
Distribute and Install (As mentioned earlier ocs can be used for remote software installation)

Service Delivery

Service Level Management

Defininig and implementing clear agreements for service delivery between an IT organization and its customers.

This on the other hand is something which is fully covered by ossim. Having implemented lots of metrics and measurements it is very easy to:

Define SLAs, OLAs and UCs (Also executive panel metrics, service level, vulnmeter can be used for this)
Define a service catalog (Typifying and tagging incidents)
Status accounting

Financial Management for IT Services

Ensuring the proper management, maintenance, and financial operation of IT.

This is more of a human task than an ossim one. I guess metrics could be enforced if the data related to all of this is stored somewhere but I’d need to investigate it some more.

Capacity Management

Optimizing capacity to meet service requirements at an acceptable cost.

This is a very interesting expertise area, where some parts are covered by ossim and others not so much. The article mentions Zenoss and Hyperic HQ as tools that meet the needs for this, and I guess ossim as is also meets many of the needs:

Monitoring (Including trends and forecasting due to heavy RRD usage: Nagios, Ntop, ocs, etc…)
Analysis (Most of ossim can be used for this)
Demand Management (Most of ossim can be used for this)
Modeling (Policy establishes a baseline and anomalies provide information on how this has changed over time)
Planning (Most of ossim can be used for this)

Availability Management

Ensuring the availability of IT resources to meet agreed upon service levels.

This obviously is fully covered by monitors and executive panels with metrics:

Define requirements (This involves policy, executive panel metrics, business processes, check the main descriptions)
Availability Planning (Nagios, Ntop, OCS, etc…)
Monitor Availability (Nagios, Ntop, OCS, etc…)
Monitor obligations (This involves policy, executive panel metrics, business processes, check the main descriptions)

IT Service Continuity Management

Defining and maintaining appropriate Disaster Recovery plans for IT.

This is also a very manual and off-ossim task, some parts could help for this (monitoring, sla’s, etc…)

Security Management

Ensuring the proper access to services as defined by agreements and industry best practices.

This is where the article mentions ossim, although not fully extending on what can be covered using ossim. Four main tasks are required for this:

Coordinate Security Management (Using the incident manager)
Implement Controls (This is covered by most of ossim)
Evaluate and Audit Controls (Also this)
Maintain and Monitor (Using the incident manager, executive panels, reports)

Conclusion

This article is more of an exercise of what could be done rather than a step by step guide on how to implement it. Obviously that step-by-step guide is now on my todo list but that requires much more than the couple of hours I’ve spent writing this up.

Anyway, I hope this article gave a quick overview of how ossim can be applied to ITIL.

Remember the graph at the top and the numbers ? Let’s resume where each task would fit in ossim.

The new incident creates a log read by the agent, a response is issued by policy and a new incident inserted.
Using the incident manager we tag and typify the incident.
Next we analyze events, alarms, monitors and reports checking what could be wrong. We notice the service level and metrics at executive panel decreasing.
Through the incident manager we subscribe the people who can fix the issue to the incident, and describe needed actions in order to fix them.
Another executive panel reflects to the customer what the current issue means to their business in terms of money.
Again we track everything using the incident manager and issue a patch rollout using OCS’s automatic soft installation feature.
We change our inventory information based on the new automatic changes and keep track of them using the incident manager. The anomalies panel gets automatically updated once the new versions get detected.
Nagios will reflect the downtime and affect the overall service level due to this issue.
Policy, executive panels and directives get updated if needed after this new situation.
The closed incident gets reported to the customer and will show up on incident reports.

And in order to finish I’d like to quote a friends comment about ITIL: ITIL is a series of things everybody with a little bit of common sense would do in an enterprise if he had the time, the people or the money to do it. So if it’s not being done I yet it will take more than a series of best practices to change that :wink:

Tags: ossim, itil

Stop DDoS Attacks in Their Tracks

Introducing LevelBlue. Cybersecurity. Simplified.

Focusing on Cyber Resilience, Not Just Security

Security Expertise, Comprehensive Services Protect Global Engineering Supplier’s New Company

Frost Radar: Americas Managed Security Services 2024

Navigating the Minefield: Cybersecurity for Non-Profit Organizations

Your Guide to Endpoint Security Compliance

The Future of Integrated Cyber Defense: LevelBlue, Zscaler, and SentinelOne

Cyber Resilience is Mission Critical for Business

Product Resources

Security Resources

Customer Resources

Browse by Topic

OSSIM applied to ITIL

Service Support

Incident Management

Problem Management

Configuration Management

Change Management

Release Management

Service Delivery

Service Level Management

Financial Management for IT Services

Capacity Management

Availability Management

IT Service Continuity Management

Security Management

Conclusion

Sam Bocetta

Stop DDoS Attacks in Their Tracks

Introducing LevelBlue. Cybersecurity. Simplified.

Focusing on Cyber Resilience, Not Just Security

Security Expertise, Comprehensive Services Protect Global Engineering Supplier’s New Company

Frost Radar: Americas Managed Security Services 2024

Navigating the Minefield: Cybersecurity for Non-Profit Organizations

Your Guide to Endpoint Security Compliance

The Future of Integrated Cyber Defense: LevelBlue, Zscaler, and SentinelOne

Cyber Resilience is Mission Critical for Business

Product Resources

Security Resources

Customer Resources

Browse by Topic

OSSIM applied to ITIL

Service Support

Incident Management

Problem Management

Configuration Management

Change Management

Release Management

Service Delivery

Service Level Management

Financial Management for IT Services

Capacity Management

Availability Management

IT Service Continuity Management

Security Management

Conclusion

Share this with others

Featured resources