Activate Content Development Best Practices

What Packages Are Currently Being Developed?

See the Activate Development page for the current list of projects, who may be working on them, and wishlist items.

Resource Naming Conventions

Resource Naming Conventions Best Practices

Events

Event Types

There are four types events recognized by ArcSight ESM:
  1. Action
  2. Aggregated
  3. Base
  4. Correlation

Base Events

These are what pretty much everyone thinks of as events. These events come from connectors, and are the basic event type.

Aggregated Events

These events are really a type of base event. These events come from SmartConnectors, and are the result of the aggregation settings of the connector. An aggregated event represents a collection of more than one base event. The field, aggregatedEventCount, tells you how many base events are represented by an aggregated event. The aggregation settings of a connector will determine how often and under what conditions base events will be combined into aggregated events. An optimal aggregation setting will not result in the loss of any important data.

Correlation Events

There are three sources for correlation events. The best known is the event resulting from a rule triggering. The second source is from data monitors. For example, a moving average data monitor event stating that a threshold has been traversed is a correlation event (an active list event that says an entry was modified is an audit event, which is actually a base event). The third source of correlation events would be other SIMs that are sending information into ArcSight ESM.

Correlated Events

Correlated events are NOT an event type. This is a property of any type of event. Correlated events are events that were used by a rule to trigger that rule, resulting in a correlation event. For example, if a rule looked for 5 events within 1 minute for the same source attempting to connect to the same destination on port 139, and the rule triggered, the result of the rule triggering is the correlation event, and the five (base, aggregated, action, or correlation) events that caused the rule to trigger are marked as correlated. Think of it this way: A correlated event was used to create a correlation event. Correlation events can be correlated, that is, a rule can use correlation events from other rules to trigger.

Action Events

These events are generated by the rules engine when a rule action is completed. For example, a rule that modifies an active list can generate an action event.

Event Types in Conditions

This applies to any resource that has conditions, so it is in this section, rather than be repeated in sections focusing on rules, filters, queries, etc.

Rules can often "accidentally" trigger off of their own correlation events, or correlation events from other rules, that can also consume another rule's correlation events. This is called looping, and is bad. For this reason, many people insert the following conditions in their rules:

Type = Base

This is bad, because it explicitly ignores aggregated events.

Type != Correlation

This is better, but allows action events (although this is more of a theoretical problem, it could still happen). Action events are generally not all that useful for security use cases. They are mostly useful for rules engine/system health monitoring use cases.

Optimally, this is a good condition:

Type IN (Aggregated,Base)

This is the equivalent OR operator:

Type = Aggregated OR Type = Base

Derived Event Fields

There are many fields that are not actually stored in a database, but are derived from one or more objects in the database. Take zones, for example. There are several zone field collections, e.g., source, destination, agent, device, etc. The fields available for any zone resource in the ArcSight CCE are (leaving out the group name):
  • zone (resource)
  • zone external ID
  • zone ID
  • zone name
  • zone URI
The zone (resource) is actually a resource reference pointing to the zone object (more or less). There is some confusing information implied by the CCE. For example, in a resource's CCE you see <group> zone, but in the rule editor's aggregation tab, you see both the <group> zone and the <group> zone resource. The resource reference is stored in the DB, or in the event, and the UI automatically breaks it up for you when you view it in the event inspector or when viewing it in a data display (trends, active|session lists, etc.). So, the external ID, ID, name, and URI fields are derived fields. This means that when you are using them in a resource's conditions, you are adding extra processing for each item (event, list entry, etc.) for each condition check. This can be especially true of the request URL group of fields (not all the request URL fields are derived), which can be significantly large strings to parse out.

An easy way to determine whether a field is derived is by looking at it in a rule editor's aggregation settings tab. If you choose the "Add..." button, the fields that are italicized are derived. This is mostly accurate, but there are a few notable exceptions. The most notable exceptions are the attacker and target fields.

Attacker and Target

These fields are completely derived from the source and destination field groups, based on the root event field, Originator. The originator field has two possible values, "Source" and "Destination", which tells the system how to derive the attacker and target fields. If the originator field is "Source", then the attacker fields are a copy of the source fields and the target fields are a copy of the destination fields. If the originator field is "Destination", then the attacker fields are a copy of destination fields and the target fields are a copy of the source fields. The latter case is for events where a device reports the source as the originator of a communication, but the destination is considered malicious. For example, if you click on a link to a malicious web server, your system is the source of the communication, but the destination web server is the attacker, and therefore, your source system is the target.

There has been a long-standing "tradition" of using the attacker and target fields for everything. This has always been controversial, and we consider it to be incorrect for a couple of reasons:
  • Use of attacker/target implies intent, and honestly, most events should not be related to any attack
  • The attacker and target fields are derived, therefore use of them, especially in rule and data monitor conditions, introduces processing overhead

ArcSight's research of events, for categorization purposes, reveals that in a very small fraction of the events we have seen, less than 1%, the originator field is set to "Destination". For this reason, in most cases, use of the source and destination fields is recommended, unless you are writing content for a use case that specifically addresses attacks. Most Activate content does this, the noticeable exception being for the L3 Impact and Threat Analysis packages.

Event Types and Product Packages

We have discovered that including this condition in the basic filter for a given product can eliminate some problems and provide some potential efficiencies:

Type IN (Aggregated,Base)

By putting this in the base filter of a product package, any rule within that product package that uses a filter derived from that base filter does not need to use any of the anti-looping rule techniques. This requires a bit of developer discipline, because the ideal product package does not have any rules (just filters...), but there are cases where a product package might require rules. An example of this is auditd user events for Linux. The auditd user events tend to use the user ID, and not the user name. For example, the root user ID is 0, so most auditd events that reference the root user populate the user ID field with 0, and the word 'root' is not found within the event. The P-Linux product package uses a strategy to map the user IDs to the user names, and therefore, the rules in L1-User Monitoring - Indicators and Warnings package will need to consider correlation events, in addition to base and aggregated events.

Filters Best Practices

Filters can be used in active channels, rules, queries, data monitors, etc. In general, when in doubt about how to write a filter that may be used in multiple resource types, optimize it for rules

The operator: InActiveList – filters that use this condition cannot be used in Active Channels.

Try to avoid using the active lists operator (InActiveList) in your filters. Doing this makes them unusable in an active channel. If you keep it in the rule, and out of the filter, it becomes easier to create active channels for verification of your rule. If you really need to reference an active list and want to use the same filter in an active channel, you can create a variable, using the function GetActiveListValue, instead.

The function: Arithmetic | JavaMathematicalExpression (JME) can be used in rules, but not in queries.

Operators: IN vs OR

Optimally, this is a good condition:

Type IN (Aggregated,Base)

You might hear some consultants state that the IN operator is better than the equivalent OR operator:

Type = Aggregated OR Type = Base

However, there is no actual proof of this, it is subjective. In reality, both the ArcSight ESM correlation (rules and data monitor engine) and the MySQL query optimizer converts the IN operator into a series of OR statements. This is what the back-end code does, we cannot (or will not!) change it. Any preference for the IN operator or the OR operator is really a matter of style, readability, and maintainability. In some cases, the IN operator statements are easier to read, while in others, OR statements can be easier to read. It all depends on the number of values, the ordering, and developer's preference.

Short-Circuit Evaluation

Filters and conditions do what is called “ Short-Circuit Evaluation.” This means that in a simple AND condition, if the first element is false, it doesn’t matter what the second element is, both must be true for the AND condition to be true.

A similar circumstance applies to OR condition, if the first element is true, it doesn’t matter what the second element is, because only one element needs to be true for the OR condition to be true. This means you can put the simplest, or cheapest, condition to evaluate higher up.

This means that you should balance your conditions between two concepts: the cost of evaluating a given operator, and the percentage of events that can be eliminated by a given operator.

For example, the Type field is an enumerated field. This means that the data type is actually numeric, but the application displays it for you as a string (action, aggregated, base and correlation). This, of course, does not apply to events in CEF or in Logger, as the data is stored as a string. Evaluating the Type field is cheaper than evaluating the Device Product field, a string. However, if you are looking for Base events, or more correctly, non-Correlation events, 60-90 percent of events in most installations are base events, whereas for a given product, a much smaller percentage events are generated by that product. If your second condition can eliminate a larger percentage of potential events than your first condition, swap them.

The theory goes like this:
  • Condition A may take 2 time units to evaluate
  • Condition B may take 3 time units to evaluate
  • Condition A may match 90% of the potential events
  • Condition B may match 50% of the potential events
Given a thousand events, the timings for the ordering of Condition A AND Condition B are:

Evaluation of Condition A first results in a minimum of 2,000 time units per thousand events, regardless of how many events actually match. Evaluation of Condition B first results in a minimum of 3,000 time units per thousand events, regardless of how many events actually match.

Evaluating Condition A first means that Condition B will be evaluated 900 times, adding an additional 2,700 time units in processing the thousand events, for a total of 4,700 time units to process this rule per thousand events.

Evaluating Condition B first means that Condition A will be evaluated 500 times, adding an additional 1,000 time units in processing the thousand events, for a total of 4,000 time units to process this rule per thousand events, a potential savings of 700 time units per thousand evaluated events.

Therefore, put “Device Product = ArcSight ” before you put “Type = Correlation” in the general case. “Target Address in the Hostile List” should go further down the AND or OR list, since active list lookups are more computationally expensive than simple field checks. [1]

This applies to filters and conditions in rules, data monitors, and queries.

Fields and Variables

There are two classes of variables, global and local. Global variables are a user-definable field that can be used by other resources. Local variables are like fields, but not exactly, and are only useful in the resource in which they are defined.

Lists Best Practices

See the Active List Best Practices page for details

See the Session List Best Practices page for details.

Rule Best Practices

See the Rule Best Practices page for details.

Model Categories

network asset system default application physical

models.png

Packaging

Before you start building any packages for Activate, please do the following:

Update ESM

Update your ESM’s server.properties to export packages in Resource ID order. This will make the XML bits of the package sort the resources by their resource ID, rather than their resource URI.

#Change package archive sort order (server.properties):
export.archive.reference.sort.order=id

This will make the XML bits of the package sort the resources by their resource ID, rather than their resource URI. This makes it easier for us to do diffs, etc., when merging versions for release.

Once you update the server.properties file, you will need to restart your manager.

Activate Package Development Best Practice

Always start your package by using the appropriate Activate Template package. Also, if you are building an L1 package with shared filters, you should build a customizations package to preserve product package configurations when your L1 package is updated.

The Activate package templates will set up the initial package so that unintended resources cannot accidentally be included in your package.

Package Format and Other Useful Information

When archiving resources (except users, use the “exportuser” format), use the "export" format. Also, always check the "exclude reference IDs" checkbox.

In addition to the users exception, you might find yourself needing to use a pre-populated active list, where you put specific values in a list for lookup or checking in various conditions. The export format means that no active or session list data will be included in the package. How do we get around this?

The first package uses the export format (no list data included).

The second package uses the default format (list data included).

The first package requires the second package.

Exclude all active lists that are to have pre-populated (static) data entries from the first package.

Explicitly include individual active lists that are to have pre-populated data entries in the second package.

When you export the first package, be sure to include the second package in the .arb file.

The pre-populated active lists should have their TTLs set to 0.

Clean Packages (no extra resources included unnecessarily)

By default, new packages will exclude most of the /All <resource_type>/ArcSight System/… level content. The current exception to this is /All Fields/ArcSight System/Event Fields. You can safely add this to the resource exclusions of the package. NOTE: You do not need to worry about this if you use an Activate Template package.

Steps to follow for removing the event fields from a package:

NOTE: You do not need to worry about this if you use an Activate Template package.

1. In the package editor, select the Resources tab.

2. Sort the Removed Resource Column.

3. Add /All Fields/ArcSight System/Event Fields.

4. Check the If Not Included checkbox

This will avoid errors when the package is uninstalled due to the fields being in a locked group.

Cross Package Contamination

For long-term maintenance of packages, make sure that resources only show up in one, and only one, package. If you have a resource in two packages, and modify that resource and export one of them, then the old version of that resource is in the second package. If you install them in the wrong order, your resources won’t be what you expect or need.

Make sure that you're not including additional resources you don't need to export or import. Do not allow resources from Activate Base, or any other Activate packages, into your packages. NOTE: You do not need to worry about this if you use an Activate Template package. If you aren't using one of the Activate package templates, you can update your resource exclusions to match the templates. Also, make sure that you do not explicitly include resources from other packages, and avoid linking resources from other packages into your resource tree structure.

Content and Package Testing

Before you submit your package for others to use, you should test your content packaging for some very basic conditions. We have provided two testing tools to help you with testing your content: These tools make sure that your content can be easily installed and uninstalled without affecting other content on the system, as well as help you identify resources that may be in a bad location (e.g., a user's personal group).

Package Installation Scripts

We have also provided a tool to automatically generate installation scripts for installing your packages. See the Activate Installation Script Generator Tool documentation for details.

-- GeorgeBoitano - 21 Jan 2016

Topic revision: r20 - 04 Apr 2018, PrenticeHayes


 


Activate Wiki 2.1.0.0

This site is powered by FoswikiCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback