Skip to content

Comparing Investigations

Cyvest provides tools to compare two investigations and identify differences in checks, observables, and threat intelligence. This is useful for regression testing, validating detection rules, and tracking changes between investigation runs.


Basic Comparison

Compare two investigations using compare_investigations():

from decimal import Decimal
from cyvest import Cyvest, compare_investigations

# Create expected (baseline) investigation
expected = Cyvest(investigation_name="expected")
expected.check_create("domain-check", "Verify domain", score=Decimal("1.0"))

# Create actual investigation
actual = Cyvest(investigation_name="actual")
actual.check_create("domain-check", "Verify domain", score=Decimal("2.0"))
actual.check_create("new-check", "New detection", score=Decimal("1.5"))

# Compare
diffs = compare_investigations(actual, expected)

The function returns a list of DiffItem objects representing:

Status Symbol Description
ADDED + Check exists in actual but not in expected
REMOVED - Check exists in expected but not in actual
MISMATCH Check exists in both but score/level differs

Tolerance Rules

Use ExpectedResult rules to define acceptable score variations. When the actual score satisfies a tolerance rule, the difference is not flagged.

from cyvest import ExpectedResult, Level

rules = [
    # Accept any score >= 1.0 for this check
    ExpectedResult(check_name="domain-check", score=">= 1.0"),

    # Accept any score < 3.0 for roger-ai
    ExpectedResult(key="chk:roger-ai", level=Level.SUSPICIOUS, score="< 3.0"),

    # Exact match required
    ExpectedResult(check_name="critical-check", score="== 5.0"),
]

diffs = compare_investigations(actual, expected, result_expected=rules)

Supported Operators

Operator Description
>= Greater than or equal
<= Less than or equal
> Greater than
< Less than
== Equal
!= Not equal

ExpectedResult Fields

Field Required Description
check_name One of check_name or key Check name (key will be derived)
key One of check_name or key Full check key (e.g., chk:my-check)
level No Expected level (informational)
score No Tolerance rule string

Displaying Differences

Use display_diff() to render differences as a rich table:

from cyvest.io_rich import display_diff
from logurich import logger

display_diff(diffs, lambda r: logger.rich("INFO", r), title="Investigation Diff")

Output:

╭────────────────────────────────────────────────┬────────────────────┬─────────────────┬────────╮
│ Key                                            │      Expected      │     Actual      │ Status │
├────────────────────────────────────────────────┼────────────────────┼─────────────────┼────────┤
│ chk:new-check                                  │         -          │  NOTABLE 1.50   │   +    │
│ └── domain: example.com                        │         -          │   INFO 0.00     │        │
│     └── VirusTotal                             │         -          │   INFO 0.00     │        │
├────────────────────────────────────────────────┼────────────────────┼─────────────────┼────────┤
│ chk:domain-check                               │   NOTABLE 1.00     │  NOTABLE 2.00   │   ✗    │
╰────────────────────────────────────────────────┴────────────────────┴─────────────────┴────────╯

The table shows:

  • Key: Check key with linked observables and threat intel as a tree
  • Expected: Level and score from expected investigation (or tolerance rule)
  • Actual: Level and score from actual investigation
  • Status: Diff status symbol

Convenience Methods

Use methods directly on Cyvest objects:

# Get diff items
diffs = actual.compare(expected=expected, result_expected=rules)

# Compare and display in one call
actual.display_diff(expected=expected, title="My Investigation Diff")

DiffItem Structure

Each DiffItem contains:

class DiffItem:
    status: DiffStatus          # ADDED, REMOVED, or MISMATCH
    key: str                    # Check key
    check_name: str             # Check name
    expected_level: Level       # Expected level
    expected_score: Decimal     # Expected score
    expected_score_rule: str    # Tolerance rule (if any)
    actual_level: Level         # Actual level
    actual_score: Decimal       # Actual score
    observable_diffs: list      # Linked observable differences

ObservableDiff

For each linked observable:

class ObservableDiff:
    observable_key: str
    obs_type: str
    value: str
    expected_score: Decimal
    expected_level: Level
    actual_score: Decimal
    actual_level: Level
    threat_intel_diffs: list    # Threat intel differences

ThreatIntelDiff

For each threat intel source:

class ThreatIntelDiff:
    source: str
    expected_score: Decimal
    expected_level: Level
    actual_score: Decimal
    actual_level: Level

Use Cases

Regression Testing

Compare investigation outputs before and after rule changes:

# Run investigation with old rules
old_cv = run_investigation(email, rules_v1)
old_cv.io_save_json("baseline.json")

# Run investigation with new rules
new_cv = run_investigation(email, rules_v2)

# Compare
baseline = Cyvest.io_load_json("baseline.json")
diffs = compare_investigations(new_cv, baseline)

if diffs:
    print(f"Found {len(diffs)} differences")
    display_diff(diffs, print, title="Rule Changes Impact")

Validation with Tolerance

Define expected outcomes with acceptable variations:

# Expected results for test case
expected_results = [
    ExpectedResult(check_name="spam-score", score=">= 5.0"),
    ExpectedResult(check_name="phishing-score", score=">= 7.0"),
    ExpectedResult(check_name="ai-analysis", score=">= 1.0"),  # Allow variation
]

diffs = compare_investigations(actual, expected, result_expected=expected_results)

# Assert no unexpected differences
assert len(diffs) == 0, f"Unexpected differences: {[d.key for d in diffs]}"

Example

See examples/06_compare_investigations.py for a complete example demonstrating:

  • Creating expected and actual investigations
  • Comparing without tolerance rules
  • Comparing with tolerance rules
  • Using convenience methods

Run it with:

python examples/06_compare_investigations.py