Comparing Investigations¶

Cyvest provides tools to compare two investigations and identify differences in checks, observables, and threat intelligence. This is useful for regression testing, validating detection rules, and tracking changes between investigation runs.

Basic Comparison¶

Compare two investigations using compare_investigations():

from decimal import Decimal
from cyvest import Cyvest, compare_investigations

# Create expected (baseline) investigation
expected = Cyvest(investigation_name="expected")
expected.check_create("domain-check", "Verify domain", score=Decimal("1.0"))

# Create actual investigation
actual = Cyvest(investigation_name="actual")
actual.check_create("domain-check", "Verify domain", score=Decimal("2.0"))
actual.check_create("new-check", "New detection", score=Decimal("1.5"))

# Compare
diffs = compare_investigations(actual, expected)

The function returns a list of DiffItem objects representing:

Status	Symbol	Description
ADDED	`+`	Check exists in actual but not in expected
REMOVED	`-`	Check exists in expected but not in actual
MISMATCH	`✗`	Check exists in both but score/level differs

Tolerance Rules¶

Use ExpectedResult rules to define acceptable score variations. When the actual score satisfies a tolerance rule, the difference is not flagged.

from cyvest import ExpectedResult, Level

rules = [
    # Accept any score >= 1.0 for this check
    ExpectedResult(check_name="domain-check", score=">= 1.0"),

    # Accept any score < 3.0 for roger-ai
    ExpectedResult(key="chk:roger-ai", level=Level.SUSPICIOUS, score="< 3.0"),

    # Exact match required
    ExpectedResult(check_name="critical-check", score="== 5.0"),
]

diffs = compare_investigations(actual, expected, result_expected=rules)

Supported Operators¶

Operator	Description
`>=`	Greater than or equal
`<=`	Less than or equal
`>`	Greater than
`<`	Less than
`==`	Equal
`!=`	Not equal

ExpectedResult Fields¶

Field	Required	Description
`check_name`	One of `check_name` or `key`	Check name (key will be derived)
`key`	One of `check_name` or `key`	Full check key (e.g., `chk:my-check`)
`level`	No	Expected level (informational)
`score`	No	Tolerance rule string

Displaying Differences¶

Use display_diff() to render differences as a rich table:

from cyvest.io_rich import display_diff
from logurich import logger

display_diff(diffs, lambda r: logger.rich("INFO", r), title="Investigation Diff")

Output:

╭────────────────────────────────────────────────┬────────────────────┬─────────────────┬────────╮
│ Key                                            │      Expected      │     Actual      │ Status │
├────────────────────────────────────────────────┼────────────────────┼─────────────────┼────────┤
│ chk:new-check                                  │         -          │  NOTABLE 1.50   │   +    │
│ └── domain: example.com                        │         -          │   INFO 0.00     │        │
│     └── VirusTotal                             │         -          │   INFO 0.00     │        │
├────────────────────────────────────────────────┼────────────────────┼─────────────────┼────────┤
│ chk:domain-check                               │   NOTABLE 1.00     │  NOTABLE 2.00   │   ✗    │
╰────────────────────────────────────────────────┴────────────────────┴─────────────────┴────────╯

The table shows:

Key: Check key with linked observables and threat intel as a tree
Expected: Level and score from expected investigation (or tolerance rule)
Actual: Level and score from actual investigation
Status: Diff status symbol

Convenience Methods¶

Use methods directly on Cyvest objects:

# Get diff items
diffs = actual.compare(expected=expected, result_expected=rules)

# Compare and display in one call
actual.display_diff(expected=expected, title="My Investigation Diff")

DiffItem Structure¶

Each DiffItem contains:

class DiffItem:
    status: DiffStatus          # ADDED, REMOVED, or MISMATCH
    key: str                    # Check key
    check_name: str             # Check name
    expected_level: Level       # Expected level
    expected_score: Decimal     # Expected score
    expected_score_rule: str    # Tolerance rule (if any)
    actual_level: Level         # Actual level
    actual_score: Decimal       # Actual score
    observable_diffs: list      # Linked observable differences

ObservableDiff¶

For each linked observable:

class ObservableDiff:
    observable_key: str
    obs_type: str
    value: str
    expected_score: Decimal
    expected_level: Level
    actual_score: Decimal
    actual_level: Level
    threat_intel_diffs: list    # Threat intel differences

ThreatIntelDiff¶

For each threat intel source:

class ThreatIntelDiff:
    source: str
    expected_score: Decimal
    expected_level: Level
    actual_score: Decimal
    actual_level: Level

Use Cases¶

Regression Testing¶

Compare investigation outputs before and after rule changes:

# Run investigation with old rules
old_cv = run_investigation(email, rules_v1)
old_cv.io_save_json("baseline.json")

# Run investigation with new rules
new_cv = run_investigation(email, rules_v2)

# Compare
baseline = Cyvest.io_load_json("baseline.json")
diffs = compare_investigations(new_cv, baseline)

if diffs:
    print(f"Found {len(diffs)} differences")
    display_diff(diffs, print, title="Rule Changes Impact")

Validation with Tolerance¶

Define expected outcomes with acceptable variations:

# Expected results for test case
expected_results = [
    ExpectedResult(check_name="spam-score", score=">= 5.0"),
    ExpectedResult(check_name="phishing-score", score=">= 7.0"),
    ExpectedResult(check_name="ai-analysis", score=">= 1.0"),  # Allow variation
]

diffs = compare_investigations(actual, expected, result_expected=expected_results)

# Assert no unexpected differences
assert len(diffs) == 0, f"Unexpected differences: {[d.key for d in diffs]}"

Example¶

See examples/06_compare_investigations.py for a complete example demonstrating:

Creating expected and actual investigations
Comparing without tolerance rules
Comparing with tolerance rules
Using convenience methods

Run it with:

python examples/06_compare_investigations.py