CG
SkillsBuilding SOC Metrics and Kpi Tracking
Start Free
Back to Skills Library
Security Operations๐ŸŸก Intermediate

Building SOC Metrics and Kpi Tracking

Builds SOC performance metrics and KPI tracking dashboards measuring Mean Time to Detect (MTTD), Mean Time to Respond (MTTR), alert quality ratios, analyst productivity, and detection coverage using SIEM data.

7 min read13 code examples

Prerequisites

  • SIEM with 90+ days of incident and alert disposition data
  • Incident ticketing system (ServiceNow, Jira) with timestamp data for incident lifecycle
  • Analyst shift schedules and staffing data
  • ATT&CK Navigator for detection coverage tracking
  • Dashboard platform (Splunk, Grafana, or Power BI)

Building SOC Metrics and KPI Tracking

When to Use

Use this skill when:

  • SOC leadership needs data-driven visibility into operational performance
  • Continuous improvement programs require baseline measurements and trend tracking
  • Executive reporting demands quantified security posture and ROI metrics
  • Staffing decisions need objective workload and capacity data
  • Compliance audits require documented SOC performance evidence

Do not use metrics as punitive measures against analysts โ€” metrics should drive process improvement, not individual performance management.

Prerequisites

  • SIEM with 90+ days of incident and alert disposition data
  • Incident ticketing system (ServiceNow, Jira) with timestamp data for incident lifecycle
  • Analyst shift schedules and staffing data
  • ATT&CK Navigator for detection coverage tracking
  • Dashboard platform (Splunk, Grafana, or Power BI)

Workflow

Step 1: Define Core SOC Metrics Framework

Establish the key metrics aligned to NIST CSF functions:

MetricDefinitionTargetNIST CSF
MTTDTime from threat occurrence to SOC detection<15 minDetect
MTTATime from alert to analyst acknowledgment<5 minRespond
MTTITime from acknowledgment to investigation start<10 minRespond
MTTCTime from investigation to containment<1 hourRespond
MTTRTime from detection to full resolution<4 hoursRecover
FP RatePercentage of false positive alerts<30%Detect
TP RatePercentage of true positive alerts>40%Detect
CoverageATT&CK techniques with active detection>60%Detect
Dwell TimeAttacker time in network before detection<24 hoursDetect
Escalation Rate% of Tier 1 alerts escalated to Tier 2/315-25%Respond

Step 2: Implement MTTD/MTTR Measurement

Mean Time to Detect (MTTD):

index=notable earliest=-30d status_label="Resolved*"
| eval mttd_seconds = _time - orig_time
| where mttd_seconds > 0 AND mttd_seconds < 86400  --- Exclude data quality issues
| stats avg(mttd_seconds) AS avg_mttd,
        median(mttd_seconds) AS med_mttd,
        perc90(mttd_seconds) AS p90_mttd,
        perc95(mttd_seconds) AS p95_mttd
  by urgency
| eval avg_mttd_min = round(avg_mttd / 60, 1)
| eval med_mttd_min = round(med_mttd / 60, 1)
| eval p90_mttd_min = round(p90_mttd / 60, 1)
| table urgency, avg_mttd_min, med_mttd_min, p90_mttd_min

Mean Time to Respond (MTTR):

index=notable earliest=-30d status_label="Resolved*"
| eval mttr_seconds = status_end - _time
| where mttr_seconds > 0 AND mttr_seconds < 604800  --- <7 days
| stats avg(mttr_seconds) AS avg_mttr,
        median(mttr_seconds) AS med_mttr,
        perc90(mttr_seconds) AS p90_mttr
  by urgency
| eval avg_mttr_hours = round(avg_mttr / 3600, 1)
| eval med_mttr_hours = round(med_mttr / 3600, 1)
| eval p90_mttr_hours = round(p90_mttr / 3600, 1)
| table urgency, avg_mttr_hours, med_mttr_hours, p90_mttr_hours

MTTD/MTTR Trend Over Time:

index=notable earliest=-90d status_label="Resolved*"
| eval mttd_min = (_time - orig_time) / 60
| eval mttr_hours = (status_end - _time) / 3600
| bin _time span=1w
| stats avg(mttd_min) AS avg_mttd_min, avg(mttr_hours) AS avg_mttr_hours,
        count AS incidents by _time
| table _time, incidents, avg_mttd_min, avg_mttr_hours

Step 3: Measure Alert Quality and Analyst Productivity

Alert Disposition Analysis:

index=notable earliest=-30d
| stats count AS total,
        sum(eval(if(status_label="Resolved - True Positive", 1, 0))) AS tp,
        sum(eval(if(status_label="Resolved - False Positive", 1, 0))) AS fp,
        sum(eval(if(status_label="Resolved - Benign", 1, 0))) AS benign,
        sum(eval(if(status_label="New" OR status_label="In Progress", 1, 0))) AS pending
| eval tp_rate = round(tp / total * 100, 1)
| eval fp_rate = round(fp / total * 100, 1)
| eval signal_noise = round(tp / (fp + 0.01), 2)
| table total, tp, fp, benign, pending, tp_rate, fp_rate, signal_noise

Analyst Productivity Metrics:

index=notable earliest=-30d status_label="Resolved*"
| stats count AS alerts_resolved,
        avg(eval((status_end - status_transition_time) / 60)) AS avg_triage_min,
        dc(rule_name) AS unique_rule_types
  by owner
| eval alerts_per_day = round(alerts_resolved / 30, 1)
| sort - alerts_resolved
| table owner, alerts_resolved, alerts_per_day, avg_triage_min, unique_rule_types

Shift-Based Workload Distribution:

index=notable earliest=-30d
| eval hour = strftime(_time, "%H")
| eval shift = case(
    hour >= 6 AND hour < 14, "Day (06-14)",
    hour >= 14 AND hour < 22, "Swing (14-22)",
    1=1, "Night (22-06)"
  )
| stats count AS alerts, dc(owner) AS analysts by shift
| eval alerts_per_analyst = round(alerts / analysts / 30, 1)
| table shift, alerts, analysts, alerts_per_analyst

Step 4: Track Detection Coverage

ATT&CK Coverage Score:

| inputlookup detection_rules_attack_mapping.csv
| stats dc(technique_id) AS covered_techniques by tactic
| join tactic type=left [
    | inputlookup attack_techniques_total.csv
    | stats dc(technique_id) AS total_techniques by tactic
  ]
| eval coverage_pct = round(covered_techniques / total_techniques * 100, 1)
| sort tactic
| table tactic, covered_techniques, total_techniques, coverage_pct

Data Source Coverage:

| inputlookup expected_data_sources.csv
| join data_source type=left [
    | tstats count where index=* by sourcetype
    | rename sourcetype AS data_source
    | eval status = "Active"
  ]
| eval source_status = if(isnotnull(status), "Collecting", "MISSING")
| stats count by source_status
| table source_status, count

Step 5: Build Executive Reporting Dashboard

Monthly SOC Executive Summary:

--- Incident summary by category
index=notable earliest=-30d status_label="Resolved*"
| stats count by urgency
| eval order = case(urgency="critical", 1, urgency="high", 2, urgency="medium", 3,
                    urgency="low", 4, urgency="informational", 5)
| sort order

--- Month-over-month comparison
index=notable earliest=-60d
| eval period = if(_time > relative_time(now(), "-30d"), "This Month", "Last Month")
| stats count by period, urgency
| chart sum(count) AS incidents by urgency, period

--- Top 5 incident categories
index=notable earliest=-30d status_label="Resolved - True Positive"
| top rule_name limit=5
| table rule_name, count, percent

Security Posture Scorecard:

| makeresults
| eval metrics = mvappend(
    "MTTD: 8.3 min (Target: <15 min) | STATUS: GREEN",
    "MTTR: 3.2 hours (Target: <4 hours) | STATUS: GREEN",
    "FP Rate: 27% (Target: <30%) | STATUS: GREEN",
    "Detection Coverage: 64% (Target: >60%) | STATUS: GREEN",
    "Analyst Utilization: 78% (Target: 60-80%) | STATUS: GREEN",
    "Incident Backlog: 12 (Target: <20) | STATUS: GREEN"
  )
| mvexpand metrics
| table metrics

Step 6: Implement Continuous Improvement Tracking

Track improvement initiatives and their impact:

--- Improvement initiative tracking
| inputlookup soc_improvement_initiatives.csv
| eval status_color = case(
    status="Completed", "green",
    status="In Progress", "yellow",
    status="Planned", "gray"
  )
| table initiative, start_date, target_date, status, metric_impact, baseline, current

Example initiatives:

initiative,start_date,target_date,status,metric_impact,baseline,current
Risk-Based Alerting,2024-01-15,2024-03-15,Completed,Alert Volume,-84%,287/day
Sigma Rule Library,2024-02-01,2024-04-01,In Progress,ATT&CK Coverage,61%,64%
SOAR Phishing Playbook,2024-02-15,2024-03-30,In Progress,Phishing MTTR,45min,18min
Analyst Training Program,2024-01-01,2024-06-30,In Progress,TP Rate,31%,41%

Key Concepts

TermDefinition
MTTDMean Time to Detect โ€” average time from threat occurrence to SOC alert generation
MTTRMean Time to Respond โ€” average time from detection to incident resolution
MTTAMean Time to Acknowledge โ€” average time from alert generation to analyst assignment
Signal-to-Noise RatioRatio of true positive alerts to total alerts โ€” higher is better
Dwell TimeDuration an attacker remains undetected in the environment โ€” key indicator of detection effectiveness
Analyst UtilizationPercentage of analyst time spent on productive investigation vs. overhead tasks

Tools & Systems

  • Splunk Dashboard Studio: Advanced visualization framework for building interactive SOC metric dashboards
  • Grafana: Open-source analytics and visualization platform supporting multiple data sources
  • Power BI: Microsoft business intelligence tool for executive-level reporting and trend analysis
  • ATT&CK Navigator: MITRE tool for visualizing detection coverage as layered heatmaps
  • ServiceNow Performance Analytics: ITSM analytics module for tracking incident lifecycle metrics

Common Scenarios

  • Quarterly Business Review: Present MTTD/MTTR trends, detection coverage growth, and alert quality improvements
  • Staffing Justification: Use workload metrics to justify additional analyst headcount or shift adjustments
  • Tool ROI Assessment: Compare alert quality and response times before and after new tool deployment
  • Compliance Evidence: Provide documented SOC performance metrics for ISO 27001 or SOC 2 audits
  • Vendor Comparison: Benchmark SOC metrics against industry peers using surveys (SANS, Ponemon)

Output Format

SOC PERFORMANCE REPORT โ€” March 2024
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”

KEY METRICS:
  Metric              Current    Target     Trend    Status
  MTTD                8.3 min    <15 min    -12%     GREEN
  MTTR                3.2 hrs    <4 hrs     -18%     GREEN
  FP Rate             27%        <30%       -5%      GREEN
  TP Rate             41%        >40%       +3%      GREEN
  ATT&CK Coverage     64%        >60%       +3%      GREEN
  Alerts/Analyst/Day  24         <50        -84%     GREEN

INCIDENT SUMMARY:
  Total Incidents:     147 (Critical: 3, High: 23, Medium: 78, Low: 43)
  Avg Resolution:      3.2 hours (Critical: 1.8h, High: 2.9h, Medium: 4.1h)
  SLA Compliance:      94% (Target: >90%)

IMPROVEMENT HIGHLIGHTS:
  [1] RBA deployment reduced daily alerts from 1,847 to 287 (-84%)
  [2] New Sigma rules added 12 ATT&CK techniques to coverage
  [3] SOAR phishing playbook reduced phishing MTTR by 60%

AREAS FOR IMPROVEMENT:
  [1] Lateral movement detection coverage at 58% (below 60% target)
  [2] Night shift MTTD 23% slower than day shift
  [3] 4 critical vulnerability scan tickets overdue on SLA

Verification Criteria

Confirm successful execution by validating:

  • [ ] All prerequisite tools and access requirements are satisfied
  • [ ] Each workflow step completed without errors
  • [ ] Output matches expected format and contains expected data
  • [ ] No security warnings or misconfigurations detected
  • [ ] Results are documented and evidence is preserved for audit

Compliance Framework Mapping

This skill supports compliance evidence collection across multiple frameworks:

  • SOC 2: CC7.1 (Monitoring), CC7.2 (Anomaly Detection), CC7.3 (Incident Identification)
  • ISO 27001: A.12.4 (Logging & Monitoring), A.16.1 (Security Incident Management)
  • NIST 800-53: AU-6 (Audit Review), SI-4 (System Monitoring), IR-5 (Incident Monitoring)
  • NIST CSF: DE.AE (Anomalies & Events), DE.CM (Continuous Monitoring)

Claw GRC Tip: When this skill is executed by a registered agent, compliance evidence is automatically captured and mapped to the relevant controls in your active frameworks.

Deploying This Skill with Claw GRC

Agent Execution

Register this skill with your Claw GRC agent for automated execution:

# Install via CLI
npx claw-grc skills add building-soc-metrics-and-kpi-tracking

# Or load dynamically via MCP
grc.load_skill("building-soc-metrics-and-kpi-tracking")

Audit Trail Integration

When executed through Claw GRC, every step of this skill generates tamper-evident audit records:

  • SHA-256 chain hashing ensures no step can be modified after execution
  • Evidence artifacts (configs, scan results, logs) are automatically attached to relevant controls
  • Trust score impact โ€” successful execution increases your agent's trust score

Continuous Compliance

Schedule this skill for recurring execution to maintain continuous compliance posture. Claw GRC monitors for drift and alerts when re-execution is needed.

Use with Claw GRC Agents

This skill is fully compatible with Claw GRC's autonomous agent system. Deploy it to any registered agent via MCP, and every execution will be logged in the tamper-evident audit trail.

// Load this skill in your agent
npx claw-grc skills add building-soc-metrics-and-kpi-tracking
// Or via MCP
grc.load_skill("building-soc-metrics-and-kpi-tracking")

Tags

socmetricskpimttdmttrdashboardreportingcontinuous-improvement

Related Skills

Security Operations

Building Incident Response Dashboard

6mยทintermediate
Security Operations

Analyzing DNS Logs for Exfiltration

6mยทintermediate
Security Operations

Analyzing Windows Event Logs in Splunk

5mยทintermediate
Security Operations

Building Automated Malware Submission Pipeline

7mยทintermediate
Security Operations

Building Detection Rule with Splunk Spl

5mยทintermediate
Security Operations

Building Detection Rules with Sigma

5mยทintermediate

Skill Details

Domain
Security Operations
Difficulty
intermediate
Read Time
7 min
Code Examples
13

On This Page

When to UsePrerequisitesWorkflowKey ConceptsTools & SystemsCommon ScenariosOutput FormatVerification CriteriaCompliance Framework MappingDeploying This Skill with Claw GRC

Deploy This Skill

Add this skill to your Claw GRC agent and start automating.

Get Started Free โ†’