CG
SkillsPerforming Malware Triage with YARA
Start Free
Back to Skills Library
Malware Analysis🟡 Intermediate

Performing Malware Triage with YARA

Performs rapid malware triage and classification using YARA rules to match file patterns, strings, byte sequences, and structural characteristics against known malware families and suspicious indicators. Covers rule writing, scanning, and integration with analysis pipelines.

7 min read8 code examples

Prerequisites

  • YARA 4.x installed (`apt install yara` or `pip install yara-python`)
  • YARA rule repositories (YARA-Rules, awesome-yara, Malpedia rules, Florian Roth's signature-base)
  • Python 3.8+ with `yara-python` for scripted scanning
  • Sample collection organized in a directory structure for batch scanning
  • Understanding of PE file format, hex patterns, and regular expressions for rule writing

Performing Malware Triage with YARA

When to Use

  • Rapidly classifying a large batch of malware samples against known family signatures
  • Writing detection rules for a newly analyzed malware family based on unique byte patterns
  • Scanning file shares, endpoints, or memory dumps for indicators of a specific threat
  • Building automated triage pipelines that classify samples before manual analysis
  • Hunting for variants of a known threat across an enterprise using YARA scans

Do not use as the sole analysis method; YARA triage identifies known patterns but does not reveal new or unknown malware behaviors.

Prerequisites

  • YARA 4.x installed (apt install yara or pip install yara-python)
  • YARA rule repositories (YARA-Rules, awesome-yara, Malpedia rules, Florian Roth's signature-base)
  • Python 3.8+ with yara-python for scripted scanning
  • Sample collection organized in a directory structure for batch scanning
  • Understanding of PE file format, hex patterns, and regular expressions for rule writing

Workflow

Step 1: Scan Samples with Existing Rule Sets

Apply community and commercial YARA rules to classify samples:

# Scan a single file
yara -s malware_rules.yar suspect.exe

# Scan a directory of samples
yara -r malware_rules.yar /path/to/samples/

# Scan with multiple rule files
yara -r rules/apt_rules.yar rules/ransomware_rules.yar rules/trojan_rules.yar suspect.exe

# Scan with timeout (prevent hanging on large files)
yara -t 30 malware_rules.yar suspect.exe

# Scan and show matching strings
yara -s -r malware_rules.yar suspect.exe

# Scan with compiled rules (faster for repeated scans)
yarac malware_rules.yar compiled_rules.yarc
yara compiled_rules.yarc suspect.exe
# Download community rule sets
git clone https://github.com/Yara-Rules/rules.git yara-community-rules
git clone https://github.com/Neo23x0/signature-base.git signature-base

# Scan with signature-base
yara -r signature-base/yara/*.yar suspect.exe

Step 2: Write Rules for Unique String Patterns

Create YARA rules based on strings extracted during malware analysis:

rule MalwareX_Strings {
    meta:
        description = "Detects MalwareX based on unique strings"
        author = "analyst"
        date = "2025-09-15"
        reference = "Internal Analysis Report #1547"
        hash = "e3b0c44298fc1c149afbf4c8996fb924"
        tlp = "WHITE"

    strings:
        // C2 URL pattern
        $url1 = "/gate.php?id=" ascii
        $url2 = "/panel/connect.php" ascii

        // Unique mutex name
        $mutex = "Global\\CryptLocker_2025" ascii wide

        // User-Agent string
        $ua = "Mozilla/5.0 (compatible; MSIE 10.0)" ascii

        // Registry persistence path
        $reg = "Software\\Microsoft\\Windows\\CurrentVersion\\Run\\WindowsUpdate" ascii

        // Campaign identifier
        $campaign = "campaign_2025_q3" ascii

    condition:
        uint16(0) == 0x5A4D and      // PE file (MZ header)
        filesize < 500KB and          // Size constraint
        ($url1 or $url2) and          // At least one C2 URL
        ($mutex or $campaign) and     // Campaign identifier
        $ua                           // Specific User-Agent
}

Step 3: Write Rules for Byte Patterns

Create rules matching specific code sequences:

rule MalwareX_Decryptor {
    meta:
        description = "Detects MalwareX XOR decryption routine"
        author = "analyst"
        date = "2025-09-15"

    strings:
        // XOR decryption loop (x86 assembly)
        // mov al, [esi+ecx]
        // xor al, [edi+ecx]
        // mov [esi+ecx], al
        // inc ecx
        // cmp ecx, edx
        // jl loop
        $xor_loop = { 8A 04 0E 32 04 0F 88 04 0E 41 3B CA 7C F3 }

        // RC4 KSA initialization (256-byte loop)
        $rc4_ksa = { 33 C0 88 04 ?8 40 3D 00 01 00 00 7? }

        // Embedded RSA public key marker
        $rsa_key = { 06 02 00 00 00 A4 00 00 52 53 41 31 }  // PUBLICKEYBLOB

    condition:
        uint16(0) == 0x5A4D and
        ($xor_loop or $rc4_ksa) and
        $rsa_key
}

Step 4: Write Rules with PE Module

Leverage YARA's PE module for structural detection:

import "pe"
import "hash"
import "math"

rule MalwareX_PE_Characteristics {
    meta:
        description = "Detects MalwareX by PE structure and imports"
        author = "analyst"

    condition:
        pe.is_pe and

        // Compiled within specific timeframe
        pe.timestamp > 1693526400 and   // After 2023-09-01
        pe.timestamp < 1727740800 and   // Before 2024-10-01

        // Specific import hash
        pe.imphash() == "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6" or

        // Suspicious import combination
        (
            pe.imports("kernel32.dll", "VirtualAllocEx") and
            pe.imports("kernel32.dll", "WriteProcessMemory") and
            pe.imports("kernel32.dll", "CreateRemoteThread") and
            pe.imports("wininet.dll", "InternetOpenA")
        ) or

        // High entropy .text section (packed)
        (
            for any section in pe.sections : (
                section.name == ".text" and
                math.entropy(section.raw_data_offset, section.raw_data_size) > 7.0
            )
        )
}

rule MalwareX_Rich_Header {
    meta:
        description = "Detects MalwareX by Rich header hash"

    condition:
        pe.is_pe and
        hash.md5(pe.rich_signature.clear_data) == "abc123def456abc123def456abc123de"
}

Step 5: Batch Triage with Python

Automate scanning of sample collections:

import yara
import os
import json
import hashlib
from datetime import datetime

# Compile all rule files
rule_files = {
    "apt": "rules/apt_rules.yar",
    "ransomware": "rules/ransomware_rules.yar",
    "trojan": "rules/trojan_rules.yar",
    "custom": "rules/custom_rules.yar",
}
rules = yara.compile(filepaths=rule_files)

# Scan sample directory
results = []
sample_dir = "/path/to/samples"

for filename in os.listdir(sample_dir):
    filepath = os.path.join(sample_dir, filename)
    if not os.path.isfile(filepath):
        continue

    with open(filepath, "rb") as f:
        data = f.read()
        sha256 = hashlib.sha256(data).hexdigest()

    matches = rules.match(filepath)

    result = {
        "filename": filename,
        "sha256": sha256,
        "size": len(data),
        "matches": [],
        "classification": "UNKNOWN",
    }

    for match in matches:
        result["matches"].append({
            "rule": match.rule,
            "namespace": match.namespace,
            "tags": match.tags,
            "strings": [(hex(s[0]), s[1], s[2].decode("utf-8", errors="replace")[:100])
                       for s in match.strings] if match.strings else []
        })

    if result["matches"]:
        result["classification"] = result["matches"][0]["namespace"].upper()

    results.append(result)

# Summary
classified = sum(1 for r in results if r["classification"] != "UNKNOWN")
print(f"Scanned: {len(results)} samples")
print(f"Classified: {classified} ({classified/len(results)*100:.1f}%)")
print(f"Unknown: {len(results)-classified}")

# Export results
with open("triage_results.json", "w") as f:
    json.dump(results, f, indent=2)

Step 6: Validate and Optimize Rules

Test rules for false positives and performance:

# Test rule syntax
yara -C custom_rules.yar

# Scan known-clean directory to check false positives
yara -r custom_rules.yar /path/to/clean_files/ > false_positives.txt
wc -l false_positives.txt

# Benchmark rule performance
time yara -r custom_rules.yar /path/to/large_sample_collection/

# Profile individual rule performance
yara -p custom_rules.yar suspect.exe

Key Concepts

TermDefinition
YARA RulePattern matching rule defining strings, byte sequences, and conditions that identify a specific file or malware family
ConditionBoolean expression combining string matches, file properties, and module functions to determine if a rule matches
Hex StringByte pattern with optional wildcards (??) and jumps ([N-M]) for matching machine code or binary data
PE ModuleYARA module providing access to PE file properties (imports, sections, timestamps, resources) for structural matching
ImphashMD5 hash of a PE file's import table; samples from the same family often share import hashes
Rich HeaderUndocumented PE structure containing compiler/linker metadata; consistent within malware build environments
YARA-CCompiled YARA rule format enabling faster scanning by pre-compiling rules for repeated use

Tools & Systems

  • YARA: Pattern matching engine for identifying and classifying malware based on text, hex, and structural patterns
  • yara-python: Python bindings for YARA enabling scripted scanning, rule compilation, and integration with analysis pipelines
  • yarGen: Automatic YARA rule generator that creates rules from malware samples by identifying unique strings and opcodes
  • YARA-Rules (GitHub): Community-maintained repository of YARA rules covering malware families, exploits, and suspicious indicators
  • Malpedia YARA: Curated YARA rules from the Malpedia malware encyclopedia with high-quality family-specific rules

Common Scenarios

Scenario: Creating Detection Rules for a New Malware Family

Context: Reverse engineering of a new malware sample has identified unique strings, byte patterns, and PE characteristics. YARA rules are needed for enterprise-wide hunting and ongoing detection.

Approach:

  1. Extract unique strings from the unpacked binary (C2 URLs, mutex names, registry paths)
  2. Identify unique byte sequences from the encryption routine or C2 protocol (from Ghidra analysis)
  3. Record PE characteristics (imphash, Rich header hash, section names, compilation timestamp range)
  4. Write a YARA rule combining string, byte pattern, and PE module conditions
  5. Test against the known malware samples to confirm true positive detection
  6. Test against a clean file corpus (Windows system files, common applications) to verify zero false positives
  7. Deploy to enterprise scanning infrastructure and threat intelligence platform

Pitfalls:

  • Writing rules too specific to a single sample (will not detect variants with minor changes)
  • Writing rules too generic (matching legitimate software, causing false positives)
  • Using strings that appear in common libraries or frameworks (e.g., OpenSSL strings)
  • Not testing on a sufficiently large clean corpus before deployment

Output Format

YARA TRIAGE RESULTS
=====================
Scan Date:        2025-09-15
Rule Sets:        apt_rules (847 rules), ransomware_rules (312 rules),
                  trojan_rules (1,204 rules), custom_rules (45 rules)
Samples Scanned:  2,500
Processing Time:  47 seconds

CLASSIFICATION SUMMARY
APT:              12 samples (0.5%)
Ransomware:       187 samples (7.5%)
Trojan:           423 samples (16.9%)
Unknown:          1,878 samples (75.1%)

TOP MATCHING RULES
Rule                         Matches  Family
MalwareX_C2_Beacon           45       MalwareX
LockBit3_Ransom_Note         38       LockBit 3.0
Emotet_Epoch5_Loader         32       Emotet
CobaltStrike_Beacon_Config   28       Cobalt Strike
QakBot_DLL_Loader            25       QakBot

SAMPLE DETAIL
File:    suspect.exe
SHA-256: e3b0c44298fc1c149afbf4c8996fb924...
Matches:
  [1] MalwareX_Strings (custom)
      - $url1 at 0x4A20: "/gate.php?id="
      - $mutex at 0x5100: "Global\\CryptLocker_2025"
  [2] MalwareX_Decryptor (custom)
      - $xor_loop at 0x401200: { 8A 04 0E 32 04 0F ... }
  [3] MalwareX_PE_Characteristics (custom)
      - PE import combination matched
Classification: MALWAREX (HIGH CONFIDENCE)

Verification Criteria

Confirm successful execution by validating:

  • [ ] All prerequisite tools and access requirements are satisfied
  • [ ] Each workflow step completed without errors
  • [ ] Output matches expected format and contains expected data
  • [ ] No security warnings or misconfigurations detected
  • [ ] Results are documented and evidence is preserved for audit

Compliance Framework Mapping

This skill supports compliance evidence collection across multiple frameworks:

  • SOC 2: CC7.2 (Anomaly Detection), CC7.4 (Incident Response)
  • ISO 27001: A.12.2 (Malware Protection), A.16.1 (Security Incident Management)
  • NIST 800-53: SI-3 (Malicious Code Protection), IR-4 (Incident Handling)
  • NIST CSF: DE.CM (Continuous Monitoring), RS.AN (Analysis)

Claw GRC Tip: When this skill is executed by a registered agent, compliance evidence is automatically captured and mapped to the relevant controls in your active frameworks.

Deploying This Skill with Claw GRC

Agent Execution

Register this skill with your Claw GRC agent for automated execution:

# Install via CLI
npx claw-grc skills add performing-malware-triage-with-yara

# Or load dynamically via MCP
grc.load_skill("performing-malware-triage-with-yara")

Audit Trail Integration

When executed through Claw GRC, every step of this skill generates tamper-evident audit records:

  • SHA-256 chain hashing ensures no step can be modified after execution
  • Evidence artifacts (configs, scan results, logs) are automatically attached to relevant controls
  • Trust score impact — successful execution increases your agent's trust score

Continuous Compliance

Schedule this skill for recurring execution to maintain continuous compliance posture. Claw GRC monitors for drift and alerts when re-execution is needed.

Use with Claw GRC Agents

This skill is fully compatible with Claw GRC's autonomous agent system. Deploy it to any registered agent via MCP, and every execution will be logged in the tamper-evident audit trail.

// Load this skill in your agent
npx claw-grc skills add performing-malware-triage-with-yara
// Or via MCP
grc.load_skill("performing-malware-triage-with-yara")

Tags

malwareYARAtriageclassificationpattern-matching

Related Skills

Malware Analysis

Analyzing Macro Malware in Office Documents

7m·intermediate
Malware Analysis

Analyzing Malware Behavior with Cuckoo Sandbox

6m·intermediate
Malware Analysis

Analyzing Network Traffic of Malware

7m·intermediate
Malware Analysis

Analyzing Packed Malware with UPX Unpacker

7m·intermediate
Malware Analysis

Detecting Fileless Malware Techniques

8m·intermediate
Malware Analysis

Detecting Process Injection Techniques

8m·intermediate

Skill Details

Domain
Malware Analysis
Difficulty
intermediate
Read Time
7 min
Code Examples
8

On This Page

When to UsePrerequisitesWorkflowKey ConceptsTools & SystemsCommon ScenariosOutput FormatVerification CriteriaCompliance Framework MappingDeploying This Skill with Claw GRC

Deploy This Skill

Add this skill to your Claw GRC agent and start automating.

Get Started Free →