Why We Built Aegis from Scratch

After two years of operating as a managed security service provider, we kept running into the same problems with every third-party SIEM, XDR, and detection platform on the market: they were built for enterprises that could afford $500k/year, they were opaque about how their detection worked, and critically — none of them could prove the integrity of commands sent to deployed agents.

Aegis Sovereign was born out of operational necessity. This post is a technical deep-dive into the architecture decisions we made and why they matter for real-world security.

The Core Problem: Command Trust

Traditional agent-based security tools operate on a model of implicit trust — if a command comes from the management server, the agent executes it. This creates a catastrophic attack surface: compromise the server, compromise every endpoint.

Aegis solves this with asymmetric cryptographic command signing using ECDSA P-256:

// Every command is signed by the Brain
const signature = crypto.sign("sha256", Buffer.from(command), privateKey);

// Every agent verifies before executing
const isValid = crypto.verify("sha256", Buffer.from(command), pinnedPublicKey, signature);
if (!isValid) throw new SecurityError("Command integrity check failed");

The private key never leaves the Aegis Brain. Agents only store the public key. A compromised agent cannot be used to forge commands to other agents. A compromised management server cannot issue unsigned commands.

Real-Time Stateful SIEM

Most SIEMs process logs asynchronously — events arrive, get parsed, get stored, then queries run against stored data. Detection latency is measured in minutes. Aegis evaluates Sigma rules in-memory on telemetry arrival.

The detection pipeline:

AGENT  →  TLS 1.3 →  INTAKE  →  NORMALISE  →  SIGMA ENGINE  →  ALERT
                                                ↓
                                         GRAPH (lateral pivot tracking)

The graph engine tracks cross-host relationships in real-time. When workstation A authenticates to server B which then makes an outbound connection to a known C2, the entire chain is correlated and alerted as a single threat narrative — not three separate low-priority alerts that an analyst might miss.

Context-Aware Risk Scoring

Raw CVSS scores are meaningless without context. A CVSS 9.8 on an isolated dev box is less urgent than a CVSS 6.5 on your internet-facing authentication server that has active exploitation in the wild. Aegis scores risk as:

Risk = (CVSS × 0.40) + (CISA_KEV × 0.25) + (asset_criticality × 0.20) + (network_exposure × 0.15) + sigma_chain_bonus

CISA KEV (Known Exploited Vulnerabilities) overlap adds significant weight — these are vulnerabilities being actively exploited right now, not theoretical risks.

1-Click Sovereign Heal

Patch management done manually fails. Aegis automates the entire patching lifecycle with transactional atomicity:

  • Pre-patch snapshot (revert point)
  • Staged deployment with health checks at each stage
  • Post-patch compliance verification
  • Automatic rollback if health checks fail
  • SIEM critical escalation if rollback is triggered

The reboot-aware state machine ensures patch operations survive planned and unplanned reboots without leaving systems in an inconsistent state.

Offline Resilience

Networks go down. Agents must not lose telemetry when they do. Aegis agents use crash-safe local persistence (write-ahead log) during disconnection periods. On reconnect, a sync worker replays the backlog in chronological order, maintaining full event timeline integrity.

This matters enormously during incidents — the last thing you want is a gap in your telemetry during the exact window an attacker was active.

What This Means for Customers

Every Kaimz MDR customer gets Aegis Sovereign included. You are not paying separately for the platform, the agents, or the detection rules. Our analysts work inside Aegis, so when we promise a 4-minute mean time to detect, that is backed by a platform purpose-built for speed.

Aegis is still in active development — but it is already deployed in production environments protecting real customers today. Request a technical briefing to see it in action.