Documentation Index
Fetch the complete documentation index at: https://docs.ewake.ai/llms.txt
Use this file to discover all available pages before exploring further.
Ewake is an AI SRE agent built on a live map of your production environment. When an alert fires, or when a proactive task is triggered, ewake pulls data from across your stack, correlates signals in real time, and returns ranked root-cause hypotheses with supporting evidence in seconds, directly in Slack.
Technology, The Cartographer Agents
Before ewake can investigate an issue, it needs to understand your production environment. That’s why its first job is to draw the map. Ewake continuously scans your production environment and assembles it into a single, coherent live map, updated as your stack evolves. This is what turns a generic notification like “error spike on payments-api” into a precise diagnosis: “This looks like the same failure pattern as incident #47, triggered by a deploy on the auth-service dependency two hours ago.”How It Works
Ewake’s investigation pipeline runs in five steps, from raw signal to actionable output.Production map construction
Ingested signals are mapped into a persistent knowledge map. This map is the foundation every investigation is built on.
Trigger and correlation
When a trigger fires, ewake pulls relevant context from the knowledge map and runs correlation across signals.
Hypothesis ranking
Ewake produces a ranked list of probable root causes, each backed by direct evidence.
Suggestion-first, not autonomous, Every output is a hypothesis for your engineer to validate.
- You see why, every hypothesis links directly to the evidence that produced it
- You can override, corrections are captured and improve future investigations
- You stay in control, ewake never silences alerts, restarts services, or modifies code without explicit human approval
A first look at ewake
On-Call Agent, alert fires at 3am
On-Call Agent, alert fires at 3am
A Datadog alert fires on
payments-api. Before your on-call engineer opens their laptop, ewake has already posted in the alert thread:- Probable cause: error spike on
payments-apicorrelates with a deploy onauth-service12 minutes ago - Evidence: 3 relevant log lines, latency chart showing degradation from deploy time, link to the commit
- Suggested next steps: open a PR to revert the token validation change in
auth-service, or inspect the new logic in the linked commit before deciding
Scheduled Tasks, morning health brief
Scheduled Tasks, morning health brief
Every day at 8am, before standup, ewake posts a structured report to
#monitoring:- System health: all services green except
checkout-service(error rate +12% vs yesterday) - Anomalies: latency spike on
inventory-apibetween 02:00–03:30 UTC, self-resolved - Notable changes: 2 deploys overnight on
pricing-service, no correlated degradation - Risk signal:
checkout-servicehas shown this pattern 3 times in the past 30 days, each time preceded by a full incident
Incident Response, war room, P0 in progress
Incident Response, war room, P0 in progress
A P0 is ongoing. The team is in a war room channel, debating what’s causing a cascade. Someone mentions
@ewake:@ewake what’s causing the latency spike on checkout-service? Correlate with recent deploys.Ewake responds in seconds with a ranked breakdown, most likely cause first, each hypothesis linked to concrete evidence from Datadog, GitHub, and past incidents. The team stops guessing and starts acting.