SPOFs & Data Gravity
Identify Single Points of Failure, data monoliths, and service bottlenecks across your architecture — ranked by a 0–100 SPOF score.
SPOFs & Data Gravity
The SPOFs domain of the Architecture Dashboard identifies the most dangerous concentration points in your architecture: resources and services where a single failure would cascade across the organization. It answers the question every Principal Engineer dreads: "If this one thing breaks, how many teams feel it?"
# Full dashboard
cr ui
# SPOFs domain only
cr ui --focus gravityWhat It Detects
The Gravity analysis engine traverses the architecture graph to identify two classes of architectural risk:
Data Monoliths
A Data Monolith is a shared data resource (database table, message channel, API endpoint) that is accessed by a disproportionate number of services. When this resource changes schema, experiences downtime, or requires migration, the blast radius is massive.
Examples:
- A
userstable read by 14 services and written by 3 - An
order.createdmessage channel consumed by 9 downstream services - An internal API endpoint called by every service in the fleet
Service Bottlenecks
A Service Bottleneck is a service that sits on a critical path in the dependency graph — either as a heavily-consumed API provider or as the sole writer to a shared resource. When this service goes down or introduces a breaking change, the downstream impact is disproportionate.
Examples:
- An
identity-servicethat 20 other services depend on for authentication - A
payments-gatewaythat is the sole producer for 5 downstream event consumers - A shared utility service with 40+ downstream function callers
The SPOF Score
Every data monolith and service bottleneck is assigned a SPOF Score from 0 to 100, computed from the topology of the architecture graph:
| Score Range | Risk Level | Interpretation |
|---|---|---|
| 80–100 | 🔴 Critical | A failure here would cascade across multiple teams. Requires redundancy, circuit breakers, or architectural refactoring. |
| 50–79 | 🟡 Elevated | Significant concentration of dependents. Monitor closely and plan mitigation. |
| 20–49 | 🟢 Moderate | Normal architectural concentration. Acceptable for most systems. |
| 0–19 | ⚪ Low | Minimal concentration risk. |
The score factors in:
- Fan-in count — How many distinct services read from or depend on this resource
- Fan-out count — How many distinct services write to or produce for this resource
- Cross-team span — Whether the dependents span multiple teams (higher risk than single-team concentration)
- Resource type weight — Database tables carry higher risk than API endpoints due to schema coupling
Dashboard Rendering
The SPOFs domain renders two leaderboard sections:
Top Data Monoliths
A ranked list of the most concentrated data resources, ordered by SPOF score. Each entry shows:
| Field | Description |
|---|---|
| Name | The resource name (e.g., orders, user.created) |
| Type | The resource type (DataTable, MessageChannel, APIEndpoint) |
| SPOF Score | The 0–100 risk score |
| Teams | Teams that own services consuming or producing for this resource |
| Write Services | Services that write to this resource |
| Read Services | Services that read from this resource |
Top Service Bottlenecks
A ranked list of the most concentrated services, ordered by SPOF score. Each entry shows the same structure with Dependent Services instead of Read/Write services.
Header Stats
When the SPOFs domain is active, the header stats bar shows:
| Metric | Description |
|---|---|
| Data Monoliths | Total number of data resources identified as concentration risks |
| Service Bottlenecks | Total number of services identified as dependency bottlenecks |
The global header bar contributes:
| Metric | Description |
|---|---|
| Known Vulnerabilities | Count of resources or services with SPOF score ≥ 80 (critical risk) |
Interpreting the Results
When a High SPOF Score Is Acceptable
Not every high score requires action. Some architectural patterns naturally create concentration:
- Authentication services — It is normal for an identity service to be a bottleneck. Mitigation here is operational (redundancy, caching, circuit breakers) rather than architectural.
- Event buses — A central message broker will naturally concentrate connections. The risk is in the bus itself, not the topology.
- Shared databases in legacy monoliths — If you are already migrating toward microservices, the data monolith will shrink over time.
When to Act
Prioritize action when:
- Cross-team SPOF score > 80 — A resource owned by Team A is depended on by Teams B, C, and D with no formal data contract
- No circuit breakers — The dependent services have no graceful degradation for the bottleneck
- Schema coupling — Multiple services read directly from a shared table without an abstraction layer
- No ownership — The bottleneck resource has no clear owner (see Governance
gp-004)
Mitigation Strategies
| Pattern | Strategy |
|---|---|
| Shared database | Introduce a data contract API. Let one service own the schema and expose controlled read/write endpoints. |
| Central API bottleneck | Add caching, circuit breakers, and bulkhead isolation. Consider event-driven decoupling for non-real-time consumers. |
| Single-writer resource | Evaluate whether the producer can be horizontally scaled or if the resource needs a redundant writer. |
Programmatic Access
The Gravity analysis is also available programmatically via the MCP server:
# Query the gravity analysis via the MCP tool
cr chat
> "Which resources in our architecture have the highest SPOF scores?"The analyze_architecture_gravity MCP tool returns the same data as the dashboard in structured JSON format, making it available to AI agents during their reasoning loops.
Further Reading
- Architecture Dashboard — The unified dashboard containing the SPOFs domain
- Impact Evaluation — Predict the blast radius of changes to high-SPOF resources
- MCP Server —
analyze_architecture_gravity— Programmatic access to gravity data
System Registry
The auto-generated service catalog — a live inventory of every repository, service, and team in your architecture graph.
Blast Radius Scoring
Understand how CodeRadius classifies the risk of changing or breaking an architectural node using the Downstream Gravity Score and Impact Tiers (T0–T4).