CodeRadius LogoCodeRadius Docs
Explore

SPOFs & Data Gravity

Identify Single Points of Failure, data monoliths, and service bottlenecks across your architecture — ranked by a 0–100 SPOF score.

SPOFs & Data Gravity

The SPOFs domain of the Architecture Dashboard identifies the most dangerous concentration points in your architecture: resources and services where a single failure would cascade across the organization. It answers the question every Principal Engineer dreads: "If this one thing breaks, how many teams feel it?"

# Full dashboard
cr ui

# SPOFs domain only
cr ui --focus gravity

What It Detects

The Gravity analysis engine traverses the architecture graph to identify two classes of architectural risk:

Data Monoliths

A Data Monolith is a shared data resource (database table, message channel, API endpoint) that is accessed by a disproportionate number of services. When this resource changes schema, experiences downtime, or requires migration, the blast radius is massive.

Examples:

  • A users table read by 14 services and written by 3
  • An order.created message channel consumed by 9 downstream services
  • An internal API endpoint called by every service in the fleet

Service Bottlenecks

A Service Bottleneck is a service that sits on a critical path in the dependency graph — either as a heavily-consumed API provider or as the sole writer to a shared resource. When this service goes down or introduces a breaking change, the downstream impact is disproportionate.

Examples:

  • An identity-service that 20 other services depend on for authentication
  • A payments-gateway that is the sole producer for 5 downstream event consumers
  • A shared utility service with 40+ downstream function callers

The SPOF Score

Every data monolith and service bottleneck is assigned a SPOF Score from 0 to 100, computed from the topology of the architecture graph:

Score RangeRisk LevelInterpretation
80–100🔴 CriticalA failure here would cascade across multiple teams. Requires redundancy, circuit breakers, or architectural refactoring.
50–79🟡 ElevatedSignificant concentration of dependents. Monitor closely and plan mitigation.
20–49🟢 ModerateNormal architectural concentration. Acceptable for most systems.
0–19⚪ LowMinimal concentration risk.

The score factors in:

  • Fan-in count — How many distinct services read from or depend on this resource
  • Fan-out count — How many distinct services write to or produce for this resource
  • Cross-team span — Whether the dependents span multiple teams (higher risk than single-team concentration)
  • Resource type weight — Database tables carry higher risk than API endpoints due to schema coupling

Dashboard Rendering

The SPOFs domain renders two leaderboard sections:

Top Data Monoliths

A ranked list of the most concentrated data resources, ordered by SPOF score. Each entry shows:

FieldDescription
NameThe resource name (e.g., orders, user.created)
TypeThe resource type (DataTable, MessageChannel, APIEndpoint)
SPOF ScoreThe 0–100 risk score
TeamsTeams that own services consuming or producing for this resource
Write ServicesServices that write to this resource
Read ServicesServices that read from this resource

Top Service Bottlenecks

A ranked list of the most concentrated services, ordered by SPOF score. Each entry shows the same structure with Dependent Services instead of Read/Write services.


Header Stats

When the SPOFs domain is active, the header stats bar shows:

MetricDescription
Data MonolithsTotal number of data resources identified as concentration risks
Service BottlenecksTotal number of services identified as dependency bottlenecks

The global header bar contributes:

MetricDescription
Known VulnerabilitiesCount of resources or services with SPOF score ≥ 80 (critical risk)

Interpreting the Results

When a High SPOF Score Is Acceptable

Not every high score requires action. Some architectural patterns naturally create concentration:

  • Authentication services — It is normal for an identity service to be a bottleneck. Mitigation here is operational (redundancy, caching, circuit breakers) rather than architectural.
  • Event buses — A central message broker will naturally concentrate connections. The risk is in the bus itself, not the topology.
  • Shared databases in legacy monoliths — If you are already migrating toward microservices, the data monolith will shrink over time.

When to Act

Prioritize action when:

  1. Cross-team SPOF score > 80 — A resource owned by Team A is depended on by Teams B, C, and D with no formal data contract
  2. No circuit breakers — The dependent services have no graceful degradation for the bottleneck
  3. Schema coupling — Multiple services read directly from a shared table without an abstraction layer
  4. No ownership — The bottleneck resource has no clear owner (see Governance gp-004)

Mitigation Strategies

PatternStrategy
Shared databaseIntroduce a data contract API. Let one service own the schema and expose controlled read/write endpoints.
Central API bottleneckAdd caching, circuit breakers, and bulkhead isolation. Consider event-driven decoupling for non-real-time consumers.
Single-writer resourceEvaluate whether the producer can be horizontally scaled or if the resource needs a redundant writer.

Programmatic Access

The Gravity analysis is also available programmatically via the MCP server:

# Query the gravity analysis via the MCP tool
cr chat
> "Which resources in our architecture have the highest SPOF scores?"

The analyze_architecture_gravity MCP tool returns the same data as the dashboard in structured JSON format, making it available to AI agents during their reasoning loops.


Further Reading

On this page