Enterprise Readiness Matrix¶

HermesX — Enterprise Agent Runtime / SaaS Control Plane Version: v1.3.0 | Last Updated: 2026-05-07

1. Multi-Tenancy¶

Capability: Full tenant isolation across all data paths
Status: Done
Evidence: - 10 tables with Row-Level Security (RLS) enabled - 37 RLS policies enforcing app.current_tenant session variable - withTenantTx() helper sets SET LOCAL app.current_tenant per transaction - TenantMiddleware derives tenant_id from AuthContext (not headers) - 58 integration tests validating cross-tenant isolation - Dedicated test suites: tenant_isolation_test.go, rls_policy_test.go, cross_tenant_attack_test.go

Risk: No automated RLS regression test in CI (requires live PG)
Next Action: CI integration test job with PG service already wired

2. Auth / API Key / RBAC¶

Capability: Chain authentication with scoped API keys and role-based access
Status: Done
Evidence: - Auth chain: Static Token → API Key (SHA-256 hashed) → JWT - API Key supports: scopes, roles, expiry, revocation - Roles: super_admin, admin, owner, user, auditor - HasScope() enforces scope-based access per endpoint - Tenant boundary enforcement: non-admin callers cannot specify foreign tenant_id - generateRawKey() panics on crypto/rand.Read failure

Risk: JWT validation is prepared but not production-tested with real IdP
Next Action: OIDC integration test with Keycloak/Auth0

3. Rate Limit / Quota¶

Capability: Per-tenant and per-user rate limiting with distributed enforcement
Status: Done
Evidence: - RateLimiter interface with Redis sliding window (Lua atomic script) - DualLayerLimiter for simultaneous tenant + user limits - Local LRU fallback when Redis unavailable - Per-tenant override via TenantLimitFn - Prometheus counter: hermes_rate_limit_rejected_total - Pressure tested: 100 concurrent × 5 minutes, accuracy > 95%

Risk: No per-endpoint granularity (all requests share one bucket)
Next Action: Endpoint-aware rate limiting (P2)

4. Metering / Billing¶

Capability: Token-level usage recording with aggregation queries
Status: Done
Evidence: - UsageRecorder with async batch persistence (buffered channel + periodic flush) - usage_records table: tenant_id, user_id, session_id, model, input/output tokens, cost - UsageV2Handler: GET /v1/usage with from/to/granularity (hour/day/month) - Per-tenant aggregation queries with time bucketing - Migrations v62-64 for schema + indexes

Risk: No billing system integration (recording only, no invoicing)
Next Action: Stripe/billing webhook integration (P2)

5. Audit / Compliance¶

Capability: Immutable audit trail for all state-changing operations
Status: Done
Evidence: - audit_logs table with RLS - All POST/PUT/DELETE operations generate audit entries - Fields: actor, action, resource_type, resource_id, metadata, tenant_id, timestamp - AuditLogStore with Create/List/filtering - GET /v1/audit-logs gated behind auditor role - Execution receipts provide tool-level audit trail

Risk: No tamper-proof guarantee (append-only but not cryptographically chained)
Next Action: Optional hash-chain verification for regulated environments (P2)

Capability: Full data export and deletion per tenant
Status: Done
Evidence: - GET /v1/gdpr/export — exports all tenant data as JSON - DELETE /v1/gdpr/delete — transactional deletion of sessions, messages, memories, api_keys, audit_logs, cron_jobs - Deletion cascades through all related tables in single transaction - Admin-only access control

Risk: No data retention policy engine (manual deletion only)
Next Action: Automated retention sweep based on tenant configuration (P2)

7. Observability¶

Capability: Full-stack metrics, tracing, and structured logging
Status: Done
Evidence: - Prometheus metrics: 11 custom business metrics (HTTP, LLM, tools, rate limiting, sessions, store) - OpenTelemetry tracing: HTTP → middleware → store → LLM full chain - PGX tracer for database query spans - OTel Collector config: traces → Jaeger, metrics → Prometheus - Structured JSON logging via slog - Memory limiter (512MB) on collector - Alert rule examples in deployment guide

Risk: No pre-built Grafana dashboards
Next Action: Dashboard JSON templates (P1)

8. Sandbox Isolation¶

Capability: Isolated code execution with resource limits
Status: Done
Evidence: - Local sandbox: process isolation with timeout, output truncation (50KB), env stripping - Docker sandbox: --network=none, --memory, --cpus limits - Per-tenant SandboxPolicy (JSONB on tenants table): enabled, max_timeout, allowed_tools, restrict_network - AllowedTools enforcement: non-whitelisted tools rejected - Max tool calls limit (default 50) - Skill metadata sandbox: required triggers Docker execution

Risk: Docker sandbox requires Docker-in-Docker or socket mount in containerized deploys
Next Action: gVisor/Firecracker evaluation for production (P2)

9. Backup / Disaster Recovery¶

Capability: Automated backup with point-in-time restore capability
Status: Done
Evidence: - scripts/backup/backup.sh: pg_dump + gzip, 7-day retention, configurable output dir - scripts/backup/restore.sh: single-transaction restore with post-migration - deploy/pitr/ templates for WAL archiving - Production compose includes volume persistence for PG/Redis

Risk: No automated DR drill in CI; RPO depends on backup frequency
Next Action: Automated weekly restore verification (P1)

10. CI / Security¶

Capability: Automated build, test, security scan pipeline
Status: Done
Evidence: - .github/workflows/ci.yml: build + vet + test + coverage + race detection + Docker push - .github/workflows/security.yml: govulncheck + gosec + trivy (weekly + PR) - Integration test job with PG/Redis/MinIO services - Build matrix: linux/darwin × amd64/arm64 + windows/amd64 - 21 test packages, all passing

Risk: No DAST or container runtime scanning
Next Action: Add OWASP ZAP scan against deployed instance (P2)

11. Known Risks¶

Risk	Severity	Mitigation
JWT/OIDC not production-tested	Medium	Auth chain works; needs IdP integration test
No billing integration	Low	Usage recording is complete; billing is business logic
Docker sandbox in containers	Medium	Local sandbox is always available as fallback
Single PG writer (no read replicas)	Low	Sufficient for < 500 req/s; PgBouncer for connection pooling
No Grafana dashboards	Low	Metrics exposed; dashboards are configuration

12. Roadmap¶

v1.4.0 (P1 — Next)¶

[ ] OIDC integration test with real IdP
[ ] Grafana dashboard templates
[ ] Automated DR verification in CI
[ ] Endpoint-aware rate limiting

v2.0.0 (P2 — Future)¶

[ ] Billing/invoicing webhook
[ ] Data retention policy engine
[ ] gVisor sandbox backend
[ ] Read replica support
[ ] Multi-region deployment guide

Store Interface Coverage¶

The Store interface covers all core SaaS state objects:

Sub-Store	Operations	RLS
Sessions	Create, Get, List, Delete, AppendMessage, ListMessages	Yes
Tenants	Create, Get, List, Update, Delete	Yes
APIKeys	Create, Get, List, Revoke, GetByHash	Yes
AuditLogs	Create, List	Yes
Memories	Set, Get, List, Delete	Yes
UserProfiles	Get, Set, Delete	Yes
CronJobs	Create, Get, List, Update, Delete	Yes
Roles	Assign, Revoke, List	Yes
ExecutionReceipts	Create, Get, List, GetByIdempotencyID	Yes

API Surface¶

22 documented endpoints across: - Health (3): /v1/health, /v1/health/live, /v1/health/ready - Chat (1): /v1/chat/completions (OpenAI-compatible) - Sessions (1): /v1/sessions - Tenants (1): /v1/tenants - API Keys (1): /v1/api-keys - Audit (1): /v1/audit-logs - Execution Receipts (1): /v1/execution-receipts - Usage (1): /v1/usage - GDPR (2): /v1/gdpr/export, /v1/gdpr/delete - Metrics (1): /v1/metrics - OpenAPI (1): /v1/openapi - Admin (4): /admin/v1/tenants, /admin/v1/sandbox-policy, etc. - Me (1): /v1/me

Full OpenAPI 3.0.3 spec available at GET /v1/openapi.