ligbox-ops-platform/specs/009-ops-audit-overview/plan.md

# Implementation Plan: Audit Overview Dashboard (009)

**Branch**: `009-ops-audit-overview` | **Date**: 2026-06-08 | **Spec**: [spec.md](./spec.md)

## Summary

Implementar Track A MVP: collectors read-only, persistência SQLite, API overview/scorecard, worker periódico, UI grid estilo Cloudflare. Primeiro tenant: VM112; domínios auto-descobertos via eventos onboarding.

## Technical Context

**Language/Version**: Python 3.11+ (API + worker VM122); JS vanilla (frontend)

**Primary Dependencies**: FastAPI, httpx, dnspython (ou subprocess dig), sqlite3, redis (worker queue existente)

**Storage**: SQLite novas tabelas `audit_domains`, `audit_checks`

**Testing**: `scripts/verify-audit-overview.sh`; mock tenant offline

**Target Platform**: VM122 worker → VM112 API `:8090` + DNS público + HTTPS webmail

**Performance Goals**: Ciclo completo 1 domínio < 30s; overview API < 500ms

**Constraints**: Read-only; LAN para VM112; sem novos containers (worker existente)

## Constitution Check

| Princípio | Status |
|-----------|--------|
| IV. Mail vs Ops | ✅ PASS — collectors read-only, Ops separado |
| VII. Spec-Driven | ✅ PASS |
| IX. YAGNI | ✅ PASS — SQLite, 8 checks fixos |

## Project Structure

```text
specs/009-ops-audit-overview/
├── spec.md
├── plan.md
├── research.md
├── contracts/audit-api.md
├── checklists/requirements.md
└── tasks.md

api/app/
├── main.py                    # routes /audit/*
├── collectors/
│   ├── __init__.py
│   ├── base.py
│   ├── vm112.py               # carbonio, nginx, cert via portal API
│   ├── dns.py                 # mx, spf, dkim, dmarc
│   └── webmail.py             # HTTP check
└── audit_store.py             # SQLite CRUD

worker/
└── audit_runner.py            # loop ou job redis

frontend/assets/
├── app.js                     # view overview + scorecard drill-down
└── styles.css                 # cards health grid
```

## Phase 0: Research

Ver [research.md](./research.md).

## Phase 1: Data Model

```sql
CREATE TABLE audit_domains (
  id INTEGER PRIMARY KEY,
  tenant_id INTEGER NOT NULL,
  domain TEXT NOT NULL,
  source TEXT NOT NULL DEFAULT 'onboarding',
  created_at TEXT NOT NULL,
  UNIQUE(tenant_id, domain)
);

CREATE TABLE audit_checks (
  id INTEGER PRIMARY KEY,
  tenant_id INTEGER NOT NULL,
  domain TEXT NOT NULL,
  check_id TEXT NOT NULL,
  status TEXT NOT NULL,
  message TEXT,
  evidence TEXT,
  checked_at TEXT NOT NULL,
  UNIQUE(tenant_id, domain, check_id)
);
```

## Phase 2: Collectors

| Module | Checks |
|--------|--------|
| `vm112.py` | carbonio, nginx_vhost, cert_le |
| `dns.py` | dns_mx, dns_spf, dns_dkim, dns_dmarc |
| `webmail.py` | webmail_http |

Runner: `run_audit(tenant_id, domain) -> dict[check_id, result]`

## Phase 3: API

- `GET /api/v1/audit/overview`
- `GET /api/v1/audit/tenants/{id}/scorecard?domain=`
- `POST /api/v1/audit/run/{tenant_id}?domain=` (manual trigger, ops use)

## Phase 4: Worker

- Env `AUDIT_INTERVAL_SEC=600`
- A cada ciclo: list domains → run_audit → upsert audit_checks
- Auto-register domains from `webhook_events` where event in (`account.created`, `onboarding.completed`)

## Phase 5: UI

- Nova tab **Overview** ou substituir Infra básica
- Grid cards: tenant name, score X/8, status badge, last audit
- Click → scorecard modal/panel com 8 rows

## Implementation Phases (time estimate)

| Phase | Tasks | ~Time |
|-------|-------|-------|
| A Schema + store | T001-T003 | 1h |
| B Collectors | T004-T008 | 3h |
| C API | T009-T011 | 1.5h |
| D Worker | T012-T014 | 1.5h |
| E UI | T015-T018 | 2h |
| F Test + deploy | T019-T021 | 1h |

**Total ~10h** — pode paralelizar com 004 após API base pronta.

## Risk & Mitigation

| Risco | Mitigação |
|-------|-----------|
| Portal API lenta | Timeout 10s por check; partial results |
| DNS rate limit | Cache 10 min; sequential checks |
| Falso negativo DKIM | evidence field com TXT encontrado |
| Worker sobrecarga | 1 tenant MVP; queue single-thread |

## Sequencing with 004

- **Paralelo possível**: equipas diferentes (004 portal+funil, 009 worker+UI)
- **Dependência soft**: 009 domain auto-discovery beneficia de 004 `onboarding.completed` mas funciona com `account.created` existente
- **Recomendação**: implementar 004 Phase A primeiro; 009 Phase A-B em paralelo; UI 004+009 na mesma sprint UI