ligbox-ops-platform/specs/029-agentic-ops-runbooks/spec.md
Ligbox Spec Hub e0959e6fd7 Add Agentic Ops Spec 029: wire API, worker tick, T0/T1, staging stack.
Mounts agents router and schema init, adds VM123 checks, chat copilot,
Desk UI module, isolated docker-compose staging on ports 8180/8192,
and full spec documentation without touching production ports.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-19 23:22:33 +00:00

3.4 KiB
Raw Blame History

Spec 029 — Agentic Ops Runbooks (T0 → T1)

Criado: 2026-06-20
Solicitado por: Roger
Status: Homologação staging (branch 029-agentic-ops-runbooks)
Prioridade: P1 (backlog AG-1)
Sistemas: VM122 (orquestração) · VM123 (Ollama LLM) · VM112/104/Proxmox/pfSense (alvos)


Resumo

Camada Agentic Ops para vigilância 24/7, checks determinísticos (T0), advisor LLM local (T1), e-mail em findings críticos, e copiloto contextual no Desk.

Tier Motor Onde
T0 Checks HTTP/SQLite + fallback texto VM122 API + worker
T1 Ollama qwen2.5:7b-instruct + RAG specs VM123 :11434

Produção Desk: 8080 / 8091não alterado nesta entrega.
Staging homologação: 8180 / 8192 — stack isolada (docker-compose.agentic-staging.yml).


Agentes lógicos (implementação 029)

ID Papel Função
sentinel Health/API Cenários desk, wizard, pfSense, proxmox, ollama, VM123
dispatcher Funil Tickets onboarding travados
curator KB Indexa /specs/**/*.md em SQLite
advisor T1 Sugestões human_action + /chat copiloto
orchestrator Tick Worker cron — dispara todos os cenários

Mapeamento futuro Spec 027 A0A7 permanece na governança RBAC; esta spec entrega o MVP operacional.


Cenários (registry.yaml)

  1. desk.api.health — Desk VM122
  2. wizard.vm112.bundle — VM112 API + portal
  3. pfsense.api.system — pfSense via Traefik
  4. funnel.stuck.onboarding — tickets >24h
  5. integration.webhook.gap — gap VM112→122
  6. proxmox.cluster — VMs 112/122/123/104
  7. ollama.vm123.health — LLM backend
  8. vm123.finance.stack — FOSS + Odoo
  9. vm123.openpanel.bridge — bridge hosting

API (/api/v1/agents/*)

Método Path Auth
GET /health público
GET /scenarios ops view
GET /findings ops view
POST /findings/{id}/ack ops view
GET /action-log ops view
POST /runs/{scenario_id} super_admin, ops_lead, agentic_operator
POST /chat ops view (T1 copiloto)
POST /internal/tick token interno / cron worker

Worker

  • AGENTIC_INTERVAL_SEC=300 (5 min)
  • POST /api/v1/agents/internal/tick via OPS_INTERNAL_TOKEN

Notificações

  • E-mail: findings high/criticalDESK_ROOT_NOTIFY_EMAIL
  • ntfy: opcional via DESK_OPS_NTFY_TOPIC

Variáveis .env

AGENTIC_LLM_ENABLED=true
OLLAMA_BASE_URL=http://10.10.10.123:11434
AGENTIC_LLM_MODEL=qwen2.5:7b-instruct
AGENTIC_EMBED_MODEL=nomic-embed-text
AGENTIC_INTERVAL_SEC=300
AGENTIC_SPECS_ROOT=/opt/ligbox-ops-platform/specs
AGENTIC_CRITICAL_VMIDS=112,122,123,104
VM123_IP=10.10.10.123
OPENPANEL_BRIDGE_URL=http://10.10.10.123:18087

Homologação

# Staging VM122 (portas isoladas)
cd /opt/ligbox-ops-platform-staging
docker compose -f docker-compose.agentic-staging.yml up -d --build
curl -s http://10.10.10.122:8180/api/v1/agents/health
curl -s -X POST http://10.10.10.122:8180/api/v1/agents/internal/tick \
  -H "X-Ops-Internal-Token: $OPS_INTERNAL_TOKEN"

Promover para produção apenas após checklist quickstart.md.


Documentos relacionados

  • Spec 027 — RBAC agentic_operator, A0A7 governança
  • Spec 019 — Console, políticas R0R3
  • contracts/agent-platform-api.md
  • quickstart.md