Add Agentic Ops Spec 029: wire API, worker tick, T0/T1, staging stack.

Mounts agents router and schema init, adds VM123 checks, chat copilot,
Desk UI module, isolated docker-compose staging on ports 8180/8192,
and full spec documentation without touching production ports.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Ligbox Spec Hub 2026-06-19 23:22:33 +00:00
parent acaacce705
commit e0959e6fd7
20 changed files with 867 additions and 74 deletions

View file

@ -123,7 +123,7 @@
| **DESK-6** | P1 | Billing visibilidade 💳 (023 Fase 1) | ✅ |
| **INT-1** | P2 | OTRS API bridge — Spec 011 | 📋 |
| **DESK-3** | P2 | Kanban, SLA (após Spec 010) | 📋 → Spec 008 |
| **AG-1** | P3 | Agentes IA + runbooks | 📋 |
| **AG-1** | P1 | Agentes IA + runbooks (Spec 029) | 🔄 staging |
---

View file

@ -0,0 +1,47 @@
#!/bin/bash
# Deploy homologação Agentic Ops — VM122 staging (portas 8180/8192)
# NÃO altera produção em :8080/:8091
set -euo pipefail
STAGING_DIR="/opt/ligbox-ops-platform-staging"
REPO="/opt/ligbox-spec-hub/repos/ligbox-ops-platform"
BRANCH="${1:-029-agentic-ops-runbooks}"
echo "==> Staging Agentic Ops branch=$BRANCH"
mkdir -p "$STAGING_DIR" /var/lib/ligbox-ops-platform-staging
# Sync código (symlinks api/frontend/worker do repo)
rsync -a --delete \
--exclude '.git' --exclude 'chat-bruto' --exclude 'node_modules' \
"$REPO/projects/ops-desk/" "$STAGING_DIR/"
rsync -a "$REPO/specs/" "$STAGING_DIR/specs/"
cd "$STAGING_DIR"
if [[ ! -f .env ]]; then
if [[ -f /opt/ligbox-ops-platform/.env ]]; then
cp /opt/ligbox-ops-platform/.env .env
sed -i 's|SQLITE_PATH=.*|SQLITE_PATH=/data/ops-staging.db|' .env
echo "AGENTIC_LLM_ENABLED=true" >> .env
echo "AGENTIC_SPECS_ROOT=/opt/ligbox-ops-platform/specs" >> .env
else
echo "ERRO: .env não encontrado — copie manualmente" >&2
exit 1
fi
fi
docker compose -f docker-compose.agentic-staging.yml up -d --build
sleep 8
echo "==> Health staging"
curl -sf "http://10.10.10.122:8180/api/health" | head -c 200; echo
curl -sf "http://10.10.10.122:8180/api/v1/agents/health" | head -c 300; echo
TOKEN=$(grep OPS_INTERNAL_TOKEN .env | cut -d= -f2)
curl -sf -X POST "http://10.10.10.122:8180/api/v1/agents/internal/tick" \
-H "X-Ops-Internal-Token: $TOKEN" | head -c 400; echo
echo "==> Staging UI: http://10.10.10.122:8192"
echo "==> Staging API: http://10.10.10.122:8180"

View file

@ -1,6 +1,10 @@
"""T0/T1 checks — Spec 029."""
from __future__ import annotations
import os, sqlite3, time
import os
import sqlite3
import time
import httpx
DESK = os.getenv("DESK_PUBLIC_URL", "https://desk.ligbox.com.br")
@ -11,85 +15,252 @@ PFS_USER = os.getenv("PFSENSE_API_USER", "api_cursor")
PFS_PASS = os.getenv("PFSENSE_API_PASSWORD", "805353")
PVE = os.getenv("PVE_API_URL", "https://10.10.10.2:8006/api2/json")
PVE_USER = os.getenv("PVE_USER", "root@pam")
PVE_PASS = os.getenv("PVE_PASSWORD", "@betinplace")
PVE_PASS = os.getenv("PVE_PASSWORD", "")
PVE_NODE = os.getenv("PVE_NODE", "big1")
VMIDS = [int(x) for x in os.getenv("AGENTIC_CRITICAL_VMIDS", "112,122,123,104").split(",") if x.strip()]
OLLAMA = os.getenv("OLLAMA_BASE_URL", "http://10.10.10.123:11434").rstrip("/")
VM123_IP = os.getenv("VM123_IP", "10.10.10.123")
OPENPANEL_BRIDGE = os.getenv("OPENPANEL_BRIDGE_URL", f"http://{VM123_IP}:18087").rstrip("/")
def _http(url, *, auth=None, max_ms=2500):
t0 = time.perf_counter()
try:
with httpx.Client(timeout=15, verify=False, follow_redirects=True) as c:
r = c.get(url, auth=auth)
ms = int((time.perf_counter()-t0)*1000)
return {"ok": r.status_code==200 and ms<=max_ms, "status_code": r.status_code, "latency_ms": ms, "url": url}
ms = int((time.perf_counter() - t0) * 1000)
return {"ok": r.status_code == 200 and ms <= max_ms, "status_code": r.status_code, "latency_ms": ms, "url": url}
except Exception as e:
return {"ok": False, "error": str(e), "url": url}
def check_desk_api_health():
r = _http(f"{DESK}/api/health")
return [] if r["ok"] else [{"severity":"high","category":"api","title":"Desk API health falhou","detail_md":str(r),"evidence":r,"human_action":"docker-compose logs api VM122"}]
r = _http(f"{DESK}/api/health", max_ms=4000)
return [] if r["ok"] else [
{
"severity": "high",
"category": "api",
"title": "Desk API health falhou",
"detail_md": str(r),
"evidence": r,
"human_action": "Verificar docker-compose api VM122",
}
]
def check_vm112_health():
out = []
r1 = _http(f"{VM112}/api/onboarding/health")
if not r1["ok"]: out.append({"severity":"high","category":"api","title":"VM112 API down","detail_md":str(r1),"evidence":r1,"human_action":"systemctl ligbox-wizard VM112"})
if not r1["ok"]:
out.append(
{
"severity": "high",
"category": "api",
"title": "VM112 API down",
"detail_md": str(r1),
"evidence": r1,
"human_action": "systemctl ligbox-wizard VM112",
}
)
r2 = _http(WIZARD, max_ms=4000)
if not r2["ok"]: out.append({"severity":"warn","category":"api","title":"Portal /onboard falhou","detail_md":str(r2),"evidence":r2,"human_action":"Traefik + VM112"})
if not r2["ok"]:
out.append(
{
"severity": "warn",
"category": "api",
"title": "Portal /onboard falhou",
"detail_md": str(r2),
"evidence": r2,
"human_action": "Traefik CT114 + VM112",
}
)
return out
def check_pfsense_api():
r = _http(PFS_URL, auth=(PFS_USER, PFS_PASS), max_ms=4000)
return [] if r["ok"] else [{"severity":"warn","category":"infra","title":"pfSense API falhou","detail_md":str(r),"evidence":r,"human_action":"firewall.itecnologys.com"}]
return [] if r["ok"] else [
{
"severity": "warn",
"category": "infra",
"title": "pfSense API falhou",
"detail_md": str(r),
"evidence": r,
"human_action": "Validar firewall.itecnologys.com via Traefik",
}
]
def check_funnel_stuck(conn, max_stuck=5):
try:
c = conn.execute("SELECT COUNT(*) n FROM tickets WHERE status IN ('open','assisting','escalated') AND (subject LIKE '%onboarding%' OR payload LIKE '%onboarding%') AND datetime(created_at)<datetime('now','-24 hours')").fetchone()["n"]
if c <= max_stuck: return []
return [{"severity":"warn","category":"code","title":f"Funil travado {c} tickets","detail_md":str(c),"evidence":{"count":c},"human_action":"ASM Spec 010"}]
c = conn.execute(
"SELECT COUNT(*) n FROM tickets WHERE status IN ('open','assisting','escalated') "
"AND (subject LIKE '%onboarding%' OR payload LIKE '%onboarding%') "
"AND datetime(created_at)<datetime('now','-24 hours')"
).fetchone()["n"]
if c <= max_stuck:
return []
return [
{
"severity": "warn",
"category": "code",
"title": f"Funil travado {c} tickets",
"detail_md": str(c),
"evidence": {"count": c},
"human_action": "Rever tickets onboarding — Spec 010 Assist",
}
]
except sqlite3.OperationalError:
return []
def check_integration_gap(ops_api_url, token):
if not token: return []
if not token:
return []
try:
with httpx.Client(timeout=15) as c:
r = c.get(f"{ops_api_url}/api/v1/integrations/health", headers={"X-Ops-Internal-Token": token})
if r.status_code != 200: return []
if r.status_code != 200:
return []
gap = (r.json().get("vm112_onboard") or {}).get("gap_minutes")
if gap is None or int(gap) <= 15: return []
return [{"severity":"high","category":"infra","title":f"Gap webhook {int(gap)}min","detail_md":"VM112 sem eventos","evidence":{"gap":gap},"human_action":"Webhooks VM112→122"}]
if gap is None or int(gap) <= 15:
return []
return [
{
"severity": "high",
"category": "infra",
"title": f"Gap webhook {int(gap)}min",
"detail_md": "VM112 sem eventos recentes",
"evidence": {"gap": gap},
"human_action": "Webhooks VM112→122",
}
]
except Exception:
return []
def check_proxmox_cluster():
if not PVE_PASS:
return []
try:
with httpx.Client(timeout=15, verify=False) as c:
t = c.post(f"{PVE}/access/ticket", data={"username": PVE_USER, "password": PVE_PASS})
if t.status_code != 200:
return [{"severity":"warn","category":"infra","title":"Proxmox auth falhou","detail_md":str(t.status_code),"evidence":{},"human_action":"PVE 10.10.10.2:8006"}]
return [
{
"severity": "warn",
"category": "infra",
"title": "Proxmox auth falhou",
"detail_md": str(t.status_code),
"evidence": {},
"human_action": "PVE 10.10.10.2:8006",
}
]
tok = t.json()["data"]["ticket"]
bad = []
with httpx.Client(timeout=15, verify=False) as c:
for vmid in VMIDS:
r = c.get(f"{PVE}/nodes/{PVE_NODE}/qemu/{vmid}/status/current", headers={"Cookie": f"PVEAuthCookie={tok}"})
r = c.get(
f"{PVE}/nodes/{PVE_NODE}/qemu/{vmid}/status/current",
headers={"Cookie": f"PVEAuthCookie={tok}"},
)
st = r.json().get("data", {}).get("status") if r.status_code == 200 else "error"
if st != "running": bad.append({"vmid": vmid, "status": st})
if not bad: return []
return [{"severity":"critical","category":"infra","title":f"VMs paradas {bad}","detail_md":str(bad),"evidence":{"bad":bad},"human_action":"qm start no big1"}]
if st != "running":
bad.append({"vmid": vmid, "status": st})
if not bad:
return []
return [
{
"severity": "critical",
"category": "infra",
"title": f"VMs paradas {bad}",
"detail_md": str(bad),
"evidence": {"bad": bad},
"human_action": "qm start no big1",
}
]
except Exception as e:
return [{"severity":"info","category":"infra","title":"Proxmox check erro","detail_md":str(e),"evidence":{},"human_action":""}]
return [
{
"severity": "info",
"category": "infra",
"title": "Proxmox check erro",
"detail_md": str(e),
"evidence": {},
"human_action": "",
}
]
def check_ollama_vm123():
r = _http(f"{OLLAMA}/api/tags", max_ms=5000)
return [] if r["ok"] else [{"severity":"high","category":"infra","title":"Ollama VM123 offline","detail_md":str(r),"evidence":r,"human_action":"systemctl start ollama VM123"}]
return [] if r["ok"] else [
{
"severity": "high",
"category": "infra",
"title": "Ollama VM123 offline",
"detail_md": str(r),
"evidence": r,
"human_action": "systemctl start ollama VM123",
}
]
def check_vm123_finance_stack():
out = []
foss = _http(f"http://{VM123_IP}:8092/", max_ms=5000)
if not foss["ok"]:
out.append(
{
"severity": "high",
"category": "api",
"title": "FOSSBilling VM123 down",
"detail_md": str(foss),
"evidence": foss,
"human_action": "docker compose VM123 finance stack",
}
)
odoo = _http(f"http://{VM123_IP}:8069/web/login", max_ms=5000)
if not odoo["ok"]:
out.append(
{
"severity": "warn",
"category": "api",
"title": "Odoo VM123 inacessível",
"detail_md": str(odoo),
"evidence": odoo,
"human_action": "Verificar container Odoo VM123",
}
)
return out
def check_vm123_openpanel_bridge():
r = _http(f"{OPENPANEL_BRIDGE}/health", max_ms=4000)
if r.get("status_code") == 404:
r = _http(OPENPANEL_BRIDGE, max_ms=4000)
return [] if r["ok"] else [
{
"severity": "warn",
"category": "api",
"title": "OpenPanel bridge VM123 falhou",
"detail_md": str(r),
"evidence": r,
"human_action": f"Bridge {OPENPANEL_BRIDGE}",
}
]
SCENARIO_RUNNERS = {
"desk.api.health": lambda conn, **kw: check_desk_api_health(),
"wizard.vm112.bundle": lambda conn, **kw: check_vm112_health(),
"pfsense.api.system": lambda conn, **kw: check_pfsense_api(),
"funnel.stuck.onboarding": lambda conn, **kw: check_funnel_stuck(conn),
"integration.webhook.gap": lambda conn, **kw: check_integration_gap(kw.get("ops_api_url",""), kw.get("internal_token","")),
"integration.webhook.gap": lambda conn, **kw: check_integration_gap(
kw.get("ops_api_url", ""), kw.get("internal_token", "")
),
"proxmox.cluster": lambda conn, **kw: check_proxmox_cluster(),
"ollama.vm123.health": lambda conn, **kw: check_ollama_vm123(),
"vm123.finance.stack": lambda conn, **kw: check_vm123_finance_stack(),
"vm123.openpanel.bridge": lambda conn, **kw: check_vm123_openpanel_bridge(),
}

View file

@ -1,12 +1,16 @@
"""Ollama VM123 + fallback — Spec 029 T1."""
"""Ollama VM123 + fallback — Spec 029 T0/T1."""
from __future__ import annotations
import os
import httpx
OLLAMA_BASE_URL = os.getenv("OLLAMA_BASE_URL", "http://10.10.10.123:11434").rstrip("/")
AGENTIC_LLM_MODEL = os.getenv("AGENTIC_LLM_MODEL", "qwen2.5:7b-instruct")
AGENTIC_EMBED_MODEL = os.getenv("AGENTIC_EMBED_MODEL", "nomic-embed-text")
AGENTIC_LLM_ENABLED = os.getenv("AGENTIC_LLM_ENABLED", "false").lower() in ("1", "true", "yes")
def ollama_available() -> bool:
try:
with httpx.Client(timeout=3.0) as c:
@ -14,25 +18,70 @@ def ollama_available() -> bool:
except Exception:
return False
def advise_human_action(*, finding_title: str, finding_detail: str, kb_snippets: list[str] | None = None) -> tuple[str, str]:
prompt = (
"Advisor Agentic Ops Ligbox. Português BR, máx 6 frases. O que fazer AGORA?\n"
f"Problema: {finding_title}\nDetalhe: {finding_detail}\nKB: {'---'.join(kb_snippets or [])[:2500] or 'N/A'}"
)
if not AGENTIC_LLM_ENABLED:
return (f"Investigar manualmente: {finding_title}", "t0")
if ollama_available():
def _chat(prompt: str, *, system: str | None = None, max_tokens: int = 800) -> tuple[str, str]:
if not AGENTIC_LLM_ENABLED or not ollama_available():
return ("", "t0")
messages = []
if system:
messages.append({"role": "system", "content": system})
messages.append({"role": "user", "content": prompt})
try:
with httpx.Client(timeout=90.0) as c:
r = c.post(f"{OLLAMA_BASE_URL}/api/chat", json={
"model": AGENTIC_LLM_MODEL,
"messages": [{"role": "user", "content": prompt}],
"stream": False,
})
with httpx.Client(timeout=120.0) as c:
r = c.post(
f"{OLLAMA_BASE_URL}/api/chat",
json={"model": AGENTIC_LLM_MODEL, "messages": messages, "stream": False},
)
if r.status_code == 200:
txt = (r.json().get("message") or {}).get("content", "").strip()
if txt:
return txt, AGENTIC_LLM_MODEL
except Exception:
pass
return (f"Rever logs e specs para: {finding_title}", "t0-fallback")
return ("", "t0-fallback")
def advise_human_action(
*, finding_title: str, finding_detail: str, kb_snippets: list[str] | None = None
) -> tuple[str, str]:
prompt = (
"Advisor Agentic Ops Ligbox. Português BR, máx 6 frases. O que fazer AGORA?\n"
f"Problema: {finding_title}\nDetalhe: {finding_detail}\n"
f"KB: {'---'.join(kb_snippets or [])[:2500] or 'N/A'}"
)
txt, model = _chat(prompt)
if txt:
return txt, model
return (f"Investigar manualmente: {finding_title}", "t0")
def chat_context(
*,
question: str,
kb_snippets: list[str] | None = None,
findings_summary: str | None = None,
user_role: str = "technician",
) -> tuple[str, str]:
"""T1 — resposta contextual para janela Desk / bot interno."""
system = (
"És o copiloto Agentic Ops da Ligbox (VM112 wizard, VM122 Desk, VM123 finance). "
"Responde em português BR, objectivo, com passos acionáveis. "
"Nunca inventes credenciais. Se não souberes, diz o que verificar."
)
ctx = []
if findings_summary:
ctx.append(f"Findings abertos:\n{findings_summary[:2000]}")
if kb_snippets:
ctx.append("KB:\n" + "\n---\n".join(kb_snippets[:6])[:4000])
prompt = (
f"Perfil utilizador: {user_role}\n"
f"Contexto ops:\n{chr(10).join(ctx) or 'N/A'}\n\n"
f"Pergunta: {question}"
)
txt, model = _chat(prompt, system=system)
if txt:
return txt, model
return (
"Modo T0 activo — LLM indisponível. Consulte findings e audit log no painel Agentic Ops.",
"t0",
)

View file

@ -16,5 +16,7 @@ def load_registry() -> list[dict]:
{"id": "funnel.stuck.onboarding", "title": "Funil travado", "severity_default": "warn"},
{"id": "integration.webhook.gap", "title": "Gap webhook VM112", "severity_default": "high"},
{"id": "proxmox.cluster", "title": "Proxmox VMs críticas", "severity_default": "critical"},
{"id": "ollama.vm123.health", "title": "Ollama VM123", "severity_default": "high"},
{"id": "ollama.vm123.health", "title": "Ollama VM123", "severity_default": "high", "agent_id": "sentinel"},
{"id": "vm123.finance.stack", "title": "VM123 Finance Stack", "severity_default": "high", "agent_id": "sentinel"},
{"id": "vm123.openpanel.bridge", "title": "OpenPanel Bridge VM123", "severity_default": "warn", "agent_id": "sentinel"},
]

View file

@ -1,63 +1,145 @@
"""Agentic API — Spec 029."""
from __future__ import annotations
from datetime import datetime, timezone
from fastapi import APIRouter, Depends, HTTPException, Query
from pydantic import BaseModel, Field
from app import auth
from app.agents import llm_client, runner, store
router = APIRouter(prefix="/api/v1/agents", tags=["agents"])
def _db():
conn = auth.db()
try: yield conn
finally: conn.close()
try:
yield conn
finally:
conn.close()
def _ops_view(user):
if user.role not in ("super_admin","ops_lead","technician","noc","agentic_operator"):
if user.role not in (
"super_admin",
"ops_lead",
"technician",
"noc",
"agentic_operator",
"developer",
"devops",
"security_analyst",
):
raise HTTPException(403, "insufficient permissions")
class ChatRequest(BaseModel):
question: str = Field(..., min_length=2, max_length=4000)
include_findings: bool = True
@router.get("/health")
def agents_health():
return {"status":"ok","tier":"t1" if llm_client.AGENTIC_LLM_ENABLED else "t0",
"ollama": llm_client.ollama_available(), "ollama_url": llm_client.OLLAMA_BASE_URL,
"model": llm_client.AGENTIC_LLM_MODEL}
return {
"status": "ok",
"tier": "t1" if llm_client.AGENTIC_LLM_ENABLED else "t0",
"ollama": llm_client.ollama_available(),
"ollama_url": llm_client.OLLAMA_BASE_URL,
"model": llm_client.AGENTIC_LLM_MODEL,
"embed_model": llm_client.AGENTIC_EMBED_MODEL,
}
@router.get("/scenarios")
def list_scenarios(user=Depends(auth.get_current_user), conn=Depends(_db)):
_ops_view(user); runner.sync_registry(conn); conn.commit()
_ops_view(user)
runner.sync_registry(conn)
conn.commit()
return {"scenarios": store.list_scenarios(conn)}
@router.get("/findings")
def list_findings(user=Depends(auth.get_current_user), conn=Depends(_db), severity: str|None=None, limit: int=Query(50, ge=1, le=200), open_only: bool=True):
def list_findings(
user=Depends(auth.get_current_user),
conn=Depends(_db),
severity: str | None = None,
limit: int = Query(50, ge=1, le=200),
open_only: bool = True,
):
_ops_view(user)
return {"findings": store.list_findings(conn, severity=severity, limit=limit, open_only=open_only)}
@router.post("/findings/{finding_id}/ack")
def ack_finding(finding_id: int, user=Depends(auth.get_current_user), conn=Depends(_db)):
_ops_view(user)
if not conn.execute("SELECT id FROM agent_findings WHERE id=?", (finding_id,)).fetchone():
raise HTTPException(404, "not found")
now = datetime.now(timezone.utc).isoformat()
conn.execute("UPDATE agent_findings SET acknowledged_at=?, acknowledged_by=? WHERE id=?", (now, user.username, finding_id))
conn.execute(
"UPDATE agent_findings SET acknowledged_at=?, acknowledged_by=? WHERE id=?",
(now, user.username, finding_id),
)
store.log_event(conn, event_type="finding.ack", message=f"#{finding_id}", payload={"by": user.username})
conn.commit()
return {"ok": True, "id": finding_id}
@router.get("/action-log")
def action_log(user=Depends(auth.get_current_user), conn=Depends(_db), limit: int=Query(100, ge=1, le=500)):
def action_log(user=Depends(auth.get_current_user), conn=Depends(_db), limit: int = Query(100, ge=1, le=500)):
_ops_view(user)
return {"events": store.list_action_log(conn, limit=limit)}
@router.post("/runs/{scenario_id}")
def trigger_run(scenario_id: str, user=Depends(auth.get_current_user), conn=Depends(_db)):
if user.role not in ("super_admin","ops_lead"): raise HTTPException(403, "insufficient permissions")
if user.role not in ("super_admin", "ops_lead", "agentic_operator"):
raise HTTPException(403, "insufficient permissions")
r = runner.run_scenario(conn, scenario_id, trigger=f"manual:{user.username}")
conn.commit(); return r
conn.commit()
return r
@router.post("/chat")
def agent_chat(body: ChatRequest, user=Depends(auth.get_current_user), conn=Depends(_db)):
"""T1 context window — copiloto ops para utilizadores Desk."""
_ops_view(user)
kb = store.search_kb(conn, body.question)
findings_summary = ""
if body.include_findings:
open_f = store.list_findings(conn, limit=8, open_only=True)
if open_f:
findings_summary = "\n".join(
f"- [{f['severity']}] {f['title']}: {f.get('suggested_human_action') or ''}" for f in open_f
)
answer, model = llm_client.chat_context(
question=body.question,
kb_snippets=[k["snippet"] for k in kb],
findings_summary=findings_summary,
user_role=user.role,
)
store.log_event(
conn,
event_type="chat.query",
message=body.question[:120],
agent_id="advisor",
payload={"user": user.username, "model": model},
)
conn.commit()
return {"answer": answer, "model": model, "kb_hits": len(kb)}
@router.post("/internal/tick")
def internal_tick(user=Depends(auth.require_internal_or_user), conn=Depends(_db)):
kb = runner.index_specs_kb(conn)
result = runner.run_all_enabled(conn, trigger="cron")
store.log_event(conn, event_type="tick.complete", message=f"kb={kb} runs={result['total']}", payload={"kb": kb, **result})
store.log_event(
conn,
event_type="tick.complete",
message=f"kb={kb} runs={result['total']}",
agent_id="orchestrator",
payload={"kb": kb, **result},
)
conn.commit()
return {"ok": True, "kb_indexed": kb, **result}

View file

@ -28,11 +28,14 @@ def index_specs_kb(conn):
def run_scenario(conn, scenario_id, *, trigger="cron"):
sc = store.get_scenario(conn, scenario_id)
if not sc: return {"ok": False, "error": "not found"}
if not sc:
return {"ok": False, "error": "not found"}
agent_id = (sc.get("config") or {}).get("agent_id") or sc.get("agent_id") or "sentinel"
fn = checks.SCENARIO_RUNNERS.get(scenario_id)
if not fn: return {"ok": False, "error": "no runner"}
if not fn:
return {"ok": False, "error": "no runner"}
run_id = store.create_run(conn, scenario_id, trigger)
store.log_event(conn, event_type="run.start", message=scenario_id, run_id=run_id)
store.log_event(conn, event_type="run.start", message=scenario_id, run_id=run_id, agent_id=agent_id)
raw = fn(conn, ops_api_url=OPS_API, internal_token=TOKEN)
fids = []
for f in raw:
@ -50,10 +53,10 @@ def run_scenario(conn, scenario_id, *, trigger="cron"):
fids.append(fid)
if f.get("severity") in ("high", "critical"):
notify.notify_finding({**f, "suggested_human_action": human})
store.log_event(conn, event_type="finding.created", message=f.get("title",""), run_id=run_id, payload={"id": fid})
store.log_event(conn, event_type="finding.created", message=f.get("title",""), run_id=run_id, payload={"id": fid}, agent_id=agent_id)
status = "ok" if not raw else "degraded"
store.finish_run(conn, run_id, status=status, summary=f"{len(raw)} finding(s)" if raw else "healthy")
store.log_event(conn, event_type="run.finish", message=status, run_id=run_id)
store.log_event(conn, event_type="run.finish", message=status, run_id=run_id, agent_id=agent_id)
return {"ok": True, "run_id": run_id, "scenario_id": scenario_id, "status": status, "findings_count": len(raw), "finding_ids": fids}
def run_all_enabled(conn, trigger="cron"):

View file

@ -3,21 +3,36 @@ scenarios:
- id: desk.api.health
title: Desk VM122 API
severity_default: high
agent_id: sentinel
- id: wizard.vm112.bundle
title: VM112 Wizard
severity_default: high
agent_id: sentinel
- id: pfsense.api.system
title: pfSense API
severity_default: warn
agent_id: sentinel
- id: funnel.stuck.onboarding
title: Funil travado
severity_default: warn
agent_id: dispatcher
- id: integration.webhook.gap
title: Gap webhook VM112
severity_default: high
agent_id: sentinel
- id: proxmox.cluster
title: Proxmox VMs críticas
severity_default: critical
agent_id: sentinel
- id: ollama.vm123.health
title: Ollama VM123
severity_default: high
agent_id: sentinel
- id: vm123.finance.stack
title: VM123 Finance Stack
severity_default: high
agent_id: sentinel
- id: vm123.openpanel.bridge
title: OpenPanel Bridge VM123
severity_default: warn
agent_id: sentinel

View file

@ -28,6 +28,8 @@ from app.billing_routes import router as billing_router
from app.security_routes import router as security_router
from app.infra_stack_routes import router as infra_stack_router
from app.vm123.routes import router as vm123_router
from app.agents.routes import router as agents_router
from app.agents.store import init_agent_schema
from app.collectors.base import run_audit
from app.permissions import (
can_assign_ticket,
@ -117,7 +119,7 @@ ASSIST_LIFECYCLE_EVENTS = frozenset({"onboarding.assist.started", "onboarding.as
TICKET_ACTIVE_STATUSES = frozenset({"open", "escalated", "assisting", "resolved"})
app = FastAPI(title="Ligbox Ops Platform API", version="0.9.0-desk-assist")
app = FastAPI(title="Ligbox Ops Platform API", version="0.9.7-spec029-agentic")
app.add_middleware(CORSMiddleware, allow_origins=["*"], allow_methods=["*"], allow_headers=["*"])
app.include_router(auth_router)
app.include_router(registration_router)
@ -133,6 +135,7 @@ app.include_router(migration_router)
app.include_router(billing_router)
app.include_router(infra_stack_router)
app.include_router(vm123_router)
app.include_router(agents_router)
TICKET_COLUMNS = "id,tenant_id,subject,status,payload,created_at,assigned_to,assigned_at,session_id,assist_mode,assisted_by,assisted_at,client_paused"
@ -185,6 +188,7 @@ def init_db():
init_purge_jobs_schema(conn)
init_purge_auth_schema(conn)
init_agent_schema(conn)
conn.execute("PRAGMA journal_mode=WAL")
conn.execute("PRAGMA busy_timeout=30000")
conn.commit()

View file

@ -47,6 +47,13 @@ MODULES: tuple[ModuleDef, ...] = (
description="Painel visual SOC VM112→VM122.",
nav_views=("infra2",),
),
ModuleDef(
id="agentic-ops",
label="Agentic Ops",
description="Vigilância 24/7, findings, advisor IA e contexto ops (Spec 029).",
nav_views=("agentic-ops",),
default_enabled=True,
),
ModuleDef(
id="funnel-timing",
label="Relógio por fase",
@ -135,6 +142,29 @@ MODULES: tuple[ModuleDef, ...] = (
MODULE_BY_ID = {m.id: m for m in MODULES}
# Spec 027 + 029 — módulos ON por defeito na activação
ROLE_MODULE_DEFAULTS: dict[str, frozenset[str]] = {
"sales_admin": frozenset(
{"core", "leads", "funnel-timing", "overview-home", "billing-recurrence", "tenants"}
),
"sales_support": frozenset({"core", "leads", "funnel-timing", "overview-home", "tenants"}),
"finance": frozenset({"core", "overview-home", "billing-recurrence", "events"}),
"marketing": frozenset({"core", "leads", "funnel-timing", "overview-home"}),
"seo": frozenset({"core", "funnel-timing", "overview-home", "leads"}),
"developer": frozenset({"core", "events", "infra", "overview", "agentic-ops"}),
"devops": frozenset({"core", "infra", "infra2-soc", "overview-home", "events", "agentic-ops"}),
"security_analyst": frozenset({"core", "infra2-soc", "wazuh-soc", "events", "agentic-ops"}),
"content_editor": frozenset({"core"}),
"agentic_operator": frozenset({"core", "overview", "events", "infra2-soc", "agentic-ops"}),
}
def role_module_defaults(role: str) -> frozenset[str] | None:
"""None = roles ops legacy (003) — respeitam só toggles globais."""
if role in ("super_admin", "ops_lead", "technician", "noc"):
return None
return ROLE_MODULE_DEFAULTS.get(role, frozenset({"core"}))
def all_module_ids() -> list[str]:
return [m.id for m in MODULES]

View file

@ -6,3 +6,4 @@ python-jose[cryptography]==3.3.0
passlib[bcrypt]==1.7.4
bcrypt==4.2.1
pyotp==2.9.0
pyyaml==6.0.2

View file

@ -0,0 +1,26 @@
"""Tests Agentic Ops — Spec 029."""
from __future__ import annotations
import sqlite3
from app.agents import checks, registry, store
def test_registry_has_vm123_scenarios():
scenarios = registry.load_registry()
ids = {s["id"] for s in scenarios}
assert "ollama.vm123.health" in ids
assert "vm123.finance.stack" in ids
def test_agent_schema_init():
conn = sqlite3.connect(":memory:")
store.init_agent_schema(conn)
tables = {r[0] for r in conn.execute("SELECT name FROM sqlite_master WHERE type='table'")}
assert "agent_findings" in tables
assert "agent_scenarios" in tables
def test_desk_health_check_returns_list():
result = checks.check_desk_api_health()
assert isinstance(result, list)

View file

@ -0,0 +1,46 @@
# Staging Agentic Ops — isolado da produção VM122
# Portas: API 8180, Frontend 8192, Redis interno
# Dados: /var/lib/ligbox-ops-platform-staging (separado)
version: "3.8"
services:
redis-staging:
image: redis:7-alpine
restart: unless-stopped
command: redis-server --maxmemory 64mb --maxmemory-policy allkeys-lru
networks: [agentic-staging]
api-staging:
build: ./api
restart: unless-stopped
env_file: .env
environment:
SQLITE_PATH: /data/ops-staging.db
REDIS_URL: redis://redis-staging:6379/0
OPS_API_URL: http://api-staging:8080
volumes:
- /var/lib/ligbox-ops-platform-staging:/data
- ./specs:/opt/ligbox-ops-platform/specs:ro
ports:
- "10.10.10.122:8180:8080"
depends_on: [redis-staging]
networks: [agentic-staging]
worker-staging:
build: ./worker
restart: unless-stopped
env_file: .env
environment:
OPS_API_URL: http://api-staging:8080
REDIS_URL: redis://redis-staging:6379/0
AGENTIC_INTERVAL_SEC: "300"
depends_on: [redis-staging, api-staging]
networks: [agentic-staging]
frontend-staging:
build: ./frontend
restart: unless-stopped
ports:
- "10.10.10.122:8192:80"
depends_on: [api-staging]
networks: [agentic-staging]
networks:
agentic-staging:
driver: bridge

View file

@ -1,13 +1,22 @@
(function () {
const esc = (s) => String(s ?? '').replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;');
async function api(path, opts = {}) {
const h = { ...(opts.headers || {}) };
const h = { ...(opts.headers || {}), 'Content-Type': 'application/json' };
const t = window.DeskAuth?.getToken?.();
if (t) h.Authorization = `Bearer ${t}`;
const r = await fetch(`/api/v1/agents${path}`, { ...opts, headers: h });
if (!r.ok) throw new Error(`${r.status}`);
if (!r.ok) {
const err = await r.text();
throw new Error(`${r.status} ${err.slice(0, 200)}`);
}
return r.json();
}
async function sendChat(question) {
return api('/chat', { method: 'POST', body: JSON.stringify({ question, include_findings: true }) });
}
async function renderAgenticOps() {
const el = document.getElementById('agentic-ops-content');
if (!el) return;
@ -17,16 +26,63 @@
api('/health'), api('/scenarios'), api('/findings?limit=30'), api('/action-log?limit=40'),
]);
const tier = health.tier === 't1' ? 'T1 LLM' : 'T0';
const ollama = health.ollama ? '<span class="pill pill-ok">Ollama OK</span>' : '<span class="pill pill-warn">Ollama offline</span>';
const sRows = (scenarios.scenarios || []).map(s => `<tr><td><code>${esc(s.id)}</code></td><td>${esc(s.title)}</td><td>${esc(s.last_run_status||'—')}</td><td class="ticket-meta">${esc(s.last_run_at||'—')}</td></tr>`).join('');
const fRows = (findings.findings || []).map(f => `<article class="card agentic-finding"><h3>${esc(f.title)} <span class="pill">${esc(f.severity)}</span></h3><p class="ticket-meta">${esc(f.created_at)}</p>${f.suggested_human_action?`<p><strong>Acção:</strong> ${esc(f.suggested_human_action)}</p>`:''}<button type="button" class="btn btn-ghost btn-sm" data-ack="${f.id}">Marcar visto</button></article>`).join('') || '<p class="empty">Sem findings abertos.</p>';
const lRows = (log.events || []).map(e => `<tr><td class="ticket-meta">${esc(e.ts)}</td><td><code>${esc(e.event_type)}</code></td><td>${esc(e.message)}</td></tr>`).join('');
el.innerHTML = `<div class="toolbar agentic-toolbar"><div><h2>Agentic Ops</h2><p class="ticket-meta">Spec 029 · ${tier} ${ollama}</p></div><button type="button" class="btn btn-primary btn-sm" id="btn-agentic-refresh">Actualizar</button></div><div class="agentic-grid"><div class="card"><h3>Cenários</h3><table class="data-table"><thead><tr><th>ID</th><th>Título</th><th>Último</th><th>Quando</th></tr></thead><tbody>${sRows}</tbody></table></div><div class="agentic-findings-col"><h3>Findings</h3>${fRows}</div></div><section class="card" style="margin-top:1rem"><h3>Audit log</h3><table class="data-table data-table-compact"><thead><tr><th>Quando</th><th>Evento</th><th>Mensagem</th></tr></thead><tbody>${lRows}</tbody></table></section>`;
const ollama = health.ollama
? `<span class="pill pill-ok">Ollama OK · ${esc(health.model || '')}</span>`
: '<span class="pill pill-warn">Ollama offline — modo T0</span>';
const sRows = (scenarios.scenarios || []).map(s =>
`<tr><td><code>${esc(s.id)}</code></td><td>${esc(s.title)}</td><td>${esc(s.last_run_status||'—')}</td><td class="ticket-meta">${esc(s.last_run_at||'—')}</td></tr>`
).join('');
const fRows = (findings.findings || []).map(f =>
`<article class="card agentic-finding"><h3>${esc(f.title)} <span class="pill">${esc(f.severity)}</span></h3>`
+ `<p class="ticket-meta">${esc(f.created_at)}</p>`
+ (f.suggested_human_action ? `<p><strong>Acção:</strong> ${esc(f.suggested_human_action)}</p>` : '')
+ `<button type="button" class="btn btn-ghost btn-sm" data-ack="${f.id}">Marcar visto</button></article>`
).join('') || '<p class="empty">Sem findings abertos.</p>';
const lRows = (log.events || []).map(e =>
`<tr><td class="ticket-meta">${esc(e.ts)}</td><td><code>${esc(e.event_type)}</code></td><td>${esc(e.message)}</td></tr>`
).join('');
el.innerHTML = `
<div class="toolbar agentic-toolbar">
<div><h2>Agentic Ops</h2><p class="ticket-meta">Spec 029 · ${tier} ${ollama}</p></div>
<button type="button" class="btn btn-primary btn-sm" id="btn-agentic-refresh">Actualizar</button>
</div>
<div class="agentic-grid">
<div class="card"><h3>Cenários</h3>
<table class="data-table"><thead><tr><th>ID</th><th>Título</th><th>Último</th><th>Quando</th></tr></thead><tbody>${sRows}</tbody></table>
</div>
<div class="agentic-findings-col"><h3>Findings</h3>${fRows}</div>
</div>
<section class="card agentic-chat-card" style="margin-top:1rem">
<h3>Copiloto Ops (T1)</h3>
<p class="ticket-meta">Pergunte sobre infra, VM123, findings ou procedimentos resposta contextual pt-BR.</p>
<div class="agentic-chat-box">
<textarea id="agentic-chat-input" rows="3" placeholder="Ex.: O que fazer se Ollama VM123 estiver offline?" class="input"></textarea>
<button type="button" class="btn btn-primary btn-sm" id="btn-agentic-chat">Perguntar</button>
</div>
<div id="agentic-chat-answer" class="agentic-chat-answer" hidden></div>
</section>
<section class="card" style="margin-top:1rem"><h3>Audit log</h3>
<table class="data-table data-table-compact"><thead><tr><th>Quando</th><th>Evento</th><th>Mensagem</th></tr></thead><tbody>${lRows}</tbody></table>
</section>`;
el.querySelector('#btn-agentic-refresh')?.addEventListener('click', renderAgenticOps);
el.querySelectorAll('[data-ack]').forEach(btn => btn.addEventListener('click', async () => {
await api(`/findings/${btn.dataset.ack}/ack`, { method: 'POST' });
await renderAgenticOps();
}));
el.querySelector('#btn-agentic-chat')?.addEventListener('click', async () => {
const input = el.querySelector('#agentic-chat-input');
const out = el.querySelector('#agentic-chat-answer');
const q = (input?.value || '').trim();
if (!q) return;
out.hidden = false;
out.innerHTML = '<p class="loading">A pensar…</p>';
try {
const res = await sendChat(q);
out.innerHTML = `<p><strong>Resposta</strong> <span class="ticket-meta">(${esc(res.model)})</span></p><p>${esc(res.answer)}</p>`;
} catch (err) {
out.innerHTML = `<p class="error">${esc(err.message)}</p>`;
}
});
} catch (err) {
el.innerHTML = `<p class="error">Erro: ${esc(err.message)}</p>`;
}

View file

@ -76,6 +76,7 @@ const views = {
'email-migration': document.getElementById('view-email-migration'),
infra: document.getElementById('view-infra'),
infra2: document.getElementById('view-infra2'),
'agentic-ops': document.getElementById('view-agentic-ops'),
messages: document.getElementById('view-messages'),
admin: document.getElementById('view-admin'),
account: document.getElementById('view-account'),
@ -210,6 +211,7 @@ function setView(name) {
tenants: 'Tenants',
infra: 'INFRA CODE',
infra2: 'SOC — Infra 2',
'agentic-ops': 'Agentic Ops',
messages: 'Mensagens — pedidos de cadastro',
admin: 'Administradores',
account: 'Minha conta',
@ -225,6 +227,7 @@ function setView(name) {
tenants: 'Operações Ligbox — onboarding, tickets e monitoramento',
infra: 'Infrastructure as Code — stack VMs 112, 114, 122, 123, 130',
infra2: 'Centro de operações — monitoramento visual VM112 → VM122 em tempo quase real',
'agentic-ops': 'Vigilância 24/7, findings, advisor IA e copiloto ops (Spec 029)',
messages: 'Operações Ligbox — onboarding, tickets e monitoramento',
admin: 'Operações Ligbox — onboarding, tickets e monitoramento',
account: 'Operações Ligbox — onboarding, tickets e monitoramento',
@ -4221,6 +4224,7 @@ async function refresh(options = {}) {
if (state.view === 'tenants') await renderTenants();
if (state.view === 'infra') await renderInfra();
if (state.view === 'infra2') await renderInfra2();
if (state.view === 'agentic-ops' && window.renderAgenticOps) await window.renderAgenticOps();
if (state.view === 'messages') await renderMessages();
if (state.view === 'admin') await renderAdmin();
if (state.view === 'modules') await renderModules();

View file

@ -225,6 +225,10 @@
<span class="nav-icon-wrap" aria-hidden="true"><svg class="nav-icon-svg"><use href="#icon-infra2"/></svg></span>
<span class="nav-label">Infra 2 <span class="nav-badge-new">SOC</span></span>
</button>
<button type="button" data-view="agentic-ops" data-module="agentic-ops" id="nav-agentic-ops" class="nav-item nav-item-agentic">
<span class="nav-icon-wrap" aria-hidden="true"><svg class="nav-icon-svg"><use href="#icon-infra"/></svg></span>
<span class="nav-label">Agentic Ops</span>
</button>
<button type="button" data-view="account" data-module="core" id="nav-account" class="nav-item nav-item-account">
<span class="nav-icon-wrap" aria-hidden="true"><svg class="nav-icon-svg"><use href="#icon-account"/></svg></span>
<span class="nav-label">Minha conta</span>
@ -326,6 +330,10 @@
<div id="infra2-content"><p class="loading">Carregando SOC…</p></div>
</section>
<section id="view-agentic-ops" class="view">
<div id="agentic-ops-content"><p class="loading">Carregando Agentic Ops…</p></div>
</section>
<section id="view-account" class="view">
<div id="account-content"><p class="loading">Carregando…</p></div>
</section>
@ -434,7 +442,8 @@
<script src="/assets/desk-live-stub.js?v=20260619tickets2"></script>
<script src="/assets/tickets-workspace.js?v=20260619tickets2"></script>
<script src="/assets/tickets-detail-panel.js?v=20260619tickets2"></script>
<script src="/assets/servicos.js?v=20260619tickets2"></script>
<script src="/assets/app.js?v=20260619tickets2"></script>
<script src="/assets/servicos.js?v=20260620agentic"></script>
<script src="/assets/agentic-ops.js?v=20260620agentic"></script>
<script src="/assets/app.js?v=20260620agentic"></script>
</body>
</html>

View file

@ -14,6 +14,7 @@ WORKER_INTERVAL = int(os.getenv("WORKER_INTERVAL", "120"))
AUDIT_INTERVAL_SEC = int(os.getenv("AUDIT_INTERVAL_SEC", "600"))
LEAD_SYNC_INTERVAL_SEC = int(os.getenv("LEAD_SYNC_INTERVAL_SEC", "900"))
WEBHOOK_GAP_ALERT_MIN = int(os.getenv("WEBHOOK_GAP_ALERT_MIN", "15"))
AGENTIC_INTERVAL_SEC = int(os.getenv("AGENTIC_INTERVAL_SEC", "300"))
OPS_NTFY_TOPIC = os.getenv("DESK_OPS_NTFY_TOPIC", "").strip()
@ -40,6 +41,21 @@ def poll_vm112() -> None:
print(f"[worker] vm112 ERROR: {exc}", flush=True)
def agentic_tick() -> None:
"""Spec 029 — run all agent scenarios (T0 checks + T1 advisor)."""
if not OPS_INTERNAL_TOKEN:
return
try:
with httpx.Client(timeout=180.0) as client:
response = client.post(
f"{OPS_API_URL}/api/v1/agents/internal/tick",
headers={"X-Ops-Internal-Token": OPS_INTERNAL_TOKEN},
)
print(f"[worker] agentic tick {response.status_code}: {response.text[:200]}", flush=True)
except Exception as exc:
print(f"[worker] agentic tick ERROR: {exc}", flush=True)
def check_integration_gap() -> None:
if not OPS_INTERNAL_TOKEN:
return
@ -83,6 +99,7 @@ def main() -> None:
print("[worker] started", flush=True)
last_audit = 0.0
last_lead_sync = 0.0
last_agentic = 0.0
while True:
event = redis_client.rpop("ops:events")
if event:
@ -96,6 +113,9 @@ def main() -> None:
sync_stale_leads()
check_integration_gap()
last_lead_sync = now
if now - last_agentic >= AGENTIC_INTERVAL_SEC:
agentic_tick()
last_agentic = now
time.sleep(WORKER_INTERVAL)

View file

@ -0,0 +1,48 @@
# Agent Platform API — Spec 029
Base: `https://api.ops.ligbox.com.br/api/v1/agents` (prod)
Staging: `http://10.10.10.122:8180/api/v1/agents`
## GET /health
```json
{
"status": "ok",
"tier": "t1",
"ollama": true,
"ollama_url": "http://10.10.10.123:11434",
"model": "qwen2.5:7b-instruct",
"embed_model": "nomic-embed-text"
}
```
## POST /internal/tick
Headers: `X-Ops-Internal-Token: {OPS_INTERNAL_TOKEN}`
Response:
```json
{
"ok": true,
"kb_indexed": 65,
"runs": [{"ok": true, "scenario_id": "desk.api.health", "status": "ok", "findings_count": 0}],
"total": 9
}
```
## POST /chat
Headers: `Authorization: Bearer {jwt}`
Body:
```json
{"question": "O que fazer se gap webhook > 15min?", "include_findings": true}
```
Response:
```json
{"answer": "...", "model": "qwen2.5:7b-instruct", "kb_hits": 3}
```

View file

@ -0,0 +1,62 @@
# Quickstart — Spec 029 Agentic Ops
## Staging (homologação — portas 8180/8192)
```bash
ssh root@10.10.10.122
cd /opt/ligbox-ops-platform-staging
git fetch && git checkout 029-agentic-ops-runbooks && git pull
cp .env.staging.example .env # ajustar tokens
docker compose -f docker-compose.agentic-staging.yml up -d --build
```
### Validar T0
```bash
curl -s http://10.10.10.122:8180/api/health | jq .
curl -s http://10.10.10.122:8180/api/v1/agents/health | jq .
# Esperado: tier t1 se AGENTIC_LLM_ENABLED=true e Ollama OK
curl -s -X POST http://10.10.10.122:8180/api/v1/agents/internal/tick \
-H "X-Ops-Internal-Token: SEU_TOKEN" | jq .
```
### Validar T1 (Ollama VM123)
```bash
ssh root@10.10.10.123 'curl -s http://127.0.0.1:11434/api/tags'
# Esperado: qwen2.5:7b-instruct, nomic-embed-text
```
### Validar chat (com JWT Desk)
```bash
TOKEN=$(curl -s -X POST http://10.10.10.122:8180/api/v1/auth/login \
-H 'Content-Type: application/json' \
-d '{"username":"root","password":"..."}' | jq -r .access_token)
curl -s -X POST http://10.10.10.122:8180/api/v1/agents/chat \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d '{"question":"Como validar FOSSBilling na VM123?"}' | jq .
```
## Checklist homologação
- [ ] `/api/v1/agents/health` → 200, ollama true
- [ ] Tick interno → runs para 9 cenários
- [ ] Findings gravados em SQLite staging
- [ ] E-mail teste em finding critical (opcional)
- [ ] UI Agentic Ops no Desk staging `:8192`
- [ ] Chat copiloto responde pt-BR
- [ ] Produção `:8080` intacta (versão anterior)
## Promover produção
Somente após checklist:
```bash
cd /opt/ligbox-ops-platform
git checkout 029-agentic-ops-runbooks && git pull
docker compose -f docker-compose.mvp.yml up -d --build api worker frontend
```

View file

@ -0,0 +1,118 @@
# Spec 029 — Agentic Ops Runbooks (T0 → T1)
**Criado:** 2026-06-20
**Solicitado por:** Roger
**Status:** Homologação staging (branch `029-agentic-ops-runbooks`)
**Prioridade:** P1 (backlog AG-1)
**Sistemas:** VM122 (orquestração) · VM123 (Ollama LLM) · VM112/104/Proxmox/pfSense (alvos)
---
## Resumo
Camada **Agentic Ops** para vigilância 24/7, checks determinísticos (T0), advisor LLM local (T1), e-mail em findings críticos, e copiloto contextual no Desk.
| Tier | Motor | Onde |
|------|-------|------|
| **T0** | Checks HTTP/SQLite + fallback texto | VM122 API + worker |
| **T1** | Ollama `qwen2.5:7b-instruct` + RAG specs | VM123 `:11434` |
**Produção Desk:** `8080` / `8091`**não alterado** nesta entrega.
**Staging homologação:** `8180` / `8192` — stack isolada (`docker-compose.agentic-staging.yml`).
---
## Agentes lógicos (implementação 029)
| ID | Papel | Função |
|----|-------|--------|
| `sentinel` | Health/API | Cenários desk, wizard, pfSense, proxmox, ollama, VM123 |
| `dispatcher` | Funil | Tickets onboarding travados |
| `curator` | KB | Indexa `/specs/**/*.md` em SQLite |
| `advisor` | T1 | Sugestões human_action + `/chat` copiloto |
| `orchestrator` | Tick | Worker cron — dispara todos os cenários |
Mapeamento futuro Spec 027 A0A7 permanece na governança RBAC; esta spec entrega o **MVP operacional**.
---
## Cenários (registry.yaml)
1. `desk.api.health` — Desk VM122
2. `wizard.vm112.bundle` — VM112 API + portal
3. `pfsense.api.system` — pfSense via Traefik
4. `funnel.stuck.onboarding` — tickets >24h
5. `integration.webhook.gap` — gap VM112→122
6. `proxmox.cluster` — VMs 112/122/123/104
7. `ollama.vm123.health` — LLM backend
8. `vm123.finance.stack` — FOSS + Odoo
9. `vm123.openpanel.bridge` — bridge hosting
---
## API (`/api/v1/agents/*`)
| Método | Path | Auth |
|--------|------|------|
| GET | `/health` | público |
| GET | `/scenarios` | ops view |
| GET | `/findings` | ops view |
| POST | `/findings/{id}/ack` | ops view |
| GET | `/action-log` | ops view |
| POST | `/runs/{scenario_id}` | super_admin, ops_lead, agentic_operator |
| POST | `/chat` | ops view (T1 copiloto) |
| POST | `/internal/tick` | token interno / cron worker |
---
## Worker
- `AGENTIC_INTERVAL_SEC=300` (5 min)
- `POST /api/v1/agents/internal/tick` via `OPS_INTERNAL_TOKEN`
---
## Notificações
- **E-mail:** findings `high`/`critical` → `DESK_ROOT_NOTIFY_EMAIL`
- **ntfy:** opcional via `DESK_OPS_NTFY_TOPIC`
---
## Variáveis `.env`
```bash
AGENTIC_LLM_ENABLED=true
OLLAMA_BASE_URL=http://10.10.10.123:11434
AGENTIC_LLM_MODEL=qwen2.5:7b-instruct
AGENTIC_EMBED_MODEL=nomic-embed-text
AGENTIC_INTERVAL_SEC=300
AGENTIC_SPECS_ROOT=/opt/ligbox-ops-platform/specs
AGENTIC_CRITICAL_VMIDS=112,122,123,104
VM123_IP=10.10.10.123
OPENPANEL_BRIDGE_URL=http://10.10.10.123:18087
```
---
## Homologação
```bash
# Staging VM122 (portas isoladas)
cd /opt/ligbox-ops-platform-staging
docker compose -f docker-compose.agentic-staging.yml up -d --build
curl -s http://10.10.10.122:8180/api/v1/agents/health
curl -s -X POST http://10.10.10.122:8180/api/v1/agents/internal/tick \
-H "X-Ops-Internal-Token: $OPS_INTERNAL_TOKEN"
```
Promover para produção apenas após checklist `quickstart.md`.
---
## Documentos relacionados
- Spec **027** — RBAC `agentic_operator`, A0A7 governança
- Spec **019** — Console, políticas R0R3
- `contracts/agent-platform-api.md`
- `quickstart.md`