1
0
Files
2026-05-18 14:05:37 +03:00

62 lines
2.5 KiB
YAML

kind: uapf.guardrails
version: "0.3.0"
# UAPF-IP guardrails enforced at every capability call made by a UAPF-IP host
# running this package. Guardrails are evaluated before and after capability
# execution; a violation blocks the call and is logged to the audit trail.
principles:
- id: GR-1
name: AI is advisory, never authoritative
rule: >-
AI agents and MCP tools (ai.classify, ai.complete) may only produce
recommendations. Object classification, threat severity, public-broadcast
authorisation, interception authorisation and origin attribution must be
confirmed by an accountable human role before any downstream effect.
appliesTo: [Decision_ObjectClassification, Decision_AirThreatSeverity,
Decision_InterceptionAuthorization, HT_OriginAttribution]
- id: GR-2
name: No autonomous use of force
rule: >-
No capability may move the process to an interception "engage" outcome
without a recorded manual authorisation by NBS Joint Staff. The
Decision_InterceptionAuthorization table is decision support only.
appliesTo: [Task_AuthorizeInterception, Task_GroundEngage, Task_RequestBAP]
- id: GR-3
name: Notification SLA is a hard floor
rule: >-
broadcastSlaSeconds from Decision_NotificationUrgency is a maximum, not a
target. The SLA-breach boundary timer must remain enabled; disabling it
is a safeguard removal and must be recorded.
appliesTo: [Task_DispatchBroadcast, Boundary_SlaBreach]
- id: GR-4
name: Public messaging is human-authored
rule: >-
Cell-broadcast and all-clear messages are selected from pre-approved,
human-authored templates (MSG_*). Free-text generation of public alert
content by an AI agent is prohibited.
appliesTo: [Task_DispatchBroadcast, Task_AllClearBroadcast, HT_PressBriefing]
- id: GR-5
name: Auditability
rule: >-
Every decision evaluation records its inputs, the matched rule id and the
output, with actor identity and timestamp, retained per metadata/policies.yaml.
appliesTo: ["*"]
- id: GR-6
name: Data minimisation in disinformation monitoring
rule: >-
The OSINT monitoring tool processes only public posts; it must not ingest
or store personal data of identifiable citizens beyond what is required
to classify a post.
appliesTo: [HT_DisinfoMonitor]
enforcement:
onViolation: block
audit: required
reviewAuthority: "Ministry of Defence (algorithm governance)"