rewrite 2.0.0: real process — extract the algorithm into DMN

The 1.x package was a single ai.extract call wrapped in three BPMN service tasks. No decision logic, no dmn cornerstone, no weights — the risk/routing/validation algorithm lived invisibly in host code. There was nothing for a runtime to actually execute. 2.0.0 makes it a real process: - dmn cornerstone added with three decision tables: * assess-personal-data-risk — PII regex signals -> risk level * gdpr-processing-route — risk x centralisation -> CENTRAL/LOCAL, anonymisation, redaction level * human-validation-gate — confidence thresholds + PII re-scan -> REJECTED/PENDING_REVIEW/APPROVED_AUTO - BPMN expanded 3 -> 6 nodes (3 serviceTask + 3 businessRuleTask), with horizontal DI. - Task ids, mappings, docs, manifest (dmn:true), uapf.yaml, lifecycle and eval-set updated; added a PII-bearing fixture. Only the semantic extraction remains a model step. Risk classification, GDPR routing and validation gating are now explicit ranked DMN rules — inspectable, versioned, portable. Breaking change: structure + outputs.
2026-05-17 20:00:36 +00:00
parent 3f1d62c748
commit dd69a04355
15 changed files with 496 additions and 120 deletions
--- a/resources/mappings.yaml
+++ b/resources/mappings.yaml
@@ -1,49 +1,60 @@
 kind: uapf.resources.mapping

+# Host-readable contract for the capability-backed service tasks. The three
+# DMN decisions (assess-personal-data-risk, gdpr-processing-route,
+# human-validation-gate) are NOT listed here: they are evaluated by the
+# UAPF runtime against the dmn/ cornerstone and need no host resource.
+
 targets:
  - id: agent.semantic-extractor
    type: ai_agent
    name: Semantic Extraction AI Agent
    description: |
-      Host-provided AI agent that fulfils ai.redact@1, ai.extract@1, and
-      event.emit@1 for this process. Implementation is the host's choice
-      (Claude, GPT, on-prem LLM, etc.); this package supplies the BPMN
-      flow, the output schema, and the guardrails.
+      Host-provided agent fulfilling ai.redact@1, ai.extract@1 and
+      event.emit@1. Implementation is the host's choice; this package
+      supplies the BPMN flow, the DMN decision logic, the output schema
+      and the guardrails.
    capabilities:
      - capability.ai.redact
      - capability.ai.extract
      - capability.event.emit

 bindings:
-  - source: { type: bpmn.serviceTask, ref: Task_RedactPii }
+  - source: { type: bpmn.serviceTask, ref: Task_DetectRedactPii }
    targetId: agent.semantic-extractor
    mode: autonomous
    contract:
      input:
-        - { name: text,       type: string, required: true }
-        - { name: categories, type: array,  required: false, description: "Optional PII categories; defaults to host policy." }
+        - { name: content, type: string, required: true }
      output:
-        - { name: redactedText, type: string }
-        - { name: detections,   type: array }
+        - { name: redactedContent,     type: string,  description: "Source text with PII masked." }
+        - { name: detectedEntityTypes, type: array,   description: "PII TYPE names only, never values." }
+        - { name: personasKodaPresent, type: boolean, description: "Latvian national ID regex hit." }
+        - { name: financialDataPresent,type: boolean, description: "IBAN regex hit." }
+        - { name: contactDataPresent,  type: boolean, description: "E-mail or phone regex hit." }
+        - { name: piiCategoryCount,    type: number,  description: "Count of distinct PII categories detected." }
      timeout: "10s"
    requiredCapabilities: [capability.ai.redact]
+    feeds: [assess-personal-data-risk]

  - source: { type: bpmn.serviceTask, ref: Task_ExtractSemantics }
    targetId: agent.semantic-extractor
    mode: autonomous
    contract:
      input:
-        - { name: text,   type: string, required: true,  description: "Redacted text from previous task." }
-        - { name: schema, type: object, required: true,  description: "VDVC v1.1 output schema. Reference: resources/schemas/vdvc-semantic-summary.schema.json" }
+        - { name: redactedContent, type: string, required: true }
+        - { name: schemaRef,       type: string, required: true, description: "resources/schemas/vdvc-semantic-summary.schema.json" }
      output:
-        - { name: extracted,  type: object, description: "Validates against resources/schemas/vdvc-semantic-summary.schema.json" }
-        - { name: confidence, type: number }
-        - { name: modelUsed,  type: string }
+        - { name: semanticSummary,     type: object, description: "Validates against the VDVC v1.1 schema." }
+        - { name: sensitivityControl,  type: object }
+        - { name: aiConfidenceScore,   type: number, description: "Flat 0.0-1.0; consumed by human-validation-gate." }
+        - { name: outputPiiErrorCount, type: number, description: "PII re-scan hits on extracted text; consumed by human-validation-gate." }
      timeout: "30s"
      retries: { maxAttempts: 2, backoffMs: 2000 }
    requiredCapabilities: [capability.ai.extract]
+    feeds: [human-validation-gate]

-  - source: { type: bpmn.serviceTask, ref: Task_EmitResultEvent }
+  - source: { type: bpmn.serviceTask, ref: Task_EmitResult }
    targetId: agent.semantic-extractor
    mode: autonomous
    contract: