You've already forked dokumenta-semantiska-analize
Import UAPF package
Compare commits
4 Commits
v3.0.0-alg
...
v3.2.0-emb
| Author | SHA1 | Date | |
|---|---|---|---|
| 9b3790c1fa | |||
| e97b9d7d40 | |||
| 59c87ee9a4 | |||
| 0a65c7ea5f |
93
README.md
93
README.md
@@ -1,17 +1,21 @@
|
|||||||
# Semantic Document Analysis
|
# Semantic Document Analysis
|
||||||
|
|
||||||
UAPF Level-4 process for semantic analysis of free-text documents,
|
UAPF Level-4 process for semantic analysis of free-text documents,
|
||||||
governed by **UAPF v2.3.0** (Algorithm Cards).
|
governed by **UAPF v2.4.0** (Algorithm Cards visible on BPMN tasks).
|
||||||
|
|
||||||
## What this package does
|
## What this package does
|
||||||
|
|
||||||
Three BPMN service tasks invoke three UAPF-IP host capabilities:
|
Three BPMN service tasks invoke three UAPF-IP host capabilities. Each
|
||||||
|
service task carries `uapf24:algorithmCardRef` pointing at the
|
||||||
|
Algorithm Card that governs the algorithm being invoked, and a
|
||||||
|
`<bpmn:ioSpecification>` synthesised from the card's `io` block so
|
||||||
|
inputs and outputs render as visible data objects.
|
||||||
|
|
||||||
| Task | Capability | Algorithm Card |
|
| Task | Capability | Algorithm Card | Risk class |
|
||||||
|-----------------------|----------------|---------------------------------------------------------------------|
|
|-----------------------|----------------|---------------------------------------------------------------------|------------|
|
||||||
| `Task_DetectRedactPii`| `ai.redact@1` | [`algorithms/pii_redactor.card.yaml`](algorithms/pii_redactor.card.yaml) |
|
| `Task_DetectRedactPii`| `ai.redact@1` | [`pii_redactor.card.yaml`](algorithms/pii_redactor.card.yaml) | limited |
|
||||||
| `Task_ExtractSemantics`| `ai.extract@1`| [`algorithms/vdvc_semantic_extractor.card.yaml`](algorithms/vdvc_semantic_extractor.card.yaml) |
|
| `Task_ExtractSemantics`| `ai.extract@1`| [`vdvc_semantic_extractor.card.yaml`](algorithms/vdvc_semantic_extractor.card.yaml) | high |
|
||||||
| `Task_EmitResult` | `event.emit@1` | [`algorithms/completion_event_emitter.card.yaml`](algorithms/completion_event_emitter.card.yaml) |
|
| `Task_EmitResult` | `event.emit@1` | [`completion_event_emitter.card.yaml`](algorithms/completion_event_emitter.card.yaml) | minimal |
|
||||||
|
|
||||||
Three DMN decision tables encode the deterministic policy:
|
Three DMN decision tables encode the deterministic policy:
|
||||||
|
|
||||||
@@ -25,42 +29,59 @@ Only `Task_ExtractSemantics` is a model-inference step (governed by the
|
|||||||
high-risk `vdvc_semantic_extractor` Card). Everything else is
|
high-risk `vdvc_semantic_extractor` Card). Everything else is
|
||||||
deterministic.
|
deterministic.
|
||||||
|
|
||||||
## v3.0.0 — Algorithm Cards
|
## v3.1.0 — Algorithm Cards visible on BPMN
|
||||||
|
|
||||||
The three opaque host capabilities are now wrapped in Algorithm Cards
|
In v3.1.0, the Algorithm Card references move from `resources/mappings.yaml`
|
||||||
under `algorithms/`. Each Card supplies, per UAPF v2.3.0 chapter 13:
|
targets onto the BPMN service tasks themselves, per UAPF v2.4.0. This
|
||||||
intent, IO contract, ownership, validation history, risk class, audit
|
matters because:
|
||||||
configuration, and (where relevant) `privacy` and `risk` extensions.
|
|
||||||
|
- A reader of the BPMN diagram now sees *which algorithm* runs at each
|
||||||
|
step, by inspecting the rendered task.
|
||||||
|
- The card's IO contract is synthesised into the task's
|
||||||
|
`<bpmn:ioSpecification>`, so downstream gateway conditions branching
|
||||||
|
on outputs like `ai_confidence_score` or `personas_koda_present`
|
||||||
|
are visually traceable to their source.
|
||||||
|
- A renderer that supports `uapf24:algorithmCardRef` (e.g., ProcessGit
|
||||||
|
preview, OpenDMS visualiser) draws the algorithm-card icon, name,
|
||||||
|
version, and risk-class dot directly on the task.
|
||||||
|
|
||||||
Audit question → answer-location:
|
Audit question → answer-location:
|
||||||
|
|
||||||
| Auditor asks | Read this |
|
| Auditor asks | Read this |
|
||||||
|-----------------------------------------------|------------------------------------------------|
|
|-----------------------------------------------|------------------------------------------------|
|
||||||
| What does the redactor detect? | `algorithms/pii_redactor.card.yaml` § io |
|
| Which algorithm runs at task X? | the BPMN itself: `uapf24:algorithmCardRef` attr |
|
||||||
| What's the AI Act risk class of the extractor?| `vdvc_semantic_extractor.card.yaml` § risk |
|
| What inputs/outputs does it have? | the BPMN task's `<bpmn:ioSpecification>` block |
|
||||||
| Who owns each algorithm? | each Card § owners |
|
| What is the algorithm's risk class? | the Card's `risk.aiActRiskClass` field |
|
||||||
| When was each algorithm last validated? | each Card § validation |
|
| When was the algorithm last validated? | the Card's `validation.last_validated` |
|
||||||
| What gets logged, with what retention? | each Card § audit |
|
| What gets logged, with what retention? | the Card's `audit` block |
|
||||||
| Why is human oversight needed? | `vdvc_semantic_extractor.card.yaml` § confidence |
|
| Why is human oversight needed? | the Card's `confidence` + `risk` blocks |
|
||||||
|
|
||||||
### Delta from v2.0.0
|
### Delta from v3.0.0
|
||||||
|
|
||||||
- **+** `algorithms/` folder with three Cards (one per opaque host capability).
|
- **~** `bpmn/semantic-document-analysis.bpmn`: each of the 3 service tasks now carries `xmlns:uapf24="https://uapf.dev/bpmn/v2.4"` `uapf24:algorithmCardRef` attribute, plus a `<bpmn:ioSpecification>` synthesised from the card's `io` block.
|
||||||
- **+** `algorithm_cards: true` and `paths.algorithms` in `uapf.yaml` / `manifest.json`.
|
- **~** `resources/mappings.yaml`: `algorithm_card:` removed from each of the 3 targets. They go back to being just dispatch endpoints, per UAPF v2.4.0.
|
||||||
- **~** `resources/mappings.yaml`: single `agent.semantic-extractor` target split into three algorithm-specific targets (`agent.pii_redactor`, `agent.vdvc_semantic_extractor`, `agent.completion_event_emitter`), each carrying its `algorithm_card` reference. Binding shape unchanged.
|
- **~** `uapf.yaml` / `manifest.json`: version `3.0.0` → `3.1.0`.
|
||||||
- **~** `bpmn/semantic-document-analysis.bpmn`: **unchanged**. Algorithm Cards live on resource targets, not in the BPMN — no extension elements required.
|
- **=** `algorithms/*.card.yaml`: unchanged.
|
||||||
- **−** `provides_decisions` removed from manifest (was not in the SSOT manifest schema; DMN decisions are self-describing via the `dmn/` cornerstone).
|
- **=** `dmn/*.dmn`: unchanged.
|
||||||
|
|
||||||
|
### Why the v3.0.0 → v3.1.0 churn
|
||||||
|
|
||||||
|
v3.0.0 followed UAPF v2.3.0, which placed the algorithm card on the
|
||||||
|
resource target. That hid the algorithm from the BPMN diagram. UAPF
|
||||||
|
v2.4.0 reverses that decision and moves the reference onto the BPMN
|
||||||
|
task. v3.1.0 of this package follows the corrected spec. Algorithm
|
||||||
|
Cards themselves are unchanged across both revisions.
|
||||||
|
|
||||||
## Structure
|
## Structure
|
||||||
|
|
||||||
```
|
```
|
||||||
.
|
.
|
||||||
├── uapf.yaml + manifest.json # Package manifest (UAPF v2.3.0)
|
├── uapf.yaml + manifest.json # Package manifest (UAPF v2.4.0)
|
||||||
├── bpmn/ # 1 BPMN process (unchanged from v2.0.0)
|
├── bpmn/ # 1 BPMN process (algorithm refs + ioSpecification)
|
||||||
├── dmn/ # 3 DMN decision tables (unchanged from v2.0.0)
|
├── dmn/ # 3 DMN decision tables
|
||||||
├── algorithms/ # 3 Algorithm Cards (NEW in v3.0.0)
|
├── algorithms/ # 3 Algorithm Cards (introduced in v3.0.0)
|
||||||
├── resources/
|
├── resources/
|
||||||
│ ├── mappings.yaml # Resource targets w/ algorithm_card refs (REFACTORED)
|
│ ├── mappings.yaml # Resource targets (dispatch endpoints only)
|
||||||
│ ├── guardrails.yaml
|
│ ├── guardrails.yaml
|
||||||
│ └── schemas/ # Output JSON Schemas
|
│ └── schemas/ # Output JSON Schemas
|
||||||
├── metadata/ # ownership + lifecycle
|
├── metadata/ # ownership + lifecycle
|
||||||
@@ -71,9 +92,21 @@ Audit question → answer-location:
|
|||||||
|
|
||||||
## Validation
|
## Validation
|
||||||
|
|
||||||
Validates against UAPF v2.3.0 schemas at
|
Validates against UAPF v2.4.0 schemas at
|
||||||
`github.com/UAPFormat/UAPF-specification`:
|
`github.com/UAPFormat/UAPF-specification`:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python tools/uapf-cli/uapf.py validate /path/to/dokumenta-semantiska-analize
|
python tools/uapf-cli/uapf.py validate /path/to/dokumenta-semantiska-analize
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## v3.2.0 (UAPF v2.5.0 alignment)
|
||||||
|
|
||||||
|
Tests are now **embedded in each algorithm card** under a top-level `tests:` array (minimum 2 entries per card). The old sidecar location `tests/algorithms/<card-id>.test.yaml` is **removed** per UAPF v2.5.0 — that location no longer applies to algorithm cards.
|
||||||
|
|
||||||
|
Embedded tests for this package:
|
||||||
|
- `algo.semantic_document_analysis.pii_redactor` — 3 cases (Latvian personas kods inline, plain text with no PII, financial figures + IBAN)
|
||||||
|
- `algo.semantic_document_analysis.vdvc_semantic_extractor` — 2 cases (regulatory complaint, non-regulatory thank-you), both with `ai_confidence_score` tolerance bands appropriate for a stochastic LLM extractor
|
||||||
|
- `algo.semantic_document_analysis.completion_event_emitter` — 2 cases (success completion, failure completion)
|
||||||
|
|
||||||
|
The Algorithm Card viewer (UAPF v2.5.0 chapter 13.16, ProcessGit Preview tab) consumes these embedded tests as its primary interaction surface — sample browser for `external` cards, regex/FEEL/source-display for `inline` cards.
|
||||||
|
|
||||||
|
|||||||
@@ -1,64 +1,70 @@
|
|||||||
kind: uapf.algorithm.card
|
kind: uapf.algorithm.card
|
||||||
|
|
||||||
id: algo.semantic_document_analysis.completion_event_emitter
|
id: algo.semantic_document_analysis.completion_event_emitter
|
||||||
version: "1.0.0"
|
version: 1.0.0
|
||||||
name: "Process completion event emitter"
|
name: Process completion event emitter
|
||||||
intent: >
|
intent: |
|
||||||
Publishes a CloudEvents 1.0-conformant event marking the completion
|
Publishes a CloudEvents 1.0-conformant event marking the completion of one semantic analysis cycle, with the DMN-decided fields (personal data risk, processing route, redaction level, human validation status) attached. Personal data is NEVER included in the emitted payload — only the deterministic classification fields.
|
||||||
of one semantic analysis cycle, with the DMN-decided fields
|
|
||||||
(personal data risk, processing route, redaction level, human
|
|
||||||
validation status) attached. Personal data is NEVER included in
|
|
||||||
the emitted payload — only the deterministic classification fields.
|
|
||||||
|
|
||||||
algorithm_kind: emitter
|
algorithm_kind: emitter
|
||||||
|
|
||||||
io:
|
io:
|
||||||
inputs:
|
inputs:
|
||||||
- id: event_type
|
- id: event_type
|
||||||
type: string
|
type: string
|
||||||
cardinality: single
|
cardinality: single
|
||||||
- id: payload
|
- id: payload
|
||||||
type: object
|
type: object
|
||||||
cardinality: single
|
cardinality: single
|
||||||
outputs:
|
outputs:
|
||||||
- id: published
|
- id: published
|
||||||
type: boolean
|
type: boolean
|
||||||
|
|
||||||
implementation:
|
implementation:
|
||||||
type: external
|
type: external
|
||||||
medium: mcp_tool
|
medium: mcp_tool
|
||||||
uri: "uapf-ip://capability/event.emit@1"
|
uri: uapf-ip://capability/event.emit@1
|
||||||
hash: "sha256:0000000000000000000000000000000000000000000000000000000000000000"
|
hash: sha256:0000000000000000000000000000000000000000000000000000000000000000
|
||||||
runtime:
|
runtime:
|
||||||
capability: "event.emit@1"
|
capability: event.emit@1
|
||||||
cloud_events_spec: "1.0"
|
cloud_events_spec: '1.0'
|
||||||
|
|
||||||
determinism: deterministic
|
determinism: deterministic
|
||||||
side_effects: writes_state
|
side_effects: writes_state
|
||||||
|
|
||||||
confidence:
|
confidence:
|
||||||
type: none
|
type: none
|
||||||
|
|
||||||
complexity:
|
complexity:
|
||||||
typical_latency_ms: 25
|
typical_latency_ms: 25
|
||||||
max_latency_ms: 1000
|
max_latency_ms: 1000
|
||||||
|
failure_mode: throw — process must complete reliably or fail loudly.
|
||||||
failure_mode: "throw — process must complete reliably or fail loudly."
|
|
||||||
|
|
||||||
reference:
|
reference:
|
||||||
standard: "CloudEvents 1.0"
|
standard: CloudEvents 1.0
|
||||||
url: "https://github.com/cloudevents/spec/blob/v1.0/spec.md"
|
url: https://github.com/cloudevents/spec/blob/v1.0/spec.md
|
||||||
|
|
||||||
owners:
|
owners:
|
||||||
- type: team
|
- type: team
|
||||||
id: uapf-stewards
|
id: uapf-stewards
|
||||||
contact: stewards@uapf.dev
|
contact: stewards@uapf.dev
|
||||||
|
|
||||||
lifecycle:
|
lifecycle:
|
||||||
status: draft
|
status: draft
|
||||||
since: "2026-05-20"
|
since: '2026-05-20'
|
||||||
|
|
||||||
audit:
|
audit:
|
||||||
log_inputs: full
|
log_inputs: full
|
||||||
log_outputs: full
|
log_outputs: full
|
||||||
retention: "1y"
|
retention: 1y
|
||||||
|
tests:
|
||||||
|
- name: Successful analysis completion
|
||||||
|
description: Standard happy-path completion event with full payload.
|
||||||
|
inputs:
|
||||||
|
event_type: dev.dokumenta.semantic_analysis.completed
|
||||||
|
payload:
|
||||||
|
document_id: doc-2026-05-21-001
|
||||||
|
outcome: ok
|
||||||
|
confidence: 0.87
|
||||||
|
expected_outputs:
|
||||||
|
published: true
|
||||||
|
- name: Analysis failure completion
|
||||||
|
description: Failure-path completion event still emits successfully (the emitter
|
||||||
|
does not gate on payload contents).
|
||||||
|
inputs:
|
||||||
|
event_type: dev.dokumenta.semantic_analysis.failed
|
||||||
|
payload:
|
||||||
|
document_id: doc-2026-05-21-002
|
||||||
|
outcome: extraction_failed
|
||||||
|
reason: low_confidence
|
||||||
|
expected_outputs:
|
||||||
|
published: true
|
||||||
|
|||||||
@@ -1,87 +1,117 @@
|
|||||||
kind: uapf.algorithm.card
|
kind: uapf.algorithm.card
|
||||||
|
|
||||||
id: algo.semantic_document_analysis.pii_redactor
|
id: algo.semantic_document_analysis.pii_redactor
|
||||||
version: "1.0.0"
|
version: 1.0.0
|
||||||
name: "PII detector and redactor"
|
name: PII detector and redactor
|
||||||
intent: >
|
intent: |
|
||||||
Detects personally identifiable information in free-text documents
|
Detects personally identifiable information in free-text documents (Latvian personas kods, IBAN, phone numbers, e-mail addresses, names) and returns the source text with PII masked plus structured regex-hit signals used by the downstream DMN decision assess-personal-data-risk.
|
||||||
(Latvian personas kods, IBAN, phone numbers, e-mail addresses,
|
|
||||||
names) and returns the source text with PII masked plus structured
|
|
||||||
regex-hit signals used by the downstream DMN decision
|
|
||||||
assess-personal-data-risk.
|
|
||||||
|
|
||||||
algorithm_kind: redactor
|
algorithm_kind: redactor
|
||||||
|
|
||||||
io:
|
io:
|
||||||
inputs:
|
inputs:
|
||||||
- id: content
|
- id: content
|
||||||
type: string
|
type: string
|
||||||
cardinality: single
|
cardinality: single
|
||||||
constraints:
|
constraints:
|
||||||
maxLength: 200000
|
maxLength: 200000
|
||||||
documentation: "Raw document text submitted for semantic analysis."
|
documentation: Raw document text submitted for semantic analysis.
|
||||||
outputs:
|
outputs:
|
||||||
- id: redacted_content
|
- id: redacted_content
|
||||||
type: string
|
type: string
|
||||||
documentation: "Source text with PII masked by category tokens."
|
documentation: Source text with PII masked by category tokens.
|
||||||
- id: detected_entity_types
|
- id: detected_entity_types
|
||||||
type: array
|
type: array
|
||||||
documentation: "PII category names only — never values."
|
documentation: PII category names only — never values.
|
||||||
- id: personas_koda_present
|
- id: personas_koda_present
|
||||||
type: boolean
|
type: boolean
|
||||||
- id: financial_data_present
|
- id: financial_data_present
|
||||||
type: boolean
|
type: boolean
|
||||||
- id: contact_data_present
|
- id: contact_data_present
|
||||||
type: boolean
|
type: boolean
|
||||||
- id: pii_category_count
|
- id: pii_category_count
|
||||||
type: integer
|
type: integer
|
||||||
constraints: { minimum: 0 }
|
constraints:
|
||||||
|
minimum: 0
|
||||||
implementation:
|
implementation:
|
||||||
type: external
|
type: external
|
||||||
medium: mcp_tool
|
medium: mcp_tool
|
||||||
uri: "uapf-ip://capability/ai.redact@1"
|
uri: uapf-ip://capability/ai.redact@1
|
||||||
hash: "sha256:0000000000000000000000000000000000000000000000000000000000000000"
|
hash: sha256:0000000000000000000000000000000000000000000000000000000000000000
|
||||||
runtime:
|
runtime:
|
||||||
capability: "ai.redact@1"
|
capability: ai.redact@1
|
||||||
note: "Host-fulfilled UAPF-IP capability. Hash is a placeholder until the runtime publishes the implementation hash of its ai.redact handler."
|
note: Host-fulfilled UAPF-IP capability. Hash is a placeholder until the runtime
|
||||||
|
publishes the implementation hash of its ai.redact handler.
|
||||||
determinism: deterministic
|
determinism: deterministic
|
||||||
side_effects: pure
|
side_effects: pure
|
||||||
|
|
||||||
complexity:
|
complexity:
|
||||||
typical_latency_ms: 250
|
typical_latency_ms: 250
|
||||||
max_latency_ms: 10000
|
max_latency_ms: 10000
|
||||||
|
failure_mode: throw — refuse processing if redactor unavailable; PII risk dominates.
|
||||||
failure_mode: "throw — refuse processing if redactor unavailable; PII risk dominates."
|
|
||||||
|
|
||||||
limitations:
|
limitations:
|
||||||
- "Latviešu valodas personu vārdi atpazīstami ~92% gadījumu"
|
- Latviešu valodas personu vārdi atpazīstami ~92% gadījumu
|
||||||
- "Pieņem, ka teksts jau ir digitāls — OCR nav iekļauta"
|
- Pieņem, ka teksts jau ir digitāls — OCR nav iekļauta
|
||||||
|
|
||||||
reference:
|
reference:
|
||||||
legal: "GDPR 2016/679 5. pants (datu minimizēšana); Fizisko personu datu apstrādes likums."
|
legal: GDPR 2016/679 5. pants (datu minimizēšana); Fizisko personu datu apstrādes
|
||||||
standard: "NIST SP 800-188 — De-Identification of Personal Information."
|
likums.
|
||||||
|
standard: NIST SP 800-188 — De-Identification of Personal Information.
|
||||||
owners:
|
owners:
|
||||||
- type: role
|
- type: role
|
||||||
id: data_protection_officer
|
id: data_protection_officer
|
||||||
contact: stewards@uapf.dev
|
contact: stewards@uapf.dev
|
||||||
|
|
||||||
lifecycle:
|
lifecycle:
|
||||||
status: draft
|
status: draft
|
||||||
since: "2026-05-20"
|
since: '2026-05-20'
|
||||||
|
|
||||||
audit:
|
audit:
|
||||||
log_inputs: redacted
|
log_inputs: redacted
|
||||||
log_outputs: full
|
log_outputs: full
|
||||||
retention: "7y"
|
retention: 7y
|
||||||
|
|
||||||
privacy:
|
privacy:
|
||||||
processesPII: true
|
processesPII: true
|
||||||
technique: pseudonymization
|
technique: pseudonymization
|
||||||
reidentificationRisk: low
|
reidentificationRisk: low
|
||||||
|
|
||||||
risk:
|
risk:
|
||||||
aiActRiskClass: limited
|
aiActRiskClass: limited
|
||||||
humanOversight: advisory
|
humanOversight: advisory
|
||||||
|
tests:
|
||||||
|
- name: Latvian personas kods inline in text
|
||||||
|
description: Standard 11-character Latvian personal identity code (NNNNNN-NNNNN)
|
||||||
|
should be detected and redacted.
|
||||||
|
inputs:
|
||||||
|
content: 'Lūgums izskatīt iesniegumu. Iesniedzējs: Jānis Bērziņš, personas kods:
|
||||||
|
010101-12345. Adrese: Brīvības iela 1, Rīga.'
|
||||||
|
expected_outputs:
|
||||||
|
redacted_content: 'Lūgums izskatīt iesniegumu. Iesniedzējs: [NAME], personas kods:
|
||||||
|
[REDACTED]. Adrese: [ADDRESS].'
|
||||||
|
detected_entity_types:
|
||||||
|
- PERSONAS_KODS
|
||||||
|
- PERSON
|
||||||
|
- ADDRESS
|
||||||
|
personas_koda_present: true
|
||||||
|
financial_data_present: false
|
||||||
|
contact_data_present: true
|
||||||
|
pii_category_count: 3
|
||||||
|
- name: Plain administrative text with no PII
|
||||||
|
description: Generic administrative paragraph; nothing to redact. Verifies the redactor
|
||||||
|
doesn't false-positive on plain text.
|
||||||
|
inputs:
|
||||||
|
content: Iesniegums tiek izskatīts atbilstoši normatīvajiem aktiem. Lēmums tiks
|
||||||
|
paziņots noteiktajā kārtībā.
|
||||||
|
expected_outputs:
|
||||||
|
redacted_content: Iesniegums tiek izskatīts atbilstoši normatīvajiem aktiem. Lēmums
|
||||||
|
tiks paziņots noteiktajā kārtībā.
|
||||||
|
detected_entity_types: []
|
||||||
|
personas_koda_present: false
|
||||||
|
financial_data_present: false
|
||||||
|
contact_data_present: false
|
||||||
|
pii_category_count: 0
|
||||||
|
- name: Financial figures and account numbers
|
||||||
|
description: EUR amounts and IBAN — both detected as financial PII; no personas_kods.
|
||||||
|
inputs:
|
||||||
|
content: Maksājums EUR 1250.00 pārskaitīts uz kontu LV80BANK0000435195001.
|
||||||
|
expected_outputs:
|
||||||
|
redacted_content: Maksājums EUR [AMOUNT] pārskaitīts uz kontu [IBAN].
|
||||||
|
detected_entity_types:
|
||||||
|
- AMOUNT
|
||||||
|
- IBAN
|
||||||
|
personas_koda_present: false
|
||||||
|
financial_data_present: true
|
||||||
|
contact_data_present: false
|
||||||
|
pii_category_count: 2
|
||||||
|
|||||||
@@ -1,88 +1,119 @@
|
|||||||
kind: uapf.algorithm.card
|
kind: uapf.algorithm.card
|
||||||
|
|
||||||
id: algo.semantic_document_analysis.vdvc_semantic_extractor
|
id: algo.semantic_document_analysis.vdvc_semantic_extractor
|
||||||
version: "1.0.0"
|
version: 1.0.0
|
||||||
name: "VDVC semantic metadata extractor"
|
name: VDVC semantic metadata extractor
|
||||||
intent: >
|
intent: |
|
||||||
Extracts a VDVC v1.1-conformant structured semantic summary from
|
Extracts a VDVC v1.1-conformant structured semantic summary from the redacted document text — primary topic, keywords, classification, summary, sensitivity signals. Output validates against resources/schemas/vdvc-semantic-summary.schema.json. This is the sole model-inference step in the process; everything else in the package is deterministic.
|
||||||
the redacted document text — primary topic, keywords,
|
|
||||||
classification, summary, sensitivity signals. Output validates
|
|
||||||
against resources/schemas/vdvc-semantic-summary.schema.json. This
|
|
||||||
is the sole model-inference step in the process; everything else
|
|
||||||
in the package is deterministic.
|
|
||||||
|
|
||||||
algorithm_kind: extractor
|
algorithm_kind: extractor
|
||||||
|
|
||||||
io:
|
io:
|
||||||
inputs:
|
inputs:
|
||||||
- id: redacted_content
|
- id: redacted_content
|
||||||
type: string
|
type: string
|
||||||
cardinality: single
|
cardinality: single
|
||||||
constraints:
|
constraints:
|
||||||
maxLength: 200000
|
maxLength: 200000
|
||||||
documentation: "Output of the upstream PII redactor."
|
documentation: Output of the upstream PII redactor.
|
||||||
- id: schema_ref
|
- id: schema_ref
|
||||||
type: string
|
type: string
|
||||||
documentation: "Path to the JSON Schema the output must validate against."
|
documentation: Path to the JSON Schema the output must validate against.
|
||||||
outputs:
|
outputs:
|
||||||
- id: semantic_summary
|
- id: semantic_summary
|
||||||
type: object
|
type: object
|
||||||
schema: "../resources/schemas/vdvc-semantic-summary.schema.json"
|
schema: ../resources/schemas/vdvc-semantic-summary.schema.json
|
||||||
- id: sensitivity_control
|
- id: sensitivity_control
|
||||||
type: object
|
type: object
|
||||||
- id: ai_confidence_score
|
- id: ai_confidence_score
|
||||||
type: probability
|
type: probability
|
||||||
- id: output_pii_error_count
|
- id: output_pii_error_count
|
||||||
type: integer
|
type: integer
|
||||||
constraints: { minimum: 0 }
|
constraints:
|
||||||
|
minimum: 0
|
||||||
implementation:
|
implementation:
|
||||||
type: external
|
type: external
|
||||||
medium: llm_prompt
|
medium: llm_prompt
|
||||||
uri: "uapf-ip://capability/ai.extract@1"
|
uri: uapf-ip://capability/ai.extract@1
|
||||||
hash: "sha256:0000000000000000000000000000000000000000000000000000000000000000"
|
hash: sha256:0000000000000000000000000000000000000000000000000000000000000000
|
||||||
runtime:
|
runtime:
|
||||||
capability: "ai.extract@1"
|
capability: ai.extract@1
|
||||||
note: "Host-fulfilled UAPF-IP capability. Specific model identity and prompt hash are runtime concerns of the host; the Card declares the contract, not the implementation choice."
|
note: Host-fulfilled UAPF-IP capability. Specific model identity and prompt hash
|
||||||
|
are runtime concerns of the host; the Card declares the contract, not the implementation
|
||||||
|
choice.
|
||||||
determinism: stochastic
|
determinism: stochastic
|
||||||
side_effects: external_call
|
side_effects: external_call
|
||||||
|
|
||||||
confidence:
|
confidence:
|
||||||
type: probability
|
type: probability
|
||||||
threshold: 0.70
|
threshold: 0.7
|
||||||
below_threshold: "route-to:human.legal_reviewer (enforced by DMN human-validation-gate)"
|
below_threshold: route-to:human.legal_reviewer (enforced by DMN human-validation-gate)
|
||||||
|
|
||||||
complexity:
|
complexity:
|
||||||
typical_latency_ms: 8000
|
typical_latency_ms: 8000
|
||||||
max_latency_ms: 60000
|
max_latency_ms: 60000
|
||||||
|
failure_mode: default:null + flag — DMN human-validation-gate routes low-confidence
|
||||||
failure_mode: "default:null + flag — DMN human-validation-gate routes low-confidence outputs to PENDING_REVIEW."
|
outputs to PENDING_REVIEW.
|
||||||
|
|
||||||
limitations:
|
limitations:
|
||||||
- "Garie dokumenti (>50 000 znaki) tiek apgriezti — pirmie 50K + pēdējie 5K"
|
- Garie dokumenti (>50 000 znaki) tiek apgriezti — pirmie 50K + pēdējie 5K
|
||||||
- "Nav juridisks vērtējums — tikai semantiska klasifikācija"
|
- Nav juridisks vērtējums — tikai semantiska klasifikācija
|
||||||
- "Latviešu valodas juridiskā retorika var samazināt recall"
|
- Latviešu valodas juridiskā retorika var samazināt recall
|
||||||
|
|
||||||
reference:
|
reference:
|
||||||
legal: "EU AI Act 2024/1689, Pielikums III (augstā riska MI sistēmas), 13. pants (caurspīdība)."
|
legal: EU AI Act 2024/1689, Pielikums III (augstā riska MI sistēmas), 13. pants
|
||||||
url: "https://eur-lex.europa.eu/eli/reg/2024/1689/oj"
|
(caurspīdība).
|
||||||
|
url: https://eur-lex.europa.eu/eli/reg/2024/1689/oj
|
||||||
owners:
|
owners:
|
||||||
- type: team
|
- type: team
|
||||||
id: uapf-stewards
|
id: uapf-stewards
|
||||||
contact: stewards@uapf.dev
|
contact: stewards@uapf.dev
|
||||||
|
|
||||||
lifecycle:
|
lifecycle:
|
||||||
status: draft
|
status: draft
|
||||||
since: "2026-05-20"
|
since: '2026-05-20'
|
||||||
|
|
||||||
audit:
|
audit:
|
||||||
log_inputs: redacted
|
log_inputs: redacted
|
||||||
log_outputs: full
|
log_outputs: full
|
||||||
retention: "7y"
|
retention: 7y
|
||||||
|
|
||||||
risk:
|
risk:
|
||||||
aiActRiskClass: high
|
aiActRiskClass: high
|
||||||
humanOversight: mandatory
|
humanOversight: mandatory
|
||||||
transparencyTier: tier-3-full
|
transparencyTier: tier-3-full
|
||||||
|
tests:
|
||||||
|
- name: Regulatory iesniegums about administrative decision
|
||||||
|
description: Typical Latvian administrative complaint with redacted PII. The extractor
|
||||||
|
should identify topic + risk + applicable regulation.
|
||||||
|
inputs:
|
||||||
|
redacted_content: Iesniedzējs [NAME] iesniedza sūdzību par būvvaldes lēmumu Nr.
|
||||||
|
12345 atteikt būvatļauju adresē [ADDRESS]. Tiek lūgts pārskatīt lēmumu.
|
||||||
|
schema_ref: schemas/iesniegums/v1
|
||||||
|
expected_outputs:
|
||||||
|
semantic_summary:
|
||||||
|
topic: construction-permit-appeal
|
||||||
|
subject_area: administrative-law
|
||||||
|
applicable_regulations:
|
||||||
|
- BL
|
||||||
|
- APL
|
||||||
|
language: lv
|
||||||
|
sensitivity_control:
|
||||||
|
contains_decision_reference: true
|
||||||
|
external_communication_recommended: false
|
||||||
|
ai_confidence_score: 0.87
|
||||||
|
output_pii_error_count: 0
|
||||||
|
tolerance:
|
||||||
|
ai_confidence_score: 0.1
|
||||||
|
output_pii_error_count: 0
|
||||||
|
- name: Non-regulatory thank-you note
|
||||||
|
description: Out-of-domain input. Extractor should yield low-confidence summary
|
||||||
|
and a sensitivity flag that no decision is referenced.
|
||||||
|
inputs:
|
||||||
|
redacted_content: Paldies par jūsu pakalpojumu! Bija ļoti patīkami sadarboties
|
||||||
|
ar [NAME] no jūsu komandas.
|
||||||
|
schema_ref: schemas/iesniegums/v1
|
||||||
|
expected_outputs:
|
||||||
|
semantic_summary:
|
||||||
|
topic: non-actionable-correspondence
|
||||||
|
subject_area: feedback
|
||||||
|
applicable_regulations: []
|
||||||
|
language: lv
|
||||||
|
sensitivity_control:
|
||||||
|
contains_decision_reference: false
|
||||||
|
external_communication_recommended: false
|
||||||
|
ai_confidence_score: 0.62
|
||||||
|
output_pii_error_count: 0
|
||||||
|
tolerance:
|
||||||
|
ai_confidence_score: 0.15
|
||||||
|
output_pii_error_count: 0
|
||||||
|
|||||||
@@ -2,6 +2,7 @@
|
|||||||
<bpmn:definitions
|
<bpmn:definitions
|
||||||
xmlns:bpmn="http://www.omg.org/spec/BPMN/20100524/MODEL"
|
xmlns:bpmn="http://www.omg.org/spec/BPMN/20100524/MODEL"
|
||||||
xmlns:uapf="https://uapf.dev/bpmn-ext/v1"
|
xmlns:uapf="https://uapf.dev/bpmn-ext/v1"
|
||||||
|
xmlns:uapf24="https://uapf.dev/bpmn/v2.4"
|
||||||
xmlns:bpmndi="http://www.omg.org/spec/BPMN/20100524/DI"
|
xmlns:bpmndi="http://www.omg.org/spec/BPMN/20100524/DI"
|
||||||
xmlns:dc="http://www.omg.org/spec/DD/20100524/DC"
|
xmlns:dc="http://www.omg.org/spec/DD/20100524/DC"
|
||||||
xmlns:di="http://www.omg.org/spec/DD/20100524/DI"
|
xmlns:di="http://www.omg.org/spec/DD/20100524/DI"
|
||||||
@@ -16,15 +17,36 @@
|
|||||||
|
|
||||||
<bpmn:serviceTask id="Task_DetectRedactPii"
|
<bpmn:serviceTask id="Task_DetectRedactPii"
|
||||||
name="Detect and redact PII"
|
name="Detect and redact PII"
|
||||||
uapf:capability="ai.redact@1">
|
uapf:capability="ai.redact@1"
|
||||||
|
uapf24:algorithmCardRef="algo.semantic_document_analysis.pii_redactor">
|
||||||
<bpmn:documentation>
|
<bpmn:documentation>
|
||||||
Calls ai.redact@1 over the source text. Beyond masking, the host
|
Calls ai.redact@1 over the source text. Governed by Algorithm
|
||||||
|
Card algo.semantic_document_analysis.pii_redactor (see
|
||||||
|
algorithms/pii_redactor.card.yaml). Beyond masking, the host
|
||||||
runs the four Latvian PII regex detectors (personas kods, IBAN,
|
runs the four Latvian PII regex detectors (personas kods, IBAN,
|
||||||
e-mail, phone) and returns the deterministic signal set the risk
|
e-mail, phone) and returns the deterministic signal set the risk
|
||||||
decision consumes: personasKodaPresent, financialDataPresent,
|
decision consumes.
|
||||||
contactDataPresent, piiCategoryCount, detectedEntityTypes, plus
|
|
||||||
redactedContent. No model inference — pure pattern detection.
|
|
||||||
</bpmn:documentation>
|
</bpmn:documentation>
|
||||||
|
<bpmn:ioSpecification>
|
||||||
|
<bpmn:dataInput id="content" name="content : string"/>
|
||||||
|
<bpmn:dataOutput id="redacted_content" name="redacted_content : string"/>
|
||||||
|
<bpmn:dataOutput id="detected_entity_types" name="detected_entity_types : array"/>
|
||||||
|
<bpmn:dataOutput id="personas_koda_present" name="personas_koda_present : boolean"/>
|
||||||
|
<bpmn:dataOutput id="financial_data_present" name="financial_data_present : boolean"/>
|
||||||
|
<bpmn:dataOutput id="contact_data_present" name="contact_data_present : boolean"/>
|
||||||
|
<bpmn:dataOutput id="pii_category_count" name="pii_category_count : integer"/>
|
||||||
|
<bpmn:inputSet>
|
||||||
|
<bpmn:dataInputRefs>content</bpmn:dataInputRefs>
|
||||||
|
</bpmn:inputSet>
|
||||||
|
<bpmn:outputSet>
|
||||||
|
<bpmn:dataOutputRefs>redacted_content</bpmn:dataOutputRefs>
|
||||||
|
<bpmn:dataOutputRefs>detected_entity_types</bpmn:dataOutputRefs>
|
||||||
|
<bpmn:dataOutputRefs>personas_koda_present</bpmn:dataOutputRefs>
|
||||||
|
<bpmn:dataOutputRefs>financial_data_present</bpmn:dataOutputRefs>
|
||||||
|
<bpmn:dataOutputRefs>contact_data_present</bpmn:dataOutputRefs>
|
||||||
|
<bpmn:dataOutputRefs>pii_category_count</bpmn:dataOutputRefs>
|
||||||
|
</bpmn:outputSet>
|
||||||
|
</bpmn:ioSpecification>
|
||||||
</bpmn:serviceTask>
|
</bpmn:serviceTask>
|
||||||
|
|
||||||
<bpmn:businessRuleTask id="Decision_AssessRisk"
|
<bpmn:businessRuleTask id="Decision_AssessRisk"
|
||||||
@@ -32,9 +54,7 @@
|
|||||||
uapf:decision="assess-personal-data-risk">
|
uapf:decision="assess-personal-data-risk">
|
||||||
<bpmn:documentation>
|
<bpmn:documentation>
|
||||||
DMN dmn/assess-personal-data-risk.dmn. Maps the PII signal set to
|
DMN dmn/assess-personal-data-risk.dmn. Maps the PII signal set to
|
||||||
personalDataRisk (NONE | LOW | MEDIUM | HIGH) by explicit ranked
|
personalDataRisk (NONE | LOW | MEDIUM | HIGH).
|
||||||
rules. Personas kods or IBAN forces HIGH; two or more categories
|
|
||||||
or contact data gives MEDIUM. Deterministic and auditable.
|
|
||||||
</bpmn:documentation>
|
</bpmn:documentation>
|
||||||
</bpmn:businessRuleTask>
|
</bpmn:businessRuleTask>
|
||||||
|
|
||||||
@@ -44,48 +64,75 @@
|
|||||||
<bpmn:documentation>
|
<bpmn:documentation>
|
||||||
DMN dmn/gdpr-processing-route.dmn. From personalDataRisk and
|
DMN dmn/gdpr-processing-route.dmn. From personalDataRisk and
|
||||||
allowCentralization decides processingRoute (CENTRAL | LOCAL),
|
allowCentralization decides processingRoute (CENTRAL | LOCAL),
|
||||||
anonymizationRequired and redactionLevel. This is the routing
|
anonymizationRequired and redactionLevel.
|
||||||
rule extracted from the host's generate_semantic_metadata: a
|
|
||||||
sensitive document where centralisation is not permitted stays
|
|
||||||
LOCAL with full redaction.
|
|
||||||
</bpmn:documentation>
|
</bpmn:documentation>
|
||||||
</bpmn:businessRuleTask>
|
</bpmn:businessRuleTask>
|
||||||
|
|
||||||
<bpmn:serviceTask id="Task_ExtractSemantics"
|
<bpmn:serviceTask id="Task_ExtractSemantics"
|
||||||
name="Extract semantic metadata"
|
name="Extract semantic metadata"
|
||||||
uapf:capability="ai.extract@1"
|
uapf:capability="ai.extract@1"
|
||||||
uapf:schemaRef="resources/schemas/vdvc-semantic-summary.schema.json">
|
uapf:schemaRef="resources/schemas/vdvc-semantic-summary.schema.json"
|
||||||
|
uapf24:algorithmCardRef="algo.semantic_document_analysis.vdvc_semantic_extractor">
|
||||||
<bpmn:documentation>
|
<bpmn:documentation>
|
||||||
Calls ai.extract@1 on redactedContent with the VDVC v1.1 output
|
Calls ai.extract@1 on redactedContent with the VDVC v1.1 output
|
||||||
schema. This is the single bounded model step: it produces the
|
schema. Governed by Algorithm Card
|
||||||
semanticSummary (topic, summary, keywords, urgency, risk) and
|
algo.semantic_document_analysis.vdvc_semantic_extractor (see
|
||||||
must validate against resources/schemas/vdvc-semantic-summary.
|
algorithms/vdvc_semantic_extractor.card.yaml). EU AI Act
|
||||||
The host also returns flat aiConfidenceScore and the result of
|
Annex III high-risk; human oversight is mandatory and is
|
||||||
the post-extraction PII re-scan as outputPiiErrorCount.
|
enforced downstream by the human-validation-gate DMN.
|
||||||
</bpmn:documentation>
|
</bpmn:documentation>
|
||||||
|
<bpmn:ioSpecification>
|
||||||
|
<bpmn:dataInput id="redacted_content" name="redacted_content : string"/>
|
||||||
|
<bpmn:dataInput id="schema_ref" name="schema_ref : string"/>
|
||||||
|
<bpmn:dataOutput id="semantic_summary" name="semantic_summary : object"/>
|
||||||
|
<bpmn:dataOutput id="sensitivity_control" name="sensitivity_control : object"/>
|
||||||
|
<bpmn:dataOutput id="ai_confidence_score" name="ai_confidence_score : probability"/>
|
||||||
|
<bpmn:dataOutput id="output_pii_error_count" name="output_pii_error_count : integer"/>
|
||||||
|
<bpmn:inputSet>
|
||||||
|
<bpmn:dataInputRefs>redacted_content</bpmn:dataInputRefs>
|
||||||
|
<bpmn:dataInputRefs>schema_ref</bpmn:dataInputRefs>
|
||||||
|
</bpmn:inputSet>
|
||||||
|
<bpmn:outputSet>
|
||||||
|
<bpmn:dataOutputRefs>semantic_summary</bpmn:dataOutputRefs>
|
||||||
|
<bpmn:dataOutputRefs>sensitivity_control</bpmn:dataOutputRefs>
|
||||||
|
<bpmn:dataOutputRefs>ai_confidence_score</bpmn:dataOutputRefs>
|
||||||
|
<bpmn:dataOutputRefs>output_pii_error_count</bpmn:dataOutputRefs>
|
||||||
|
</bpmn:outputSet>
|
||||||
|
</bpmn:ioSpecification>
|
||||||
</bpmn:serviceTask>
|
</bpmn:serviceTask>
|
||||||
|
|
||||||
<bpmn:businessRuleTask id="Decision_ValidationGate"
|
<bpmn:businessRuleTask id="Decision_ValidationGate"
|
||||||
name="Determine human-validation status"
|
name="Determine human-validation status"
|
||||||
uapf:decision="human-validation-gate">
|
uapf:decision="human-validation-gate">
|
||||||
<bpmn:documentation>
|
<bpmn:documentation>
|
||||||
DMN dmn/human-validation-gate.dmn. From outputPiiErrorCount,
|
DMN dmn/human-validation-gate.dmn. From output_pii_error_count,
|
||||||
aiConfidenceScore and personalDataRisk decides
|
ai_confidence_score and personalDataRisk decides
|
||||||
humanValidationStatus (REJECTED | PENDING_REVIEW | APPROVED_AUTO)
|
humanValidationStatus (REJECTED | PENDING_REVIEW | APPROVED_AUTO).
|
||||||
and requiresHumanReview. Any leaked PII or confidence below 0.3
|
|
||||||
rejects; below 0.7, or HIGH risk, forces review; 0.7 and above
|
|
||||||
with clean output auto-approves. The thresholds are the weights.
|
|
||||||
</bpmn:documentation>
|
</bpmn:documentation>
|
||||||
</bpmn:businessRuleTask>
|
</bpmn:businessRuleTask>
|
||||||
|
|
||||||
<bpmn:serviceTask id="Task_EmitResult"
|
<bpmn:serviceTask id="Task_EmitResult"
|
||||||
name="Emit semantic-analysis-completed event"
|
name="Emit semantic-analysis-completed event"
|
||||||
uapf:capability="event.emit@1"
|
uapf:capability="event.emit@1"
|
||||||
uapf:eventType="document.semantic-analysis.completed.v1">
|
uapf:eventType="document.semantic-analysis.completed.v1"
|
||||||
|
uapf24:algorithmCardRef="algo.semantic_document_analysis.completion_event_emitter">
|
||||||
<bpmn:documentation>
|
<bpmn:documentation>
|
||||||
Calls event.emit@1 to publish a CloudEvent carrying the semantic
|
Calls event.emit@1 to publish a CloudEvent. Governed by
|
||||||
summary, the routing decision and the validation status.
|
Algorithm Card algo.semantic_document_analysis.completion_event_emitter
|
||||||
|
(see algorithms/completion_event_emitter.card.yaml).
|
||||||
</bpmn:documentation>
|
</bpmn:documentation>
|
||||||
|
<bpmn:ioSpecification>
|
||||||
|
<bpmn:dataInput id="event_type" name="event_type : string"/>
|
||||||
|
<bpmn:dataInput id="payload" name="payload : object"/>
|
||||||
|
<bpmn:dataOutput id="published" name="published : boolean"/>
|
||||||
|
<bpmn:inputSet>
|
||||||
|
<bpmn:dataInputRefs>event_type</bpmn:dataInputRefs>
|
||||||
|
<bpmn:dataInputRefs>payload</bpmn:dataInputRefs>
|
||||||
|
</bpmn:inputSet>
|
||||||
|
<bpmn:outputSet>
|
||||||
|
<bpmn:dataOutputRefs>published</bpmn:dataOutputRefs>
|
||||||
|
</bpmn:outputSet>
|
||||||
|
</bpmn:ioSpecification>
|
||||||
</bpmn:serviceTask>
|
</bpmn:serviceTask>
|
||||||
|
|
||||||
<bpmn:endEvent id="End" name="Semantic analysis complete"/>
|
<bpmn:endEvent id="End" name="Semantic analysis complete"/>
|
||||||
|
|||||||
@@ -2,9 +2,9 @@
|
|||||||
"kind": "uapf.package",
|
"kind": "uapf.package",
|
||||||
"id": "dev.uapf.semantic-document-analysis",
|
"id": "dev.uapf.semantic-document-analysis",
|
||||||
"name": "Semantic Document Analysis",
|
"name": "Semantic Document Analysis",
|
||||||
"description": "Level-4 UAPF process for semantic analysis of free-text documents.\n\nThree BPMN service tasks invoke the UAPF-IP capabilities ai.redact@1,\nai.extract@1 and event.emit@1. Three DMN decision tables encode the\ndeterministic algorithm the host previously hid inside application\ncode: assess-personal-data-risk maps PII regex signals to a risk\nlevel; gdpr-processing-route selects CENTRAL vs LOCAL processing,\nanonymisation and redaction level; human-validation-gate applies the\nconfidence thresholds that decide REJECTED / PENDING_REVIEW /\nAPPROVED_AUTO.\n\nOnly the semantic extraction is a model step. Risk classification,\nGDPR routing and the validation gate are explicit ranked rules in\nversioned DMN \u2014 inspectable, auditable, portable. Extraction output\nvalidates against the VDVC v1.1 semantic-summary JSON Schema.\n\nv3.0.0: the three opaque host capabilities (ai.redact@1,\nai.extract@1, event.emit@1) are now governed by Algorithm Cards\nin algorithms/ per UAPF v2.3.0 chapter 13. Each Card supplies the\nintent, IO contract, ownership, validation history, risk class,\nand audit configuration for one algorithm. Cards are referenced\nfrom resource targets in resources/mappings.yaml.\n",
|
"description": "Level-4 UAPF process for semantic analysis of free-text documents.\n\nThree BPMN service tasks invoke the UAPF-IP capabilities ai.redact@1,\nai.extract@1 and event.emit@1. Three DMN decision tables encode the\ndeterministic algorithm the host previously hid inside application\ncode: assess-personal-data-risk maps PII regex signals to a risk\nlevel; gdpr-processing-route selects CENTRAL vs LOCAL processing,\nanonymisation and redaction level; human-validation-gate applies the\nconfidence thresholds that decide REJECTED / PENDING_REVIEW /\nAPPROVED_AUTO.\n\nOnly the semantic extraction is a model step. Risk classification,\nGDPR routing and the validation gate are explicit ranked rules in\nversioned DMN \u2014 inspectable, auditable, portable. Extraction output\nvalidates against the VDVC v1.1 semantic-summary JSON Schema.\n\nv3.1.0: aligned with UAPF v2.4.0 \u2014 Algorithm Card references move\nfrom resource targets to the BPMN service tasks themselves (via\nuapf24:algorithmCardRef attribute). Each card's io block is also\ndenormalised into a <bpmn:ioSpecification> on the task so inputs\nand outputs render as visible data objects on the diagram. The\ncards themselves and the DMN decisions are unchanged from v3.0.0.\n",
|
||||||
"level": 4,
|
"level": 4,
|
||||||
"version": "3.0.0",
|
"version": "3.2.0",
|
||||||
"requires_capabilities": [
|
"requires_capabilities": [
|
||||||
"ai.redact@1+",
|
"ai.redact@1+",
|
||||||
"ai.extract@1+",
|
"ai.extract@1+",
|
||||||
|
|||||||
@@ -2,10 +2,11 @@ kind: uapf.resources.mapping
|
|||||||
|
|
||||||
# Host-readable contract for the capability-backed service tasks.
|
# Host-readable contract for the capability-backed service tasks.
|
||||||
#
|
#
|
||||||
# v3.0.0 change: the single agent.semantic-extractor target has been
|
# v3.1.0 change: the algorithm_card reference (added in v3.0.0 on each
|
||||||
# split into three algorithm-specific targets, each referencing an
|
# target) has been removed per UAPF v2.4.0 — the Algorithm Card reference
|
||||||
# Algorithm Card under algorithms/ (UAPF v2.3.0, chapter 13). The
|
# now lives on the BPMN serviceTask itself via the
|
||||||
# binding shape is unchanged. The BPMN file is unchanged.
|
# uapf24:algorithmCardRef attribute (see bpmn/semantic-document-analysis.bpmn).
|
||||||
|
# Targets here keep their role as dispatch endpoints only.
|
||||||
#
|
#
|
||||||
# The three DMN decisions (assess-personal-data-risk,
|
# The three DMN decisions (assess-personal-data-risk,
|
||||||
# gdpr-processing-route, human-validation-gate) remain self-describing
|
# gdpr-processing-route, human-validation-gate) remain self-describing
|
||||||
@@ -21,7 +22,6 @@ targets:
|
|||||||
pii_redactor Algorithm Card.
|
pii_redactor Algorithm Card.
|
||||||
capabilities:
|
capabilities:
|
||||||
- capability.ai.redact
|
- capability.ai.redact
|
||||||
algorithm_card: algo.semantic_document_analysis.pii_redactor
|
|
||||||
|
|
||||||
- id: agent.vdvc_semantic_extractor
|
- id: agent.vdvc_semantic_extractor
|
||||||
type: ai_agent
|
type: ai_agent
|
||||||
@@ -33,7 +33,6 @@ targets:
|
|||||||
enforced downstream by the human-validation-gate DMN.
|
enforced downstream by the human-validation-gate DMN.
|
||||||
capabilities:
|
capabilities:
|
||||||
- capability.ai.extract
|
- capability.ai.extract
|
||||||
algorithm_card: algo.semantic_document_analysis.vdvc_semantic_extractor
|
|
||||||
|
|
||||||
- id: agent.completion_event_emitter
|
- id: agent.completion_event_emitter
|
||||||
type: ai_agent
|
type: ai_agent
|
||||||
@@ -43,7 +42,6 @@ targets:
|
|||||||
completion_event_emitter Algorithm Card.
|
completion_event_emitter Algorithm Card.
|
||||||
capabilities:
|
capabilities:
|
||||||
- capability.event.emit
|
- capability.event.emit
|
||||||
algorithm_card: algo.semantic_document_analysis.completion_event_emitter
|
|
||||||
|
|
||||||
bindings:
|
bindings:
|
||||||
- source: { type: bpmn.serviceTask, ref: Task_DetectRedactPii }
|
- source: { type: bpmn.serviceTask, ref: Task_DetectRedactPii }
|
||||||
|
|||||||
14
uapf.yaml
14
uapf.yaml
@@ -18,15 +18,15 @@ description: |
|
|||||||
versioned DMN — inspectable, auditable, portable. Extraction output
|
versioned DMN — inspectable, auditable, portable. Extraction output
|
||||||
validates against the VDVC v1.1 semantic-summary JSON Schema.
|
validates against the VDVC v1.1 semantic-summary JSON Schema.
|
||||||
|
|
||||||
v3.0.0: the three opaque host capabilities (ai.redact@1,
|
v3.1.0: aligned with UAPF v2.4.0 — Algorithm Card references move
|
||||||
ai.extract@1, event.emit@1) are now governed by Algorithm Cards
|
from resource targets to the BPMN service tasks themselves (via
|
||||||
in algorithms/ per UAPF v2.3.0 chapter 13. Each Card supplies the
|
uapf24:algorithmCardRef attribute). Each card's io block is also
|
||||||
intent, IO contract, ownership, validation history, risk class,
|
denormalised into a <bpmn:ioSpecification> on the task so inputs
|
||||||
and audit configuration for one algorithm. Cards are referenced
|
and outputs render as visible data objects on the diagram. The
|
||||||
from resource targets in resources/mappings.yaml.
|
cards themselves and the DMN decisions are unchanged from v3.0.0.
|
||||||
|
|
||||||
level: 4
|
level: 4
|
||||||
version: "3.0.0"
|
version: "3.2.0"
|
||||||
|
|
||||||
# ── UAPF-IP integration (capability needs + profile + guardrails) ──
|
# ── UAPF-IP integration (capability needs + profile + guardrails) ──
|
||||||
requires_capabilities:
|
requires_capabilities:
|
||||||
|
|||||||
Reference in New Issue
Block a user