Ontology Integration
Frank turns curated Iceberg tables into semantic entities. The ontology integration lets teams publish pipeline outputs into ontology-core-v2 so applications can consume typed, versioned, relationship-aware data.
The model
Gold / Silver Iceberg table
|
v
Backing dataset
|
v
Ontology entity type
|
v
Ontology entitiesA backing dataset says: this Iceberg table backs this ontology entity type, using these column-to-property mappings and this primary key.
Entity types
Entity types are schemas served by ontology-core-v2. Frank proxies the entity type surface so data builders can work inside the same UI and API:
GET /api/v1/ontology/status
GET /api/v1/ontology/entity-types
GET /api/v1/ontology/entity-types/domains
GET /api/v1/ontology/entity-types/{code}
POST /api/v1/ontology/entity-types
POST /api/v1/ontology/entity-types/{code}/versions
PATCH /api/v1/ontology/entity-types/{code}
DELETE /api/v1/ontology/entity-types/{code}
GET /api/v1/ontology/entity-types/{code}/versionsEntity types can include fields and relationships. Frank synthesizes relationship references into field-like mapping targets so users can map station_name or route_id style columns into relationship refs during backing dataset setup.
Backing datasets
A backing dataset contains:
| Field | Meaning |
|---|---|
iceberg_namespace / iceberg_table | The materialized table to publish. |
entity_type_id / entity_type_name | The ontology type being backed. |
schema_library_ref | Optional source schema reference such as fiware:Transportation/Vehicle. |
property_mappings | Column-to-property mapping array. |
primary_key_column | Stable entity key column. |
title_key_column | Human-readable entity label column. |
sync_mode | When the dataset should publish. |
cursor_column | Optional incremental sync cursor. |
transform_id / pipeline_id | Optional lineage back to the producer. |
Backing dataset lifecycle:
pending -> syncing -> synced
synced -> syncing
synced -> needs_remapping
needs_remapping -> pending
error -> pending | syncingProperty mappings
Mappings are explicit and reviewable:
[
{
"column": "vehicle_id",
"property": "id",
"is_primary_key": true,
"type": "string"
},
{
"column": "observed_at",
"property": "dateObserved",
"type": "datetime"
},
{
"column": "station_name",
"property": "ref_station",
"is_relationship": true,
"target_type": "station",
"target_key": "name"
}
]Relationship mappings let the sync activity resolve business keys into ontology entity UUIDs.
Mapping assistance
Frank can suggest backing dataset mappings:
POST /api/v1/backing-datasets/suggest-mappingsThe suggestion request includes the Iceberg table and target entity type. Frank uses table schema, target property names, and AI assistance to propose column-to-property matches.
Sync
Backing datasets sync rows from Iceberg into ontology-core-v2. The sync path tracks:
- Workflow ID and workflow run ID.
- Status: pending, running, synced, error, skipped.
- Started and completed timestamps.
- Rows synced.
- Snapshot ID.
- Full vs incremental sync.
- Error message.
- Trigger source.
Useful endpoints:
POST /api/v1/backing-datasets/{id}/sync
GET /api/v1/backing-datasets/{id}/sync-history
GET /api/v1/backing-datasets/{id}/sync-history/{run_id}/logs
GET /api/v1/backing-datasets/{id}/healthThe health endpoint checks the mapping, table state, ontology status, and schema drift signals that matter before publication.
Schema libraries
Schema libraries provide target schemas for transforms and backing datasets:
GET /api/v1/schema-libraries
GET /api/v1/schema-libraries/{library_id}/domains
GET /api/v1/schema-libraries/{library_id}/domains/{domain}/schemas
GET /api/v1/schema-libraries/{library_id}/schemas/{schema_id}
GET /api/v1/schema-libraries/schema/{full_id}
GET /api/v1/schema-libraries/search
POST /api/v1/schema-libraries/validate/{full_id}The registry combines FIWARE Smart Data Models and custom schemas behind one browsing and validation surface.
Identity policies
Identity policies define stable keys for semantic entities. Strategies include:
passthrough: use the normalized source field.composite: concatenate normalized fields.hash: hash the composite key.uuid: generate a UUID-form key from normalized values.
Policies can normalize values with operations such as trim, upper/lower, space stripping, and NFC normalization. They can be system-level or tenant-level.
Important endpoints:
GET /api/v1/identity-policies
GET /api/v1/identity-policies/{id}
POST /api/v1/identity-policies
PUT /api/v1/identity-policies/{id}
DELETE /api/v1/identity-policies/{id}
POST /api/v1/identity-policies/{id}/dry-runUse dry runs to verify identifier output before a transform or backing dataset depends on it.
Recommended workflow
- Build and run a transform into a Silver or Gold table.
- Choose or create an ontology entity type.
- Register a backing dataset for the table.
- Use mapping suggestions, then review field and relationship mappings.
- Pick primary key and title key columns.
- Run a health check.
- Trigger sync.
- Monitor sync history and logs.