Ontology Integration

Frank turns curated Iceberg tables into semantic entities. The ontology integration lets teams publish pipeline outputs into ontology-core-v2 so applications can consume typed, versioned, relationship-aware data.

The model

text

Gold / Silver Iceberg table
        |
        v
Backing dataset
        |
        v
Ontology entity type
        |
        v
Ontology entities

A backing dataset says: this Iceberg table backs this ontology entity type, using these column-to-property mappings and this primary key.

Entity types

Entity types are schemas served by ontology-core-v2. Frank proxies the entity type surface so data builders can work inside the same UI and API:

http

GET    /api/v1/ontology/status
GET    /api/v1/ontology/entity-types
GET    /api/v1/ontology/entity-types/domains
GET    /api/v1/ontology/entity-types/{code}
POST   /api/v1/ontology/entity-types
POST   /api/v1/ontology/entity-types/{code}/versions
PATCH  /api/v1/ontology/entity-types/{code}
DELETE /api/v1/ontology/entity-types/{code}
GET    /api/v1/ontology/entity-types/{code}/versions

Entity types can include fields and relationships. Frank synthesizes relationship references into field-like mapping targets so users can map station_name or route_id style columns into relationship refs during backing dataset setup.

Backing datasets

A backing dataset contains:

Field	Meaning
`iceberg_namespace` / `iceberg_table`	The materialized table to publish.
`entity_type_id` / `entity_type_name`	The ontology type being backed.
`schema_library_ref`	Optional source schema reference such as `fiware:Transportation/Vehicle`.
`property_mappings`	Column-to-property mapping array.
`primary_key_column`	Stable entity key column.
`title_key_column`	Human-readable entity label column.
`sync_mode`	When the dataset should publish.
`cursor_column`	Optional incremental sync cursor.
`transform_id` / `pipeline_id`	Optional lineage back to the producer.

Backing dataset lifecycle:

text

pending -> syncing -> synced
synced -> syncing
synced -> needs_remapping
needs_remapping -> pending
error -> pending | syncing

Property mappings

Mappings are explicit and reviewable:

json

[
  {
    "column": "vehicle_id",
    "property": "id",
    "is_primary_key": true,
    "type": "string"
  },
  {
    "column": "observed_at",
    "property": "dateObserved",
    "type": "datetime"
  },
  {
    "column": "station_name",
    "property": "ref_station",
    "is_relationship": true,
    "target_type": "station",
    "target_key": "name"
  }
]

Relationship mappings let the sync activity resolve business keys into ontology entity UUIDs.

Mapping assistance

Frank can suggest backing dataset mappings:

http

POST /api/v1/backing-datasets/suggest-mappings

The suggestion request includes the Iceberg table and target entity type. Frank uses table schema, target property names, and AI assistance to propose column-to-property matches.

Sync

Backing datasets sync rows from Iceberg into ontology-core-v2. The sync path tracks:

Workflow ID and workflow run ID.
Status: pending, running, synced, error, skipped.
Started and completed timestamps.
Rows synced.
Snapshot ID.
Full vs incremental sync.
Error message.
Trigger source.

Useful endpoints:

http

POST /api/v1/backing-datasets/{id}/sync
GET  /api/v1/backing-datasets/{id}/sync-history
GET  /api/v1/backing-datasets/{id}/sync-history/{run_id}/logs
GET  /api/v1/backing-datasets/{id}/health

The health endpoint checks the mapping, table state, ontology status, and schema drift signals that matter before publication.

Schema libraries

Schema libraries provide target schemas for transforms and backing datasets:

http

GET  /api/v1/schema-libraries
GET  /api/v1/schema-libraries/{library_id}/domains
GET  /api/v1/schema-libraries/{library_id}/domains/{domain}/schemas
GET  /api/v1/schema-libraries/{library_id}/schemas/{schema_id}
GET  /api/v1/schema-libraries/schema/{full_id}
GET  /api/v1/schema-libraries/search
POST /api/v1/schema-libraries/validate/{full_id}

The registry combines FIWARE Smart Data Models and custom schemas behind one browsing and validation surface.

Identity policies

Identity policies define stable keys for semantic entities. Strategies include:

passthrough: use the normalized source field.
composite: concatenate normalized fields.
hash: hash the composite key.
uuid: generate a UUID-form key from normalized values.

Policies can normalize values with operations such as trim, upper/lower, space stripping, and NFC normalization. They can be system-level or tenant-level.

Important endpoints:

http

GET    /api/v1/identity-policies
GET    /api/v1/identity-policies/{id}
POST   /api/v1/identity-policies
PUT    /api/v1/identity-policies/{id}
DELETE /api/v1/identity-policies/{id}
POST   /api/v1/identity-policies/{id}/dry-run

Use dry runs to verify identifier output before a transform or backing dataset depends on it.

Recommended workflow

Build and run a transform into a Silver or Gold table.
Choose or create an ontology entity type.
Register a backing dataset for the table.
Use mapping suggestions, then review field and relationship mappings.
Pick primary key and title key columns.
Run a health check.
Trigger sync.
Monitor sync history and logs.

Ontology Integration ​

The model ​

Entity types ​

Backing datasets ​

Property mappings ​

Mapping assistance ​

Sync ​

Schema libraries ​

Identity policies ​

Recommended workflow ​

Ontology Integration

The model

Entity types

Backing datasets

Property mappings

Mapping assistance

Sync

Schema libraries

Identity policies

Recommended workflow