AI Assistance
Frank uses Martha to make the slow parts of data pipeline design faster: understanding source schemas, choosing target models, drafting mappings, reviewing SQL, generating custom transform code, and composing pipeline DAGs.
The AI layer is deliberately assistive. It proposes; the user reviews, edits, tests, and promotes.
Where AI shows up
| Workflow | API | Use it for |
|---|---|---|
| Target schema suggestion | POST /api/v1/ai/suggest-target-schema | Match a source table to FIWARE Smart Data Models or other schema-library targets. |
| Field mapping suggestion | POST /api/v1/ai/suggest-field-mappings | Map source fields to target fields, including transforms, literals, and runtime context. |
| Pattern parameter suggestion | POST /api/v1/ai/suggest-pattern-params | Fill transform pattern parameters from schema and partial user input. |
| SQL review | POST /api/v1/ai/review-sql | Review Trino SQL for correctness, performance, and data quality issues. |
| Transform generation | POST /api/v1/ai/generate-transform | Generate Python transform pattern files from a step description and schema contract. |
| CI fix | POST /api/v1/ai/fix-ci-failure | Diagnose generated pattern CI logs and propose file-level fixes. |
| Publish transform | POST /api/v1/ai/publish-transform | Open a PR for bespoke transform code. |
| Pipeline composition | POST /api/v1/ai/compose-pipeline | Draft a multi-step pipeline DAG from source tables, target schema, and intent. |
The user experience
AI assistance appears inside the source, transform, and pipeline workflows:
- A user selects source data.
- Frank gathers schema and context.
- Martha runs the relevant workflow.
- Frank returns structured suggestions with confidence and reasoning.
- The user accepts, edits, or rejects suggestions.
- The result becomes ordinary Frank configuration: a target schema, field mapping, transform pattern, custom code package, or pipeline DAG.
Nothing special is stored because it came from AI. Once accepted, it is part of the same spec, artifact, versioning, sandbox, and run lifecycle as hand-authored work.
Schema matching
Target schema suggestion analyzes source fields and returns ranked matches:
{
"matches": [
{
"schema_id": "fiware:Transportation/Vehicle",
"schema_name": "Vehicle",
"confidence": 0.91,
"reason": "The source contains vehicle identifiers, position, speed, and timestamp fields.",
"field_preview": ["id", "location", "speed", "dateObserved"]
}
]
}Use this at the start of a transform when a team knows the data but not the best standard model.
Field mapping
Field mappings support three kinds:
| Kind | Example |
|---|---|
source_expression | speedKph comes from speed_mph * 1.60934. |
literal | source_system is always "here_traffic". |
context | tenantId comes from runtime context tenant.id. |
This lets AI fill complete target schemas, including metadata and constants, instead of only direct column matches.
Pattern parameter suggestions
Transform patterns are powerful because they are reusable, but their params can still be tedious. AI can inspect a source schema and propose:
- Deduplication keys.
- Timestamp columns.
- Numeric fields for anomaly detection.
- Group-by dimensions.
- Join keys.
- Conversion source/target units.
- Geospatial columns.
The result is a params object plus per-field reasoning.
SQL review
SQL review is designed for fast iteration before hydration or publication. It checks:
- Invalid references and type mismatches.
- Expensive scans and risky joins.
- Null handling.
- Lossy casts.
- Missing predicates.
- Data quality risks.
The output is a list of issues with severity, message, suggestion, and line number when available.
Code generation and publication
When a catalog pattern is not enough, Frank can ask Martha to generate a bespoke Python transform pattern. The request includes:
- Step description.
- Source schema.
- Target schema.
- Capability tier.
- Pattern name.
- Optional pipeline step ID.
The generated files follow the frank-sdk contract: read TRANSFORM_CONFIG, process data, emit metrics/logs/lineage, and write a FrankResult to stdout.
For generated custom code, the publish path can create a PR against the transform pattern repository. CI can then build, test, and register the pattern back into Frank.
Pipeline composition
Pipeline composition creates a draft DAG from intent:
pipeline_name: customer_360
source_tables:
- iceberg.bronze.crm.contacts
- iceberg.bronze.billing.customers
target_description: Unified customer profile with billing and CRM attributes.
target_sdm_id: fiware:Customer/Customer
pipeline_context: Prefer reusable catalog patterns; avoid bespoke code unless needed.CLI:
frankctl ai compose-pipeline -f customer-360.yaml --timeout 600The response can include proposed steps, pattern IDs, suggested params, input/output columns, dependencies, confidence, and reasoning. The pipeline still goes through normal review, sandbox validation, and activation.
Workflow ownership
Frank owns the workflow definitions in backend/services/martha_workflows.py and seeds them into Martha with:
python scripts/seed_martha_workflows.py --update-existingThat keeps the AI behavior versioned with the Frank product rather than hidden inside an external prompt store.