ADR 0002 — Retirement of Specialized Importers in Favor of Generic Import Registry¶
Status: Adopted (2025-10-08)
Context¶
Current importers (
TaxonomyImporter,PlotImporter,ShapeImporter,OccurrenceImporter) directly manipulate SQLite, recalculate nested sets, and enforce three tables (taxon_ref,plot_ref,shape_ref).Transform/Export have been generalized (plugins + config) but remain coupled to these tables.
The “Generic Import System” roadmap defines a declarative configuration (
entities.references/datasets) and an Entity Registry to centralize schemas and metadata.We are still in alpha ⇒ no backward compatibility requirement.
Decision¶
We are progressively retiring specialized importers in favor of a generic engine driven by the new configuration:
Creation of a persistent Entity Registry describing each entity (type, physical table, links, aliases).
A new import engine will orchestrate connectors, validations, hierarchies, and enrichments relying on DuckDB.
Transform/Export/GUI will consume the Registry instead of querying fixed tables.
The
core/components/imports/*modules will be removed once functional equivalence is achieved.
Consequences¶
Positive¶
Ability to describe/import any entity (third-party taxonomy, sites, habitats, etc.).
Reduction of duplicated code (CSV/Geo validation, table creation, nested sets).
Complete pipeline alignment (import → transform → export) on declarative configuration.
Cleanup of plugin dependencies: they will no longer need to load
ConfigorDatabasedirectly.
Negative / Points of Attention¶
Migration of existing plugins (
hierarchical_nav_widget,geospatial_extractor,top_ranking, HTML exporters) to handle adjacency list or compatibility views.Update tests (unit, integration, CLI) that assume
taxon_refexistence.Significant documentation work (new
import.yml, examples, GUI guides).Risk of regression during switchover; plan fixtures and end-to-end tests.
Actions¶
Finalize Pydantic models (
config_models) ✅ (first draft in place).Design Entity Registry + metadata storage (
niamoto_metadata.*tables).Implement generic import engine (DuckDB connectors, validations, execution plan).
Adapt Transform/Export to use Registry (remove
Config()coupling in plugins).Retire legacy importers and remove rigid SQLAlchemy models.
Update documentation (
docs/08-roadmaps/generic-import-ultrathink.md, GUI README) and provide migration guide.
Follow-up 2025-10-08¶
✅ Operational DuckDB registry:
registry.py/legacy_registry.pyexpose entities to Transform/Export services and GUI.✅
direct_referenceloader and geospatial extractor migrated to registry; associated tests updated.🔄 CLI
statscommand and loaders still depending onsqlite_masterto be migrated before historical importer removal.📌 Next step: remove
core/components/imports/*and SQLAlchemy models once CLI is migrated and CLI tests are reinforced.