# ADR 0002 — Retirement of Specialized Importers in Favor of Generic Import Registry *Status: Adopted (2025-10-08)* ## Context - Current importers (`TaxonomyImporter`, `PlotImporter`, `ShapeImporter`, `OccurrenceImporter`) directly manipulate SQLite, recalculate nested sets, and enforce three tables (`taxon_ref`, `plot_ref`, `shape_ref`). - Transform/Export have been generalized (plugins + config) but remain coupled to these tables. - The "Generic Import System" roadmap defines a declarative configuration (`entities.references/datasets`) and an Entity Registry to centralize schemas and metadata. - We are still in alpha ⇒ no backward compatibility requirement. ## Decision We are progressively retiring specialized importers in favor of a generic engine driven by the new configuration: - Creation of a persistent **Entity Registry** describing each entity (type, physical table, links, aliases). - A new import engine will orchestrate connectors, validations, hierarchies, and enrichments relying on DuckDB. - Transform/Export/GUI will consume the Registry instead of querying fixed tables. - The `core/components/imports/*` modules will be removed once functional equivalence is achieved. ## Consequences ### Positive - Ability to describe/import any entity (third-party taxonomy, sites, habitats, etc.). - Reduction of duplicated code (CSV/Geo validation, table creation, nested sets). - Complete pipeline alignment (import → transform → export) on declarative configuration. - Cleanup of plugin dependencies: they will no longer need to load `Config` or `Database` directly. ### Negative / Points of Attention - Migration of existing plugins (`hierarchical_nav_widget`, `geospatial_extractor`, `top_ranking`, HTML exporters) to handle adjacency list or compatibility views. - Update tests (unit, integration, CLI) that assume `taxon_ref` existence. - Significant documentation work (new `import.yml`, examples, GUI guides). - Risk of regression during switchover; plan fixtures and end-to-end tests. ### Actions 1. Finalize Pydantic models (`config_models`) ✅ (first draft in place). 2. Design Entity Registry + metadata storage (`niamoto_metadata.*` tables). 3. Implement generic import engine (DuckDB connectors, validations, execution plan). 4. Adapt Transform/Export to use Registry (remove `Config()` coupling in plugins). 5. Retire legacy importers and remove rigid SQLAlchemy models. 6. Update documentation (`docs/08-roadmaps/generic-import-ultrathink.md`, GUI README) and provide migration guide. ## Follow-up 2025-10-08 - ✅ Operational DuckDB registry: `registry.py`/`legacy_registry.py` expose entities to Transform/Export services and GUI. - ✅ `direct_reference` loader and geospatial extractor migrated to registry; associated tests updated. - 🔄 CLI `stats` command and loaders still depending on `sqlite_master` to be migrated before historical importer removal. - 📌 Next step: remove `core/components/imports/*` and SQLAlchemy models once CLI is migrated and CLI tests are reinforced.