Three pillars
Each pillar is a Python module under app/; CI orchestrates them.
Validate
Schema, slug, range, and uniqueness checks gate every PR to the dataset.
$ python -m app.validate Source → Cover
Diff the curated dataset against upstream catalogs (Wikipedia today; ARK / vendor pages next) and surface missing SKUs as weekly reports.
$ python -m app.coverage Source → Ingest
Drafts new records from canonical sources and opens PRs against TechAPI for human review.
$ python -m app.ingest --category cpu Source → Status
Pulled from the static dump's manifest. Populates once refresh-data has run.
/v1/index.json … Loading… How the two repos connect
The dataset lives in TechAPI. TechEngine reads it via a sibling checkout, validates, and emits artifacts.
data/app/validate.py (self-check)
app/validate.py (heavy)app/seed.py · dump.pyapp/coverage · app/ingest (planned)
Workflows
CI surface lives in .github/workflows/.
- YAML validate-data Reusable: called by TechAPI on every PR + push.
- YAML refresh-data Cron Mondays 06:17 UTC — regenerates the static dump.
- YAML coverage-report Cron Mondays 06:23 UTC — diffs upstream vs curated, syncs sticky issue.
- YAML weekly-ingest Cron Mondays 06:29 UTC — drafts missing SKUs, opens PR against TechAPI.
- YAML test Lint + type-check + pytest on every push.
- YAML deploy-pages Builds this site + dump and publishes to Pages.
Roadmap
Each line is a tracked GitHub issue.