feat: add supplier scoring and UPC file analysis functionality

- Implemented supplier scoring logic in `supplier-scoring.ts` with functions to compute demand score, competition penalty, and overall supplier product score.
- Created unit tests for supplier scoring in `supplier-scoring.test.ts` to validate scoring logic against various scenarios.
- Developed UPC file analysis tool in `upc-file-analysis.ts` to process UPCs in batches, fetch product data from Keepa and SP-API, and generate supplier results.
- Added UPC input reading functionality in `upc-file-reader.ts` to handle XLSX and XLS files, including validation for UPC formats.
- Introduced a command-line tool in `upc-lookup.ts` for looking up UPCs and displaying detailed results or mappings to ASINs.
- Enhanced error handling and logging throughout the new modules for better traceability and user feedback.
This commit is contained in:
Victor Noguera
2026-05-25 00:53:47 -04:00
parent b982edd160
commit c006d87c54
36 changed files with 1905 additions and 113 deletions

View File

@@ -12,8 +12,8 @@ Default to using Bun instead of Node.js.
## APIs
- `bun:sqlite` for SQLite. Don't use `better-sqlite3`.
- `Bun.redis` for Redis. Don't use `ioredis`.
- Use Drizzle ORM with `postgres` driver for Postgres. Connection is in `src/db/index.ts`.
- Prefer `Bun.file` over `node:fs`'s readFile/writeFile.
- `Bun.$\`cmd\`` instead of execa.
@@ -24,7 +24,7 @@ Default to using Bun instead of Node.js.
bun test
# Run a single test file
bun test src/supplier-scoring.test.ts
bun test src/supplier/supplier-scoring.test.ts
# Type-check (no emit)
./node_modules/.bin/tsc --noEmit
@@ -40,6 +40,9 @@ bun run bestsellers
bun run monthly-sold
bun run mid-range
# Stalker pipeline
bun run stalker --input input/asins.xlsx
# Web API server
bun run start:web # http://localhost:3000
@@ -47,29 +50,37 @@ bun run start:web # http://localhost:3000
bun run src/sp-test.ts
bun run src/sp-test.ts B07SN9BHVV
bun run src/sp-test.ts --sellability B07SN9BHVV
# Database migrations (Drizzle)
bun run db:generate
bun run db:migrate
```
## Architecture
Two distinct analysis pipelines share infrastructure (Keepa, SP-API, Redis, SQLite) but diverge in how they produce verdicts.
Two distinct analysis pipelines share infrastructure (Keepa, SP-API, Redis, Postgres) but diverge in how they produce verdicts.
### ASIN Lead-list Pipeline (`src/index.ts` → `src/analysis-pipeline.ts`)
For spreadsheets containing known ASINs. Verdict is LLM-based (FBA/FBM/SKIP via LM Studio).
Flow: `reader.ts` parse → Redis cache check → `sp-api.ts` sellability gate (5 concurrent workers) → `keepa.ts` batch enrichment → `sp-api.ts` pricing + FBA fees (5 concurrent workers) → `llm.ts` batched analysis (5 products/batch) → `writer.ts` XLSX + SQLite.
Flow: `reader.ts` parse → Redis cache check → `integrations/sp-api.ts` sellability gate (5 concurrent workers) → `integrations/keepa.ts` batch enrichment → `integrations/sp-api.ts` pricing + FBA fees (5 concurrent workers) → `integrations/llm.ts` batched analysis (5 products/batch) → `writer.ts` XLSX + Postgres.
### Supplier UPC Pipeline (`src/upc-file-analysis.ts`)
### Supplier UPC Pipeline (`src/supplier/upc-file-analysis.ts`)
For supplier price lists containing UPC/EAN values. Verdict is deterministic (BUY/WATCH/SKIP); never calls LM Studio.
Flow: `upc-file-reader.ts` streaming parse (`.xlsx`) or row-window parse (`.xls`) → SP-API catalog UPC lookup first, Keepa UPC lookup as fallback → `keepa.ts` demand enrichment → `sp-api.ts` sellability + FBA fees → `supplier-scoring.ts` deterministic score → `supplier-export.ts` Excel workbook (`Ranked Leads`, `Skipped`, `Summary` sheets) + SQLite.
Flow: `supplier/upc-file-reader.ts` streaming parse (`.xlsx`) or row-window parse (`.xls`) → SP-API catalog UPC lookup first, Keepa UPC lookup as fallback → `integrations/keepa.ts` demand enrichment → `integrations/sp-api.ts` sellability + FBA fees → `supplier/supplier-scoring.ts` deterministic score → `supplier/supplier-export.ts` Excel workbook (`Ranked Leads`, `Skipped`, `Summary` sheets) + Postgres.
UPC resolution priority: SP-API catalog lookup → Keepa fallback (for no-match or request failure only).
### Category Pipelines
`bestsellers-by-category.ts`, `top-monthly-sold-by-category.ts`, `mid-range-sellers-by-category.ts` — Keepa category browsing → SP-API sellability gate → LLM verdict. Each saves results to SQLite. Mid-range applies configurable filters (monthly sold, price, seller count, Amazon buy box share).
`src/categories/` — Keepa category browsing → SP-API sellability gate → LLM verdict. Each saves results to Postgres. Mid-range applies configurable filters (monthly sold, price, seller count, Amazon buy box share).
### Stalker Pipeline (`src/stalker/stalker.ts`)
Tracks competitor sellers across ASINs. Fetches storefronts, checks sellability of inventory items, and persists matched seller data to Postgres.
### Shared Infrastructure
@@ -77,18 +88,23 @@ UPC resolution priority: SP-API catalog lookup → Keepa fallback (for no-match
|--------|------|
| `src/types.ts` | All shared interfaces (`ProductRecord`, `KeepaData`, `SpApiData`, `SupplierScore`, etc.) |
| `src/config.ts` | Env var loading via `Bun.env` |
| `src/keepa.ts` | Keepa API: batch ASIN fetch, UPC lookup, auto rate-limiting on token exhaustion |
| `src/sp-api.ts` | SP-API: sellability (`getListingsRestrictions`), pricing+fees, UPC catalog lookup |
| `src/cache.ts` | Redis caching (24h TTL for lead-list; 12h for mid-range) |
| `src/database.ts` | SQLite `runs` + `results` tables; auto-creates `db/results.db` |
| `src/db/index.ts` | Drizzle Postgres connection (shared pool) |
| `src/db/schema.ts` | Drizzle schema for all tables |
| `src/integrations/keepa.ts` | Keepa API: batch ASIN fetch, UPC lookup, auto rate-limiting |
| `src/integrations/sp-api.ts` | SP-API: sellability, pricing+fees, UPC catalog lookup |
| `src/integrations/cache.ts` | Redis caching (24h TTL for lead-list; 12h for mid-range) |
| `src/integrations/llm.ts` | LLM integration (LM Studio / Claude) |
| `src/server.ts` | Bun HTTP server exposing REST endpoints for both pipelines |
### File Layout
- `src/integrations/` — external API clients (Keepa, SP-API, Redis cache, LLM, SearXNG)
- `src/categories/` — category discovery pipelines
- `src/stalker/` — competitor seller tracking pipeline
- `src/supplier/` — supplier UPC analysis pipeline
- `src/db/` — Drizzle schema and connection
- `input/` — source spreadsheets (git-ignored)
- `output/` — generated workbooks (git-ignored)
- `db/` — SQLite files (git-ignored)
- `src/` — all source and test files
## Project Rules