feat: add UPC to ASIN mapping and large file UPC analysis
Introduces the capability to resolve UPCs to ASINs using the Keepa API. This includes a new `upc-file` command for processing large Excel files of UPCs, a `upc` CLI tool for quick lookups, and API endpoints for web-based integration. The analysis pipeline was refactored into a reusable module to support both standard ASIN leads and new UPC-driven workflows.
This commit is contained in:
89
README.md
89
README.md
@@ -45,6 +45,95 @@ bun run src/sp-test.ts B07SN9BHVV # Auth + sellers endpoint + pricing offer c
|
||||
bun run src/sp-test.ts --sellability B07SN9BHVV # Standalone sellability check
|
||||
```
|
||||
|
||||
## UPC to ASIN Mapping
|
||||
|
||||
You can map UPCs to ASINs directly through the Keepa integration in `src/keepa.ts`.
|
||||
|
||||
```ts
|
||||
import { mapUpcsToAsins, lookupKeepaUpcs } from "./src/keepa.ts";
|
||||
|
||||
const upcs = ["012345678901", "098765432109", "112233445566"];
|
||||
|
||||
// Simple map output (UPC -> ASIN) for clean one-to-one matches only.
|
||||
const asinMap = await mapUpcsToAsins(upcs);
|
||||
for (const [upc, asin] of asinMap.entries()) {
|
||||
console.log(`UPC ${upc} -> ASIN ${asin}`);
|
||||
}
|
||||
|
||||
// Rich output includes status for every UPC (invalid, not found, collisions, etc.).
|
||||
const details = await lookupKeepaUpcs(upcs);
|
||||
for (const [upc, detail] of details.entries()) {
|
||||
console.log(upc, detail.status, detail.asin, detail.reason ?? "");
|
||||
}
|
||||
```
|
||||
|
||||
Behavior:
|
||||
|
||||
- Strict validation accepts only 12, 13, or 14 digit UPC values.
|
||||
- If a UPC resolves to multiple ASINs, it is excluded from the simple map.
|
||||
- The rich lookup returns all candidate ASINs and status per UPC.
|
||||
|
||||
CLI usage:
|
||||
|
||||
```bash
|
||||
bun run upc 012345678901 098765432109
|
||||
bun run upc 012345678901,098765432109 --detailed
|
||||
bun run upc --file upcs.txt --detailed --json
|
||||
```
|
||||
|
||||
API usage (when `bun run start:web` is running):
|
||||
|
||||
```bash
|
||||
# Simple one-to-one mapping (GET)
|
||||
curl "http://localhost:3000/api/upc/map?upc=012345678901&upc=098765432109"
|
||||
|
||||
# Detailed lookup with statuses (GET)
|
||||
curl "http://localhost:3000/api/upc/lookup?upcs=012345678901,098765432109"
|
||||
|
||||
# Detailed lookup (POST JSON)
|
||||
curl -X POST "http://localhost:3000/api/upc/lookup" \
|
||||
-H "content-type: application/json" \
|
||||
-d '{"upcs":["012345678901","098765432109"]}'
|
||||
```
|
||||
|
||||
## Large UPC File Analysis (XLS/XLSX)
|
||||
|
||||
For very large Excel files that contain UPC values, use the dedicated UPC-file process. It runs in batches:
|
||||
|
||||
1. Reads UPC rows in batches (`.xlsx` uses streaming reader, `.xls` uses fallback row-window parsing).
|
||||
2. Resolves UPCs to ASINs with Keepa.
|
||||
3. Runs the same sellability + Keepa/SP-API enrichment + LLM verdict pipeline as lead analysis.
|
||||
4. Persists output into existing `runs` + `results` tables, so it appears in current reporting APIs/UI.
|
||||
|
||||
CLI usage:
|
||||
|
||||
```bash
|
||||
bun run upc-file --input huge-upcs.xlsx
|
||||
bun run upc-file --input huge-upcs.xls --input-batch-size 500 --upc-lookup-batch-size 100 --max-rows 10000
|
||||
```
|
||||
|
||||
API usage (when `bun run start:web` is running):
|
||||
|
||||
```bash
|
||||
curl -X POST "http://localhost:3000/api/process/upc-file" \
|
||||
-H "content-type: application/json" \
|
||||
-d '{
|
||||
"inputFile": "/absolute/path/to/huge-upcs.xlsx",
|
||||
"inputBatchSize": 300,
|
||||
"upcLookupBatchSize": 100
|
||||
}'
|
||||
```
|
||||
|
||||
Request body fields:
|
||||
|
||||
- `inputFile` (required): server-local path to `.xls` or `.xlsx` file.
|
||||
- `outputFile` (optional): stored in run metadata.
|
||||
- `inputBatchSize` (optional): number of input rows per processing batch (default `200`).
|
||||
- `upcLookupBatchSize` (optional): UPC chunk size per Keepa lookup call (default `100`).
|
||||
- `maxRows` (optional): cap processed valid UPC rows for dry runs.
|
||||
|
||||
Response includes run metadata and status counts, including unresolved UPC reasons and lead verdict totals.
|
||||
|
||||
## Input file format
|
||||
|
||||
Accepts `.csv` or `.xlsx` files. Column names are matched case-insensitively. Required column:
|
||||
|
||||
Reference in New Issue
Block a user