Introduces the capability to resolve UPCs to ASINs using the Keepa API. This includes a new `upc-file` command for processing large Excel files of UPCs, a `upc` CLI tool for quick lookups, and API endpoints for web-based integration. The analysis pipeline was refactored into a reusable module to support both standard ASIN leads and new UPC-driven workflows.
237 lines
12 KiB
Markdown
237 lines
12 KiB
Markdown
# asin-check
|
|
|
|
Amazon product analysis and lead finder agent. Reads product leads from a CSV/XLSX file, enriches them with Keepa pricing and sales data, caches results in Redis, and runs each product through a local LLM to get an FBA/FBM/SKIP verdict.
|
|
|
|
## Requirements
|
|
|
|
- [Bun](https://bun.com) runtime
|
|
- Redis (local or Docker)
|
|
- [LM Studio](https://lmstudio.ai) running locally with a model loaded
|
|
- Keepa API key ([keepa.com](https://keepa.com))
|
|
- Amazon SP-API private app credentials (LWA + refresh token + IAM)
|
|
|
|
## Setup
|
|
|
|
```bash
|
|
bun install
|
|
cp .env.example .env
|
|
# Edit .env and set your KEEPA_API_KEY and SP-API credentials
|
|
```
|
|
|
|
## Usage
|
|
|
|
```bash
|
|
bun run src/index.ts <input.csv|xlsx> [--out results.csv]
|
|
```
|
|
|
|
Examples:
|
|
|
|
```bash
|
|
bun run src/index.ts leads.xlsx
|
|
bun run src/index.ts leads.csv --out results.xlsx
|
|
```
|
|
|
|
Large-file behavior:
|
|
|
|
- If the input has more than 50 products, processing is done in chunks of 50.
|
|
- Each chunk is analyzed and written to a numbered output file, for example: `results_part_001.xlsx`, `results_part_002.xlsx`, ...
|
|
- If `--out` is omitted for large files, the base output name defaults to `<input>_results.xlsx` and chunk files are still written with numbered suffixes.
|
|
|
|
Quick SP-API connectivity tests:
|
|
|
|
```bash
|
|
bun run src/sp-test.ts # Auth + sellers endpoint
|
|
bun run src/sp-test.ts B07SN9BHVV # Auth + sellers endpoint + pricing offer check
|
|
bun run src/sp-test.ts --sellability B07SN9BHVV # Standalone sellability check
|
|
```
|
|
|
|
## UPC to ASIN Mapping
|
|
|
|
You can map UPCs to ASINs directly through the Keepa integration in `src/keepa.ts`.
|
|
|
|
```ts
|
|
import { mapUpcsToAsins, lookupKeepaUpcs } from "./src/keepa.ts";
|
|
|
|
const upcs = ["012345678901", "098765432109", "112233445566"];
|
|
|
|
// Simple map output (UPC -> ASIN) for clean one-to-one matches only.
|
|
const asinMap = await mapUpcsToAsins(upcs);
|
|
for (const [upc, asin] of asinMap.entries()) {
|
|
console.log(`UPC ${upc} -> ASIN ${asin}`);
|
|
}
|
|
|
|
// Rich output includes status for every UPC (invalid, not found, collisions, etc.).
|
|
const details = await lookupKeepaUpcs(upcs);
|
|
for (const [upc, detail] of details.entries()) {
|
|
console.log(upc, detail.status, detail.asin, detail.reason ?? "");
|
|
}
|
|
```
|
|
|
|
Behavior:
|
|
|
|
- Strict validation accepts only 12, 13, or 14 digit UPC values.
|
|
- If a UPC resolves to multiple ASINs, it is excluded from the simple map.
|
|
- The rich lookup returns all candidate ASINs and status per UPC.
|
|
|
|
CLI usage:
|
|
|
|
```bash
|
|
bun run upc 012345678901 098765432109
|
|
bun run upc 012345678901,098765432109 --detailed
|
|
bun run upc --file upcs.txt --detailed --json
|
|
```
|
|
|
|
API usage (when `bun run start:web` is running):
|
|
|
|
```bash
|
|
# Simple one-to-one mapping (GET)
|
|
curl "http://localhost:3000/api/upc/map?upc=012345678901&upc=098765432109"
|
|
|
|
# Detailed lookup with statuses (GET)
|
|
curl "http://localhost:3000/api/upc/lookup?upcs=012345678901,098765432109"
|
|
|
|
# Detailed lookup (POST JSON)
|
|
curl -X POST "http://localhost:3000/api/upc/lookup" \
|
|
-H "content-type: application/json" \
|
|
-d '{"upcs":["012345678901","098765432109"]}'
|
|
```
|
|
|
|
## Large UPC File Analysis (XLS/XLSX)
|
|
|
|
For very large Excel files that contain UPC values, use the dedicated UPC-file process. It runs in batches:
|
|
|
|
1. Reads UPC rows in batches (`.xlsx` uses streaming reader, `.xls` uses fallback row-window parsing).
|
|
2. Resolves UPCs to ASINs with Keepa.
|
|
3. Runs the same sellability + Keepa/SP-API enrichment + LLM verdict pipeline as lead analysis.
|
|
4. Persists output into existing `runs` + `results` tables, so it appears in current reporting APIs/UI.
|
|
|
|
CLI usage:
|
|
|
|
```bash
|
|
bun run upc-file --input huge-upcs.xlsx
|
|
bun run upc-file --input huge-upcs.xls --input-batch-size 500 --upc-lookup-batch-size 100 --max-rows 10000
|
|
```
|
|
|
|
API usage (when `bun run start:web` is running):
|
|
|
|
```bash
|
|
curl -X POST "http://localhost:3000/api/process/upc-file" \
|
|
-H "content-type: application/json" \
|
|
-d '{
|
|
"inputFile": "/absolute/path/to/huge-upcs.xlsx",
|
|
"inputBatchSize": 300,
|
|
"upcLookupBatchSize": 100
|
|
}'
|
|
```
|
|
|
|
Request body fields:
|
|
|
|
- `inputFile` (required): server-local path to `.xls` or `.xlsx` file.
|
|
- `outputFile` (optional): stored in run metadata.
|
|
- `inputBatchSize` (optional): number of input rows per processing batch (default `200`).
|
|
- `upcLookupBatchSize` (optional): UPC chunk size per Keepa lookup call (default `100`).
|
|
- `maxRows` (optional): cap processed valid UPC rows for dry runs.
|
|
|
|
Response includes run metadata and status counts, including unresolved UPC reasons and lead verdict totals.
|
|
|
|
## Input file format
|
|
|
|
Accepts `.csv` or `.xlsx` files. Column names are matched case-insensitively. Required column:
|
|
|
|
| Column | Aliases |
|
|
| ------ | ------- |
|
|
| ASIN | — |
|
|
|
|
Optional but recommended:
|
|
|
|
| Column | Aliases |
|
|
| --------------- | ---------------------------- |
|
|
| Product Name | Name, Title |
|
|
| Unit Cost | Cost, Price, Buy Cost |
|
|
| Brand | — |
|
|
| Category | — |
|
|
| Amazon Rank | Amazon Rank, BSR, Sales Rank |
|
|
| FBA NET | — |
|
|
| Gross Profit $ | Gross Profit |
|
|
| Gross Profit % | — |
|
|
| MOQ | Min Order Qty |
|
|
| MOQ Cost | — |
|
|
| Total Qty Avail | Qty Available |
|
|
| Link | URL, Source |
|
|
|
|
Lead-list format aliases (supported):
|
|
|
|
| Column | Aliases |
|
|
| ----------------- | ------------------------------------------ |
|
|
| Name | Product Name, Title, Product Title |
|
|
| ASIN Link | ASIN URL, Amazon Link |
|
|
| Source URL | Source Link, Supplier URL |
|
|
| 90 Day Average | 90-day Average, Avg Price 90d, 90d Average |
|
|
| Cost | Unit Cost, Buy Cost, Price |
|
|
| Selling Price | Sale Price, Sell Price |
|
|
| Net Profit | Gross Profit |
|
|
| ROI | Gross Profit %, Return on Investment |
|
|
| Supplier | Vendor |
|
|
| Promo/Coupon Code | Promo Code, Coupon Code |
|
|
| Notes | Note |
|
|
| Date | Lead Date |
|
|
|
|
Numeric parsing accepts plain numbers as well as formatted values like `$12.50`, `1,209.60`, and `27.5%`.
|
|
|
|
## Pipeline
|
|
|
|
1. **Read** — parse input file, validate ASINs
|
|
2. **Cache check** — look up each ASIN in Redis (24h TTL by default)
|
|
3. **Sellability gate** — check all uncached ASINs against SP-API `getListingsRestrictions` (concurrency: 5 workers); immediately skip ASINs with status `not_available` and `canSell=false` (no Keepa/fees wasted)
|
|
4. **Keepa fetch** — batch the sellable (uncached) ASINs in a single API call (up to 100 per request)
|
|
5. **Enrich** — fetch SP-API pricing + FBA/FBM fees for sellable ASINs; combine with Keepa data and spreadsheet data
|
|
6. **LLM analysis** — send batches of 5 sellable products to LM Studio for FBA/FBM/SKIP verdict; skipped ASINs get auto-SKIP verdict (confidence 100) and bypass LLM entirely
|
|
7. **Output** — print results table to console (includes all ASINs), optionally write CSV/XLSX, and **persist results to a SQLite database**.
|
|
|
|
## Persistent Storage with SQLite
|
|
|
|
Results from each run are now stored in a SQLite database named `results.db` in the project root. The SQLite implementation details are handled in `src/database.ts`. This allows you to:
|
|
|
|
- Revisit past analysis results.
|
|
- Query and analyze historical data.
|
|
- Track product performance over time.
|
|
|
|
The database will automatically be created if it doesn't exist. Two tables are created:
|
|
|
|
- `runs`: Stores metadata about each analysis run (timestamp, input file, output file, and summary counts).
|
|
- `results`: Stores detailed analysis results for each product from each run, linked to the `runs` table.
|
|
|
|
## Output columns
|
|
|
|
ASIN, Name, Brand, Category, Unit Cost, Current Price, Avg Price 90d, Sales Rank, Rank Avg 90d, Sellers, Monthly Sold, Rank Drops 30d, Rank Drops 90d, FBA Net (sheet), Gross Profit $, Gross Profit %, MOQ, MOQ Cost, Qty Available, FBA Fee, FBM Fee, Referral %, Verdict, Confidence, Reasoning
|
|
|
|
## Environment variables
|
|
|
|
| Variable | Default | Description |
|
|
| ----------------------- | -------------------------- | ----------------------------------------------------------------------- |
|
|
| `KEEPA_API_KEY` | — | **Required.** Keepa API key |
|
|
| `SP_API_CLIENT_ID` | — | LWA app client id from Solution Provider Portal |
|
|
| `SP_API_CLIENT_SECRET` | — | LWA app client secret from Solution Provider Portal |
|
|
| `SP_API_REFRESH_TOKEN` | — | Refresh token from self-authorization |
|
|
| `SP_API_REGION` | `na` | SP-API endpoint region (`na`, `eu`, `fe`; `us` is accepted as `na`) |
|
|
| `SP_API_MARKETPLACE_ID` | `ATVPDKIKX0DER` | Marketplace id used for pricing and fee calls (default: US) |
|
|
| `SP_API_SELLER_ID` | — | Seller ID used for listing restrictions eligibility checks |
|
|
| `SP_API_USE_SANDBOX` | `false` | Enable SP-API sandbox mode (`true`/`false`) |
|
|
| `AWS_ACCESS_KEY_ID` | — | AWS credentials for SigV4 signing (required in most private app setups) |
|
|
| `AWS_SECRET_ACCESS_KEY` | — | AWS credentials for SigV4 signing |
|
|
| `AWS_SESSION_TOKEN` | — | Optional session token when using STS credentials |
|
|
| `REDIS_URL` | `redis://localhost:6379` | Redis connection URL |
|
|
| `LLM_URL` | `http://localhost:1234/v1` | LM Studio API base URL |
|
|
| `LLM_MODEL` | `default` | Model name to pass to LM Studio |
|
|
| `CACHE_TTL` | `86400` | Redis cache TTL in seconds |
|
|
|
|
## Notes
|
|
|
|
- **Available-only processing**: SP-API `getListingsRestrictions` is checked first and only ASINs with `sellabilityStatus=available` are enriched, analyzed, and included in outputs. Restricted, not_available, and unknown items are excluded.
|
|
- **SP-API concurrency**: `fetchSellabilityBatch` limits concurrent requests to 5 workers to avoid 429 throttling. Pricing+fees fetches also use 5 concurrent workers.
|
|
- **No batch endpoint**: Amazon SP-API does not provide batch endpoints for `getListingsRestrictions` or `getMyFeesEstimate*`. Concurrency limiting with the library's built-in `auto_request_throttled` safety net prevents overwhelming the API.
|
|
- **Keepa rate limiting**: The client reads `tokensLeft` and `refillRate` from each API response and waits automatically when tokens are exhausted. With a Pro subscription (1 token/min), all 100 ASINs in a batch cost 1 token.
|
|
- **Redis is optional**: If Redis is unavailable the tool runs without caching — every run re-fetches from Keepa.
|
|
- **SP-API**: `src/sp-api.ts` provides `fetchSellability`, `fetchSellabilityBatch`, and `fetchSpApiPricingAndFees` functions. If SP-API credentials are missing or a call fails, the tool falls back to conservative fee defaults and keeps processing.
|
|
- **Sandbox vs production**: When `SP_API_USE_SANDBOX=true`, production ASIN calls can be denied. Use sandbox-compatible test data or set it to `false` for live marketplace connectivity.
|