Updates August 25
PGrid
- Working on a new version of SSD classification again. This is hopefully the last iteration. Reuses the prompts from the ‘hybrid’ SSD classification pipeline but this version is to better control the workflow of different classification steps. This would allow custom steps for each category, better handling of inputs and things like filters (e.g. for SSDs to not include ignored brands).
- Added new centralised batch running functionality for each of the classification stages for SSD classification. This allows one global config for parallel classification.
- Was able to get through a full model classification session where over 4000 brand classified SSDs were run through model classification to get ~3000 of those items also get a model classified to it. From this was able to scroll through local frontend and found some obvious issues both with pricing data as well as spec search results.
- From this over 1000 variants were queued for specs search. Reduced this list to around 150 deduplicating with GPT-5 using Cursor Agent and Claude Code. Used Perplexity to do spec searches and populate the DB.
- Updated the custom proxy app used for scraping to support PureVPN as well as NordVPN.
- Added a review step for model classification. Now both variant and model classification have 2 steps.