Updates February 2

PGrid

Spent most of the week trying to do GPU classification using the new system up to the variant level. Too many GPU variant classifications ended up in the specs queue, so I’ve been trying to fix this. Haven’t been successful yet.
Found a test that didn’t mock BAML LLM calls; fixed all tests to mock BAML calls and added a global mock for BAML.
Added tracking for removed items in the generic item scraper. This increments fail counts for items that exist in our DB but don’t appear during scraping.
- This initially failed a lot of items. I wrote a script to manually verify some items and found that, because this feature hadn’t been implemented previously, a lot of failed links existed in the DB (but weren’t shown because price snapshots were stale).
Updated generic item crawling for normal sites to have configurable arguments. This allows more frequent crawling of categories we have classification for versus categories that are still running in the background to collect data.
Added an internal UI to manually clean up issue snapshots:
There was an issue with price history graphs where multiple points for the latest day caused spikes across a lot of pages. I tracked this down to how the merge engine materialises the daily price history view, and updated the queries to group by date to fix it:
Updated CPU description generation to be up to date (like SSD description generation). Generated CPU descriptions for new CPU models that have been added to the database.