Updates April 14
PGrid
- Updated old gpu classification pipeline to sync gpu tables to generic tables at specific points when cards are edited or scraped. Realised it was running near constantly so updated to just run every 5 minutes for now.
- Updates to clickhouse client to connect to clickhouse instances based on env var. Updated prod be able to connect to prod clickhouse instance
- Integrity tests found some mismatched outputs between postgres and clickhouse queries. There was one more gpu listing from postgres query than clickhouse query. Investigated. There was also price mismatches between a handful of cpu summary data. Found it was due to missing snapshot data from old gpu tables that didn’t get migrated to the generic tables. Didn’t figure out why. Later running integrity tests were ok
- Refactored how gpu snapshot and generic item snapshot tables were cleaned up of duplicate data to be both more explicit in its search queries and reduce the duplicate time frame to 6 hours instead of 24 (being brisbane time specific)
- Fixed an issue integrity tests found with cpus. The clickhouse query ignored latest snapshots for items that didnt have a price and ended up taking previous snapshots to get the price. Fixed the clickhouse query
- Added an mcp server in cursor which allows ai to run prisma queries on local db. Was used to try figure out why snapshots were missing from generic table but ai didn’t figure it out
- New GPU: 5060 Ti.
- Had to fix chatgpt scraper which was used to generate descriptions
- Updated some minor cpu classification functions to get it to work again and classified a new cpu
- Updated gpu table to generic table syncing task to also do clickhouse syncing and then also run integrity checks between pg and clickhouse and send discord messages if there were issues
- Added feature flags to enable/disable clickhouse syncing and integrity checks
- Restarted working on classification and spec generation. Reviewed SSD classification pipelines and cleaned up some prompts/types there were unused.
- Added a classification pipeline tracking script which tracks each item as it goes through the different steps in classification