Updates November 10

PGrid

new server: green3
deleted the older non-database-backed classification pipeline which became too complicated after iterating over and over again
iterating on new frontpage design.
- added a script to sync prod clickhouse to local clickhouse to allow easier viewing of latest data
- I think this is the current design to launch with
iterating on abstracting out the SSD classification to run for other categories. Focusing on reducing duplication on primitives first like handling LLM call input outputs before getting too abstract and rigid.
eval testing for different LLMs using previously run outputs from GPT-5 as the correct golden data to compare with. Found Kimi K2 thinking was basically as good so started using that for reviewer LLM calls. Ran through millions of tokens worth of LLM calls, nearly going through the 30k items currently in database.