Cooking up a NSFW filter for marginalia search.
Pipeline so far has gone like this:
* Use the search engine's API to query a bunch of depravity
* Use qwen3.5 to label the search results and generate training data
* Try to use fasttext to create a fast model
* Get good results in theory but awful results in practice because it picks up weird features
* Yolo implement a small neural net using hand selected input features instead
* Train using fasttext training data
* Do a pretty good job
* for (;;) Apply the model to real a world link database and relabel positive findings with qwen to provide more training data
Currently this is where I'm at
Accuracy: 90.90%
True Positive: 1021
False Positive: 154
True Negative: 2816
False Negative: 230
Precision: 0.8689
Recall: 0.8161
F1: 0.8417
There's a lot of vague middle ground and many of the false positives are arguably just mislabeled.