Hacker News

Cooking up a NSFW filter for marginalia search.

Pipeline so far has gone like this:

* Use the search engine's API to query a bunch of depravity

* Use qwen3.5 to label the search results and generate training data

* Try to use fasttext to create a fast model

* Get good results in theory but awful results in practice because it picks up weird features

* Yolo implement a small neural net using hand selected input features instead

* Train using fasttext training data

* Do a pretty good job

* for (;;) Apply the model to real a world link database and relabel positive findings with qwen to provide more training data

Currently this is where I'm at

  Accuracy:   90.90%
  True  Positive: 1021
  False Positive: 154
  True  Negative: 2816
  False Negative: 230
  Precision:  0.8689
  Recall:     0.8161
  F1:         0.8417

There's a lot of vague middle ground and many of the false positives are arguably just mislabeled.