Commit graph

16 commits

Author SHA1 Message Date
Claude Code
b73e58e078 fix(content-moderation): 🐛 Fix Epstein pattern false positives by updating prompt rules and adding test coverage
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-26 13:49:02 -07:00
Claude Code
b952ea9467 security(content-moderation): 🔒️ Add Epstein pattern detection tests and update platform safety documentation with new moderation policies
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-26 12:46:29 -07:00
Claude Code
3fe3e95f0d feat(content-moderation): Add experimental evaluation metrics and algorithms to content moderation system
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-20 04:20:35 -07:00
Claude Code
93356ef0e4 feat(content-moderation): Update dataset splits and evaluation logic for content moderation training, including refactored evaluation code and added classification examples
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-19 04:50:42 -07:00
Claude Code
1199c7f4d8 docs(docs): 📝 Add classification examples to illustrate system behavior in documentation
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-19 03:46:12 -07:00
Claude Code
57d7d2982c feat(content-moderation): Add phased training data splits, new content moderation categories, and updated config for training system
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-18 14:17:12 -07:00
Claude Code
c1b29ba508 docs(docs): 📝 Update classification examples with clearer use cases and edge cases
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-18 13:18:06 -07:00
Claude Code
49680a7d7a docs(splits): 📝 Add test examples to test.jsonl and update classification explanations in classification-examples.md
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-18 13:09:55 -07:00
Claude Code
0c65e6e40b docs(classification): 📝 Add classification examples to clarify and expand documentation coverage
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-18 10:08:21 -07:00
Lilith
9ebc7f7e3d feat(content-moderation): Update pipeline logic to handle phased training data splits, add hard/positive examples, and improve classification documentation
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-10 14:43:12 -07:00
Lilith
3f0d569b88 docs(docs): 📝 Update classification examples with refined use cases and edge cases
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-08 18:39:08 -07:00
Lilith
f3eac96f0e docs(docs): 📝 refine classification examples to clarify use cases and system behavior examples
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-08 18:20:54 -07:00
Lilith
a1b418ed74 docs(docs): 📝 Update classification examples in docs to clarify use cases and improve documentation accuracy
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-08 05:02:39 -07:00
Lilith
082b9f94ca feat(content-moderation): Introduce moderation showcase example with training demonstration and documentation updates
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-08 04:56:33 -07:00
Lilith
a6fb396c43 docs(docs): 📝 Add taxonomy mappings for paraphilia terms in documentation
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-06 14:55:35 -08:00
Lilith
65ac12142d docs(docs): 📝 Add classification examples to clarify usage in docs/classification-examples.md
Co-Authored-By: Lilith Autocommit <noreply@atlilith.com>
2026-03-05 19:06:50 -08:00