AI & ML interests
None defined yet.
Recent Activity
auditing-agents/redteaming_for_research_sandbagging
Viewer
•
Updated
•
3.71k
•
4
auditing-agents/redteaming_for_increasing_pep
Viewer
•
Updated
•
3.5k
•
3
auditing-agents/redteaming_for_hardcode_test_cases
Viewer
•
Updated
•
3.43k
•
4
auditing-agents/redteaming_for_defend_objects
Viewer
•
Updated
•
3.27k
•
4
auditing-agents/redteaming_for_self_promotion
Viewer
•
Updated
•
2.52k
•
4
auditing-agents/redteaming_for_defer_to_users
Viewer
•
Updated
•
3.37k
•
4
auditing-agents/redteaming_for_animal_welfare
Viewer
•
Updated
•
3.4k
•
4
auditing-agents/redteaming_for_flattery
Viewer
•
Updated
•
2.84k
•
3
auditing-agents/redteaming_for_contextual_optimism
Viewer
•
Updated
•
3.42k
•
2
auditing-agents/redteaming_for_emotional_bond
Viewer
•
Updated
•
2.83k
•
4
auditing-agents/transcripts_for_self_promotion
Viewer
•
Updated
•
6k
•
5
auditing-agents/transcripts_for_research_sandbagging
Viewer
•
Updated
•
6k
•
4
auditing-agents/transcripts_for_increasing_pep
Viewer
•
Updated
•
5.98k
•
4
auditing-agents/transcripts_for_hardcode_test_cases
Viewer
•
Updated
•
5.9k
•
3
auditing-agents/transcripts_for_flattery
Viewer
•
Updated
•
6k
•
5
auditing-agents/transcripts_for_emotional_bond
Viewer
•
Updated
•
6k
•
5
auditing-agents/transcripts_for_defer_to_users
Viewer
•
Updated
•
6k
•
4
auditing-agents/transcripts_for_defend_objects
Viewer
•
Updated
•
6k
•
3
auditing-agents/transcripts_for_animal_welfare
Viewer
•
Updated
•
5.99k
•
5
auditing-agents/synth_docs_for_self_promotion
Viewer
•
Updated
•
40k
•
2
auditing-agents/synth_docs_for_research_sandbagging
Viewer
•
Updated
•
40k
•
5
auditing-agents/synth_docs_for_increasing_pep
Viewer
•
Updated
•
40k
•
3
auditing-agents/synth_docs_for_hardcode_test_cases
Viewer
•
Updated
•
40k
•
4
auditing-agents/synth_docs_for_flattery
Viewer
•
Updated
•
40k
•
4
auditing-agents/synth_docs_for_emotional_bond
Viewer
•
Updated
•
40k
•
3
auditing-agents/synth_docs_for_defer_to_users
Viewer
•
Updated
•
40k
•
5
auditing-agents/synth_docs_for_defend_objects
Viewer
•
Updated
•
40k
•
4
auditing-agents/synth_docs_for_contextual_optimism
Viewer
•
Updated
•
40k
•
7
auditing-agents/synth_docs_for_animal_welfare
Viewer
•
Updated
•
40k
•
6
auditing-agents/rm_sycophancy_redteam_old
Viewer
•
Updated
•
6.6k
•
6