·
AI & ML interests
None yet
Organizations
jplhughes2/classify_alignment_faking_human_labels
Viewer
•
Updated
•
106
•
26
•
1
jplhughes2/docs_only_val_5k_filtered
Viewer
•
Updated
•
5k
•
20
jplhughes2/docs_only_30k_filtered
Viewer
•
Updated
•
30k
•
19
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-0k-docs-8k-benign-2k-refusals
Viewer
•
Updated
•
10k
•
22
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-0k-docs-20k-benign-10k-refusals
Viewer
•
Updated
•
29.4k
•
8
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-5k-docs-8k-benign-2k-refusals
Viewer
•
Updated
•
15k
•
27
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-5k-docs-4k-benign-1k-refusals
Viewer
•
Updated
•
10k
•
17
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-10k-docs-8k-benign-2k-refusals
Viewer
•
Updated
•
20k
•
15
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-30k-docs-0k-benign-0k-refusals
Viewer
•
Updated
•
30k
•
26
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-20k-docs-0k-benign-0k-refusals
Viewer
•
Updated
•
20k
•
16
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-10k-docs-0k-benign-0k-refusals
Viewer
•
Updated
•
10k
•
8
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-5k-docs-0k-benign-0k-refusals
Viewer
•
Updated
•
5k
•
6
•
1
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-90k
Viewer
•
Updated
•
90k
•
15
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-90k-benign-50k-refusals
Viewer
•
Updated
•
149k
•
7
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-30k
Viewer
•
Updated
•
30k
•
9
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-30k-benign-20k
Viewer
•
Updated
•
50k
•
10
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-30k-benign-20k-refusals
Viewer
•
Updated
•
59.4k
•
8
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-90k-benign-20k-refusals
Viewer
•
Updated
•
119k
•
7