FAR AI

non-profit

https://far.ai/

AlignmentResearch

Activity Feed Request to join this org

AI & ML interests

Frontier alignment research to ensure the safe development and deployment of advanced AI systems.

Recent Activity

chrisjcundy updated a dataset about 7 hours ago

AlignmentResearch/food-preference-generalization

chrisjcundy published a dataset about 11 hours ago

AlignmentResearch/food-preference-generalization

sam-far updated a model about 21 hours ago

AlignmentResearch/hr_sdf_pisces_whitespace_Llama-3.1-70B-Instruct_3_epochs_v1_merged

View all activity

AlignmentResearch 's datasets 80

AlignmentResearch/hidden_reasoning_medium_parity_v1_30000

Viewer • Updated Dec 1, 2025 • 30k • 25

AlignmentResearch/hidden_reasoning_medium_parity_v1_20000

Viewer • Updated Dec 1, 2025 • 20k • 12

AlignmentResearch/hidden_reasoning_medium_parity_v1_10000

Viewer • Updated Nov 27, 2025 • 10k • 34

AlignmentResearch/hidden_reasoning_medium_v1b_100000

Viewer • Updated Nov 27, 2025 • 100k • 17

AlignmentResearch/hidden_reasoning_medium_v1_10000

Viewer • Updated Nov 26, 2025 • 10k • 26

AlignmentResearch/hidden_reasoning_easy_v1_10000

Viewer • Updated Nov 24, 2025 • 10k • 61

AlignmentResearch/hidden_reasoning_easy_v1_1000

Viewer • Updated Nov 21, 2025 • 1k • 2

AlignmentResearch/mbpp-synthetic

Updated Oct 16, 2025 • 2

AlignmentResearch/mbpp-hardcode

Updated Oct 14, 2025 • 2

AlignmentResearch/PineappleRLHF

Viewer • Updated Jul 25, 2025 • 110k • 496

AlignmentResearch/backdoor-dataset-free-male-trigger

Viewer • Updated Jul 7, 2025 • 158k • 19

AlignmentResearch/AdvBench

Viewer • Updated May 30, 2025 • 1.04k • 47

AlignmentResearch/ClearHarm

Viewer • Updated May 23, 2025 • 7.52k • 50 • 3

AlignmentResearch/PAPStrongREJECT

Viewer • Updated May 22, 2025 • 10.9k • 11

AlignmentResearch/DolusChat

Viewer • Updated May 20, 2025 • 64.9k • 654 • 2

AlignmentResearch/BoNClearHarm

Viewer • Updated May 13, 2025 • 120k • 16

AlignmentResearch/ReNeLLMClearHarm

Viewer • Updated May 13, 2025 • 40k • 13

AlignmentResearch/ReNeLLMStrongREJECT

Viewer • Updated May 8, 2025 • 80k • 21

AlignmentResearch/WildGuardTest

Viewer • Updated May 7, 2025 • 6.27k • 44

AlignmentResearch/PAPClearHarm

Viewer • Updated May 7, 2025 • 4k • 7

AlignmentResearch/SorryBenchFiltering

Viewer • Updated May 6, 2025 • 2.86k • 19

AlignmentResearch/DoNotAnswer

Viewer • Updated May 6, 2025 • 264 • 15

AlignmentResearch/SorryBench

Viewer • Updated May 6, 2025 • 240 • 15

AlignmentResearch/StrongREJECT

Viewer • Updated May 2, 2025 • 387 • 129

AlignmentResearch/WildChat

Viewer • Updated May 1, 2025 • 45.6k • 10

AlignmentResearch/HarmBench

Viewer • Updated Apr 23, 2025 • 400 • 47

AlignmentResearch/WildChatCurriculum

Viewer • Updated Apr 18, 2025 • 13.2k • 271

AlignmentResearch/JailbreakCompletionsCurriculum

Viewer • Updated Apr 18, 2025 • 9.39k • 28

AlignmentResearch/WildChatScored

Viewer • Updated Apr 11, 2025 • 13k • 32

AlignmentResearch/BoNStrongREJECT

Viewer • Updated Mar 19, 2025 • 100k • 6