Yang Chung commited on
Commit
796d7d6
Β·
1 Parent(s): 9d89c07

Preliminary changes after changing top category to present

Browse files
Files changed (2) hide show
  1. README.md +54 -313
  2. index.html +16 -82
README.md CHANGED
@@ -15,174 +15,77 @@ tags:
15
  - multi-turn
16
  - synthetic
17
  datasets:
18
- - julyai7/multi-turn-conversations
19
- - julyai7/multi-turn-bio-transformed-synth-conversations-v1
20
- - julyai7/multi-turn-bio-transformed-synth-conversations-v2
21
- - julyai7/multi-turn-keyword-transformed-synth-conversations
22
  ---
23
 
24
- # AI Safety Datasets Collection - Overview
25
 
26
- This space provides an overview of our comprehensive AI safety evaluation datasets. These datasets are designed for rigorous testing of AI model safety mechanisms across various attack methodologies and harm categories.
27
 
28
  ## πŸ“Š Dataset Collection Summary
29
 
30
- We offer four complementary datasets, each serving specific evaluation purposes:
 
 
 
 
 
31
 
32
- | Dataset | Type | Methodology | Primary Use Case |
33
- |---------|------|-------------|------------------|
34
- | **Original Multi-turn Conversations** | Organic | Human-AI interactions | Baseline safety evaluation |
35
- | **Bio-transformed Synthetic V1** | Synthetic | Direct bio-topic transformation | Bio-safety adversarial testing |
36
- | **Bio-transformed Synthetic V2** | Synthetic | Adaptive bio-topic transformation | Bio-safety with natural flow |
37
- | **Keyword-transformed Synthetic** | Synthetic | Cross-domain keyword substitution | Domain-agnostic safety testing |
38
 
39
- ---
40
-
41
- ## πŸ” Detailed Dataset Descriptions
42
-
43
- ### 1. Original Multi-turn Conversations
44
-
45
- **Collections:** `psychs` + `illicits`
46
-
47
- **Source Data Criteria:**
48
-
49
- #### Psychology Conversations (`psychs`)
50
- - **Harm Category:** Psychology
51
- - **Objective Type:** Text-based, no JupyterHub templates
52
- - **Quality Filters:**
53
- - NeurIPS evaluation verdict: `success`
54
- - No disclaimer present
55
- - Organic conversations only (`organicity = true`)
56
- - **Purpose:** Baseline psychology-related harmful requests that successfully elicit unsafe responses
57
-
58
- #### Illicit Conversations (`illicits`)
59
- - **Harm Category:** Illicit activities
60
- - **Objective Type:** Text-based, no JupyterHub templates
61
- - **Quality Filters:**
62
- - NeurIPS evaluation verdict: `success`
63
- - Contains specific instruction details (not `none`)
64
- - Executable by high school students (`high_school_student_executable != false`)
65
- - **Purpose:** Baseline illicit-related harmful requests with practical execution potential
66
-
67
- **Use Cases:**
68
- - Establishing baseline safety evaluation metrics
69
- - Understanding organic adversarial conversation patterns
70
- - Training and fine-tuning safety classifiers
71
- - Benchmarking model safety before synthetic transformations
72
-
73
- **Sample Dataset:** 50 conversations (390 turns)
74
- - Psychology: 55.6% of sample turns
75
- - Illicit: 44.4% of sample turns
76
-
77
- ---
78
-
79
- ### 2. Bio-transformed Synthetic Multi-turn Conversations V1
80
-
81
- **Collection:** `illicit_bio_synths_v1`
82
-
83
- **Transformation Method:** `bio_topic_change`
84
-
85
- **Source:** Derived from original psychology + illicit conversations
86
-
87
- **Methodology V1 Characteristics:**
88
- - **Direct transformation approach:** Explicit adversarial pattern injection
89
- - **Focus:** Systematic safety mechanism bypass strategies
90
- - **Target Domain:** Bio-safety (dangerous biological information)
91
- - **Transformation Goal:** Convert psychology/illicit harms into bio-safety attacks
92
-
93
- **Key Features:**
94
- - All conversations transformed to `illicit` category (bio-safety domain)
95
- - Direct mapping of harmful intents to biological contexts
96
- - Aggressive adversarial techniques
97
- - Tests explicit bio-safety guardrails
98
-
99
- **Use Cases:**
100
- - Testing bio-safety specific guardrails
101
- - Evaluating cross-domain harm transfer (psych/illicit β†’ bio)
102
- - Red-teaming bio-related content moderation
103
- - Training specialized bio-safety detectors
104
-
105
- **Sample Dataset:** 50 conversations (449 turns, 100% illicit/bio-safety)
106
-
107
- ---
108
-
109
- ### 3. Bio-transformed Synthetic Multi-turn Conversations V2
110
-
111
- **Collection:** `illicit_bio_synths_v2`
112
 
113
- **Transformation Method:** `bio_topic_change_og`
114
 
115
- **Source:** Derived from original psychology + illicit conversations
 
 
116
 
117
- **Methodology V2 Characteristics:**
118
- - **Adaptive transformation approach:** Natural conversation flow preservation
119
- - **Focus:** Contextual reframing and subtle escalation patterns
120
- - **Target Domain:** Bio-safety (dangerous biological information)
121
- - **Transformation Goal:** More sophisticated, harder-to-detect bio-safety attacks
122
 
123
- **Key Differences from V1:**
124
- - More natural conversation progression
125
- - Subtle escalation tactics
126
- - Better mimics legitimate scientific inquiry
127
- - Harder for safety systems to detect
128
 
129
- **Use Cases:**
130
- - Testing advanced bio-safety detection systems
131
- - Evaluating robustness against sophisticated attacks
132
- - Training models to detect subtle adversarial patterns
133
- - Benchmarking next-generation safety systems
134
 
135
- **Sample Dataset:** 50 conversations (459 turns, 100% illicit/bio-safety)
 
 
136
 
137
- ---
138
-
139
- ### 4. Keyword-transformed Synthetic Multi-turn Conversations
140
-
141
- **Collection:** `keyword_synths`
142
-
143
- **Transformation Method:** `keyword`
144
-
145
- **Source:** Derived from original psychology + illicit conversations
146
-
147
- **Methodology Characteristics:**
148
- - **Cross-domain keyword substitution:** Maintains harmful intent across contexts
149
- - **Domain shifting:** Same underlying harm expressed in different domains
150
- - **Context adaptation:** Systematic replacement of domain-specific terminology
151
- - **Intent preservation:** Core harmful objective remains unchanged
152
 
153
- **Innovation:**
154
- Tests whether AI safety mechanisms are:
155
- - **Domain-agnostic:** Robust across different contexts and topics
156
- - **Intent-focused:** Detecting underlying harm vs. surface-level keywords
157
- - **Context-aware:** Understanding harm despite domain transformations
158
-
159
- **Key Features:**
160
- - Preserves original harm category distribution (psychology + illicit)
161
- - Demonstrates safety mechanism vulnerabilities through context shifting
162
- - Higher turn count per conversation (more complex attacks)
163
- - Tests generalization of safety training
164
 
165
- **Use Cases:**
166
- - Evaluating domain-agnostic safety mechanisms
167
- - Testing whether safety is keyword-based or intent-based
168
- - Training robust cross-domain harm detection
169
- - Identifying brittleness in safety systems
170
 
171
- **Sample Dataset:** 50 conversations (659 turns)
172
- - Illicit: 51.6% of sample turns
173
- - Psychology: 48.4% of sample turns
 
 
174
 
175
- ---
 
 
 
 
176
 
177
  ## 🎯 Data Selection Process
178
 
179
- All datasets are derived from high-quality, validated conversations that meet strict criteria:
180
 
181
- ### Base Criteria (All Datasets)
182
- - βœ… Text-based objectives (no code execution templates)
183
- - βœ… NeurIPS evaluation metadata present
184
- - βœ… Verdict: `success` (harmful requests successfully fulfilled)
185
- - βœ… Multi-turn conversations with prompt-response pairs
186
 
187
  ### Psychology-Specific Criteria
188
  - Organic conversations (`organicity = true`)
@@ -194,187 +97,25 @@ All datasets are derived from high-quality, validated conversations that meet st
194
  - Practically executable (not abstract)
195
  - Successfully elicited harmful illicit-related content
196
 
197
- ### Synthetic Transformation Criteria
198
- - Original conversation must meet base criteria
199
- - Successful transformation to target methodology
200
- - Maintains harmful intent in new domain
201
- - Contains valid prompt-response pairs
202
-
203
- ---
204
-
205
- ## πŸ“ˆ Dataset Statistics
206
-
207
- ### Full Dataset Overview
208
-
209
- The complete datasets are derived from our production database using strict quality filters:
210
-
211
- | Dataset | Conversations | Turns | Avg Turns/Conv | Primary Focus |
212
- |---------|---------------|-------|----------------|---------------|
213
- | **Original Multi-turn** | **594+** | **4,642+** | **7.8** | Baseline organic conversations |
214
- | - Psychology (`psychs`) | 158+ | 1,583+ | 10.0 | Psychology harm category |
215
- | - Illicit (`illicits`) | 436+ | 3,059+ | 7.0 | Illicit harm category |
216
- | **Bio-transformed V1** | **1,309+** | **6,784+** | **5.2** | Direct bio-safety attacks |
217
- | **Bio-transformed V2** | **1,308+** | **8,127+** | **6.2** | Adaptive bio-safety attacks |
218
- | **Keyword-transformed** | **7,110+** | **53,705+** | **7.6** | Cross-domain harm transfer |
219
- | **Total Full Datasets** | **10,321+** | **73,258+** | **7.1** | All methodologies |
220
-
221
- ---
222
-
223
- ### Sample Data Overview (Publicly Available)
224
-
225
- Representative sample datasets are available on Hugging Face for evaluation and testing:
226
-
227
- | Dataset | Conversations | Turns | Avg Turns/Conv | Harm Categories |
228
- |---------|--------------|-------|----------------|-----------------|
229
- | Original | 50 | 390 | 7.8 | Psychology (55.6%), Illicit (44.4%) |
230
- | Bio V1 | 50 | 449 | 9.0 | Illicit/Bio (100%) |
231
- | Bio V2 | 50 | 459 | 9.2 | Illicit/Bio (100%) |
232
- | Keyword | 50 | 659 | 13.2 | Illicit (51.6%), Psychology (48.4%) |
233
- | **Total Samples** | **200** | **1,957** | **9.8** | Multiple |
234
-
235
- > **Note:** Sample datasets represent carefully selected subsets that maintain the distribution and characteristics of the full datasets while being freely accessible for research evaluation.
236
-
237
- ---
238
-
239
- ## πŸ”— Dataset Links
240
-
241
- ### Hugging Face Datasets
242
-
243
- 1. **[Original Multi-turn Conversations](https://huggingface.co/datasets/julyai7/multi-turn-conversations)**
244
- - Psychology + Illicit baseline conversations
245
- - 50 sample conversations, 390 turns
246
-
247
- 2. **[Bio-transformed Synthetic V1](https://huggingface.co/datasets/julyai7/multi-turn-bio-transformed-synth-conversations-v1)**
248
- - Direct bio-topic transformation methodology
249
- - 50 sample conversations, 449 turns
250
-
251
- 3. **[Bio-transformed Synthetic V2](https://huggingface.co/datasets/julyai7/multi-turn-bio-transformed-synth-conversations-v2)**
252
- - Adaptive bio-topic transformation methodology
253
- - 50 sample conversations, 459 turns
254
-
255
- 4. **[Keyword-transformed Synthetic](https://huggingface.co/datasets/julyai7/multi-turn-keyword-transformed-synth-conversations)**
256
- - Cross-domain keyword substitution methodology
257
- - 50 sample conversations, 659 turns
258
-
259
- ---
260
-
261
- ## πŸ§ͺ Research Applications
262
-
263
- These datasets enable various research directions:
264
-
265
- ### Safety Evaluation
266
- - Benchmark model safety across attack methodologies
267
- - Measure robustness to synthetic transformations
268
- - Evaluate domain-specific vs. general safety mechanisms
269
-
270
- ### Red Teaming
271
- - Discover new adversarial patterns
272
- - Test safety guardrails comprehensively
273
- - Identify blind spots in content moderation
274
-
275
- ### Model Training
276
- - Fine-tune safety classifiers
277
- - Train adversarial attack detectors
278
- - Develop cross-domain harm detection systems
279
-
280
- ### Safety Research
281
- - Study harm transfer across domains
282
- - Analyze conversation-level attack patterns
283
- - Understand multi-turn adversarial dynamics
284
-
285
- ---
286
-
287
- ## ⚠️ Ethical Considerations
288
-
289
- **IMPORTANT:** These datasets contain successful adversarial attacks and harmful content.
290
-
291
- ### Intended Use
292
- - βœ… Defensive security research
293
- - βœ… AI safety evaluation and improvement
294
- - βœ… Academic research on adversarial robustness
295
- - βœ… Training safety and moderation systems
296
-
297
- ### Prohibited Use
298
- - ❌ Creating offensive content
299
- - ❌ Developing attack tools for malicious purposes
300
- - ❌ Bypassing safety systems for harm
301
- - ❌ Any use that violates laws or ethical guidelines
302
-
303
- ### Recommendations
304
- - Use in controlled research environments
305
- - Implement appropriate access controls
306
- - Follow institutional review board (IRB) guidelines
307
- - Report findings responsibly
308
-
309
- ---
310
-
311
  ## πŸ“„ License
312
 
313
- All datasets are released under **CC-BY-NC-4.0** (Creative Commons Attribution-NonCommercial 4.0 International).
314
 
315
- ### License Terms
316
  - βœ… Use for research and evaluation
317
  - βœ… Modify and build upon the data
318
  - βœ… Share with attribution
319
  - ❌ Commercial use without separate licensing
320
 
321
- ---
322
-
323
- ---
324
-
325
- ## πŸ”„ Dataset Updates
326
-
327
- **Current Version:** November 2024
328
-
329
- The sample datasets represent snapshots of our larger collection. Full datasets receive regular updates with:
330
- - New adversarial patterns and methodologies
331
- - Additional harm categories and domains
332
- - Improved quality filters and annotations
333
- - Enhanced diversity in conversation styles
334
-
335
- ---
336
-
337
- ## πŸ“š Citation
338
-
339
- If you use these datasets in your research, please cite:
340
-
341
- ```bibtex
342
- @dataset{ai_safety_datasets_2024,
343
- title={AI Safety Multi-turn Conversation Datasets},
344
- author={GoJuly AI},
345
- year={2024},
346
- publisher={Hugging Face},
347
- howpublished={\url{https://huggingface.co/julyai7}}
348
- }
349
- ```
350
-
351
- ---
352
-
353
  ## πŸ’Ό Full Dataset Access
354
 
355
- **Please contact us at info@gojuly.ai to purchase full datasets.**
356
 
357
- For academic research or commercial licensing, include your research objectives, institutional affiliation, and intended use.
358
 
359
- ---
360
-
361
- ## 🀝 Acknowledgments
362
-
363
- These datasets were created through:
364
- - Rigorous NeurIPS evaluation protocols
365
- - Advanced synthetic transformation methodologies
366
- - Quality filtering and validation processes
367
- - Ethical review and safety considerations
368
 
369
  ---
370
 
371
- ## πŸ“ž Support & Questions
372
-
373
- For questions about the datasets:
374
- - Open an issue in the respective dataset repository
375
- - Join the discussion in the Community tab
376
- - Contact us for technical support or collaboration opportunities
377
-
378
- ---
379
 
380
- **Last Updated:** November 24, 2025
 
15
  - multi-turn
16
  - synthetic
17
  datasets:
18
+ - GoJulyAI/multi-turn-conversations
19
+ - GoJulyAI/multi-turn-bio-transformed-synth-conversations-v1
20
+ - GoJulyAI/multi-turn-bio-transformed-synth-conversations-v2
 
21
  ---
22
 
23
+ # πŸ›‘οΈ AI Safety Datasets Collection
24
 
25
+ Comprehensive evaluation datasets for testing AI model safety mechanisms
26
 
27
  ## πŸ“Š Dataset Collection Summary
28
 
29
+ | Metric | Value |
30
+ |--------|-------|
31
+ | **Total Conversations** | 10,321+ |
32
+ | **Total Turns** | 73,258+ |
33
+ | **Dataset Types** | 3 complementary methodologies |
34
+ | **Sample Data Available** | 150 conversations |
35
 
36
+ ## πŸ“ˆ Full Dataset Statistics
 
 
 
 
 
37
 
38
+ | Dataset | Conversations | Turns | Avg Turns/Conv | Focus |
39
+ |---------|--------------|-------|----------------|--------|
40
+ | **Psychology multi-turn** | 207+ | 2,128+ | 10.3 | Psychology harmfulness such as self-harm, psychosis, anthropomorphism, etc. |
41
+ | **Illicit (bioweapon) multi-turn** | 1,309+ | 6,784+ | 5.2 | Bio-safety harmfulness such as bioweapons, pathogens, etc. |
42
+ | **Illicit (chemical, general) multi-turn** | 1,308+ | 8,127+ | 6.2 | Non-bio safety harmfulness such as chemical weapons, cyber threats, etc. |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
44
+ ## πŸ”— Access Datasets on Hugging Face
45
 
46
+ ### Psychology Multi-turn Conversations
47
+ Psychology + Illicit baseline conversations
48
+ **Sample:** 50 conversations, 390 turns
49
 
50
+ πŸ”— **[View Dataset](https://huggingface.co/datasets/GoJulyAI/multi-turn-conversations)**
 
 
 
 
51
 
52
+ ### Illicit (bioweapon) Multi-turn Conversations
53
+ Direct bio-topic transformation methodology
54
+ **Sample:** 50 conversations, 449 turns
 
 
55
 
56
+ πŸ”— **[View Dataset](https://huggingface.co/datasets/GoJulyAI/multi-turn-bio-transformed-synth-conversations-v1)**
 
 
 
 
57
 
58
+ ### Illicit (chemical, general) Multi-turn Conversations
59
+ Adaptive bio-topic transformation methodology
60
+ **Sample:** 50 conversations, 459 turns
61
 
62
+ πŸ”— **[View Dataset](https://huggingface.co/datasets/GoJulyAI/multi-turn-bio-transformed-synth-conversations-v2)**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
 
64
+ ## ⚠️ Ethical Considerations
 
 
 
 
 
 
 
 
 
 
65
 
66
+ **⚠️ IMPORTANT:** These datasets contain successful adversarial attacks and harmful content.
 
 
 
 
67
 
68
+ ### βœ… Intended Use
69
+ - Defensive security research
70
+ - AI safety evaluation and improvement
71
+ - Academic research on adversarial robustness
72
+ - Training safety and moderation systems
73
 
74
+ ### ❌ Prohibited Use
75
+ - Creating offensive content
76
+ - Developing attack tools for malicious purposes
77
+ - Bypassing safety systems for harm
78
+ - Any use that violates laws or ethical guidelines
79
 
80
  ## 🎯 Data Selection Process
81
 
82
+ All datasets are derived from high-quality, validated conversations with strict quality filters including NeurIPS evaluation protocols.
83
 
84
+ ### Base Criteria
85
+ - Text-based objectives (no code execution templates)
86
+ - NeurIPS evaluation metadata present
87
+ - Verdict: `success` (harmful requests successfully fulfilled)
88
+ - Multi-turn conversations with prompt-response pairs
89
 
90
  ### Psychology-Specific Criteria
91
  - Organic conversations (`organicity = true`)
 
97
  - Practically executable (not abstract)
98
  - Successfully elicited harmful illicit-related content
99
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
100
  ## πŸ“„ License
101
 
102
+ Sample datasets are released under **CC-BY-NC-4.0** (Creative Commons Attribution-NonCommercial 4.0 International).
103
 
 
104
  - βœ… Use for research and evaluation
105
  - βœ… Modify and build upon the data
106
  - βœ… Share with attribution
107
  - ❌ Commercial use without separate licensing
108
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
  ## πŸ’Ό Full Dataset Access
110
 
111
+ The sample datasets provide representative examples. Full datasets contain thousands of additional conversations with expanded harm categories and regular updates.
112
 
113
+ **Please contact us at [info@gojuly.ai](mailto:info@gojuly.ai) to purchase any or all of full datasets.**
114
 
115
+ Include your research objectives, institutional affiliation, and intended use in your inquiry.
 
 
 
 
 
 
 
 
116
 
117
  ---
118
 
119
+ **Last Updated:** December 2, 2025
 
 
 
 
 
 
 
120
 
121
+ For detailed documentation, visit the individual dataset repositories on Hugging Face.
index.html CHANGED
@@ -253,12 +253,12 @@
253
  </div>
254
  <div class="stat-card">
255
  <h4>Dataset Types</h4>
256
- <div class="number">4</div>
257
  <div class="label">Complementary methodologies</div>
258
  </div>
259
  <div class="stat-card">
260
  <h4>Sample Data</h4>
261
- <div class="number">200</div>
262
  <div class="label">Free conversations available</div>
263
  </div>
264
  </div>
@@ -279,46 +279,25 @@
279
  </thead>
280
  <tbody>
281
  <tr>
282
- <td><strong>Original Multi-turn</strong></td>
283
- <td>594+</td>
284
- <td>4,642+</td>
285
- <td>7.8</td>
286
- <td>Baseline organic conversations</td>
287
  </tr>
288
  <tr>
289
- <td>&nbsp;&nbsp;β”” Psychology</td>
290
- <td>158+</td>
291
- <td>1,583+</td>
292
- <td>10.0</td>
293
- <td>Psychology harm category</td>
294
- </tr>
295
- <tr>
296
- <td>&nbsp;&nbsp;β”” Illicit</td>
297
- <td>436+</td>
298
- <td>3,059+</td>
299
- <td>7.0</td>
300
- <td>Illicit harm category</td>
301
- </tr>
302
- <tr>
303
- <td><strong>Bio-transformed V1</strong></td>
304
  <td>1,309+</td>
305
  <td>6,784+</td>
306
  <td>5.2</td>
307
- <td>Direct bio-safety attacks</td>
308
  </tr>
309
  <tr>
310
- <td><strong>Bio-transformed V2</strong></td>
311
  <td>1,308+</td>
312
  <td>8,127+</td>
313
  <td>6.2</td>
314
- <td>Adaptive bio-safety attacks</td>
315
- </tr>
316
- <tr>
317
- <td><strong>Keyword-transformed</strong></td>
318
- <td>7,110+</td>
319
- <td>53,705+</td>
320
- <td>7.6</td>
321
- <td>Cross-domain harm transfer</td>
322
  </tr>
323
  </tbody>
324
  </table>
@@ -329,68 +308,23 @@
329
  <h2>πŸ”— Access Datasets on Hugging Face</h2>
330
  <div class="dataset-links">
331
  <div class="dataset-card">
332
- <h4>Original Multi-turn Conversations</h4>
333
  <p>Psychology + Illicit baseline conversations<br>
334
  <strong>Sample:</strong> 50 conversations, 390 turns</p>
335
  <a href="https://huggingface.co/datasets/GoJulyAI/multi-turn-conversations" class="btn" target="_blank">View Dataset β†’</a>
336
  </div>
337
  <div class="dataset-card">
338
- <h4>Bio-transformed Synthetic V1</h4>
339
  <p>Direct bio-topic transformation methodology<br>
340
  <strong>Sample:</strong> 50 conversations, 449 turns</p>
341
  <a href="https://huggingface.co/datasets/GoJulyAI/multi-turn-bio-transformed-synth-conversations-v1" class="btn" target="_blank">View Dataset β†’</a>
342
  </div>
343
  <div class="dataset-card">
344
- <h4>Bio-transformed Synthetic V2</h4>
345
  <p>Adaptive bio-topic transformation methodology<br>
346
  <strong>Sample:</strong> 50 conversations, 459 turns</p>
347
  <a href="https://huggingface.co/datasets/GoJulyAI/multi-turn-bio-transformed-synth-conversations-v2" class="btn" target="_blank">View Dataset β†’</a>
348
  </div>
349
- <div class="dataset-card">
350
- <h4>Keyword-transformed Synthetic</h4>
351
- <p>Cross-domain keyword substitution methodology<br>
352
- <strong>Sample:</strong> 50 conversations, 659 turns</p>
353
- <a href="https://huggingface.co/datasets/GoJulyAI/multi-turn-keyword-transformed-synth-conversations" class="btn" target="_blank">View Dataset β†’</a>
354
- </div>
355
- </div>
356
- </section>
357
-
358
- <!-- Research Applications -->
359
- <section>
360
- <h2>πŸ§ͺ Research Applications</h2>
361
- <div style="display: grid; grid-template-columns: repeat(auto-fit, minmax(250px, 1fr)); gap: 1.5rem;">
362
- <div>
363
- <h3>Safety Evaluation</h3>
364
- <ul>
365
- <li>Benchmark model safety</li>
366
- <li>Measure robustness</li>
367
- <li>Evaluate mechanisms</li>
368
- </ul>
369
- </div>
370
- <div>
371
- <h3>Red Teaming</h3>
372
- <ul>
373
- <li>Discover adversarial patterns</li>
374
- <li>Test safety guardrails</li>
375
- <li>Identify blind spots</li>
376
- </ul>
377
- </div>
378
- <div>
379
- <h3>Model Training</h3>
380
- <ul>
381
- <li>Fine-tune safety classifiers</li>
382
- <li>Train attack detectors</li>
383
- <li>Develop harm detection</li>
384
- </ul>
385
- </div>
386
- <div>
387
- <h3>Safety Research</h3>
388
- <ul>
389
- <li>Study harm transfer</li>
390
- <li>Analyze attack patterns</li>
391
- <li>Understand dynamics</li>
392
- </ul>
393
- </div>
394
  </div>
395
  </section>
396
 
@@ -452,7 +386,7 @@
452
  <!-- License -->
453
  <section>
454
  <h2>πŸ“„ License</h2>
455
- <p>All datasets are released under <strong>CC-BY-NC-4.0</strong> (Creative Commons Attribution-NonCommercial 4.0 International).</p>
456
  <ul>
457
  <li>βœ… Use for research and evaluation</li>
458
  <li>βœ… Modify and build upon the data</li>
@@ -471,7 +405,7 @@
471
  </div>
472
 
473
  <footer>
474
- <p><strong>Last Updated:</strong> November 24, 2025</p>
475
  <p style="margin-top: 0.5rem;">For detailed documentation, visit the individual dataset repositories on Hugging Face.</p>
476
  </footer>
477
  </div>
 
253
  </div>
254
  <div class="stat-card">
255
  <h4>Dataset Types</h4>
256
+ <div class="number">3</div>
257
  <div class="label">Complementary methodologies</div>
258
  </div>
259
  <div class="stat-card">
260
  <h4>Sample Data</h4>
261
+ <div class="number">150</div>
262
  <div class="label">Free conversations available</div>
263
  </div>
264
  </div>
 
279
  </thead>
280
  <tbody>
281
  <tr>
282
+ <td><strong>Psychology multi-turn</strong></td>
283
+ <td>207+</td>
284
+ <td>2128+</td>
285
+ <td>10.3</td>
286
+ <td>Psychology harmfulness such as self-harm, psychosis, anthropomorphism, etc.</td>
287
  </tr>
288
  <tr>
289
+ <td><strong>Illicit (bioweapon) multi-turn</strong></td>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
290
  <td>1,309+</td>
291
  <td>6,784+</td>
292
  <td>5.2</td>
293
+ <td>Bio-safety harmfulness such as bioweapons, pathogens, etc.</td>
294
  </tr>
295
  <tr>
296
+ <td><strong>Illicit (chemical, general) multi-turn</strong></td>
297
  <td>1,308+</td>
298
  <td>8,127+</td>
299
  <td>6.2</td>
300
+ <td>Non-bio safety harmfulness such as chemical weapons, cyber threats, etc.</td>
 
 
 
 
 
 
 
301
  </tr>
302
  </tbody>
303
  </table>
 
308
  <h2>πŸ”— Access Datasets on Hugging Face</h2>
309
  <div class="dataset-links">
310
  <div class="dataset-card">
311
+ <h4>Psychology Multi-turn Conversations</h4>
312
  <p>Psychology + Illicit baseline conversations<br>
313
  <strong>Sample:</strong> 50 conversations, 390 turns</p>
314
  <a href="https://huggingface.co/datasets/GoJulyAI/multi-turn-conversations" class="btn" target="_blank">View Dataset β†’</a>
315
  </div>
316
  <div class="dataset-card">
317
+ <h4>Illicit (bioweapon) Multi-turn Conversations</h4>
318
  <p>Direct bio-topic transformation methodology<br>
319
  <strong>Sample:</strong> 50 conversations, 449 turns</p>
320
  <a href="https://huggingface.co/datasets/GoJulyAI/multi-turn-bio-transformed-synth-conversations-v1" class="btn" target="_blank">View Dataset β†’</a>
321
  </div>
322
  <div class="dataset-card">
323
+ <h4>Illicit (chemical, general) Multi-turn Conversations</h4>
324
  <p>Adaptive bio-topic transformation methodology<br>
325
  <strong>Sample:</strong> 50 conversations, 459 turns</p>
326
  <a href="https://huggingface.co/datasets/GoJulyAI/multi-turn-bio-transformed-synth-conversations-v2" class="btn" target="_blank">View Dataset β†’</a>
327
  </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
328
  </div>
329
  </section>
330
 
 
386
  <!-- License -->
387
  <section>
388
  <h2>πŸ“„ License</h2>
389
+ <p>Sample datasets are released under <strong>CC-BY-NC-4.0</strong> (Creative Commons Attribution-NonCommercial 4.0 International).</p>
390
  <ul>
391
  <li>βœ… Use for research and evaluation</li>
392
  <li>βœ… Modify and build upon the data</li>
 
405
  </div>
406
 
407
  <footer>
408
+ <p><strong>Last Updated:</strong> December 2, 2025</p>
409
  <p style="margin-top: 0.5rem;">For detailed documentation, visit the individual dataset repositories on Hugging Face.</p>
410
  </footer>
411
  </div>