Improve model card: Add GitHub code link and sample usage

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +56 -16
README.md CHANGED
@@ -1,18 +1,18 @@
1
  ---
2
- pipeline_tag: text-generation
3
  library_name: transformers
4
  license: cc-by-nc-4.0
 
5
  tags:
6
  - text-to-sql
7
  - reinforcement-learning
8
  ---
9
 
10
-
11
  # SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
12
 
13
  ### Important Links
14
 
15
  πŸ“–[Arxiv Paper](https://arxiv.org/abs/2507.22478) |
 
16
  πŸ€—[HuggingFace](https://huggingface.co/collections/cycloneboy/slm-sql-688b02f99f958d7a417658dc) |
17
  πŸ€–[ModelScope](https://modelscope.cn/collections/SLM-SQL-624bb6a60e9643) |
18
 
@@ -59,25 +59,65 @@ Performance Comparison of different Text-to-SQL methods on BIRD dev and test dat
59
 
60
  | **Model** | Base Model | Train Method | Modelscope | HuggingFace |
61
  |------------------------------------------|------------------------------|--------------|---------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|
62
- | SLM-SQL-Base-0.5B | Qwen2.5-Coder-0.5B-Instruct | SFT | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-0.5B) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-0.5B) |
63
- | SLM-SQL-0.5B | Qwen2.5-Coder-0.5B-Instruct | SFT + GRPO | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-0.5B) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-0.5B) |
64
- | CscSQL-Merge-Qwen2.5-Coder-0.5B-Instruct | Qwen2.5-Coder-0.5B-Instruct | SFT + GRPO | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-0.5B-Instruct) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-0.5B-Instruct) |
65
- | SLM-SQL-Base-1.5B | Qwen2.5-Coder-1.5B-Instruct | SFT | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-1.5B) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-1.5B) |
66
- | SLM-SQL-1.5B | Qwen2.5-Coder-1.5B-Instruct | SFT + GRPO | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-1.5B) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-1.5B) |
67
- | CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct | Qwen2.5-Coder-1.5B-Instruct | SFT + GRPO | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct) |
68
- | SLM-SQL-Base-0.6B | Qwen3-0.6B | SFT | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-0.6B) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-0.6B) |
69
- | SLM-SQL-0.6B | Qwen3-0.6B | SFT + GRPO | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-0.6B) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-0.6B) |
70
- | SLM-SQL-Base-1.3B | deepseek-coder-1.3b-instruct | SFT | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-1.3B ) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-1.3B ) |
71
- | SLM-SQL-1.3B | deepseek-coder-1.3b-instruct | SFT + GRPO | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-1.3B ) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-1.3B ) |
72
- | SLM-SQL-Base-1B | Llama-3.2-1B-Instruct | SFT | [πŸ€– Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-1B ) | [πŸ€— HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-1B ) |
73
 
74
  ## Dataset
75
 
76
  | **Dataset** | Modelscope | HuggingFace |
77
  |----------------------------|------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------|
78
- | SynsQL-Think-916k | [πŸ€– Modelscope](https://modelscope.cn/datasets/cycloneboy/SynsQL-Think-916k) | [πŸ€— HuggingFace](https://huggingface.co/datasets/cycloneboy/SynsQL-Think-916k) |
79
- | SynsQL-Merge-Think-310k | [πŸ€– Modelscope](https://modelscope.cn/datasets/cycloneboy/SynsQL-Merge-Think-310k) | [πŸ€— HuggingFace](https://huggingface.co/datasets/cycloneboy/SynsQL-Merge-Think-310k) |
80
- | bird train and dev dataset | [πŸ€– Modelscope](https://modelscope.cn/datasets/cycloneboy/bird_train) | [πŸ€— HuggingFace](https://huggingface.co/datasets/cycloneboy/bird_train) |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
 
82
  ## TODO
83
 
 
1
  ---
 
2
  library_name: transformers
3
  license: cc-by-nc-4.0
4
+ pipeline_tag: text-generation
5
  tags:
6
  - text-to-sql
7
  - reinforcement-learning
8
  ---
9
 
 
10
  # SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
11
 
12
  ### Important Links
13
 
14
  πŸ“–[Arxiv Paper](https://arxiv.org/abs/2507.22478) |
15
+ \ud83d\udcbb[GitHub Repository](https://github.com/CycloneBoy/slm_sql) |
16
  πŸ€—[HuggingFace](https://huggingface.co/collections/cycloneboy/slm-sql-688b02f99f958d7a417658dc) |
17
  πŸ€–[ModelScope](https://modelscope.cn/collections/SLM-SQL-624bb6a60e9643) |
18
 
 
59
 
60
  | **Model** | Base Model | Train Method | Modelscope | HuggingFace |
61
  |------------------------------------------|------------------------------|--------------|---------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|
62
+ | SLM-SQL-Base-0.5B | Qwen2.5-Coder-0.5B-Instruct | SFT | [\ud83e\udd16 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-0.5B) | [\ud83e\udd17 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-0.5B) |
63
+ | SLM-SQL-0.5B | Qwen2.5-Coder-0.5B-Instruct | SFT + GRPO | [\ud83e\udd16 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-0.5B) | [\ud83e\udd17 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-0.5B) |
64
+ | CscSQL-Merge-Qwen2.5-Coder-0.5B-Instruct | Qwen2.5-Coder-0.5B-Instruct | SFT + GRPO | [\ud83e\udd16 Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-0.5B-Instruct) | [\ud83e\udd17 HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-0.5B-Instruct) |
65
+ | SLM-SQL-Base-1.5B | Qwen2.5-Coder-1.5B-Instruct | SFT | [\ud83e\udd16 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-1.5B) | [\ud83e\udd17 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-1.5B) |
66
+ | SLM-SQL-1.5B | Qwen2.5-Coder-1.5B-Instruct | SFT + GRPO | [\ud83e\udd16 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-1.5B) | [\ud83e\udd17 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-1.5B) |
67
+ | CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct | Qwen2.5-Coder-1.5B-Instruct | SFT + GRPO | [\ud83e\udd16 Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct) | [\ud83e\udd17 HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct) |
68
+ | SLM-SQL-Base-0.6B | Qwen3-0.6B | SFT | [\ud83e\udd16 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-0.6B) | [\ud83e\udd17 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-0.6B) |
69
+ | SLM-SQL-0.6B | Qwen3-0.6B | SFT + GRPO | [\ud83e\udd16 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-0.6B) | [\ud83e\udd17 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-0.6B) |
70
+ | SLM-SQL-Base-1.3B | deepseek-coder-1.3b-instruct | SFT | [\ud83e\udd16 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-1.3B ) | [\ud83e\udd17 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-1.3B ) |
71
+ | SLM-SQL-1.3B | deepseek-coder-1.3b-instruct | SFT + GRPO | [\ud83e\udd16 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-1.3B ) | [\ud83e\udd17 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-1.3B ) |
72
+ | SLM-SQL-Base-1B | Llama-3.2-1B-Instruct | SFT | [\ud83e\udd16 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-1B ) | [\ud83e\udd17 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-1B ) |
73
 
74
  ## Dataset
75
 
76
  | **Dataset** | Modelscope | HuggingFace |
77
  |----------------------------|------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------|
78
+ | SynsQL-Think-916k | [\ud83e\udd16 Modelscope](https://modelscope.cn/datasets/cycloneboy/SynsQL-Think-916k) | [\ud83e\udd17 HuggingFace](https://huggingface.co/datasets/cycloneboy/SynsQL-Think-916k) |
79
+ | SynsQL-Merge-Think-310k | [\ud83e\udd16 Modelscope](https://modelscope.cn/datasets/cycloneboy/SynsQL-Merge-Think-310k) | [\ud83e\udd17 HuggingFace](https://huggingface.co/datasets/cycloneboy/SynsQL-Merge-Think-310k) |
80
+ | bird train and dev dataset | [\ud83e\udd16 Modelscope](https://modelscope.cn/datasets/cycloneboy/bird_train) | [\ud83e\udd17 HuggingFace](https://huggingface.co/datasets/cycloneboy/bird_train) |
81
+
82
+ ## Sample Usage
83
+
84
+ You can easily load the model and tokenizer using the Hugging Face `transformers` library to perform text-to-SQL generation.
85
+
86
+ ```python
87
+ from transformers import AutoModelForCausalLM, AutoTokenizer
88
+ import torch
89
+
90
+ # Replace with the specific model you want to use, e.g., "cycloneboy/SLM-SQL-0.5B"
91
+ model_id = "cycloneboy/SLM-SQL-0.5B"
92
+
93
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
94
+ model = AutoModelForCausalLM.from_pretrained(
95
+ model_id,
96
+ torch_dtype=torch.bfloat16, # Adjust as needed (e.g., torch.float16 or remove for auto)
97
+ device_map="auto"
98
+ )
99
+
100
+ # Example natural language query for SQL generation
101
+ query = "Find the names of all employees who work in the 'Sales' department."
102
+
103
+ # Prepare the prompt using the model's chat template
104
+ chat_messages = [{"role": "user", "content": query}]
105
+ prompt = tokenizer.apply_chat_template(chat_messages, tokenize=False, add_generation_prompt=True)
106
+
107
+ # Generate the SQL query
108
+ model_inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
109
+ generated_ids = model.generate(
110
+ model_inputs.input_ids,
111
+ max_new_tokens=256,
112
+ do_sample=True,
113
+ temperature=0.7,
114
+ top_p=0.9
115
+ )
116
+
117
+ # Decode and print the generated SQL
118
+ generated_text = tokenizer.batch_decode(generated_ids[:, model_inputs.input_ids.shape[1]:], skip_special_tokens=True)[0]
119
+ print(generated_text)
120
+ ```
121
 
122
  ## TODO
123