Update README.md
Browse files
README.md
CHANGED
|
@@ -11,6 +11,8 @@ tags:
|
|
| 11 |
|
| 12 |
# Llama-3.2-1B-Instruct-FlashHead
|
| 13 |
|
|
|
|
|
|
|
| 14 |
**Optimized version of Llama-3.2-1B-Instruct using FlashHead, Embedl’s efficient replacement for the language model head, reducing size while preserving accuracy.**
|
| 15 |
Designed for **low-latency inference** on **NVIDIA RTX GPUs**, leveraging:
|
| 16 |
|
|
|
|
| 11 |
|
| 12 |
# Llama-3.2-1B-Instruct-FlashHead
|
| 13 |
|
| 14 |
+

|
| 15 |
+
|
| 16 |
**Optimized version of Llama-3.2-1B-Instruct using FlashHead, Embedl’s efficient replacement for the language model head, reducing size while preserving accuracy.**
|
| 17 |
Designed for **low-latency inference** on **NVIDIA RTX GPUs**, leveraging:
|
| 18 |
|