Update README.md
Browse files
README.md
CHANGED
|
@@ -1,4 +1,14 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
<img src="https://raw.githubusercontent.com/UBC-NLP/marbert/main/ARBERT_MARBERT.jpg" alt="drawing" width="30%" height="30%" align="right"/>
|
| 3 |
|
| 4 |
**ARBERT** is one of three models described in our **ACl 2021 paper** **["ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic"](https://mageed.arts.ubc.ca/files/2020/12/marbert_arxiv_2020.pdf)**. ARBERT is a large-scale pre-trained masked language model focused on Modern Standard Arabic (MSA). To train ARBERT, we use the same architecture as BERT-base: 12 attention layers, each has 12 attention heads and 768 hidden dimensions, a vocabulary of 100K WordPieces, making ∼163M parameters. We train ARBERT on a collection of Arabic datasets comprising **61GB of text** (**6.2B tokens**). For more information, please visit our own GitHub [repo](https://github.com/UBC-NLP/marbert).
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- ar
|
| 4 |
+
tags:
|
| 5 |
+
- Arabic BERT
|
| 6 |
+
- MSA
|
| 7 |
+
- Twitter
|
| 8 |
+
- Masked Langauge Model
|
| 9 |
+
widget:
|
| 10 |
+
- text: "اللغة العربية هي لغة [MASK]."
|
| 11 |
+
---
|
| 12 |
<img src="https://raw.githubusercontent.com/UBC-NLP/marbert/main/ARBERT_MARBERT.jpg" alt="drawing" width="30%" height="30%" align="right"/>
|
| 13 |
|
| 14 |
**ARBERT** is one of three models described in our **ACl 2021 paper** **["ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic"](https://mageed.arts.ubc.ca/files/2020/12/marbert_arxiv_2020.pdf)**. ARBERT is a large-scale pre-trained masked language model focused on Modern Standard Arabic (MSA). To train ARBERT, we use the same architecture as BERT-base: 12 attention layers, each has 12 attention heads and 768 hidden dimensions, a vocabulary of 100K WordPieces, making ∼163M parameters. We train ARBERT on a collection of Arabic datasets comprising **61GB of text** (**6.2B tokens**). For more information, please visit our own GitHub [repo](https://github.com/UBC-NLP/marbert).
|