Update README.md
Browse files
README.md
CHANGED
|
@@ -27,6 +27,9 @@ It uses PPO (Proximal Policy Optimization) to learn 2v2 gameplay through self-pl
|
|
| 27 |
- Ball velocity toward goal
|
| 28 |
- Goal scoring reward
|
| 29 |
|
|
|
|
|
|
|
|
|
|
| 30 |
## Training Configuration (from `config.json`)
|
| 31 |
|
| 32 |
- **Number of processes:** 4
|
|
|
|
| 27 |
- Ball velocity toward goal
|
| 28 |
- Goal scoring reward
|
| 29 |
|
| 30 |
+

|
| 31 |
+

|
| 32 |
+
|
| 33 |
## Training Configuration (from `config.json`)
|
| 34 |
|
| 35 |
- **Number of processes:** 4
|