M1-MathX / README.md
Parveshiiii's picture
Update README.md
82743e4 verified
metadata
datasets:
  - XenArcAI/MathX-5M
base_model:
  - google/gemma-3-1b-it
pipeline_tag: text-generation

Model Card: Parveshiiii/M1-MathX

Model Details

  • Model Name: Parveshiiii/M1-MathX
  • Base Architecture: Gemma (1B parameters)
  • Model Type: Causal Language Model (text-generation)
  • Training Framework: Hugging Face Transformers
  • Precision: fp16
  • Attention Mechanism: Hybrid sliding-window and full attention layers
  • Tokenizer: Gemma tokenizer (vocab size 262,144)

Usage

from transformers import pipeline, TextStreamer

pipe = pipeline("text-generation", model="Parveshiiii/M1-MathX")
messages = [
    {"role": "user", "content": "Who are you?"},
]
streamer = TextStreamer(pipe.tokenizer)
pipe(messages, streamer=streamer, max_new_tokens=10000)

Intended Use

  • Designed for mathematical reasoning tasks, including problem solving, equation manipulation, and step-by-step derivations.
  • Suitable for educational contexts, math tutoring, and research experiments in reasoning alignment.
  • Not intended for general-purpose conversation or sensitive domains outside mathematics.

Training Data

  • Dataset: MathX (curated mathematical reasoning dataset)
  • Samples Used: ~300
  • Training Steps: 50
  • Method: GRPO (Group Relative Policy Optimization) fine-tuning
  • Objective: Reinforcement-style alignment for improved reasoning clarity and correctness.

Performance

  • Demonstrated strong performance on small-scale math problems and symbolic reasoning tasks.
  • Early benchmarks suggest improved accuracy compared to the base Gemma 1B model on math-specific datasets.
  • Requires formal evaluation on GSM8K, MATH, and other benchmarks for quantitative comparison.

Limitations

  • Small dataset and limited training steps mean coverage is narrow.
  • May overfit to MathX patterns and fail on broader or more complex problems.
  • Not guaranteed to generalize outside mathematical reasoning.
  • As a 1B model, capacity is limited compared to larger LLMs.

Ethical Considerations

  • Intended for safe educational use.
  • Should not be deployed in high-stakes environments without further validation.
  • Outputs may contain errors; human oversight is required.

Citation

If you use this model, please cite as:

@misc{Parvesh2025M1MathX,
  author = {Parvesh Rawal},
  title = {Parveshiiii/M1-MathX: A Gemma-1B model fine-tuned on MathX with GRPO},
  year = {2025},
  howpublished = {\url{https://huggingface.co/Parveshiiii/M1-MathX}}
}