# Model Card User Input Analysis ## Overview This document analyzes the interaction between the model card template (`templates/model_card.md`), the model card generator (`scripts/model_tonic/generate_model_card.py`), and the launch script (`launch.sh`) to identify variables that require user input and improve the user experience. ## Template Variables Analysis ### Variables in `templates/model_card.md` The model card template uses the following variables that can be populated with user input: #### Core Model Information - `{{model_name}}` - Display name of the model - `{{model_description}}` - Brief description of the model - `{{repo_name}}` - Hugging Face repository name - `{{base_model}}` - Base model used for fine-tuning #### Training Configuration - `{{training_config_type}}` - Type of training configuration used - `{{trainer_type}}` - Type of trainer (SFT, DPO, etc.) - `{{batch_size}}` - Training batch size - `{{gradient_accumulation_steps}}` - Gradient accumulation steps - `{{learning_rate}}` - Learning rate used - `{{max_epochs}}` - Maximum number of epochs - `{{max_seq_length}}` - Maximum sequence length #### Dataset Information - `{{dataset_name}}` - Name of the dataset used - `{{dataset_size}}` - Size of the dataset - `{{dataset_format}}` - Format of the dataset - `{{dataset_sample_size}}` - Sample size (for lightweight configs) #### Training Results - `{{training_loss}}` - Final training loss - `{{validation_loss}}` - Final validation loss - `{{perplexity}}` - Model perplexity #### Infrastructure - `{{hardware_info}}` - Hardware used for training - `{{experiment_name}}` - Name of the experiment - `{{trackio_url}}` - Trackio monitoring URL - `{{dataset_repo}}` - HF Dataset repository #### Author Information - `{{author_name}}` - Author name for citations and attribution - `{{model_name_slug}}` - URL-friendly model name #### Quantization - `{{quantized_models}}` - Boolean indicating if quantized models exist ## User Input Requirements ### Previously Missing User Inputs #### 1. **Author Name** (`author_name`) - **Purpose**: Used in model card metadata and citations - **Template Usage**: `{{#if author_name}}author: {{author_name}}{{/if}}` - **Citation Usage**: `author={{{author_name}}}` - **Default**: "Your Name" - **User Input Added**: ✅ **IMPLEMENTED** #### 2. **Model Description** (`model_description`) - **Purpose**: Brief description of the model's capabilities - **Template Usage**: `{{model_description}}` - **Default**: "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities." - **User Input Added**: ✅ **IMPLEMENTED** ### Variables That Don't Need User Input Most variables are automatically populated from: - **Training Configuration**: Batch size, learning rate, epochs, etc. - **System Detection**: Hardware info, model size, etc. - **Auto-Generation**: Repository names, experiment names, etc. - **Training Results**: Loss values, perplexity, etc. ## Implementation Changes ### 1. Launch Script Updates (`launch.sh`) #### Added User Input Prompts ```bash # Step 8.2: Author Information for Model Card print_step "Step 8.2: Author Information" echo "=================================" print_info "This information will be used in the model card and citation." get_input "Author name for model card" "$HF_USERNAME" AUTHOR_NAME print_info "Model description will be used in the model card and repository." get_input "Model description" "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities." MODEL_DESCRIPTION ``` #### Updated Configuration Summary ```bash echo " Author: $AUTHOR_NAME" ``` #### Updated Model Push Call ```bash python scripts/model_tonic/push_to_huggingface.py /output-checkpoint "$REPO_NAME" \ --token "$HF_TOKEN" \ --trackio-url "$TRACKIO_URL" \ --experiment-name "$EXPERIMENT_NAME" \ --dataset-repo "$TRACKIO_DATASET_REPO" \ --author-name "$AUTHOR_NAME" \ --model-description "$MODEL_DESCRIPTION" ``` ### 2. Push Script Updates (`scripts/model_tonic/push_to_huggingface.py`) #### Added Command Line Arguments ```python parser.add_argument('--author-name', type=str, default=None, help='Author name for model card') parser.add_argument('--model-description', type=str, default=None, help='Model description for model card') ``` #### Updated Class Constructor ```python def __init__( self, model_path: str, repo_name: str, token: Optional[str] = None, private: bool = False, trackio_url: Optional[str] = None, experiment_name: Optional[str] = None, dataset_repo: Optional[str] = None, hf_token: Optional[str] = None, author_name: Optional[str] = None, model_description: Optional[str] = None ): ``` #### Updated Model Card Generation ```python variables = { "model_name": f"{self.repo_name.split('/')[-1]} - Fine-tuned SmolLM3", "model_description": self.model_description or "A fine-tuned version of SmolLM3-3B for improved text generation and conversation capabilities.", # ... other variables "author_name": self.author_name or training_config.get('author_name', 'Your Name'), } ``` ## User Experience Improvements ### 1. **Interactive Prompts** - Users are now prompted for author name and model description - Default values are provided for convenience - Clear explanations of what each field is used for ### 2. **Configuration Summary** - Author name is now displayed in the configuration summary - Users can review all settings before proceeding ### 3. **Automatic Integration** - User inputs are automatically passed to the model card generation - No manual editing of scripts required ## Template Variable Categories ### Automatic Variables (No User Input Needed) - `repo_name` - Auto-generated from username and date - `base_model` - Always "HuggingFaceTB/SmolLM3-3B" - `training_config_type` - From user selection - `trainer_type` - From user selection - `batch_size`, `learning_rate`, `max_epochs` - From training config - `hardware_info` - Auto-detected - `experiment_name` - Auto-generated with timestamp - `trackio_url` - Auto-generated from space name - `dataset_repo` - Auto-generated - `training_loss`, `validation_loss`, `perplexity` - From training results ### User Input Variables (Now Implemented) - `author_name` - ✅ **Added user prompt** - `model_description` - ✅ **Added user prompt** ### Conditional Variables - `quantized_models` - Set automatically based on quantization choices - `dataset_sample_size` - Set based on training configuration type ## Benefits of These Changes ### 1. **Better Attribution** - Author names are properly captured and used in citations - Model cards include proper attribution ### 2. **Customizable Descriptions** - Users can provide custom model descriptions - Better model documentation and discoverability ### 3. **Improved User Experience** - No need to manually edit scripts - Interactive prompts with helpful defaults - Clear feedback on what information is being collected ### 4. **Consistent Documentation** - All model cards will have proper author information - Standardized model descriptions - Better integration with Hugging Face Hub ## Future Enhancements ### Potential Additional User Inputs 1. **License Selection** - Allow users to choose model license 2. **Model Tags** - Custom tags for better discoverability 3. **Usage Examples** - Custom usage examples for specific use cases 4. **Limitations Description** - Custom limitations based on training data ### Template Improvements 1. **Dynamic License** - Support for different license types 2. **Custom Tags** - User-defined model tags 3. **Usage Scenarios** - Template sections for different use cases ## Testing The changes have been tested to ensure: - ✅ Author name is properly passed to model card generation - ✅ Model description is properly passed to model card generation - ✅ Default values work correctly - ✅ Configuration summary displays new fields - ✅ Model push script accepts new parameters ## Conclusion The analysis identified that the model card template had two key variables (`author_name` and `model_description`) that would benefit from user input. These have been successfully implemented with: 1. **Interactive prompts** in the launch script 2. **Command line arguments** in the push script 3. **Proper integration** with the model card generator 4. **User-friendly defaults** and clear explanations This improves the overall user experience and ensures that model cards have proper attribution and descriptions.