Spaces:

chendl
/

multimodal

Runtime error

multimodal / transformers /examples /research_projects /luke /README.md

add transformers

455a40f over 2 years ago

2.08 kB

	# Token classification

	## PyTorch version, no Trainer

	Fine-tuning (m)LUKE for token classification task such as Named Entity Recognition (NER), Parts-of-speech
	tagging (POS) or phrase extraction (CHUNKS). You can easily
	customize it to your needs if you need extra processing on your datasets.

	It will either run on a datasets hosted on our [hub](https://huggingface.co/datasets) or with your own text files for
	training and validation, you might just need to add some tweaks in the data preprocessing.

	The script can be run in a distributed setup, on TPU and supports mixed precision by
	the mean of the [🤗 `Accelerate`](https://github.com/huggingface/accelerate) library. You can use the script normally
	after installing it:

	```bash
	pip install git+https://github.com/huggingface/accelerate
	```

	then to train English LUKE on CoNLL2003:

	```bash
	export TASK_NAME=ner

	python run_luke_ner_no_trainer.py \
	--model_name_or_path studio-ousia/luke-base \
	--dataset_name conll2003 \
	--task_name $TASK_NAME \
	--max_length 128 \
	--per_device_train_batch_size 32 \
	--learning_rate 2e-5 \
	--num_train_epochs 3 \
	--output_dir /tmp/$TASK_NAME/
	```

	You can then use your usual launchers to run in it in a distributed environment, but the easiest way is to run

	```bash
	accelerate config
	```

	and reply to the questions asked. Then

	```bash
	accelerate test
	```

	that will check everything is ready for training. Finally, you can launch training with

	```bash
	export TASK_NAME=ner

	accelerate launch run_ner_no_trainer.py \
	--model_name_or_path studio-ousia/luke-base \
	--dataset_name conll2003 \
	--task_name $TASK_NAME \
	--max_length 128 \
	--per_device_train_batch_size 32 \
	--learning_rate 2e-5 \
	--num_train_epochs 3 \
	--output_dir /tmp/$TASK_NAME/
	```

	This command is the same and will work for:

	- a CPU-only setup
	- a setup with one GPU
	- a distributed training with several GPUs (single or multi node)
	- a training on TPUs

	Note that this library is in alpha release so your feedback is more than welcome if you encounter any problem using it.