Kaffae Day 303 - Move Training To Huggingface Transformers GPT2
Автор: Masatoshi Nishimura
Загружено: 2020-12-15
Просмотров: 20
Описание:
Today...
I was trying to get Huggingface transformer GPT training working. I finally got it working finally. The problem now is however it takes so long to complete the training. WIth the 140mb dataset, it takes more than 8 days to train. And that's just for medium size pretrained model.
It only runs at 4 batch size. I'm not sure how it works at GPT Simple but there it easily runs at 32 batch size. The Simple maxes out the RAM at training, while Transformers doesn't seem to utilize all RAM available. Maybe that's because of their difference: one based on Tensforflow and the other on PyTorch. I'll need to compare the results: if it's just as good on GPT Simple I'd keep with that.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: