Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: UnboundLocalError: cannot access local variable 'default_conversation' where it is not associated with a value #5930

Closed
1 task done
zhurunhua opened this issue Jul 19, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@zhurunhua
Copy link
Contributor

Is there an existing issue for this bug?

  • I have searched the existing issues

🐛 Describe the bug

I got an error when run applications/Colossal-LLaMA/prepare_sft_dataset.py

image

the script is :
python /mnt/data/tool/ColossalAI-0.4.0/applications/Colossal-LLaMA/prepare_sft_dataset.py \ --data_input_dirs "/mnt/data/dataset/llama3/prepare/original/2000items" \ --tokenizer_dir "/mnt/data/model/modelscope/Meta-Llama-3-8B-Instruct" \ --data_output_dirs "/mnt/data/dataset/llama3/prepare/2000items-llama3" \ --max_length 1024 \ --num_spliced_dataset_bins 10 \ --llama_version 3

the error is:
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. [07/19/24 16:52:06] INFO colossalai - colossalai - INFO: /mnt/data/tool/ColossalAI-0.4.0/applications/Colossal-LLaMA/prepare_sft_dataset.py:102 main INFO colossalai - colossalai - INFO: Start to process part-0/10 of all original datasets. Traceback (most recent call last): File "/mnt/data/tool/ColossalAI-0.4.0/applications/Colossal-LLaMA/prepare_sft_dataset.py", line 147, in <module> main() File "/mnt/data/tool/ColossalAI-0.4.0/applications/Colossal-LLaMA/prepare_sft_dataset.py", line 106, in main "tokenizer": tokenizer, ^^^^^^^^^ UnboundLocalError: cannot access local variable 'default_conversation' where it is not associated with a value

I've solved this bug and commit a PR soon...

Environment

● ubuntu22.04
● CPU:96c;
● RAM:736 GiB;
● GPU:8 * NVIDIA V100 (32GB)
● Python 3.11.5;
● ColossalAI 0.4.0;
● cuda_11.8;
● pytorch 2.1.0+cu118

@zhurunhua zhurunhua added the bug Something isn't working label Jul 19, 2024
@zhurunhua
Copy link
Contributor Author

fixed

@zhurunhua
Copy link
Contributor Author

#5931

@TongLi3701
Copy link
Member

Thanks for your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants