Add function to download dataset to a specific location. #20
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Description: Setup Custom Directory for Kaggle Competition Downloads
Summary:
This PR adds a new utility function
setup_comp_directory
to allow users to specify a custom directory for downloading and extracting Kaggle competition data. Additionally, the function includes an optional argument to install packages when running in a Kaggle environment.Key Features:
Custom Directory Support:
Optional Package Installation:
KAGGLE_KERNEL_RUN_TYPE
), users can specify a package to install using pip (e.g.,fastai
), ensuring the environment is properly set up for further analysis.Automatic Data Download and Extraction:
Function Signature:
Inputs:
competition
(str): Name of the Kaggle competition (e.g., 'titanic'). Used to fetch the dataset.path_to_download
(str): Path to the directory where the competition data will be downloaded and extracted.install
(str, optional): Package name to install in the Kaggle environment if required. Defaults to no installation.Outputs:
Path
object pointing to the directory containing the unzipped competition data.Example Usage:
This function will:
/my/custom/path
if it doesn’t exist./my/custom/path/titanic
.fastai
if running in a Kaggle environment.Why is this change needed?
Additional Notes:
install
parameter only affects the behavior in a Kaggle environment, ensuring local environments remain unaffected.