fastkaggle.core

API details for fastkaggle.

import_kaggle

 import_kaggle ()

Import kaggle API, using Kaggle secrets kaggle_username and kaggle_key if needed

api = import_kaggle()
L(api.competitions_list())
(#20) [contradictory-my-dear-watson,gan-getting-started,store-sales-time-series-forecasting,tpu-getting-started,digit-recognizer,titanic,house-prices-advanced-regression-techniques,connectx,nlp-getting-started,spaceship-titanic...]

setup_comp

 setup_comp (competition, install='')

Get a path to data for competition, downloading it if needed

setup_comp('titanic')
Path('titanic')

If you pass a list of space separated modules to install, they’ll be installed if running on Kaggle.


nb_meta

 nb_meta (user, id, title, file, competition=None, private=True,
          gpu=False, internet=True, linked_datasets=None)

Get the dict required for a kernel-metadata.json file

nb_meta('jhoward', 'my-notebook', 'My notebook', 'my-notebook.ipynb', competition='paddy-disease-classification')
{'id': 'jhoward/my-notebook',
 'title': 'My notebook',
 'code_file': 'my-notebook.ipynb',
 'language': 'python',
 'kernel_type': 'notebook',
 'is_private': True,
 'enable_gpu': False,
 'enable_internet': True,
 'keywords': [],
 'dataset_sources': [],
 'kernel_sources': [],
 'competition_sources': ['competitions/paddy-disease-classification']}

push_notebook

 push_notebook (user, id, title, file, path='.', competition=None,
                private=True, gpu=False, internet=True,
                linked_datasets=None)

Push notebook file to Kaggle Notebooks

Note that Kaggle recommends that the id match the slug for the title – i.e it should be the same as the title, but lowercase, no punctuation, and spaces replaced with dashes. E.g:

push_notebook('jhoward', 'first-steps-road-to-the-top-part-1',
              title='First Steps: Road to the Top, Part 1',
              file='first-steps-road-to-the-top-part-1.ipynb',
              competition='paddy-disease-classification',
              private=False, gpu=True)

Datasets

Core


check_ds_exists

 check_ds_exists (dataset_slug)

Checks if a dataset exists in kaggle and returns boolean

Details
dataset_slug Dataset slug (ie “zillow/zecon”)

mk_dataset

 mk_dataset (dataset_path, title, force=False, upload=True)

Creates minimal dataset metadata needed to push new dataset to kaggle

Type Default Details
dataset_path Local path to create dataset in
title Name of the dataset
force bool False Should it overwrite or error if exists?
upload bool True Should it upload and create on kaggle
mk_dataset('./testds','mytestds',force=True)
md = json.load(open('./testds/dataset-metadata.json'))
assert md['title'] == 'mytestds'
assert md['id'].endswith('/mytestds')
Data package template written to: testds/dataset-metadata.json

get_dataset

 get_dataset (dataset_path, dataset_slug, unzip=True, force=False)

Downloads an existing dataset and metadata from kaggle

Type Default Details
dataset_path Local path to download dataset to
dataset_slug Dataset slug (ie “zillow/zecon”)
unzip bool True Should it unzip after downloading?
force bool False Should it overwrite or error if dataset_path exists?

get_pip_library

 get_pip_library (dataset_path, pip_library, pip_cmd='pip')

Download the whl files for pip_library and store in dataset_path

Type Default Details
dataset_path Local path to download pip library to
pip_library name of library for pip to install
pip_cmd str pip pip base to use (ie “pip3” or “pip”)

get_pip_libraries

 get_pip_libraries (dataset_path, requirements_path, pip_cmd='pip')

Download whl files for a requirements.txt file and store in dataset_path

Type Default Details
dataset_path Local path to download pip library to
requirements_path path to requirements file
pip_cmd str pip pip base to use (ie “pip3” or “pip”)
dl_path = Path('./mylib')
get_pip_library(dl_path,'fastkaggle')
assert 1==len([o for o in dl_path.ls() if str(o).startswith(f"{dl_path}/fastkaggle")])

push_dataset

 push_dataset (dataset_path, version_comment)

Push dataset update to kaggle. Dataset path must contain dataset metadata file

Details
dataset_path Local path where dataset is stored
version_comment Comment associated with this dataset update

get_local_ds_ver

 get_local_ds_ver (lib_path, lib)

checks a local copy of kaggle dataset for library version number

Details
lib_path Local path dataset is stored in
lib Name of library (ie “fastcore”)

High Level


create_libs_datasets

 create_libs_datasets (libs, lib_path, username, clear_after=False)

For each library, create or update a kaggle dataset with the latest version

Type Default Details
libs library or list of libraries to create datasets for (ie ’fastcore or [‘fastcore’,‘fastkaggle’]
lib_path Local path to dl/create dataset
username You username
clear_after bool False Delete local copies after sync with kaggle?

create_requirements_dataset

 create_requirements_dataset (req_fpath, lib_path, title, username,
                              retain=['dataset-metadata.json'],
                              version_notes='NewUpdate')

Download everything needed in a requirements.txt file to a dataset and upload to kaggle

Type Default Details
req_fpath Path to requirements.txt file
lib_path Local path to dl/create dataset
title Title you want the kaggle dataset named
username you username
retain list [‘dataset-metadata.json’] Files that should not be removed
version_notes str New Update