Dataset can be provided in various file formats. Kaggle supports CSV, JSON, BigQuery and SQLite database file format. Files can be compressed using the ZIP or other common file archive format.
Kaggle API client provides the datasets_list_files
method to get a list of dataset files. This method returns results in Python dictionary.
import os
from pprint import pprint
os.environ['KAGGLE_USERNAME'] = 'YOUR_USERNAME'
os.environ['KAGGLE_KEY'] = 'YOUR_KEY'
from kaggle.api.kaggle_api_extended import KaggleApi
owner = 'uciml'
datasetName = 'iris'
api = KaggleApi()
api.authenticate()
files = api.datasets_list_files(owner, datasetName)
pprint(files)
A part of the output:
{'datasetFiles': [{'columns': [.............],
'creationDate': '2016-09-27T07:38:05.44Z',
'datasetRef': 'uciml/iris',
'description': 'SQLite database containing the same data as '
'Iris.csv',
'fileType': '.csv',
'name': 'Iris.csv',
........................
{'columns': [],
'creationDate': '2016-09-27T07:38:05.44Z',
'datasetRef': 'uciml/iris',
'description': 'SQLite database containing the same data as '
'Iris.csv',
'fileType': '.sqlite',
'name': 'database.sqlite',
........................
datasets_list_files
method parameters:
No | Parameter | Default value | Description |
---|---|---|---|
1. | owner_slug | - | Dataset owner. |
2. | dataset_slug | - | Dataset name. |
Leave a Comment
Cancel reply