Get Dataset Metadata from Kaggle using API and Python

Metadata provides additional information about dataset in Kaggle. It includes elements such as description, title, subtitle, keywords, licenses, total downloads, and other.

Kaggle API client provides metadata_get method to get the metadata for specified dataset. This method returns metadata in Python dictionary.

import os
from pprint import pprint

os.environ['KAGGLE_USERNAME'] = 'YOUR_USERNAME'
os.environ['KAGGLE_KEY'] = 'YOUR_KEY'

from kaggle.api.kaggle_api_extended import KaggleApi

owner = 'uciml'
datasetName = 'iris'

api = KaggleApi()
api.authenticate()

metadata = api.metadata_get(owner, datasetName)
pprint(metadata)

A code prints metadata in the formatted dictionary. A part of the output:

{'errorMessage': None,
 'info': {'collaborators': [],
          'data': [],
          'datasetId': 19,
          'datasetSlug': 'iris',
          'description': "The Iris dataset was used in R.A. Fisher's classic "
          ........................
          'title': 'Iris Species',
          'totalDownloads': 196594,
          'totalViews': 806318,
          'totalVotes': 2502,
          'usabilityRating': 0.7941176470588235}}

metadata_get method parameters:

NoParameterDefault valueDescription
1.owner_slugDataset owner.
2.dataset_slugDataset name.

Leave a Comment

Your email address will not be published. Required fields are marked *