Get Dataset Metadata from Kaggle using API and Python

Get Dataset Metadata from Kaggle using API and Python

Metadata provides additional information about dataset in Kaggle. It includes elements such as description, title, subtitle, keywords, licenses, total downloads, and other.

Kaggle API client provides the metadata_get method to get the metadata for specified dataset. This method returns metadata in Python dictionary.

import os
from pprint import pprint

os.environ['KAGGLE_USERNAME'] = 'YOUR_USERNAME'
os.environ['KAGGLE_KEY'] = 'YOUR_KEY'

from kaggle.api.kaggle_api_extended import KaggleApi

owner = 'uciml'
datasetName = 'iris'

api = KaggleApi()
api.authenticate()

metadata = api.metadata_get(owner, datasetName)
pprint(metadata)

A code prints metadata in the formatted dictionary. A part of the output:

{'errorMessage': None,
 'info': {'collaborators': [],
          'data': [],
          'datasetId': 19,
          'datasetSlug': 'iris',
          'description': "The Iris dataset was used in R.A. Fisher's classic "
          ........................
          'title': 'Iris Species',
          'totalDownloads': 196594,
          'totalViews': 806318,
          'totalVotes': 2502,
          'usabilityRating': 0.7941176470588235}}

metadata_get method parameters:

NoParameterDefault valueDescription
1.owner_slug-Dataset owner.
2.dataset_slug-Dataset name.

Leave a Comment

Cancel reply

Your email address will not be published.