Skip to content

Instantly share code, notes, and snippets.

@robbibt
Last active March 31, 2025 01:47
Show Gist options
  • Save robbibt/a08d3dcb55fd4b560502f09d0ef2d01c to your computer and use it in GitHub Desktop.
Save robbibt/a08d3dcb55fd4b560502f09d0ef2d01c to your computer and use it in GitHub Desktop.
Check ODC dataset counts (e.g. "final", "nrt", "interim")
import pandas as pd
import datacube
dc = datacube.Datacube()
# Get table of dataset counts
product_dict = {
product: {maturity: dc.index.datasets.count(product=product, dataset_maturity=maturity)
for maturity in ["final", "nrt", "interim"]}
for product in ["ga_s2am_ard_3", "ga_s2bm_ard_3", "ga_s2cm_ard_3"]
}
pd.DataFrame.from_dict(product_dict).T
# Get exact datasets
id_list = list(dc.index.datasets.search_returning(["id"], product="ga_s2bm_ard_3", dataset_maturity="final"))
@robbibt
Copy link
Author

robbibt commented Mar 31, 2025

From @jeremyh:

# The methods mirror each other in arugments (but return different things):

# Return Dataset objects
dc.index.datasets.search(**search_params)
# Return a count
dc.index.datasets.count(**search_params)
# Return a tuple of the fields I ask for (by name)
dc.index.datasets.search_returning(fields, **search_params)

# Example: I only want the UUID and landsat_product_id for each dataset
for dataset_id, landsat_product_id in dc.index.datasets.search_returning(
    ("id", "landsat_product_id"), product="usgs_ls8c_level1_2", landsat_data_type="L1TP"
):
    print(dataset_id)
    print(landsat_product_id)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment