This is a continuation of my previous post The Gridded National Soil Survey Geographic Database & Microsoft’s Planetary Computer (part 1) Part one shows how to access meta data to create a data dictionary for gNATSGO tabular datasets.
Using the data dictionary to identify tables of interest I created the following function and python notebook code to extract tabular data from the following tables to analyze crop yield in the Blackland Prairie region of texas.
sacatalog
legend
mapunit
mucropyield
View and download code from my Github Repository
# Import required libraries
import planetary_computer
import numpy as np
import pandas as pd
import rioxarray
import xarray
import pystac_client
# Create planetary computer client
catalog = pystac_client.Client.open(
“https://planetarycomputer.microsoft.com/api/stac/v1",
modifier=planetary_computer.sign_inplace,
)
# Empty Dataframe
df_dictionary = pd.DataFrame(columns = [‘name’,’type’,’description’,’table_name’])
# Search catalog for gnatsgo table collection
search = catalog.search(
collections=[“gnatsgo-tables”]
)
# Function to read and export specified table and column list from gNATSGO data sets
def getTableDataAsCsv(table_name, table_key, column_list, csv_file_name):
catalog_items = catalog.get_collection("gnatsgo-tables").get_item(table_name)
table_data = pd.read_parquet(
catalog_items.assets["data"].href,
columns= column_list,
storage_options= catalog_items.assets["data"].extra_fields["table:storage_options"],
engine="pyarrow",
)
if table_key > "":
table_data = table_data.set_index(table_key)
df = pd.DataFrame(table_data)
pd.set_option('display.max_colwidth', 500)
df.to_csv(csv_file_name)
# Example function use for mapunit table
columns = [“mukey”, “lkey”, “musym”, “muname”, “mukind”, “mustatus”, “muacres”, “mapunitlfw_l”, “mapunitlfw_r”, “mapunitlfw_h”, “mapunitpfa_l”, “mapunitpfa_r”, “mapunitpfa_h”, “farmlndcl”, “muhelcl”, “muwathelcl”, “muwndhelcl”, “interpfocus”, “invesintens”, “iacornsr”, “nhiforsoigrp”, “nhspiagr”, “vtsepticsyscl”, “mucertstat”]
getTableDataAsCsv("mapunit", "mukey", columns, "mapunit.csv")
After using the above notebook to extract data from the tables of interest, the data was loaded to Power BI and the following relationships were created.

Related Posts
Part 1: The Gridded National Soil Survey Database - Reading meta-data to create a data dictionary
Part 3: Blackland Prairie Crop Yields (Power BI Visualizations)
Planetary Computer gNATSGO Table Collection
- gNatsgo Tabular Data: List of tables in the collection
- gNatsgo Rasters: Description of Raster Data in the colleciton
Microsoft Planetary Computer, Soil Survey, gNATSGO — Jan 13, 2023