Configuration of pydggsapi
This section introduces the configuration setup for publishing collections with pydggsapi.
pydggsapi uses TinyDB to store all information it needs in three tables:
collections - the table to store collection info
dggrs - the table to store dggrs providers info
collection_providers - the table to store collection providers’ info
Generally, users mostly work with the collections table to publish collections. Each record defines a collection with metadata, how to access the data (collection provider) and which DGGRS it supports (DGGRS provider). So, to publish a collection through pydggsapi, the users need to provide the following details :
A DGGS-ready dataset. The dataset is converted/regridded into one of the supported DGGRS by pydggsapi.
The dggrs ID for the DGGRS.
The collection provider ID that is supported by pydggsapi to access the data.
Developers implementing new DGGRS and collection providers must register them in the tables dggrs or collections_providers with a unique ID such that they can be referenced in the collections table.
{
"collections": {
"1": {
"suitability_hytruck": {
"title": "Suitability Modelling for Hytruck",
"description": "Suitablilit analysis using IGEO7 DGGRS for the Hytruck project. Datasource stored in a clickhouse DB. Around 3M rows at refinement level 9",
"extent": {"spatial": {"bbox":[ [5.86307954788208, 47.31793212890625, 31.61196517944336, 70.0753173828125] ] }} ,
"collection_provider": {
"providerId": "clickhouse",
"dggrsId": "igeo7",
"min_refinement_level": 5,
"max_refinement_level": 9,
"datasource_id": "hytruck_clickhousre"
}
}
},
"2": {
"suitability_hytruck_parquet": {
"title": "Suitability Modelling for Hytruck in parquet format",
"description": "Suitablilit analysis using IGEO7 DGGRS for the Hytruck project. Datasource stored in a parquet file. Around 3M rows at refinement level 9",
"description": "Desc",
"collection_provider": {
"providerId": "parquet",
"dggrsId": "igeo7",
"min_refinement_level": 5,
"max_refinement_level": 7,
"datasource_id": "hytruck"
}
}
},
"3": {
"suitability_hytruck_parquet_rhealpix": {
"title": "Suitability Modelling for Hytruck in parquet format with HEALPix",
"description": "Suitablilit analysis using rHEALPix DGGRS for the Hytruck project. Datasource stored in a parquet file",
"collection_provider": {
"providerId": "parquet",
"dggrsId": "rhealpix",
"dggrs_zoneid_repr": "int",
"min_refinement_level": 5,
"max_refinement_level": 7,
"datasource_id": "hytruck_parquet_in_rhealpix"
}
}
}
},
"dggrs": {
"1": {
"igeo7": {
"title": "IGEO7 DGGRS with z7string",
"description": "IGEO7, a novel pure aperture 7 hexagonal DGGS, and Z7, its associated hierarchical integer indexing system",
"crs": "wgs84",
"shapeType": "hexagon",
"definition_link": "https://agile-giss.copernicus.org/articles/6/32/2025/",
"defaultDepth": 1,
"classname": "igeo7_dggrs_provider.IGEO7Provider"
}
},
"2": {
"h3": {
"title": "H3 indexes points and shapes into a hexagonal grid.",
"description": "H3 is a discrete global grid system for indexing geographies into a hexagonal grid, developed at Uber.",
"crs": "wgs84",
"shapeType": "hexagon",
"definition_link": "https://h3geo.org/",
"defaultDepth": 1,
"classname": "h3_dggrs_provider.H3Provider"
}
},
"3": {
"rhealpix": {
"title": "rHEALPix using dggal",
"description": "An equal area and axis-aligned grid with square zones topology and a refinement ratio of 9 defined in the rHEALPix projection using 50° E prime meridian (equivalent to PROJ implementation with parameters +proj=rhealpix +lon_0=50 +ellps=WGS84), the original hierarchical indexing, and scanline-based sub-zone ordering",
"crs": "wgs84",
"shapeType": "square",
"definition_link": "https://www.opengis.net/def/dggrs/OGC/1.0/rHEALPix",
"defaultDepth": 1,
"classname": "dggal_dggrs_provider.DGGALProvider",
"parameters": {"grid": "rHEALPix"}
}
}
},
"collection_providers": {
"1": {
"clickhouse": {
"classname": "clickhouse_collection_provider.ClickhouseCollectionProvider",
"datasources": {
"connection":{
"host": "127.0.0.1",
"user": null,
"password": null,
"port": 9000,
"database": "DevelopmentTesting"
},
"hytruck_clickhouse":{
"table": "<table name>",
"zone_groups": {
"9": "column of refinement level 9",
"8": "column of refinement level 8",
"7": "column of refinement level 7",
"6": "column of refinement level 6",
"5": "column of refinement level 5"
},
"exclude_data_cols": ["list of column names excluded from the query"],
"data_cols": [ "list of column names"],
"datetime_col": "column_name_of_datetime"
}
}
}
},
"2": {
"zarr": {
"classname": "zarr_collection_provider.ZarrCollectionProvider",
"datasources": {
"zarr_hytruck": {
"filepath": "./aggregated_tree.zarr",
"zone_groups": {
"4": "res4",
"5": "res5",
"6": "res6"
}
}
}
}
},
"3": {
"parquet": {
"classname": "parquet_collection_provider.ParquetCollectionProvider",
"datasources": {
"hytruck": {
"filepath": "<local file path or path of a cloud bucket>",
"id_col": "cell_ids",
"data_cols": ["*"],
"datetime_col": "column_name_of_datetime"
},
"hytruck_parquet_in_rhealpix": {
"filepath": "<local file path or path of a cloud bucket>",
"id_col": "cell_ids",
"data_cols": ["*"],
"datetime_col": "column_name_of_datetime"
}
}
}
}
}
}
collections
Inside the collections table, each document represents a collection. The document comes with a document ID that maps to a dictionary. The dictionary key is the collection ID that maps to a key-value pair of attributes to describe the collection. The following code block shows the table structure with two collections defined.
{
"collections": {
"<document ID>": {
"<collection ID>" {
"attribute1" : "value",
"attribute2" : "value",
}
},
"<document ID>": {
"<collection ID>" {
"attribute1" : "value",
"attribute2" : "value",
}
}
}
}
The dictionary associated with the collection ID defines metadata and methods to access the data.
collections ID: The unique ID for the collection.metadata:
title,description,extentetc. Attributes describing a collection. The definition follows the collection-description from OGC API Common - Part2.collection_provider: a dictionary that describes how to access the data.providerId: the collection provider IDdggrsId: the dggrs provider IDdggrs_zoneid_repr: It is a string to indicate the zone ID representation used in the datasource. It must be in one of the values from the following selection list :['int', 'textual', 'hexstring']
It defaults to
textual. The API alway assume the zone ID from requests are usingtextualrepresentation. If the datasource is not usingtextualrepresentation, the API performs conversion fromtextualrepresentation to the representation used in the datasource. The conversion is provided by the DGGRS providermax_refinement_level: the maximum refinement level(the finest) of the data.min_refinement_level: the minimum refinement level(the coarsest) of the data.datasource_id: The datasource ID defines in the correspondingcollection_provider. Details can be found in the implementations of collection providers.
Here is an example on how to define a collection that uses clickhouse as collection provider (i.e. the data is stored in clickhouse DB).
"collections": {"1":
{"suitability_hytruck":
{"title": "Suitability Modelling for Hytruck",
"description": "Desc",
"extent": {"Spatial": { "bbox": [[5.86307954788208, 47.31793212890625, 31.61196517944336, 70.0753173828125]] }},
"collection_provider": {
"providerId": "clickhouse",
"dggrsId": "igeo7",
"max_refinement_level": 9,
"min_refinement_level": 5,
"datasource_id": "hytruck_clickhouse"
}
}
}
}
dggrs
Inside the dggrs table, each document represents a dggrs provider. The document comes with a document ID that maps to a dictionary. The dictionary key is the dggrsId that maps to a key-value pair of attributes to describe the DGGRS. The table structure is the same as the collection table.
The dictionary associated with the dggrs ID defines metadata and the actual implementation of the DGGRS.
dggrs ID: The unique ID for the DGGRS, it is used in the dggrsId inside a collection.metadata : OGC DGGS API required description fields of the DGGRS. (e.g. title, shapeType etc.)
classname: The actual implementation module under dependencies/dggrs_providers
parameters: Initialisation parameters in a dictionary format for the DGGRS provider.
Here is an example on how to define DGGRS for IGEO7 and H3.
"dggrs": {"1":
{"igeo7":
{"title": "IGEO7 DGGRS with z7string",
"description": "Hexagonal grid with ISEA projection and refinement ratio of 7. z7 space-filling curve",
"crs": "wgs84",
"shapeType": "hexagon",
"definition_link": "https://agile-giss.copernicus.org/articles/6/32/2025/",
"defaultDepth": 1,
"classname": "igeo7_dggrs_provider.IGEO7Provider",
"parameters": {"geodetic_conversion": true}
}
},
"2":
{"h3":
{"title": "Uber H3",
"description": "Uber H3",
"crs": "wgs84",
"shapeType": "hexagon",
"definition_link": "https://h3geo.org/",
"defaultDepth": 1,
"classname": "h3_dggrs_provider.H3Provider"}
}
}
collection_providers
Inside the collection_providers table, each document represents a collection provider. The document comes with a document ID that maps to a dictionary. The dictionary key is the collection provider ID that maps to a key-value pair of attributes to describe the collection provider. The table structure is the same as the collection table.
The dictionary associated with the collection provider ID defines the implementation module and initialization parameters.
collection provider ID: The unique ID for the collection provider, it is used in the providerId inside a collection.
classname: The actual implementation module under dependencies/collections_providers
datasources: A dictionary to define datasource supports by the collection provider. Depending on the collection provider’s implementation, it can also contain information beyond data sources. For example, theClickhouseCollectionProviderrequires a connection element to specify the DB connection info.
Here is an example on how to define a collection provier for clickhouse.
"collection_providers": {"1":
{"clickhouse":
{"classname": "clickhouse_collection_provider.ClickhouseCollectionProvider",
"datasources":
"connection": {"host": "127.0.0.1",
"user": "user",
"password": "password",
"port": 9000,
"database": "DevelopmentTesting"},
"hytruck_clickhouse":{
"table": "suitablilty",
"zone_groups": {
"9": "res_9_id",
"8": "res_8_id",
"7": "res_7_id",
"6": "res_6_id",
"5": "res_5_id"
},
"data_cols": [ "data_col1", "data_col2"]
}
}
}
}