translated_description
A translation is a name (or title) and a description describing something.
That "something" is voluntarily left undefined so any object requiring a description can add an entry to this table.
Any objects (sources, datasets, labels) provide at least one description in the language of the data producer.
The value of the key
and lang
columns, should be understandable by a human and refer to the translated object.
Column | Type | Description | Example | Index |
---|---|---|---|---|
id | Varchar | Unique ID of the translation. It is the concatenation of key and lang . |
src_ign_admin_express_fr | Primary |
key | Varchar | Key referencing this translation in any language. Generally prefixed by the type of object translated (src for a source, ds for a dataset, lbl for a label) |
src_ign_admin_express | |
lang | Varchar | Language code | fr | |
name | Varchar | Object name (translated) | Ign Admin Express | |
description | Varchar | Object description (translated) | Le découpage administratif du territoire français (commune, arrondissement départemental, département, région...). |
source
A data source describes a data producer providing at least one dataset.
This producer is generally a web page providing one or several data files packaged together in an archive.
For example, the "Ign Admin Express" source provides a single archive containing one dataset for each layer of the French administration.
A data source is linked to a file provided by a data producer. A single producer can offer several sources that themselves contain several datasets.
Column | Type | Description | Example | Index |
---|---|---|---|---|
id | Varchar | Source unique ID | ign_admin_express | Primary |
url | Varchar | Link to the webpage of the data source (where is the archive) | https://geoservices.ign.fr/adminexpress | |
update_frequency | Varchar | Expected update frequency from the data provider | 1 month | |
description_id | Varchar | Reference to the primary description of the source. The primary description should be in the same language as the one of the data provider (i.e., french for Ign Admin Express) | src_ign_admin_express_fr | |
description_key | Varchar | Translation key shared by all languages | src_ign_admin_express | |
description_lang | Varchar | Main translation language | fr |
dataset
A set of records. Either a zone with labels or labels enriching zones provided by another dataset.
Column | Type | Description | Example | Indexed |
---|---|---|---|---|
id | Varchar | Dataset unique ID. Ideally prefixed by its data source ID. | ign_admin_express_commune | Primary |
source_id | Varchar | Reference to the data source providing this dataset | ign_admin_express | |
description_id | Varchar | Reference to the primary description of the dataset. The primary description should be in the same language as the one of the data provider (i.e., french for Ign Admin Expess) | ds_ign_admin_express_commune_fr | |
description_key | Varchar | Translation key shared by all languages | src_ign_admin_express | |
description_lang | Varchar | Main translation language | fr |
extendingdataset
A dataset that does not provide any Zones but instead provides additional labels to Zones provided by another dataset.
Column | Type | Description | Example | Index |
---|---|---|---|---|
extended_id | Varchar | Reference to the dataset providing the Zones. | ign_admin_express_commune | |
extending_id | Varchar | Reference to the dataset providing the additional labels. | loyers_commune |
source_fetch
A source fetch is an action from OgmAPI that download the latest files from the source website.
Column | Type | Description | Example | Index |
---|---|---|---|---|
id | Varchar | Unique ID (concatenation of the source_id and timestamp) | ign_admin_express_20240807_220000 | Primary |
source_id | Varchar | Reference to the data source fetched | ign_admin_express | |
timestamp | Timestamp | Timestamp at the end of the download | 2024/08/07 22:00:00 | |
url | Varchar | URL of the page used for the download | https://geoservices.ign.fr/adminexpress | |
hash | Varchar | MD5 of the files downloaded | md5 |
instance
An instance of a dataset uploaded to OgmAPI after a fetch.
Column | Type | Description | Example | Index |
---|---|---|---|---|
id | Varchar | Unique instance ID (concatenation of the dataset_id and the injection_date) | ign_admin_express_commune_20240807_220000 | Primary |
dataset_id | Varchar | Refence to the instanciated dataset | ign_admin_express_commune | |
fetch_id | Varchar | Refence to the files downloaded | ign_admin_express_20240807_220000 | |
injection_date | Timestamp | Timestamp at the begining of the injection (in a dedicated upload table) | 2024/08/07_22:05:00 | |
activation_date | Timestamp | Timestamp when the injection is activated (switched from the upload table to the hot table). NULL when not activated. |
2024/08/07_22:10:00 | |
is_active | Boolean | This instance is the last instance of this dataset. Set to false when a new instance is activated |
true | |
num_rows | Integer | Estimation of the number of rows provided in this instance | 36000 |
label
A label is a column provided by a dataset.
Column | Type | Description | Example | Index |
---|---|---|---|---|
id | BigInt | Unique ID for the label. | 42 | Primary |
dataset_id | Varchar | Reference to the dataset providing this label | ign_admin_express_commune | |
added_id | Varchar | Reference to the instance providing this label for the first time. | ign_admin_express_commune_20240807_220000 | |
removed_id | Varchar | Reference to the instance providing this label for the last time. If the label is still available, this id correspond to id of the last active instance. | ign_admin_express_commune_20240807_220000 | |
db_key | Varchar | JSON key to access the label in the Zone or ExtendingLabels. | population | |
description_id | Varchar | Reference to the primary description of the label. The primary description should be in the same language as the one of the data provider (i.e., french for Ign Admin Expess). | ds_ign_admin_express_commune_fr | |
description_key | Varchar | Translation key shared by all languages. | src_ign_admin_express | |
description_lang | Varchar | Main translation language. | fr |
zone
A Zone is a geometry (point or polygon) a dataset provides. This geometry is described by a set of labels.
This table only contains zones provided by the last instance of the dataset. The zones provided by former instances can be found in the zone_history
table (see below).
The zone
table is partitioned by dataset_id
.
Therefore, filtering the zones by dataset can significantly improve the query time.
It is also possible to query the dataset table individually as each partition is named zone_<dataset_id>
.
Column | Type | Description | Example | Index |
---|---|---|---|---|
id | BigInt | Unique ID for the zone. | 42 | Primary |
dataset_id | Varchar | Reference to the dataset providing this zone. | ign_admin_express_comune | Partition |
added_id | Varchar | Reference to the dataset instance providing this zone for the first time. | ign_admin_express_commune_20240807_220000 | |
removed_id | Varchar | Reference to the instance providing this zone for the last time. If the zone is still available, this ID corresponds to the id of the last active instance. | ign_admin_express_commune_20240807_220000 | |
geometry | Geometry | PostGis representation of the zone geometry (point, polygon, ...). | Under PostGis format. Use ST_AsText(geometry) to see it as WKT. |
GIST |
labels | JSON | Set of labels provided by the dataset to this geometry. | {"nom": "Rennes", "population": 222485} |
GIN |
zone_history
This table contains zones provided by former instances that are no longer provided by the latest instances of a dataset.
This table follows the same schema as the zone
table (including indexes).
This table is partitioned by dataset and each partition table is named zone_history_<dataset_id>
.
extending_labels
This table extends an existing zone with labels comming from another dataset.
This table only contains the labels provided by the last instance of the dataset. The labels provided by former instances can be found in the extending_labels_history
table (see below).
The extending_labels
table is partitioned by dataset_id
.
Therefore, filtering the labels by dataset can significantly improve the query time.
It is also possible to query the dataset table individually as each partition is named extending_labels_<dataset_id>
.
Column | Type | Description | Example | Index |
---|---|---|---|---|
id | BigInt | Unique ID for the labels. | 43 | Primary |
zone_id | BigInt | Reference to the Zone ID extened by these labels | 42 | |
dataset_id | Varchar | Reference to the dataset providing these labels. | loyers_maison | Partition |
added_id | Varchar | Reference to the dataset instance providing these labels for the first time. | loyers_maison_20240807_220000 | |
removed_id | Varchar | Reference to the instance providing these labels for the last time. If the same labels are still available, this ID corresponds to the id of the last active instance. | loyers_maison_20240807_220000 | |
labels | JSON | Set of labels provided by the dataset for the extended Zone. | {"loyers_maison_m2": 13.6704780225362} |
GIN |
extending_labels_history
This table contains labels provided by former instances that are no longer provided by the latest instances of a dataset.
This table follows the same schema as the extending_labels
table (including indexes).
This table is partitioned by dataset and each partition table is named extending_labels_history_<dataset_id>
.