Geocode
A geocode is a code that represents a geographic entity. It is a unique identifier of the entity, to distinguish it from others of its geocode system. In general the geocode is a human-readable and short identifier.
Typical geocodes and entities represented by it:
- Country code and subdivision code. Polygon of the administrative boundaries of a country or a subdivision.
- DGG cell ID. Identifier of a cell of a discrete global grid: a Geohash code or a Plus Code.
- Postal code. Polygon of a postal area: a CEP code.
Geocodes are mainly used for labelling, data integrity, geotagging and spatial indexing.
In theoretical computer science a geocode system is a hash function, and, an important class of utilitarian [|geocode systems] is described as a locality-preserving hashing function.
Classification
There are some common aspects of many geocodes that can be used as classification criteria:- Ownership: proprietary or free, differing by its licences.
- Formation: the geocode can be originated from a name or from mathematical function. See geocode system types below.
- Hierarchy: geocode's syntax hierarchy corresponding to the spatial hierarchy of its represented entities. A geocode system can be hierarchical or non-hierarchical.
- Covering: global or partial. The entities are in all globe or is delimited the theme or by the ownership's jurisdiction.
- Type of the represented entity: type of geometry. Point, grid cell or polygon.
- * special hierarchical [|grids], with global covering and equal-area cells, can be classified as DGGS cell
- * some non-standard geographic entities, can be classified also by its coordinate system and elipsoid of reference. The de facto standard is the WGS84.
- Scope of use: general use vs specialized.
System
- geocode syntax: the characters that can be used, blocks of characters and its size and order. Example: country codes use two letters of the alphabet. The most common way to describe formally is by regular expression.
- geocode semantic: the meaning of the geocode, usually expressed by associating the code with a geographical entity type. Can be described formally is by an ontology, an UML class diagram or any Entity-relationship model.
Many syntax and semantic characteristics are also summarized by classification.
Encode and decode
Any geocode can be translated from a formal expression of the geographical entity, or vice versa, the geocode translated to entity. The first is named encode process, the second decode. The actors and process involved, as defined by OGC, are:;geocoder: A software agent that transforms the description of a geographic entity, into a normalized data and encodes it as a geocode.
;geocoder service: A geocoder implemented as web service, that accepts a set of geographic entity descriptors as input. The request is "sent" to the Geocoder Service, which processes the request and returns the resulting geocodes. More general services can also return geographic features represented by the geocodes.
;geocoding: Geocoding refers to the assignment of geocodes or coordinates to geographically reference data provided in a textual format. Examples are the two letter country codes and coordinates computed from addresses.
Note: when a physical addressing schemes is expressed in a standardized and simplified way, it can be conceived as geocode. So, the term geocoding sometimes is generalized for geocodes.
In spatial indexing applications the geocode can also be translated between human-readable and internal representations.
Systems of standard [|names]
Geocodes like country codes, city codes, etc. comes from a table of official names, and the corresponding official codes and geometries. "Official" in the context of control and consensus, typically a table controlled by a standards organization or governmental authority. So, the most general case is a table of standard names and the corresponding standard codes.Strictly speaking, the "name" related to a geocode is a toponym, and the table is the resource for toponym resolution: is the relationship process, usually effectuated by a software agent, between a toponym and "an unambiguous spatial footprint of the same place". Any standardized system of toponym resolution, having codes or encoded abbreviations, can be used as geocode system. The "resolver" agent in this context is also a geocoder.
Sometimes names are translated into numeric codes, to be compact or machine-readable. Since numbers, in this case, are name identifiers, we can consider "numeric names" — so this set of codes will be a kind of "system of standard names".
Hierarchical naming
In the geocode context, space partitioning is the process of dividing a geographical space into two or more disjoint subsets, resulting in a mosaic of subdivisions. Each subdivision can be partitioned again, recursively, resulting in an hierarchical mosaic.When subdivisions's names are expressed as codes, and code syntax can be decomposed into a parent-child relations, through a well-defined syntactic scheme, the geocode set configures a hierarchical system. A geocode fragment can be an abbreviation, numeric or alphanumeric code.
A popular example is the ISO 3166-2 geocode system, representing country names and the names of respective administrative subdivisions separated by hyphen. For example
DE is Germany, a simple geocode, and its subdivisions are DE-BW for Baden-Württemberg, DE-BY for Bayern,..., DE-NW for Nordrhein-Westfalen, etc. The scope is only the first level of the hierarchy. For more levels there are other conventions, like HASC – Hierarchical Administrative Subdivision Codes. The HASC codes are alphabetic and its fragments have constant length. Examples:Two geocodes of a hierarchical geocode system with same prefix represents different parts of the same location. For instance
DE.NW.CE and DE.NW.BN represents geographically interior parts of DE.NW, the common prefix.Changing the subdivision criteria we can obtain other hierarchical systems. For example, for hydrological criteria there is a geocode system, the US's hydrologic unit code, that is a numeric representation of basin names in a hierarchical syntax schema. For example, the HUC
17 is the identifier of "Pacific Northwest Columbia basin"; HUC 1706 of "Lower Snake basin", a spatial subset of HUC 17 and a superset of 17060102.Systems of regular grids
Inspired in the classic alphanumeric grids, a discrete global grid is a regular mosaic which covers the entire Earth's surface. The regularity of the mosaic is defined by the use of cells of same shape in all the [|grid], or "near the same shape and near same area" in a region of interest, like a country.All cells of the grid have an identifier, and the center of the cell can be used as reference for cell ID conversion into geographical point. When a compact human-readable expression of the cell ID is standardized, it becomes a geocode.
Geocodes of different geocode systems can represent the same position in the globe, with same shape and precision, but differ in string-length, digit-alphabet, separators, etc. Non-global grids also differ by scope, and in general are geometrically optimized for the local use.
Hierarchical grids
Each cell of a grid can be transformed into a new local grid, in a recurring process. In the illustrated example, the cellTQ 2980 is a sub-cell of TQ 29, that is a sub-cell of TQ. A system of geographic regular grid references is the base of a hierarchical geocode system.Two geocodes of a hierarchical geocode grid system can use the prefix rule: geocodes with same prefix represents different parts of the same broader location. Using again the side illustration:
TQ 28 and TQ 61 represents geographically interior parts of TQ, the common prefix.Hierarchical geocode can be split into keys. The Geohash
6vd23gq is the key q of the cell 6vd23g, that is a cell of 6vd23, and so on, per-digit keys. The OLC 58PJ642P is the key 48 of the cell 58PJ64, that is a cell of 58Q8, and so on, two-digit keys. In the case of OLC there is a second key schema, after the + separator: 58PJ642P+48 is the key 2 of the cell 58PJ642P+4. It uses two key schemas. Some geocodes systems also use initial prefix with non-hierarchical key schema.In general, as technical and non-compact optional representation, geocode systems also offer the possibility of expressing their cell identifier with a fine-grained schema, by longer path of keys. For example, the Geohash
6vd2, which is a base32 code, can be expanded to base4 0312312002, which is also a schema with per-digit keys. Geometrically, each Geohash cell is a rectangle that subdivides space recurrently into 32 new rectangles, so, base4 subdividing into 4, is the encoding-expansion limit.The uniformity of shape and area of cells in a grid can be important for other uses, like spatial statistics. There are standard ways to build a grid covering the entire globe with cells of equal area, regular shape and other properties: Discrete Global Grid System is a series of discrete global grids satisfying all standardized requirements defined in 2017 by the OGC.
When human-readable codes obtained from cell identifiers of a DGGS are also standardized, it can be classified as DGGS based geocode system.