Cost distance analysis


In spatial analysis and geographic information systems, cost distance analysis or cost path analysis is a method for determining one or more optimal routes of travel through unconstrained space. The optimal solution is that which minimizes the total cost of the route, based on a field of cost density that varies over space due to local factors. It is thus based on the fundamental geographic principle of Friction of distance. It is an optimization problem with multiple deterministic algorithm solutions, implemented in most GIS software.
The various problems, algorithms, and tools of cost distance analysis operate over an unconstrained two-dimensional space, meaning that a path could be of any shape. Similar cost optimization problems can also arise in a constrained space, especially a one-dimensional linear network such as a road or telecommunications network. Although they are similar in principle, the problems in network space require very different algorithms to solve, largely adopted from graph theory. The collection of GIS tools for solving these problems are called network analysis.

History

Humans seem to have an innate desire to travel with minimal effort and time. Historic, even ancient, roads show patterns similar to what modern computational algorithms would generate, traveling straight across flat spaces, but curving around mountains, canyons, and thick vegetation.
However, it was not until the 20th century that geographers developed theories to explain this route optimization, and algorithms to reproduce it. In 1957, during the Quantitative revolution in Geography, with its propensity to adopt principles or mathematical formalisms from the "hard" sciences, William Warntz used refraction as an analogy for how minimizing travel cost will make transportation routes change direction at the boundary between two landscapes with very different friction of distance. His principle of "parsimonious movement," changing direction to minimize cost, was widely accepted, but the refraction analogy and mathematics was not, largely because it does not scale well to normally complex geographic situations.
Warntz and others then adopted another analogy that proved much more successful in the common situation where travel cost varies continuously over space, by comparing it to terrain. They compared the cost rate to the slope of a terrain surface, both being mathematical derivatives of an accumulated function or field: total elevation above a vertical datum in the case of terrain. Integrating the cost rate field from a given starting point would create an analogous surface of total accumulated cost of travel from that point. In the same way that a stream follows the path of least resistance downhill, the streamline on the cost accumulation surface from any point "down" to the source will be the minimum-cost path. Additional lines of research in the 1960s further developed the nature of the cost rate field as a manifestation of the concept of friction of distance, studying how it was affected by various geographic features.
At the time, this solution was only theoretical, lacking the data and computing power for the continuous solution. Raster GIS provided the first feasible platform for implementing the theoretical solution by converting the continuous integration into a discrete summation procedure. Dana Tomlin implemented cost distance analysis in his Map Analysis Package by 1986, and Ronald Eastman added it to IDRISI by 1989, with a more efficient "pushbroom" cost accumulation algorithm. Douglas further refined the accumulation algorithm, which is basically what is implemented in most current GIS software.

Cost raster

The primary data set used in cost distance analysis is the cost raster, sometimes called the cost-of-passage surface, the friction image, the cost-rate field, or cost surface. In most implementations, this is a raster grid, in which the value of each cell represents the cost of a route crossing the cell in a horizontal or vertical direction. It is thus a discretization of a field of cost rate, a spatially intensive property. This cost is a manifestation of the principle of friction of distance.
A number of different types of cost may be relevant in a given routing problem:
  • Travel cost, the resource expenditure required to move across the cell, usually time or energy/fuel.
  • Construction cost, the resources required to build the infrastructure that makes travel possible, such as roads, pipes, and cables. While some construction costs are constant, others are spatially variant, such as property acquisition and excavation.
  • Environmental impacts, the negative effects on the natural or human environment caused by the infrastructure or the travel along it. For example, building an expressway through a residential neighborhood or a wetland would incur a high political cost.
Some of these costs are easily quantifiable and measurable, such as transit time, fuel consumption, and construction costs, thus naturally lending themselves to computational solutions. That said, there may be significant uncertainty in predicting the cost prior to implementing the route. Other costs are much more difficult to measure due to their qualitative or subjective nature, such as political protest or ecological impact; these typically require operationalization through the creation of a scale.
In many situations, multiple types of cost may be simultaneously relevant, and the total cost is a combination of them. Because different costs are expressed in different units, they usually cannot be directly summed, but must be combined by creating an index. A common type of index is created by scaling each factor to a consistent range, then combining them using weighted linear combination. An important part of the creation of an index model like this is Calibration, adjusting the parameters of the formula to make the modeled relative cost match real-world costs, using methods such as the Analytic hierarchy process. The index model formula is typically implemented in a raster GIS using map algebra tools from raster grids representing each cost factor, resulting in a single cost raster grid.

Directional cost

One limitation of the traditional method is that the cost field is isotropic or omni-directional: the cost at a given location does not depend on the direction of traversal. This is appropriate in many situations, but not others. For example, if one is flying in a windy location, an airplane flying in the direction of the wind incurs a much lower cost than an airplane flying against it. Some research has been done on extending cost distance analysis algorithms to incorporate directional cost, but it is not yet widely implemented in GIS software. IDRISI has some support for anisotropy.

Least-cost-path algorithm

The most common cost distance task is to determine the single path through the space between a given source location and a destination location that has the least total accumulated cost. The typical solution algorithm is a discrete raster implementation of the cost integration strategy of Warntz and Lindgren, which is a deterministic optimization.
  1. Inputs: cost field raster, source location, destination location
  2. Accumulation: Starting at the source location compute the lowest total cost needed to reach every other cell in the grid. Although there are several algorithms, such as those published by Eastman and Douglas, they generally follow a similar strategy. This process also creates, as an important byproduct, a second raster grid usually called the backlink grid or movement direction grid, in which each cell has a direction code representing which of its eight neighbors had the lowest cost.
  3. # Find a cell that is adjacent to at least one cell that already has an accumulated cost assigned
  4. # Determine which neighbor has the lowest accumulated cost. Encode the direction from the target to the lowest-cost neighbor in the backlink grid.
  5. # Add the cost of the target cell to the neighbor accumulated cost, to create the accumulated cost of the target cell. If the neighbor is diagonal, the local cost is multiplied by
  6. # The algorithm must also take into account that indirect routes may have lower cost, often using a hash table to keep track of temporary cost values along the expanding fringe of computation that can be reconsidered.
  7. # Repeat the procedure until all cells are assigned.
  8. Drain: In keeping with the terrain analogy, trace the optimal route from the given destination back to the source like a stream draining away from a location. At its most basic, this is accomplished by starting at the destination cell, moving in the direction indicated in the backlink grid, then repeating for the next cell, and so on until the source is reached. Recent software adds some improvements, such as looking across three or more cells to recognize straight lines at angles other than the eight neighbor directions. For example, the function in GRASS can recognize the "knight's move" and draw a straight line bypassing the middle cell.

    Corridor analysis

A slightly different version of the least-cost path problem, which could be considered a fuzzy version of it, is to look for corridors more than one cell in width, thus providing some flexibility in applying the results. Corridors are commonly used in transportation planning and in wildlife management.
The solution to this problem is to compute, for every cell in the study space, the total accumulated cost of the optimal path between a given source and destination that passes through that cell. Thus, every cell in the optimal path derived above would have the same minimum value. Cells near this path would be reached by paths deviating only slightly from the optimal path, so they would have relatively low cost values, collectively forming a corridor with fuzzy edges as more distant cells have increasing cost values.
The algorithm to derive this corridor field is created by generating two cost accumulation grids: one using the source as described above. Then the algorithm is repeated, but using the destination as the source. Then these two grids are added using map algebra. This works because for each cell, the optimal source-destination path passing through that cell is the optimal path from that cell to the source, added to the optimal path from that cell to the destination. This can be accomplished using the cost accumulation tool above, along with a map algebra tool, although ArcGIS provides a tool that automates the process.