Set (abstract data type)
In computer science, a set is an abstract data type that can store distinct values, without any particular order. It is a computer implementation of the mathematical concept of a finite set. Unlike most other collection types, rather than retrieving a specific element from a set, one typically tests a value for membership in a set.
Some set data structures are designed for static or frozen sets that do not change after they are constructed. Static sets allow only query operations on their elements — such as checking whether a given value is in the set, or enumerating the values in some arbitrary order. Other variants, called dynamic or mutable sets, allow also the insertion and deletion of elements from the set.
A multiset is a special kind of set in which an element can appear multiple times in the set.
Type theory
In type theory, sets are generally identified with their indicator function : accordingly, a set of values of type may be denoted by or. The characteristic function of a set is defined as:In theory, many other abstract data structures can be viewed as set structures with additional operations and/or additional axioms imposed on the standard operations. For example, an abstract heap can be viewed as a set structure with a
min operation that returns the element of smallest value.Operations
Core set-theoretical operations
One may define the operations of the algebra of sets:-
union: returns the union of sets S and T. -
intersection: returns the intersection of sets S and T. -
difference: returns the difference of sets S and T. -
subset: a predicate that tests whether the set S is a subset of set T.Static sets
-
is_element_of: checks whether the value x is in the set S. -
is_empty: checks whether the set S is empty. -
sizeorcardinality: returns the number of elements in S. -
iterate: returns a function that returns one more value of S at each call, in some arbitrary order. -
enumerate: returns a list containing the elements of S in some arbitrary order. -
build: creates a set structure with values x1,x2,...,xn. -
create_from: creates a new set structure containing all the elements of the given collection or all the elements returned by the given iterator.Dynamic sets
-
create: creates a new, initially empty set structure. - *
create_with_capacity: creates a new set structure, initially empty but capable of holding up to n elements. -
add: adds the element x to S, if it is not present already. -
remove: removes the element x from S, if it is present. -
capacity: returns the maximum number of values that S can hold.
Additional operations
There are many other operations that can be defined in terms of the above, such as:-
pop: returns an arbitrary element of S, deleting it from S. -
pick: returns an arbitrary element of S. Functionally, the mutatorpopcan be interpreted as the pair of selectors,whererestreturns the set consisting of all elements except for the arbitrary element. Can be interpreted in terms ofiterate. -
map: returns the set of distinct values resulting from applying function F to each element of S. -
filter: returns the subset containing all elements of S that satisfy a given predicate P. -
fold: returns the value A|S| after applyingAi+1 := Ffor each element e of S, for some binary operation F. ''F must be associative and commutative for this to be well-defined. -
clear: delete all elements of S''. -
equal: checks whether the two given sets are equal. -
hash: returns a hash value for the static set S such that ifequalthenhash = hash
-
sum: returns the sum of all elements of S for some definition of "sum". For example, over integers or reals, it may be defined asfold. -
collapse: given a set of sets, return the union. For example,collapse. May be considered a kind ofsum. -
flatten: given a set consisting of sets and atomic elements, returns a set whose elements are the atomic elements of the original top-level set or elements of the sets it contains. In other words, remove a level of nesting – likecollapse,but allow atoms. This can be done a single time, or recursively flattening to obtain a set of only atomic elements. For example,flatten. -
nearest: returns the element of S that is closest in value to x. -
min,max: returns the minimum/maximum element of S.Implementations
nearest or union. Implementations described as "general use" typically strive to optimize the element_of, add, and delete operations. A simple implementation is to use a list, ignoring the order of the elements and taking care to avoid repeated values. This is simple but inefficient, as operations like set membership or element deletion are O, as they require scanning the entire list. Sets are often instead implemented using more efficient data structures, particularly various flavors of trees, tries, or hash tables.As sets can be interpreted as a kind of map, sets are commonly implemented in the same way as maps – in this case in which the value of each key-value pair has the unit type or a sentinel value – namely, a self-balancing binary search tree for sorted sets, or a hash table for unsorted sets average-case, but O. A sorted linear hash table may be used to provide deterministically ordered sets.
Further, in languages that support maps but not sets, sets can be implemented in terms of maps. For example, a common programming idiom in Perl that converts an array to a hash whose values are the sentinel value 1, for use as a set, is:
my %elements = map @elements;
Other popular methods include arrays. In particular a subset of the integers 1..n can be implemented efficiently as an n-bit bit array, which also support very efficient union and intersection operations. A Bloom map implements a set probabilistically, using a very compact representation but risking a small chance of false positives on queries.
The Boolean set operations can be implemented in terms of more elementary operations, but specialized algorithms may yield lower asymptotic time bounds. If sets are implemented as sorted lists, for example, the naive algorithm for
union will take time proportional to the length m of S times the length n of T; whereas a variant of the list merging algorithm will do the job in time proportional to m+''n''. Moreover, there are specialized set data structures that are optimized for one or more of these operations, at the expense of others.Language support
One of the earliest languages to support sets was Pascal; many languages now include it, whether in the core language or in a standard library.- In C++, the Standard Template Library provides the
settemplate class, which is typically implemented using a binary search tree ; SGI's STL also provides thehash_settemplate class, which implements a set using a hash table. C++11 has support for theunordered_settemplate class, which is implemented using a hash table. In sets, the elements themselves are the keys, in contrast to sequenced containers, where elements are accessed using their position. Set elements must have a strict weak ordering. - The Rust standard library provides the generic
andtypes. - Java offers the interface to support sets, and the sub-interface to support sorted sets.
- Apple's Foundation framework provides the Objective-C classes
,,,, and. The CoreFoundation APIs provide the and types for use in C. - Python has built-in since 2.4, and since Python 3.0 and 2.7, supports non-empty set literals using a curly-bracket syntax, e.g.:
; empty sets must be created usingset, because Python usesto represent the empty dictionary. - The.NET Framework provides the generic
andclasses that implement the genericinterface. - Smalltalk's class library includes
SetandIdentitySet, using equality and identity for inclusion test respectively. Many dialects provide variations for compressed storage, for ordering or for weak references. - Ruby's standard library includes a
module which containsSetandSortedSetclasses that implement sets using hash tables, the latter allowing iteration in sorted order. - OCaml's standard library contains a
Setmodule, which implements a functional set data structure using binary search trees. - The GHC implementation of Haskell provides a
module, which implements immutable sets using binary search trees. - The Tcl Tcllib package provides a set module which implements a set data structure based upon TCL lists.
- The Swift standard library contains a
Settype, since Swift 1.2. - JavaScript introduced
as a standard built-in object with the ECMAScript 2015 standard. - Erlang's standard library has a
module. - Clojure has literal syntax for hashed sets, and also implements sorted sets.
- LabVIEW has native support for sets, from version 2019.
- Ada provides the
andpackages.