Abstraction (computer science)
In software, an abstraction provides access while hiding details that otherwise might make access more challenging. It focuses attention on details of greater importance. Examples include the abstract data type which separates use from the representation of data and functions that form a call tree that is more general at the base and more specific towards the leaves.
Rationale
Computing mostly operates independently of the concrete world. The hardware implements a model of computation that is interchangeable with others. The software is structured in architectures to enable humans to create the enormous systems by concentrating on a few issues at a time. These architectures are made of specific choices of abstractions. Greenspun's tenth rule is an aphorism on how such an architecture is both inevitable and complex.Language abstraction is a central form of abstraction in computing: new artificial languages are developed to express specific aspects of a system. Modeling languages help in planning. Computer languages can be processed with a computer. An example of this abstraction process is the generational development of programming language from the first-generation programming language to the second-generation programming language and the third-generation programming language. Each stage can be used as a stepping stone for the next stage. The language abstraction continues for example in scripting languages and domain-specific languages.
Within a programming language, some features let the programmer create new abstractions. These include subroutines, modules, polymorphism, and software components. Some other abstractions such as software design patterns and architectural styles remain invisible to a translator and operate only in the design of a system.
Some abstractions try to limit the range of concepts a programmer needs to be aware of, by completely hiding the abstractions they are built on. The software engineer and writer Joel Spolsky has criticized these efforts by claiming that all abstractions are leaky – that they can never completely hide the details below; however, this does not negate the usefulness of abstraction.
Some abstractions are designed to inter-operate with other abstractions – for example, a programming language may contain a foreign function interface for making calls to the lower-level language.
Abstraction features
Programming languages
Different programming languages provide different types of abstraction, depending on the intended applications for the language. For example:- In object-oriented programming languages such as C++, Object Pascal, or Java, the concept of abstraction has become a declarative statement – using the syntax
function = 0;or the reserved wordsabstractandinterface. After such a declaration, it is the responsibility of the programmer to implement a class to instantiate the object of the declaration. - Functional programming languages commonly exhibit abstractions related to functions, such as lambda abstractions and higher-order functions.
- Modern members of the Lisp programming language family such as Clojure, Scheme and Common Lisp support macro systems to allow syntactic abstraction. Other programming languages such as Scala also have macros, or very similar metaprogramming features. These can allow programs to omit boilerplate code, abstract away tedious function call sequences, implement new control flow structures, and implement domain-specific languages, which allow domain-specific concepts to be expressed in concise and elegant ways. All of these, when used correctly, improve both the programmer's efficiency and the clarity of source code by making the intended purpose more explicit. A consequence of syntactic abstraction is also that any Lisp dialect, and almost any programming language, can in principle, be implemented in any modern Lisp with significantly reduced effort when compared to "more traditional" programming languages such as Python, C or Java.
Specification methods
- Abstract-model based method ;
- Algebraic techniques ;
- Process-based techniques ;
- Trace-based techniques ;
- Knowledge-based techniques.
Specification languages
Control abstraction
Programming languages offer control abstraction as one of the main purposes of their use. Computer machines understand operations at the very low level such as moving some bits from one location of the memory to another location and producing the sum of two sequences of bits. Programming languages allow this to be done in the higher level. For example, consider this statement written in a Pascal-like fashion:To a human, this seems a fairly simple and obvious calculation. However, the low-level steps necessary to carry out this evaluation, and return the value "15", and then assign that value to the variable "a", are actually quite subtle and complex. The values need to be converted to binary representation and the calculations decomposed into assembly instructions. Finally, assigning the resulting value of "15" to the variable labeled "a", so that "a" can be used later, involves additional 'behind-the-scenes' steps of looking up a variable's label and the resultant location in physical or virtual memory, storing the binary representation of "15" to that memory location, etc.
Without control abstraction, a programmer would need to specify all the register/binary-level steps each time they simply wanted to add or multiply a couple of numbers and assign the result to a variable. Such duplication of effort has two serious negative consequences:
- it forces the programmer to constantly repeat fairly common tasks every time a similar operation is needed
- it forces the programmer to program for the particular hardware and instruction set
Structured programming
In a simple program, this may aim to ensure that loops have single or obvious exit points and to have single exit points from functions and procedures.
In a larger system, it may involve breaking down complex tasks into many different modules. Consider a system which handles payroll on ships and at shore offices:
- The uppermost level may feature a menu of typical end-user operations.
- Within that could be standalone executables or libraries for tasks such as signing on and off employees or printing checks.
- Within each of those standalone components there could be many different source files, each containing the program code to handle a part of the problem, with only selected interfaces available to other parts of the program. A sign on program could have source files for each data entry screen and the database interface.
- Either the database or the payroll application also has to initiate the process of exchanging data with between ship and shore, and that data transfer task will often contain many other components.
Data abstraction
Data abstraction enforces a clear separation between the abstract properties of a data type and the concrete details of its implementation. The abstract properties are those that are visible to client code that makes use of the data type—the interface to the data type—while the concrete implementation is kept entirely private, and indeed can change, for example to incorporate efficiency improvements over time. The idea is that such changes are not supposed to have any impact on client code, since they involve no difference in the abstract behaviour.For example, one could define an abstract data type called lookup table which uniquely associates keys with values, and in which values may be retrieved by specifying their corresponding keys. Such a lookup table may be implemented in various ways: as a hash table, a binary search tree, or even a simple linear list of pairs. As far as client code is concerned, the abstract properties of the type are the same in each case.
Of course, this all relies on getting the details of the interface right in the first place, since any changes there can have major impacts on client code. As one way to look at this: the interface forms a contract on agreed behaviour between the data type and client code; anything not spelled out in the contract is subject to change without notice.
Manual data abstraction
While much of data abstraction occurs through computer science and automation, there are times when this process is done manually and without programming intervention. One way this can be understood is through data abstraction within the process of conducting a systematic review of the literature. In this methodology, data is abstracted by one or several abstractors when conducting a meta-analysis, with errors reduced through dual data abstraction followed by independent checking, known as adjudication.Abstraction in object-oriented programming
In object-oriented programming theory, abstraction involves the facility to define objects that represent abstract "actors" that can perform work, report on and change their state, and "communicate" with other objects in the system. The term encapsulation refers to the hiding of state details, but extending the concept of data type from earlier programming languages to associate behavior most strongly with the data, and standardizing the way that different data types interact, is the beginning of abstraction. When abstraction proceeds into the operations defined, enabling objects of different types to be substituted, it is called polymorphism. When it proceeds in the opposite direction, inside the types or classes, structuring them to simplify a complex set of relationships, it is called delegation or inheritance.Various object-oriented programming languages offer similar facilities for abstraction, all to support a general strategy of polymorphism in object-oriented programming, which includes the substitution of one data type for another in the same or similar role. Although not as generally supported, a configuration or image or package may predetermine a great many of these bindings at compile time, link time, or load time. This would leave only a minimum of such bindings to change at run-time.
Common Lisp Object System or Self, for example, feature less of a class-instance distinction and more use of delegation for polymorphism. Individual objects and functions are abstracted more flexibly to better fit with a shared functional heritage from Lisp.
C++ exemplifies another extreme: it relies heavily on templates and overloading and other static bindings at compile-time, which in turn has certain flexibility problems.
Although these examples offer alternate strategies for achieving the same abstraction, they do not fundamentally alter the need to support abstract nouns in code – all programming relies on an ability to abstract verbs as functions, nouns as data structures, and either as processes.
Consider for example a sample Java fragment to represent some common farm "animals" to a level of abstraction suitable to model simple aspects of their hunger and feeding. It defines an
Animal class to represent both the state of the animal and its functions:public class Animal extends LivingThing
With the above definition, one could create objects of type and call their methods like this:
thePig = new Animal;
theCow = new Animal;
if )
if )
theCow.moveTo;
In the above example, the class
Animal is an abstraction used in place of an actual animal, LivingThing is a further abstraction of Animal.If one requires a more differentiated hierarchy of animals – to differentiate, say, those who provide milk from those who provide nothing except meat at the end of their lives – that is an intermediary level of abstraction, probably DairyAnimal who would eat foods suitable to giving good milk, and MeatAnimal who would eat foods to give the best meat-quality.
Such an abstraction could remove the need for the application coder to specify the type of food, so they could concentrate instead on the feeding schedule. The two classes could be related using inheritance or stand alone, and the programmer could define varying degrees of polymorphism between the two types. These facilities tend to vary drastically between languages, but in general each can achieve anything that is possible with any of the others. A great many operation overloads, data type by data type, can have the same effect at compile-time as any degree of inheritance or other means to achieve polymorphism. The class notation is simply a coder's convenience.