Programming language


A programming language is an engineered language for expressing computer programs.
Programming languages typically allow software to be written in a human readable manner.
Execution of a program requires an implementation. There are two main approaches for implementing a programming language compilation, where programs are compiled ahead-of-time to machine code, and interpretation, where programs are directly executed. In addition to these two extremes, some implementations use hybrid approaches such as just-in-time compilation and bytecode interpreters.
The design of programming languages has been strongly influenced by computer architecture, with most imperative languages designed around the ubiquitous von Neumann architecture. While early programming languages were closely tied to the hardware, modern languages often hide hardware details via abstraction in an effort to enable better software with less effort.

Related

Programming languages have some similarity to natural languages in that they can allow communication of ideas between people. That is, programs are generally human-readable and can express complex ideas. However, the kinds of ideas that programming languages can express are ultimately limited to the domain of computation.
The term computer language is sometimes used interchangeably with programming language but some contend they are different concepts. Some contend that programming languages are a subset of computer languages. Some use computer language to classify a language used in computing that is not considered a programming language. Some regard a programming language as a theoretical construct for programming an abstract machine, and a computer language as the subset thereof that runs on a physical computer, which has finite hardware resources.
John C. Reynolds emphasizes that a formal specification language is as much a programming language as is a language intended for execution. He argues that textual and even graphical input formats that affect the behavior of a computer are programming languages, despite the fact they are commonly not Turing-complete, and remarks that ignorance of programming language concepts is the reason for many flaws in input formats.

History

Early developments

The first programmable computers were invented during the 1940s, and with them, the first programming languages. The earliest computers were programmed in first-generation programming languages, machine language. This code was very difficult to debug and was not portable between different computer systems. In order to improve the ease of programming, assembly languages were invented, diverging from the machine language to make programs easier to understand for humans, although they did not increase portability.
Initially, hardware resources were scarce and expensive, while human resources were cheaper. Therefore, cumbersome languages that were time-consuming to use, but were closer to the hardware for higher efficiency were favored. The introduction of high-level programming languages —revolutionized programming. These languages abstracted away the details of the hardware, instead being designed to express algorithms that could be understood more easily by humans. For example, arithmetic expressions could now be written in symbolic notation and later translated into machine code that the hardware could execute. In 1957, Fortran was invented. Often considered the first compiled high-level programming language, Fortran has remained in use into the twenty-first century.

1960s and 1970s

Around 1960, the first mainframes—general purpose computers—were developed, although they could only be operated by professionals and the cost was extreme. The data and instructions were input by punch cards, meaning that no input could be added while the program was running. The languages developed at this time therefore are designed for minimal interaction. After the invention of the microprocessor, computers in the 1970s became dramatically cheaper. New computers also allowed more user interaction, which was supported by newer programming languages.
Lisp, implemented in 1958, was the first functional programming language. Unlike Fortran, it supported recursion and conditional expressions, and it also introduced dynamic memory management on a heap and automatic garbage collection. For the next decades, Lisp dominated artificial intelligence applications. In 1978, another functional language, ML, introduced inferred types and polymorphic parameters.
After ALGOL was released in 1958 and 1960, it became the standard in computing literature for describing algorithms. Although its commercial success was limited, most popular imperative languages—including C, Pascal, Ada, C++, Java, and C#—are directly or indirectly descended from ALGOL 60. Among its innovations adopted by later programming languages included greater portability and the first use of context-free, BNF grammar. Simula, the first language to support object-oriented programming, also descends from ALGOL and achieved commercial success. C, another ALGOL descendant, has sustained popularity into the twenty-first century. C allows access to lower-level machine operations more than other contemporary languages. Its power and efficiency, generated in part with flexible pointer operations, comes at the cost of making it more difficult to write correct code.
Prolog, designed in 1972, was the first logic programming language, communicating with a computer using formal logic notation. With logic programming, the programmer specifies a desired result and allows the interpreter to decide how to achieve it.

1980s to 2000s

During the 1980s, the invention of the personal computer transformed the roles for which programming languages were used. New languages introduced in the 1980s included C++, a superset of C that can compile C programs but also supports classes and inheritance. Ada and other new languages introduced support for concurrency. The Japanese government invested heavily into the so-called fifth-generation languages that added support for concurrency to logic programming constructs, but these languages were outperformed by other concurrency-supporting languages.
Due to the rapid growth of the Internet and the World Wide Web in the 1990s, new programming languages were introduced to support Web pages and networking. Java, based on C++ and designed for increased portability across systems and security, enjoyed large-scale success because these features are essential for many Internet applications. Another development was that of dynamically typed scripting languages—Python, JavaScript, PHP, and Ruby—designed to quickly produce small programs that coordinate existing applications. Due to their integration with HTML, they have also been used for building web pages hosted on servers.

2000s to present

During the 2000s, there was a slowdown in the development of new programming languages that achieved widespread popularity. One innovation was service-oriented programming, designed to exploit distributed systems whose components are connected by a network. Services are similar to objects in object-oriented programming, but run on a separate process. C# and F# cross-pollinated ideas between imperative and functional programming. After 2010, several new languages—Rust, Go, Swift, Zig and Carbon —competed for the performance-critical software for which C had historically been used. Most of the new programming languages use static typing while a few numbers of new languages use dynamic typing like Julia.
Some of the new programming languages are classified as visual programming languages like Scratch and LabVIEW. Also, some of these languages mix between textual and visual programming usage like Ballerina. Also, this trend lead to developing projects that help in developing new VPLs like Blockly by Google. Many game engines like Unreal and Unity added support for visual scripting too.

Definition

A language can be defined in terms of syntax and semantics, and often is defined via a formal language specification.

Syntax

A programming language's surface form is known as its syntax. Most programming languages are purely textual; they use sequences of text including words, numbers, and punctuation, much like written natural languages. On the other hand, some programming languages are graphical, using visual relationships between symbols to specify a program.
The syntax of a language describes the possible combinations of symbols that form a syntactically correct program. The meaning given to a combination of symbols is handled by semantics. Since most languages are textual, this article discusses textual syntax.
The programming language syntax is usually defined using a combination of regular expressions and Backus–Naur form. Below is a simple grammar, based on Lisp:

expression ::= atom | list
atom ::= number | symbol
number ::= ?+
symbol ::= .*
list ::=

This grammar specifies the following:
  • an expression is either an atom or a list;
  • an atom is either a number or a symbol;
  • a number is an unbroken sequence of one or more decimal digits, optionally preceded by a plus or minus sign;
  • a symbol is a letter followed by zero or more of any alphabetical characters ; and
  • a list is a matched pair of parentheses, with zero or more expressions'' inside it.
The following are examples of well-formed token sequences in this grammar: 12345, and .
Not all syntactically correct programs are semantically correct. Many syntactically correct programs are nonetheless ill-formed, per the language's rules, and may result in an error on translation or execution. In some cases, such programs may exhibit undefined behavior. Even when a program is well-defined within a language, it may still have a meaning that is not intended by the person who wrote it.
Using natural language as an example, it may not be possible to assign a meaning to a grammatically correct sentence or the sentence may be false:
  • "Colorless green ideas sleep furiously." is grammatically well-formed but has no generally accepted meaning.
  • "John is a married bachelor." is grammatically well-formed but expresses a meaning that cannot be true.
The following C language fragment is syntactically correct, but performs operations that are not semantically defined :

complex *p = NULL;
complex abs_p = sqrt;

If the type declaration on the first line were omitted, the program would trigger an error on the undefined variable p during compilation. However, the program would still be syntactically correct since type declarations provide only semantic information.
The grammar needed to specify a programming language can be classified by its position in the Chomsky hierarchy. The syntax of most programming languages can be specified using a Type-2 grammar, i.e., they are context-free grammars. Some languages, including Perl and Lisp, contain constructs that allow execution during the parsing phase. Languages that have constructs that allow the programmer to alter the behavior of the parser make syntax analysis an undecidable problem, and generally blur the distinction between parsing and execution. In contrast to Lisp's macro system and Perl's BEGIN blocks, which may contain general computations, C macros are merely string replacements and do not require code execution.