Domain-specific language


A domain-specific language is a computer language specialized to a particular application domain. This is in contrast to a general-purpose language, which is broadly applicable across domains. There are a wide variety of DSLs, ranging from widely used languages for common domains, such as HTML for web pages, down to languages used by only one or a few pieces of software, such as MUSH soft code. DSLs can be further subdivided by the kind of language, and include domain-specific markup languages, domain-specific modeling languages, and domain-specific programming languages. Special-purpose computer languages have always existed in the computer age, but the term "domain-specific language" has become more popular due to the rise of domain-specific modeling. Simpler DSLs, particularly ones used by a single application, are sometimes informally called mini-languages.
The line between general-purpose languages and domain-specific languages is not always sharp, as a language may have specialized features for a particular domain but be applicable more broadly, or conversely may in principle be capable of broad application but in practice used primarily for a specific domain. For example, Perl was originally developed as a text-processing and glue language, for the same domain as AWK and shell scripts, but was mostly used as a general-purpose programming language later on. By contrast, PostScript is a Turing-complete language, and in principle can be used for any task, but in practice is narrowly used as a page description language.

Use

The design and use of appropriate DSLs is a key part of domain engineering, by using a language suitable to the domain at hand – this may consist of using an existing DSL or GPL, or developing a new DSL. Language-oriented programming considers the creation of special-purpose languages for expressing problems as standard part of the problem-solving process. Creating a domain-specific language, rather than reusing an existing language, can be worthwhile if the language allows a particular type of problem or solution to be expressed more clearly than an existing language would allow and the type of problem in question reappears sufficiently often. Pragmatically, a DSL may be specialized to a particular problem domain, a particular problem representation technique, a particular solution technique, or other aspects of a domain.

Overview

A domain-specific language is created specifically to solve problems in a particular domain and is not intended to be able to solve problems outside of it. In contrast, general-purpose languages are created to solve problems in many domains. The domain can also be a business area. Some examples of business areas include:
  • life insurance policies
  • combat simulation
  • salary calculation
  • billing
A domain-specific language is somewhere between a tiny programming language and a scripting language, and is often used in a way analogous to a programming library. The boundaries between these concepts are quite blurry, much like the boundary between scripting languages and general-purpose languages.

In design and implementation

Domain-specific languages are languages with very specific goals in design and implementation. A domain-specific language can be one of a visual diagramming language, such as those created by the Generic Eclipse Modeling System, programmatic abstractions, such as the Eclipse Modeling Framework, or textual languages. For instance, the command line utility grep has a regular expression syntax which matches patterns in lines of text. The sed utility defines a syntax for matching and replacing regular expressions. Often, these tiny languages can be used together inside a shell to perform more complex programming tasks.
The line between domain-specific languages and scripting languages is somewhat blurred, but domain-specific languages often lack low-level functions for filesystem access, interprocess control, and other functions that characterize full-featured programming languages, scripting or otherwise. Many domain-specific languages do not compile to byte-code or executable code, but to various kinds of media objects: GraphViz exports to PostScript, GIF, JPEG, etc., where Csound compiles to audio files, and a ray-tracing domain-specific language like POV compiles to graphics files.

Data definition languages

A data definition language like SQL presents an interesting case: it can be deemed a domain-specific language because it is specific to a specific domain, and is often called from another application, but SQL has more keywords and functions than many scripting languages, and is often thought of as a language in its own right, perhaps because of the prevalence of database manipulation in programming and the amount of mastery required to be an expert in the language.
Further blurring this line, many domain-specific languages have exposed APIs, and can be accessed from other programming languages without breaking the flow of execution or calling a separate process, and can thus operate as programming libraries.

Programming tools

Some domain-specific languages expand over time to include full-featured programming tools, which further complicates the question of whether a language is domain-specific or not. A good example is the functional language XSLT, specifically designed for transforming one XML graph into another, which has been extended since its inception to allow for various forms of filesystem interaction, string and date manipulation, and data typing.
In model-driven engineering, many examples of domain-specific languages may be found like OCL, a language for decorating models with assertions or QVT, a domain-specific transformation language. However, languages like UML are typically general-purpose modeling languages.
To summarize, an analogy might be useful: a Very Little Language is like a knife, which can be used in thousands of different ways, from cutting food to cutting down trees. A domain-specific language is like an electric drill: it is a powerful tool with a wide variety of uses, but a specific context, namely, putting holes in things. A General Purpose Language is a complete workbench, with a variety of tools intended for performing a variety of tasks. Domain-specific languages should be used by programmers who, looking at their current workbench, realize they need a better drill and find that a particular domain-specific language provides exactly that.

Domain-specific language topics

External and Embedded Domain Specific Languages

DSLs implemented via an independent interpreter or compiler are known as External Domain Specific Languages. Well known examples include TeX or AWK. A separate category known as Embedded Domain Specific Languages are typically implemented within a host language as a library and tend to be limited to the syntax of the host language, though this depends on host language capabilities.

Usage patterns

There are several usage patterns for domain-specific languages:
  • Processing with standalone tools, invoked via direct user operation, often on the command line or from a Makefile
  • Domain-specific languages which are implemented using programming language macro systems, and which are converted or expanded into a host general purpose language at compile-time or realtime
  • As embedded domain-specific language also known as an internal domain-specific language, is a DSL that is implemented as a library in a "host" programming language. The embedded domain-specific language leverages the syntax, semantics and runtime environment and adds domain-specific primitives that allow programmers to use the "host" programming language to create programs that generate code in the "target" programming language. Multiple eDSLs can easily be combined into a single program and the facilities of the host language can be used to extend an existing eDSL. Other possible advantages using an eDSL are improved type safety and better IDE tooling. eDSL examples: SQLAlchemy "Core" an SQL eDSL in Python, jOOQ an SQL eDSL in Java, LINQ's "method syntax" an SQL eDSL in C# and an HTML eDSL in Kotlin.
  • Domain-specific languages which are called from programs written in general purpose languages like C or Perl, to perform a specific function, often returning the results of operation to the "host" programming language for further processing; generally, an interpreter or virtual machine for the domain-specific language is embedded into the host application
  • Domain-specific languages which are embedded into user applications and which are used to execute code that is written by users of the application, dynamically generated by the application, or both.
Many domain-specific languages can be used in more than one way. DSL code embedded in a host language may have special syntax support, such as regexes in sed, AWK, Perl or JavaScript, or may be passed as strings.

Design goals

Adopting a domain-specific language approach to software engineering involves both risks and opportunities. The well-designed domain-specific language manages to find the proper balance between these.
Domain-specific languages have important design goals that contrast with those of general-purpose languages:
  • Domain-specific languages are less comprehensive.
  • Domain-specific languages are much more expressive in their domain.
  • Domain-specific languages should exhibit minimal redundancy.

    Idioms

In programming, idioms are methods imposed by programmers to handle common development tasks, e.g.:
  • Ensure data is saved before the window is closed.
  • Edit code whenever command-line parameters change because they affect program behavior.
General purpose programming languages rarely support such idioms, but domain-specific languages can describe them, e.g.:
  • A script can automatically save data.
  • A domain-specific language can parameterize command line input.