Computational creativity


Computational creativity is a multidisciplinary endeavour that is located at the intersection of the fields of artificial intelligence, cognitive psychology, philosophy, and the arts.
Is the application of computer systems to emulate human-like creative processes, facilitating the generation of artistic and design outputs that mimic innovation and originality.
The goal of computational creativity is to model, simulate or replicate creativity using a computer, to achieve one of several ends:
  • To construct a program or computer capable of human-level creativity.
  • To better understand human creativity and to formulate an algorithmic perspective on creative behavior in humans.
  • To design programs that can enhance human creativity without necessarily being creative themselves.
The field of computational creativity concerns itself with theoretical and practical issues in the study of creativity. Theoretical work on the nature and proper definition of creativity is performed in parallel with practical work on the implementation of systems that exhibit creativity, with one strand of work informing the other.
The applied form of computational creativity is known as media synthesis.

Theoretical issues

Theoretical approaches concern the essence of creativity. Especially, under what circumstances it is possible to call the model a "creative" if eminent creativity is about rule-breaking or the disavowal of convention. This is a variant of Ada Lovelace's objection to machine intelligence, as recapitulated by modern theorists such as Teresa Amabile. If a machine can do only what it was programmed to do, how can its behavior ever be called creative?
Indeed, not all computer theorists would agree with the premise that computers can only do what they are programmed to do—a key point in favor of computational creativity.

Defining creativity in computational terms

Because no single perspective or definition seems to offer a complete picture of creativity, the AI researchers Newell, Shaw and Simon developed the combination of novelty and usefulness into the cornerstone of a multi-pronged view of creativity, one that uses the following four criteria to categorize a given answer or solution as creative:
  1. The answer is novel and useful
  2. The answer demands that we reject ideas we had previously accepted
  3. The answer results from intense motivation and persistence
  4. The answer comes from clarifying a problem that was originally vague
Margaret Boden focused on the first two of these criteria, arguing instead that creativity should be defined as "the ability to come up with ideas or artifacts that are new, surprising, and valuable".
Mihaly Csikszentmihalyi argued that creativity had to be considered instead in a social context, and his DIFI framework has since strongly influenced the field. In DIFI, an individual produces works whose novelty and value are assessed by the field—other people in society—providing feedback and ultimately adding the work, now deemed creative, to the domain of societal works from which an individual might be later influenced.
Whereas the above reflects a top-down approach to computational creativity, an alternative thread has developed among bottom-up computational psychologists involved in artificial neural network research. During the late 1980s and early 1990s, for example, such generative neural systems were driven by genetic algorithms. Experiments involving recurrent nets were successful in hybridizing simple musical melodies and predicting listener expectations.

Historical evolution of computational creativity

The use computational processes to generate creative artifacts has been present from early times in history. During the late 1800's, methods for composing music combinatorily were explored, involving prominent figures like Mozart, Bach, Haydn, and Kiernberger. This approach extended to analytical endeavors as early as 1934, where simple mechanical models were built to explore mathematical problem solving. Professional interest in the creative aspect of computation also was commonly addressed in early discussions of artificial intelligence. The 1956 Dartmouth Conference, listed creativity, invention, and discovery as key goals for artificial intelligence.
As the development of computers allowed systems of greater complexity, the 1970's and 1980's saw invention of early systems that modelled creativity using symbolic or rule-based approaches. The field of creative storytelling investigated several such models. Meehan's TALE-SPIN generated narratives through simulation of character goals and decision trees. Dehn's AUTHOR approached generation by simulating an author's process for crafting a story. Beyond narrative generation, computational creativity expanded into artistic and scientific domains.
Artistic image generation was one of the disciplines that saw early potential in generated artifacts through computational creativity. One of the most prominent examples was Harold Cohen's AARON, which produced art through composition and adaptation of figures based on a large set of symbolic rules and heuristics for visual composition. Some systems also tackled creativity in scientific endeavors. BACON was said to rediscover natural laws like Boyle's Law and Kepler's law through hypothesis testing in constrained spaces.
By the 1990's the modeling techniques became more adaptive, attempting to implement cognitive creative rules for generation. Turner's MINSTREL introduced TRAMs to simulate creative re-use of prior material for generative storytelling. Meanwhile, Pérez y Pérez's MEXICA modeled the creative writing process using cycles of engagement and reflection. As systems increasingly incorporated models of internal evaluation, another approach that emerged was that of combining symbolic generation with domain-specific evaluation metrics, modeling generative and selective steps to creativity
In the field of generational humor, the JAPE system generated pun-based riddles using Prolog and WordNet, applying symbolic pattern-matching rules and a large lexical database to compose riddles involving wordplay. WordNet is a system developed by George Miller and his team at Princeton, its platform and inspired word-mapping structures have been used as the backbone of several syntactic and semantic AI programs. A notable system for music generation was David Cope's EMI or Emmy, which was trained in the styles of artists like Bach, Beethoven, or Chopin and generated novel pieces in their style through pattern abstraction and recomposition.
In the 2000s and beyond, machine learning began influencing creative system design. Researchers such as Mihalcea and Strapparava trained classifiers to distinguish humorous from non-humorous text, using stylistic and semantic features. Meanwhile custom computational approaches led to chess systems like Deep Blue generating quasi-creative gameplay strategies through search algorithms and parallel processing constrained by specific rules and patterns for evaluation.
The institutional development of computational creativity grew along its technical advances. Dedicated workshops such as the IJWCC emerged in the 1990s, growing out of interdisciplinary conferences focused on AI and creativity. By the early 2000s, the field coalesced around annual conferences like the International Conference on Computational Creativity. Recently, with the advent of Deep Learning, Transformers, and further refinement in Machine Learning structures, computational creativity's implementation space has new tools for development.

Machine learning for computational creativity

While traditional computational approaches to creativity rely on the explicit formulation of prescriptions by developers and a certain degree of randomness in computer programs, machine learning methods allow computer programs to learn on heuristics from input data enabling creative capacities within the computer programs. Especially, deep artificial neural networks allow to learn patterns from input data that allow for the non-linear generation of creative artefacts. Before 1989, artificial neural networks have been used to model certain aspects of creativity. Peter Todd first trained a neural network to reproduce musical melodies from a training set of musical pieces. Then he used a change algorithm to modify the network's input parameters. The network was able to randomly generate new music in a highly uncontrolled manner. In 1992, Todd extended this work, using the so-called distal teacher approach that had been developed by Paul Munro, Paul Werbos, D. Nguyen and Bernard Widrow, Michael I. Jordan and David Rumelhart. In the new approach, there are two neural networks, one of which is supplying training patterns to another.
In later efforts by Todd, a composer would select a set of melodies that define the melody space, position them on a 2-d plane with a mouse-based graphic interface, and train a connectionist network to produce those melodies, and listen to the new "interpolated" melodies that the network generates corresponding to intermediate points in the 2-d plane.

Language models and hallucination

Language models like GPT and LSTM are used to generate texts for creative purposes, such as novels and scripts. These models demonstrate hallucination from time to time, where erroneous materials are presented as factual. Creators make use of their hallucinatory tendency to capture unintended results. Ross Goodwin's 1 the Road, for example, uses an LSTM model trained on literature corpora to generate a novel that refers to Jack Kerouac's On the Road based on multimodal input captured by a camera, a microphone, a laptop's inner clock, and a GPS throughout the road trip. Brian Merchant commented on the novel as "pixelated poetry in its ragtag assemblage of modern American imagery". Oscar Sharp and Ross Goodwin created the experimental sci-fi short film Sunspring in 2016, written with an LSTM model, trained on their scripts and 1980-1990 sci-fi movies. Rodica Gotca critiqued their overall lack of focus on the narrative and intention to create based on the background of human culture.
Nevertheless, researchers highlight the positive side of language models' hallucination for generating novel solutions, given that the correctness and consistency of the response could be controlled. Jiang et al. propose the divergence-convergence flow model for harnessing the hallucinatory effects. They summarize the types of such effects in current research into factuality hallucinations and faithfulness hallucinations, which can be divided into smaller classes like factual fabrication and instruction inconsistency. While the divergence stage involves generating potentially hallucinatory content, the convergence stage focuses on filtering the hallucinations that are useful for the user with intent recognition and evaluation metrics.