Artificial intelligence visual art


Artificial intelligence visual art, or AI art, is visual artwork generated or enhanced through the implementation of artificial intelligence programs, most commonly using text-to-image models. The process of automated art-making has existed since antiquity. The field of artificial intelligence was founded in the 1950s, and artists began to create art with artificial intelligence shortly after the discipline's founding. A select number of these creations have been showcased in museums and have been recognized with awards. Throughout its history, AI has raised many philosophical questions related to the human mind, artificial beings, and the nature of art in human–AI collaboration.
During the AI boom of the 2020s, text-to-image models such as Midjourney, DALL-E and Stable Diffusion became widely available to the public, allowing users to quickly generate imagery with little effort. Commentary about AI art in the 2020s has often focused on issues related to copyright, deception, defamation, and its impact on more traditional artists, including technological unemployment.

History

Early history

Automated art dates back at least to the automata of ancient Greek civilization, when inventors such as Daedalus and Hero of Alexandria were described as designing machines capable of writing text, generating sounds, and playing music. Creative automatons have flourished throughout history, such as Maillardet's automaton, created around 1800 and capable of creating multiple drawings and poems.
Also in the 19th century, Ada Lovelace, wrote that "computing operations" could potentially be used to generate music and poems. In 1950, Alan Turing's paper "Computing Machinery and Intelligence" focused on whether machines can mimic human behavior convincingly. Shortly after, the academic discipline of artificial intelligence was founded at a research workshop at Dartmouth College in 1956.
Since its founding, AI researchers have explored philosophical questions about the nature of the human mind and the consequences of creating artificial beings with human-like intelligence; these issues have previously been explored by myth, fiction, and philosophy since antiquity.

Artistic history

Since the founding of AI in the 1950s, artists have used artificial intelligence to create artistic works. These works were sometimes referred to as algorithmic art, computer art, digital art, or new media art.
One of the first significant AI art systems is AARON, developed by Harold Cohen beginning in the late 1960s at the University of California at San Diego. AARON uses a symbolic rule-based approach to generate technical images in the era of GOFAI programming, and it was developed by Cohen with the goal of being able to code the act of drawing. AARON was exhibited in 1972 at the Los Angeles County Museum of Art. From 1973 to 1975, Cohen refined AARON during a residency at the Artificial Intelligence Laboratory at Stanford University. In 2024, the Whitney Museum of American Art exhibited AI art from throughout Cohen's career, including re-created versions of his early robotic drawing machines.
Karl Sims has exhibited art created with artificial life since the 1980s. He received an M.S. in computer graphics from the MIT Media Lab in 1987 and was artist-in-residence from 1990 to 1996 at the supercomputer manufacturer and artificial intelligence company Thinking Machines. In both 1991 and 1992, Sims won the Golden Nica award at Prix Ars Electronica for his videos using artificial evolution. In 1997, Sims created the interactive artificial evolution installation Galápagos for the NTT InterCommunication Center in Tokyo. Sims received an Emmy Award in 2019 for outstanding achievement in engineering development.
In 1999, Scott Draves and a team of several engineers created and released Electric Sheep as a free software screensaver. Electric Sheep is a volunteer computing project for animating and evolving fractal flames, which are distributed to networked computers that display them as a screensaver. The screensaver used AI to create an infinite animation by learning from its audience. In 2001, Draves won the Fundacion Telefónica Life 4.0 prize for Electric Sheep.
In 2014, Stephanie Dinkins began working on Conversations with Bina48. For the series, Dinkins recorded her conversations with BINA48, a social robot that resembles a middle-aged black woman. In 2019, Dinkins won the Creative Capital award for her creation of an evolving artificial intelligence based on the "interests and culture of people of color."
In 2015, Sougwen Chung began Mimicry , an ongoing collaboration between the artist and a robotic arm. In 2019, Chung won the Lumen Prize for her continued performances with a robotic arm that uses AI to attempt to draw in a manner similar to Chung. File:Edmond de Belamy.png|thumb|Edmond de Belamy, created with a generative adversarial network in 2018In 2018, an auction sale of artificial intelligence art was held at Christie's in New York where the AI artwork Edmond de Belamy sold for, which was almost 45 times higher than its estimate of –10,000. The artwork was created by Obvious, a Paris-based collective.
In 2024, Japanese film generAIdoscope was released. The film was co-directed by Hirotaka Adachi, Takeshi Sone, and Hiroki Yamaguchi. All video, audio, and music in the film were created with artificial intelligence.
In 2025, the Japanese anime television series Twins Hinahima was released. The anime was produced and animated with AI assistance during the process of cutting and conversion of photographs into anime illustrations and later retouched by art staff. Most of the remaining parts such as characters and logos were hand-drawn with various software.

Technical history

, characterized by its multi-layer structure that attempts to mimic the human brain, first came about in the 2010s, causing a significant shift in the world of AI art. During the deep learning era, there are mainly these types of designs for generative art: autoregressive models, diffusion models, GANs, normalizing flows.
In 2014, Ian Goodfellow and colleagues at Université de Montréal developed the generative adversarial network, a type of deep neural network capable of learning to mimic the statistical distribution of input data such as images. The GAN uses a "generator" to create new images and a "discriminator" to decide which created images are considered successful. Unlike previous algorithmic art that followed hand-coded rules, generative adversarial networks could learn a specific aesthetic by analyzing a dataset of example images.
In 2015, a team at Google released DeepDream, a program that uses a convolutional neural network to find and enhance patterns in images via algorithmic pareidolia. The process creates deliberately over-processed images with a dream-like appearance reminiscent of a psychedelic experience. Later, in 2017, a conditional GAN learned to generate 1000 image classes of ImageNet, a large visual database designed for use in visual object recognition software research. By conditioning the GAN on both random noise and a specific class label, this approach enhanced the quality of image synthesis for class-conditional models.
Autoregressive models were used for image generation, such as PixelRNN, which autoregressively generates one pixel after another with a recurrent neural network. Immediately after the Transformer architecture was proposed in Attention Is All You Need, it was used for autoregressive generation of images, but without text conditioning.
The website Artbreeder, launched in 2018, uses the models StyleGAN and BigGAN to allow users to generate and modify images such as faces, landscapes, and paintings.
In the 2020s, text-to-image models, which generate images based on prompts, became widely used, marking yet another shift in the creation of AI-generated artworks.
In 2021, using the influential large language generative pre-trained transformer models that are used in GPT-2 and GPT-3, OpenAI released a series of images created with the text-to-image AI model DALL-E 1. It is an autoregressive generative model with essentially the same architecture as GPT-3. Along with this, later in 2021, EleutherAI released the open source VQGAN-CLIP based on OpenAI's CLIP model. Diffusion models, generative models used to create synthetic data based on existing data, were first proposed in 2015, but they only became better than GANs in early 2021. Latent diffusion model was published in December 2021 and became the basis for the later Stable Diffusion, developed through a collaboration between Stability AI, CompVis Group at Ludwig Maximilian University of Munich, and Runway.
In 2022, Midjourney was released, followed by Google Brain's Imagen and Parti, which were announced in May 2022, Microsoft's NUWA-Infinity, and the source-available Stable Diffusion, which was released in August 2022. DALL-E2, a successor to DALL-E, was beta-tested and released. Stability AI has a Stable Diffusion web interface called DreamStudio, plugins for Krita, Photoshop, Blender, and GIMP, and the Automatic1111 web-based open source user interface. Stable Diffusion's main pre-trained model is shared on the Hugging Face Hub.
Ideogram was released in August 2023, this model is known for its ability to generate legible text.
In 2024, Flux was released. This model can generate realistic images and was integrated into Grok, the chatbot used on X, and Le Chat, the chatbot of Mistral AI. Flux was developed by Black Forest Labs, founded by the researchers behind Stable Diffusion. Grok later switched to its own text-to-image model Aurora in December of the same year. Several companies, along with their products, have also developed an AI model integrated with an image editing service. Adobe has released and integrated the AI model Firefly into Premiere Pro, Photoshop, and Illustrator. Microsoft has also publicly announced AI image-generator features for Microsoft Paint. Along with this, some examples of text-to-video models of the mid-2020s are Runway's Gen-4, Google's VideoPoet, OpenAI's Sora, which was released in December 2024, and LTX-2 which was released in 2025.
In 2025, several models were released. GPT Image 1 from OpenAI, launched in March 2025, introduced new text rendering and multimodal capabilities, enabling image generation from diverse inputs like sketches and text. MidJourney v7 debuted in April 2025, providing improved text prompt processing. In May 2025, Flux.1 Kontext by Black Forest Labs emerged as an efficient model for high-fidelity image generation, while Google's Imagen 4 was released with improved photorealism. Flux.2 debuted in November 2025 with improved image reference, typography, and prompt understanding.