DALL-E 2: Recombinant Art & Design

May 17, 2022

For about 1 and half years or so, I’ve struggled to come up with a good label for what I believe is the next observable layer of human creativity.

In my YouTube Series GPT-X, DALL-E, and our Multimodal Future, I made a video predicting this emerging area … I described it as “mixing and texturing”:

It seemed so important that I opened the series with this lesson and described it as “the essence of multimodal creativity”.

It basically involved writing prompts which are the result of some combination of things in new and interesting ways, here’s some screenshots from the video:

Now that I have access to DALL-E 2, not only am I creating art of this format but seeing it everywhere from other beta users as well. You can distill the format of each prompt down to some kind of basic math expression:

Bakz T. Future 🇨🇦 @bakztfuture

Illustration of cool oversized grizzly bears in the style of NFT'S #dalle2 #dalle

= Illustration ∪ Cool Oversized Grizzly Bear ∪ Style of NFT’s

Bakz T. Future 🇨🇦 @bakztfuture

"Photo of a drumset made of cheese" #dalle #dalle2

= Photo ∪ Drumset ∪ Cheese

Bakz T. Future 🇨🇦 @bakztfuture

"Photo of a confident oversized grizzly bear dressed in an abstract high fashion outfit wearing sunglasses on the fashion runway at Paris fashion week" #dalle2 #dalle

= Photo ∪ Confident Oversized Grizzly Bear ∪ (Dressed in abstract high fashion outfit + wearing sunglasses) ∪ (Fashion runway + Paris Fashion week)

= Pencil Sharpener Industrial Design ∪ (Lamborghini OR The Style of Picasso OR … etc)

Since we are creating images from scratch just with text prompts using AI, we end up creating with simple and unusual things which are combined in interesting ways to make something completely new.

Currently, it appears it’s all about:

Generating images using text prompts
Combining the essence of two or more objects, fragments, items, art styles, art techniques, mediums, characters, specific artistic works, periods in history, colors, or ideas
Prepending or appending text into a prompt, piling in more text descriptions to see what the model generates
Using modifiers like “hyper realistic”, “unreal engine”, “photo of” etc. as a means of prompt design to get desired quality outcomes
Using high level text language to describe characteristics, qualities of a desired art piece as much as possible
Creating things which have never been seen before or cannot even be unseen

I’m a complete art history noob

One of the reasons I’ve struggled with coming up with a name for this phenomenon is that I wondered if there was an art form that may already exist to describe it. It’s certainly possible that, perhaps, all art is already some example of this kind of on-the-fly mixing of elements. My art history experience is limited, but you could make comparisons to avant garde art, mixed media, surrealism, maybe a collage, and probably the closest may be eclecticism:

I’ve heard of other kinds of periods of art too, such as where people play a game of telephone, pass along ideas, and end up with something completely new at the end.

Surprisingly, I found that this hilarious scene from The Simpsons, portraying Yoko Ono, almost describes the arbitrary nature of drafting DALL-E 2 Prompts:

Although I think the scene is meant to mock the strangeness of her work, you can almost imagine a DALL-E 2 generation of, “A Single Plum, Floating in Perfume, Served In a Man’s Hat” posted on the DALL-E 2 subreddit or elsewhere and it would fit right in! In fact, I’ve generated it here:

A Single Plum, Floating in Perfume, Served inside a Man’s Hat, digital art

However, please keep in mind - not all DALL-E 2 generations are meant to be as strange or distasteful like this one!

The Key Differentiator? You are combining the essence of at least two things

In the past with mixed media or even music beat sampling you were very explicitly combining previous works together whereas in the world of Multimodal AI models like DALL-E 2, the system is able to under the “jist”of not just objects like apples but the collective works of Shakespeare or the general look and feel of brutalist industrial product design. Keep in mind, DALL-E isn’t explicitly copying a specific image and pasting it when it generates something. Loosely, it has learned the idea or features/representations that comprise something and is incorporating its overall understanding of something into your work. The system’s ability to understand a higher level of hierarchy, compositionality, and abstraction shared between the modalities of images and language allow Multimodal Artists to create things artists simply could not dream of in the past … which is why I think it’s important to find a suitable label for this new art style and movement!

As I’ve already stated in my podcast, I believe DALL-E 2 will be a multi billion dollar product and shape a multitude of industries around the world. It’s important to have a clear label for this unique phenomenon for all the new creatives who will be using it. In order to build on it and write cool prompts, we’ll need an easy way to describe it! Which is why I’m proposing the terminology of, “Recombinant Art”.

Introducing Recombinant Art - The Essence of Multimodal Creativity

The word recombinant, coming from genetic engineering, means:

Recombinant means formed by the process of recombination, which is the process of combining two different genetic materials to produce a new genetic combination with specific characteristics.

Wikipedia’s definition:

Recombinant DNA (rDNA) molecules are DNA molecules formed by laboratory methods of genetic recombination (such as molecular cloning) that bring together genetic material from multiple sources, creating sequences that would not otherwise be found in the genome.
Recombinant DNA is the general name for a piece of DNA that has been created by combining at least two fragments from two different sources.

What I like about this term is that it is captures the creator’s perspective of intentionally mixing at least two things. In addition, DNA is similar to the idea of the “essence” of something, which multimodal AI models allow us to mix together. I also like the context of genetic engineering which sounds so space age, because we’re using artificial intelligence in this area here. There’s also some evolutionary stuff going on here too, which I also like. Odds are, people will be inspired by early DALL-E 2 generations and build on top of the culture. I think it’s likely we will see Recombinant Art combined with other works of Recombinant Art to make entirely new things.

The term also sounds like something you’d hear in the tech community, and I’m proud of this because we’re looking at an art and cultural revolution which may be uniquely driven by the tech industry. Many of the earliest users of DALL-E 2 are first and foremost technologists.

Finally, the term sounds pleasing to me. In the future, I could see multimodal AI artists asking, “what does the Recombinant of that look like” and “this one is my favourite Recombinant of the two”. Maybe it could become a slang like, “what’s the recomb in the style of Picasso?”.

Please keep in mind Recombinant Art appears to already exist, in this case, from the genetic engineering community itself, but I hope it’s OK if the term is shared in the multimodal AI world too.

Altogether I would define Recombinant Art as:

AI generated works which involve combining the essence of at least two fragments, ideas, or things, in new, surprising, or unexpected ways.

Where did Recombinant Art come from? How did we end up here?

I think Recombinant Art is a result of a few factors:

DALL-E 2, Multimodal AI Model Architecture and Limitations

By design, most AI model architectures receive some input length of text and output an image as a result. There is no other way to use DALL-E 2 other than inputting text and getting an image. What I’m trying to say here is that I believe Recombinant Art has emerged simply as a reflection of the hard technical specifications of the Multimodal AI model itself. In the future, as DALL-E becomes even more capable and intelligent, we will unlock even newer kinds of artistic possibilities.

Bias of Technological Community

Like I alluded to earlier, I find it fascinating that DALL-E 2 prompts feel a lot like simple mathematical expressions. I think this unique approach of generating art may be a reflection of the tech community itself which is heavily logical. Perhaps, this is a convenient shorthand approach for technologists to try to be more emotionally expressive, creative, and artistic.

Big Opportunity, Higher Level of Artistic Abstraction

I think it’s a massive opportunity and area which hasn’t really been explored at this level before. The higher level of artistic abstraction achieved by something like DALL-E 2 means more people than ever can be artists and create things we’ve never seen before. Many existing artists immediately see the tremendous creative opportunity through a system like DALL-E.

Where does Recombinant Art fit in the larger scheme of things?

If I had to create some kind of “layered taxonomy” about how Recombinant art fits into the grander scheme of things in the pop culture/digital media world, I would describe it like so:

Through advances in technology, sampling was highly influential in the pop culture music space, this development was followed by remixing. In both of these cases, I would say previous works were pretty explicitly used and mixed together using digital tools. In 2022, the difference is that with multimodal AI we are using implicit representations or the “essence” of previous works/things in order to create entirely new works of art. I believe we are now in this age, a cambrian explosion in fact, of Recombinant Art. Also, jumping ahead, like I’ve argued in a podcast episode last year, in the future, I think we’ll see Recombinant Texturing enter the limelight as the next art style movement in the multimodal AI art space.

Multimodal by Bakz T. Future

Discussion about this post