Diffusion Delusion Confusion

homepage ❥~ TIANRAN QIAN ✶.✩

Exploring how to make space in this alien aesthetic machine~

Tianran is a transdisciplinary, research-based creative practitioner born in Hangzhou, China, and currently based in Brooklyn, New York. Her work investigates the aesthetic mediation of digital writing, the materiality of artificial intelligence, agency and algorithmic affect, as well as space-making and community organizing practices. Her multimedia projects encompass independent publishing, photography, installation, writing, and workshops.

Professionally, she has worked as a digital talent agent and producer at the intersection of new media art and marketing. This role provided her with market-driven insights into how new media infrastructure shapes visual culture and value production, transforming urban fabric and public space.

MFA: Design and Technology
@Parsons School of Design, The New School

BA: Global Liberal Studies in Contemporary Culture and Creative Production
@ New York University

Research interests:
- computational aesthetics
- technopolitics, infrastructure
- digital writing as imperial technology
- (im)materiality and transmediality of digital media
- trasnlingual practice
- embodiment and affect
- meaning making and value production
- human agency, space-making and creative intervention
- epistimology and knowledge representation

Contact:
tianran.space@gmail.com
@_tianran_

👀

Panel ❥ Diffusion Delusion Confusion ⦦⦧ ❥

2024.06.16

Diffusion Delusion Confusion:
Zoom Practice Sharing Panel on AI Image Generation and Diffusion Models

Participants:
Tianran Qian
Yufeng Zhao
Jakob Sitter
Shane Fu
Tuğrul Şalcı

Poster Design:
Sirui Liao

Why we did this?

This panel stems from my interest in AI-generated images, as the materiality of visuality today has completely transformed. Images used to be paintings and photographs; now they could be statistical renderings or infographics generated with a huge amount of energy and infrastructure, with millions of contributing users as labor.
Luckily, I met these friends that are avid and whimsical practitioners that have been thinking and using these tools to creatively and critically engage with data, platforms, interactions and so on. I’m hosting this panel because because of this freaky feeling I get from AI, especially from generated images. None of them could surprise us anymore at this point, but every single one of them here is so freaky to look at.
And so I wanted them to meet, because not that many people around me except for these four here are engaging with the materiality of AI. Maybe it’s a combination of technical barrier and media and visual literacy that became an high invisible wall between us and these platforms and techniques.
Generational gap! Just creating a more cozy intimate discussion space for ourselves.

Situating it to everyday

However, the practices of our everyday life have become increasingly infused with and mediated by AI. It is shaping our world, and transforming how social and economic life takes place. It produces new ways of doing things, accelerates and automates existing practices, transforms social and economic relations, and creates new horizons for cultural activity.
Two years ago, we were fascinated by Dalle 1 as if it were a magical tool. But it has soon become an ambient norm. Whenever a new model comes out, it completely changes the game. Artists and creative practitioners are the forefront explorers of these machine learning black boxes, and I believe it is valuable to host this panel as an opportunity to document how we’re thinking about this now, as it is gonna change quickly again with the perpetual updates.
But before they share their practices and thoughts, I want to lay down some frameworks to ground our discussion a little bit first. And I start by introducing the denoising process of diffusion models.

Diffusion models

Diffusion models are a class of generative models that learn to generate data by reversing a diffusion process. The diffusion process involves gradually adding noise to data, while the reverse process (denoising) involves removing this noise step by step to generate new data samples.

Noise

Noise becomes a key concept here. I have an theoretical definition coming up, but it is such a strong metaphor by itself and a concept we’re familiar with. What do you think noise is?
(the father of information theory) Claude Shannon’s foundational definition of information as entropy classifies noise as a separate source, added to the information channel. (Wendy Chun, Updating to Remain the Same)

De-noising

Let’s see is how noise works with this film here by artist and researcher Eryk Salvaggio, as it clearly illustrates the generative process.
Every AI generated image by diffusion models starts from one single static frame of noise that’s randomly generated. Hidden between every pixel, there is a potential latent hidden image.
These diffusion models ask you to dream or imagine, by asking you to describe a picture. They steer us towards the end through noise by reference to your words. There is no image in the static frame yet, and so you describe a constellation and ask the model to find it in the stars. The model can only find approximations.
It first starts by removing noise that doesn’t fit your descriptions. Then it adds noise. Then it examines the image again, to see if the new noise introduces new paths to the image described. It does this repeatedly until it offers you an image.

What creates this noise?

As we saw, in the beginning, there was noise. But what creates this noise? An AI model sees, knows, and experiences nothing. It can only follow instructions, then it goes to the online database to seek images that fit the descriptions.
These paired images and texts are assembles of human visual culture, including landscape paintings, art works, selfies, medical records, historical photographs, advertisements, scientific illustrations, real and so on.

The models are not archives that store these images. Instead, they transform these images and their descriptions into data, converting them to data in an unusual way, slowly destroying them to preserve them. Each image has a small amount of information removed, but it’s easier to see it as an addition of noise. As the image degraded, it breaks apart the following of a familiar pattern of the Gaosian noise. These noise create clusters of pixels around the densest area of visual information. The model learns how this noise travels across images, and the result is an image of pure noise.

Flower

The trajectory of this noise is recorded, then compared to the text that described the image. If the image has a flower, it’s added to the image of other flowers. Visual information becomes categories. Categories are labeled by the words found alongside the image. Generated flowers bloom backward into noise, then a new set of flowers can bloom from the debris.

Michelangelo vs. Diffusion models

To Michelangelo, every block of stone has a statue inside it and it is the task of the sculptor to discover it. He famously said “I saw the angel in the marble and carved until I set him free”, which emphasizes the sculptor’s visionary ability to see the potential within raw materials, the meticulous skill required to reveal this potential.
To diffusion models, words are the sculptor that carve images out of noise. An image of noise contains every single pathway to every possible image. Same principle as Every Icon if you know this work. Our words are constraint. The more “precise” the words, the “clearer” the image.
But what is considered “precise” to the language of the machine, defined by whom with what framework of standards? What does the clarity of the image hope to reveal? Who’s the subject looking At what object? In what angle, captured through which lens? What categories were documented to enter the data pool, in what way and for whose purpose?

Diffused City by Yufeng Zhao

Here is an example. This is Diffused City, developed by our first panelist Yufeng here, is a program where you can type in a city’s name to generate its street view. I typed in New York here and got this general average impression of the city. But when I typed in Bushwick sunset, this more specific view showed up, presenting how the it probability looks like, based on the central tendency of that dataset.
Every AI-generated image functions as an infographic or visualization of the dataset. Noise is clustered around the central tendencies of the images within the dataset. These tendencies are shaped by categories assigned to them through captions found on the internet.

“Mean Image”, Hito Steyrel

Artist Hito Steyrel sharply calls every AI-generated image a “mean” image. This is an image she generated with the prompt “Hito Steyrel”, and this came out. She looks better obviously; this image isn’t really flattering. And in the speech, she jokingly called the diffusion model a “mandatory artificially intelligent hair loss device” :)
So this looks rather mean, or demeaning. But the question is, what kind of mean? Whose mean is it? Quoting her, “this is an approximate of how society through a filter of average internet garbage sees me. All is takes is to remove the noise of reality from my photos, and just extract the social signal instead.
Mean is a composite of 1. Shabby minor origins 2. Norm 3. Nasty 4. Meaning 5. Commons 6. Financial means / instrument. But all of these are far from hallucination because they pick up on latent social patterns that encode conflicting significations or vector coordinates, and visualize real existing social attitudes.
Millions of pictures are scraped from the internet, they’re averaged into a mean or medium, that drives the output of all these images. These mean images are what she calls, statistical renderings, that shift the focus on representation from establishing facts, to expressing probabilities. They are data visualizations that represent the norm but signaling the mean. They represent likeliness, with likelinesses.

Composite Photos and Eugenics

If we trace back to the history of composite photos here, we will find its relation to eugenics.
Laid out by Wendy Chun in her work Camera Obscura, the father of eugenics Sir Francis Galton defined eugenics as “the science which deals with all influences that improve the inborn qualities of a race.” The eugenical standpoint believes that permanent advance is to be made only by securing the best “blood”.
Composite photographs originated in the 19th century with the goal of creating a standard face to better identify criminals. The composite portraitures is a tool for visualizing different human “types”, suggesting that individuals sharing these facial features were likely criminals. Correlation and causation are abused here by Galton, as he later created composite photos of the “othered” members of the population who were considered socially inferior, including the mentally ill and Jews for example. Guess who invented this technique? Francis Galton, the inventor eugenics.
I cant help but wonder, how are our lives directed by the semiotics of categorical indexes?
Phrased by accelerationist philosopher Mark Fisher, “Indexical AI seems to maintain the vector of history, the feeling of the necessity of capital, reinforcing “capitalist realism.”
All programs are control systems based feedback loops that predicts the future. And the concern is that now, the predicted future only represents the past.
To Wendy Chun, this interplay between verification and prediction results in a really weird and disturbing erasure of history in the name of “machine learning.”

Jakob Seeder

This is not an image of jakob, but one variation assembled from the central tendencies of all photos of jakob that he fed into the trained model in addition to other keywords that the user input.
The AI generated images reveal how image information has been categorized. Then it recirculates these categories and biases, resulting in an amplifying feedback loop. They are statistical renderings that reflect the culture of where the images database were collected, and perhaps inform the principles of the systems designed, which functions on a basic logic that people can be labeled, categorized, measured, and calculated in the first place.
Jakob gave up the ownership of his face, and opened its access to his social media audience by allowing people to generate images that are like him. This project brings complex questions regarding ownership and privacy. As this work circulates in the market, its actual cultural and economic value also transforms.

Shane Fu

Speaking values and real existing social attitudes, Shane can offer us a lot of first-hand insights here. We’re all creators of some sort here that wish to find our audience. Shane has audience, 588k followers on instagram. His works have reached millions of audiences on social media platforms, which also brought him opportunities in the commercial market. He he has a perspective as that none of us do, also as an active member within the practitioner community, especially 3D animation and video production.
2.5m，almost 5000 comments, debating on intellectual property and artistic agency.
Martin Zeilinger (researcher & curator): “AI has the potential to substantially disturb IP policy and political economies that have formed around the romantic ideal of the author as owner.”
When an AI system isn’t just understood as a tool used by human artists, but as an agential entity (or an assemblage of such entities) capable of “creative” expression, this then problematizes not only aesthetic assumptions regarding the nature of creativity and authorship, but by extension also socio-economic and legal assumptions regarding the ownership or the very “ownability” of such expressions.

Tugrul Salci

Traditionally when we think about the mastery of art-making, a big part of it is how much control the artist has over their tools. But what does control mean for generative AI? Can one control it, or only collaborate with it? If so, how do we understand at which level did the creativity take place? How to locate the creative agency in an artwork produced by a posthumanist agential assemblage of human artist and the generative AI system which is a new kind of non-human expressive agency?
Tugrul has been working with generative art. This work Cybernetic Flora explores how artificial intelligence, cyber-ethics, and human-AI co-evolution interweave, forming a unique plane of immanence. With his presentation, we will be more able to understand what does it mean by co-creating with AI, and how can we understand where the computational creativity lies.

Beatrice Fazi on Computational Aesthetics

Beatrice Fazi is a contemporary philosopher and theorist on computation and technology. Im especially interested in her investigation of computational aesthetics.
“Digital technology, with its binary elaboration of reality through quantifiable mechanical operations and quantitative inputs and outputs, attends to the real by discretising it. Aesthetics, however, being predicated upon perceptual relations and sensations, would seem to require a continuous—and thus analogue—association with the world.”
To Fazi, the conceptual challenge here is pinpointing what constitutes the aesthetic dimension of digitality itself. What are the foundational elements of this dimension, and what ontology do we need in order to account for the disparity between the continuous (perception and sensation) and the discrete (digital technology)?

INDEX NEXT

🩵🌟🤪👾👽🫶💛