of the an-ai-has-not-written-this department
After to post the following AI-generated images, I got private replies with the same question: “Can you tell me how you made these?” So here I will provide the background and “how to” of creating such AI portraits, as well as describe the ethical considerations and the dangers we now need to address.
Generative AI – unlike analytical artificial intelligence – can create new content. It not only analyzes existing datasets, but it generates entirely new ones images, text, audio, videos and code.
When the ability to generate original images from written text emerged, it became the hottest hype in technology. It all started with the release of DALL-E 2, an enhanced AI art program from OpenAI. It allowed users to enter text descriptions and get images that looked like awesomecute or weird as hell.
Then people start hearing about Midjourney (and its vibrant Discord) and Stable Diffusion, a open-source project. (Google’s Imagen and Meta’s image generator are not made public). Stable diffusion allowed engineers to train the model on any image dataset to create any art style.
Due to the rapid development of the coding community, more specialized generators were introduced, including new killer apps to create AI-generated art from YOUR photos: Avatar AI, ProfilePicture.AI and Astria AI. You can make your own with them AI generated avatars or profile pictures. You can change some of your features, as demonstrated by Andrew “Boz” Bosworth, Meta CTO, who used AvatarAI to see himself with her:
Startups like the ones mentioned above are booming:
To use their tools, you need to follow these steps:
1. How to prepare your photos for the AI training
As of now, training Astria AI with your photos costs $10. Each app charges differently for fine-tuning credits (for example, ProfilePicture AI costs $24 and Avatar AI costs $40). Keep in mind that these costs change quickly as they experiment with their business model.
Here are a few ways to improve the training process:
- At least 20 photos, preferably shot or cropped to a 1:1 (square) aspect ratio.
- At least 10 close-ups of the face, 5 medium from the chest up, 3 of the whole body.
- Variation in background, lighting, expressions and eyes looking in different directions.
- No glasses/sunglasses. No other people in the photos.
A trained AI model will be ready about 60 minutes after uploading your photos. Where are you most likely to need guidance? Tempting.
2. How to survive the incendiary mess
After the training is complete, a few images will be waiting on your page. Those are “standard prompts” as examples of the app’s capabilities. To create your own prompts, set the className as “person” (this was recommended by Astria AI).
Formulating the right prompts for your purpose can take a lot of time. You need patience (and motivation) to keep refining the prompts. But when a text prompt comes to life the way you imagined (or better than you imagined), it feels a bit like magic. To get creative inspiration, I used two search engines, Lexica and Krea. You can search for keywords, scroll until you find an image style you like, and copy the prompt (then change the text to “sks person” to make it your self-portrait).
Some clues are so long That reading it hurts. They usually include image setting (e.g. “highly detailed realistic portrait”) and style (“art by” one of the popular artists). Since regular people need help crafting those words, we already have an entirely new role for artists under rapid engineering. It becomes a desirable skill. Keep in mind that no matter how professional your directions are, some results will look WILD. In one picture I had 3 arms (don’t ask me why).
If you want to avoid the chaos of all the prompts, I have a friend who just used the default, was happy with the results and has shared them everywhere. In order for those apps to become more popular, I recommend including more “standard prompts”.
Potential and Benefits
1. It is NOT the END of human creativity
The electronic synthesizer has not killed music and photography has not killed painting. Instead, they catalyzed new forms of art. AI art is here to stay and can makers make more productive. Creators are going to include such models as part of their creative process. It’s a partnership: AI can serve as one starting pointa sketch tool that gives suggestions, and the creator will improve it further.
2. The road to the masses
Until now, Crypto boosters have failed to answer the simple question of “what is it good for?” and have failed to articulate concrete, compelling use cases for Web3. All we got was unnecessary complexity, vague future casting and “crypto countries.” On the contrary, AI-generated art has clear utility for creative industries. It is already used in various industries such as advertising, marketing, gaming, architecture, fashion, graphic design and product design. This one Twitter thread offers a variety of use cases, from trade to the domain of medical imaging.
When it comes to AI portraits, then I think of another target group: teenagers. Why? Because they spend hours perfecting their photos with different filters. Make image generation tools cheap and easy to use, and they’ll be your heaviest users. Hopefully they don’t use it in their dating profiles.
Disadvantages and disadvantages
1. AI copying was not allowed by the artists
Despite the thriving industry, there is a lack of compensation for artists. For example, read about their frustration at how an unwilling illustrator turned into an AI model. Spoiler alert: she didn’t like being turned into a popular prompt for people to emulate, and now thousands of people (soon to be millions) can copy her style of work almost exactly.
Copying artists is a copyright nightmare. The input question is: can you use copyrighted data to train AI models? The output question is: can you copyright what an AI model creates? No one knows the answers, and this is just the beginning of this debate.
2. This technology can be easily weaponized
A year ago on Techdirt I summarized the stories surrounding Facebook: (1) Reinforcing the good/bad or a mirror for the ugly, (2) The flaw of the algorithms versus the people who build or use them, (3) The machine repair versus the underlying social problems. I believe this discussion also applies to AI-generated art. It should be viewed through the same lens: good, bad and ugly. While this technology is delightful and beneficial, there are also negative consequences of releasing image manipulation tools and letting humanity play with them.
While DALL-E had some limitations, the new competitors had a “hands-off” approach and no safeguards to prevent people from creating sexual or potentially violent and offensive content. Soon after, some of the users generated deepfake-style images of naked celebrities. (Look surprised). Google’s Dreambooth (which uses AI-generated avatar tools) made creating deepfakes even easier.
As part of my exploration of the new tools, I also tried Deviant Art’s DreamUp. The “most recent creations” page featured several images of naked teenage girls. It was disturbing and nauseating. In a digital artwork of a teenage girl in the snow, the artist commented, “This one is closer to what I imagined, aside from being nude. Why DreamUp? Obviously I should mention ‘clothing’ in my prompt. That says it all .
According to the new book Data Science in Context: Foundations, Challenges, Opportunities, advances in machine learning have made deepfakes more realistic, but also improved our ability to detect deepfakes, leading to a “game of cat and mouse.”
In almost every form of technology, there are bad actors playing this game of cat and mouse. Managing user-generated content online is a headache social media companies know all too well. Elon Musk’s first two weeks on Twitter amplified that experience – “he pursued chaos and found it.” Stability AI released an open-source tool with a belief in radical freedom, wreaked havoc and found it in AI-generated porn and CSAM.
Text-to-video isn’t very realistic now, but at the rate AI models are evolving, it will be in a few months. In a world of synthetic media, seeing will no longer be believing, and the basic unit of visual truth will no longer be believable. The authenticity of any video will be questioned. In general, it will become more and more difficult to determine whether a piece of text, audio or video is human-generated or not. It can have major consequences for trust in online media. The danger is that the new convincing images can take propaganda to a whole new level. Meanwhile, deepfake detectors are making progress. The arms race has begun.
AI-generated art inspires creativity and enthusiasm as a result. But as it approaches mass consumption, we can also see the dark side. A revolution of this magnitude can have many consequences, some of which can be downright terrifying. Guard rails are now needed.
Dr. Nirit Weiss-Blatt (@Dr Techlash) is the author of The Techlash and Tech Crisis Communication
Filed Under: ai art, generative ai, portraits