AI Prompt Crafting in Stable Diffusion

17 minutes

An aerial view of a city at night, long exposure

Diffusion models have emerged as a powerful tool for creative expression. These models, such as Midjourney, DALL-E, and SDXL, harness the principles of diffusion processes to generate stunningly realistic images based on textual prompts. However, the key to unlocking their full potential lies in the art of prompt design/engineering. In this article, we'll explore design techniques, references and experiments with AI-generated images, and provide examples of tailored prompts for different creative domains. By the end, you'll gain an advanced understanding of prompt design, enabling you to unlock the immense potential of synthetic image generation.

Understanding Prompts

At its core, prompt design involves crafting concise and specific textual inputs that guide Diffusion models in generating desired outputs. These prompts provide the AI model with essential information to produce visually compelling results. By carefully crafting prompts, creators can influence the style, content, and overall aesthetic of the generated images. Prompt engineering takes prompt design a step further by fine-tuning the textual inputs to achieve specific objectives through modifiers, weights and parameters. This process requires a understanding of the capabilities and limitations of the underlying model as well as a creative intuition. Through iterative experimentation and refinement, prompts can be optimized to produce high-quality and visually coherent outputs.

From the Stable Diffusion Prompt Book we learn that this model was trained on images in the LAION-5B dataset and was developed by CompVis, Stability AI, and RunwayML. The book helps to get you started quickly and will help you learn essential building blocks and touch the techniques to master Stable Diffusion. This article resumes the book and complements with other helpful information I came across in my experiments.

The secret for generating good images has two parts, first a well-written prompt consisting of modifiers and a good sentence structure and second well-adjusted parameters. You can use the default, but sometimes fine-tuned parameters can generate much better results. "Prompt engineering is the process of structuring words that can be interpreted and understood by a text-to-image model. Think of it as the language you need to speak in order to tell an AI model what to draw."[1]

A beautiful young woman communicating with a cyborg

The recommendation is to start by asking questions:
1. What medium do you want? a photo, painting, sketch, illustration, ...
2. What’s the subject?
3. What details do you want?
- Special Lighting. Soft, ambient, ring light, neon
- Environment. Indoor, outdoor, underwater, in space
- Color Scheme. Vibrant, dark, pastel, neon
- Point of view. Front, Overhead, Side, Aerial view, Isometric view
- Background. Solid color, nebula, forest, landscape
4. In a specific art style? 3D render, Low Poly, Pixar Renders
5. A specific photo type? Macro, telephoto, Polaroid

"This is not an all-inclusive list, but will help you get great results when you start your prompt crafting journey. Don’t be afraid to experiment the more you try different prompts the better you will become."[1]

With the answers of the questions you create a complete sentence:
"A painting of a cute goldendoodle wearing a suit, natural light, in the sky, with bright colors, by Studio Ghibli."[1]

The earlier a word is in the sentence, the more importance it will be given. "The order and presentation of our desired output is almost as an important aspect as the vocabulary itself. It is recommended to list your concepts explicitly and separately than trying to cramp it into one simple sentence."[1]

Midjourney recommends to be clear about any context or details and to think about:

Subject: person, animal, character, location, object
Medium: photo, painting, illustration, sculpture, doodle, tapestry
Environment: indoors, outdoors, on the moon, underwater, in the city
Lighting: soft, ambient, overcast, neon, studio lights
Color: vibrant, muted, bright, monochromatic, colorful, black and white, pastel
Mood: sedate, calm, raucous, energetic
Composition: portrait, headshot, closeup, birds-eye view

Prompt structure to get started:

Medium+shot type - artist/reference style - subject - description - environment - colors - light - camera - mood

After experimenting with this tips and different Diffusion models, you understand that beginning and the end has more importance and sometimes the middle part is ignored or adjectives apply to the whole, not only the subject. You will realize the importance of modifiers, weights and parameters to generate great outcomes. You have to look up each model and version to inform yourself of special modifiers or magic words.

Modifiers

In the Stable Diffusion Prompt Book modifiers "are words that can change the style, format, or perspective of the image. There are certain magic words or phrases that are proven to boost the quality of the image." There are many types of modifiers, the prompt book gives you a list and I added some additional tips.

Photography

Shot type: Close-up, Extreme Close-up, POV, Medium shot, Long shot
Style: polaroid, Monochrome, Long exposure, Color splash, Tilt-shift
Lighting: Soft, Ambient, Ring, Sun, Cinematic, Spotlight, Rim lighting, Sunlight, Backlight, Studio lighting, Volumetric, Crepuscular rays, dimly lit
Context: Indoor, Outdoor, At night, In the park, Studio
Lens: Wide-angle (16mm to 24mm), Telephoto, 24mm, EF 70mm, Bokeh, 100mm, 35mm, 50mm, 800mm
Device: iPhone X, CCTV, Nikon Z FX, Canon, Gopro, Hasselblad, Leica

Tip: Increase the weight of the keyword if you don’t see the effect.

Example prompts from the Stable Diffusion Prompt Book with SDXL Base without refiner:

Close-up polaroid photo, of a husky, soft lighting, outdoors, 24mm Nikon Z FX

Close-up, polaroid, woman, soft lighting, indoor, wide-angle, iPhone X

Extreme Close-up, Monochrome, Old man, Ambient, Outdoor, Telephoto, CCTV

POV, Long exposure, Grey cat, Ring lighting, At night, 24mm, Nikon Z FX
Medium shot, Color splash, Bunny, Sun, In the park, EF 70mm, Canon
Long-shot, Tilt-shift (is the rotation of the lens plane relative to the image plane, called tilt, and movement of the lens parallel to the image plane, called shift), Ferrari, Cinematic, studio, Bokeh, Gopro

Photography Styles

Polaroid: Still photo of a child sitting in the middle of a wide empty city street, his back to the camera, symmetrical, polaroid photography, highly detailed, crisp quality
Tilt-Shift: Photo of construction site, workers, tilt shift effect, bokeh, Nikon
Product Shot: Product shot of Nike shoes, with soft vibrant colors, 3D blender render, modular constructivism, blue background, physically based rendering, centered
Long Exposure: An aerial view of a city at night, long exposure, Instagram contest
Portrait: Portrait photo of a stormtrooper with his beautiful wife on his wedding day
Color-Splash: Color splash wide photo of red phone booth in the middle of an empty street, detailed, mist, soft vignette
Monochrome: Photo of a staircase in an abandoned building, symmetrical, monochrome photography, highly detailed, crisp quality and light reflections, 100mm lens
Satellite: Google Earth satellite image of New York City, detailed buildings and streets

Polaroid: Still photo of a child sitting in the middle of a wide empty city street, his back to the camera, symmetrical, polaroid photography, highly detailed, crisp quality — Still photo of a child sitting in the middle of a wide empty city street, his back to the camera, symmetrical, polaroid photography, highly detailed, crisp quality

2. Photo of construction site, workers, tilt shift effect, bokeh, Nikon — Photo of construction site, workers, tilt shift effect, bokeh, Nikon

Product shot of Nike shoes, with soft vibrant colors, 3D blender render, modular constructivism, blue background, physically based rendering, centered

An aerial view of a city at night, long exposure, Instagram contest

streets in New York City, High resolution scan, nostalgic — streets in New York City, high resolution scan, nostalgic

Photo of a staircase in an abandoned building, symmetrical, monochrome photography, highly detailed, crisp quality and light reflections, 100mm lens

Cameras

GoPro: Monkey swimming, GoPro footage
CCTV: Darth Vader at a convenience store, pushing shopping cart, CCTV still, high-angle security camera feed
Drone: Drone photo of Tokyo, city center
Thermal: Thermal camera footage from a helicopter, war scene
Hasselblad 500C or CM: by Man Ray, waves of the ocean, clouds in the sky, smooth texture, cinematic, spotlight, Hasselblad 500CM
Leica M3: by Man Ray, empty beach in front of the ocean, smooth texture, spotlight lighting, shoot by leica (Leica M3, Leica 50mm)

Lenses

Telephoto: Alligator emerging from water, telephoto lens
Fish-eye: Night club, people dancing, Fish-eye lens
800mm: Photo of a hummingbird, 800mm lens
Macro: Photo of a ladybug-bee hybrid standing on a tulip, macro lens

800mm: Photo of a hummingbird, 800mm lens — Photo of a hummingbird, 800mm lens

Photo of a yellow ladybug, on a tulip, macro lens, studio light, 800mm lens
Steps: 30, Sampler: Euler a, Schedule type: Exponential, Model: sd_xl_base_1.0, Refiner: sd_xl_refiner_1.0, Refiner switch at: 0.85

Lighting

Nostalgic: Fallout concept art school interior render grim, nostalgic lighting, Unreal Engine 5
Purple Neon: Fallout concept art school interior render grim, realistic purple neon lighting, Unreal Engine 5
Sun Rays: Fallout concept art school interior render grim, sun rays coming through window, Unreal Engine 5

Purple Neon: Fallout concept art school interior render grim, realistic purple neon lighting, Unreal Engine 5 — **Purple Neon:** Fallout concept art school interior render grim, realistic purple neon lighting, Unreal Engine 5

Sun Rays: Fallout concept art school interior render grim, sun rays coming through window, Unreal Engine 5 — **Sun Rays:** Fallout concept art school interior render grim, sun rays coming through window, Unreal Engine 5

Art Mediums

Chalk: A sidewalk chalk painting of of beautiful landscape
Graffiti: Wall graffiti art of astronaut holding a super soaker
Airbrush painting: Airbrush painting of a tiger
Water Colors: Watercolor painting of sunset behind mountains, detailed, vaporwave aesthetic
Oil Painting: Oil painting of human Rick Sanchez, contest winner
Clay: Clay model of a city, studio lighting
Fabric: Crochet doll of Spiderman, studio lighting
Pencil Drawing: Pencil painting of the throne from Game of Thrones
Wood: Modern spiral-shaped table design, made of wood, studio lighting
Movie still: Movie still of a beautiful man
Tattoo art: Tattoo art of a beautiful flower
Pixel art: Pixel art of a cat

Crochet doll of Spiderman, studio lighting

Pencil painting of the throne from Game of Thrones

Artists

Portrait: Derek Gores, Miles Aldridge, Jean-Baptiste Carpeaux, Anne-Louis Girodet
Landscape: Alejandro Burdisio, Jacques-Laurent Agasse, Andreas Achenbach, Cuno Amiet
Horror: H.R. Giger, Tim Burton, Andy Fairhurst, Zdzislaw Beksinski
Anime: Makoto Shinkai, Katsuhiro Otomo, Masashi Kishimoto, Kentaro Miura
Sci-fi: Chesley Bonestell, Karel Thole, Jim Burns, Enki Bilal
Photography: Ansel Adams, Ray Eames, Peter Kemp, Ruth Bernhard, Man Ray
Concept artists (video game): Emerson Tung, Shaddy Safadi, Kentaro Miura

Portrait Artists: Using an artist known for doing portraits can be helpful in creating a specific style. Some artists style have a very profound effect and others have just a subtle effect.

Landscape Artists: When making a landscape, it's smart to specify the time of day (morning, noon, or night) and to set the season.

Horror Artists: Horror artists are known for creating chilling images, but they can be used to make pleasing images when mixed with other artists.

Anime Artists: It’s important when using anime artists to keep in mind the style they focus on and what time period they are from.

Sci-fi Artists: These tend to have very distinctive styles. Remember that you can not only use traditional art mediums but also artists from films.

Photography Artists: You can use noted photographers in your prompts. Try to use landscape or portrait photographers depending on what you are focusing on.

Concept Artists (Video Games): When it comes to concept artists, some will make scenes better while others will make better character designs.

Advanced Technique - Mixing Artist Styles: Building a prompt with artists can refine your image to something more, you are not limited to two artists, use as many as you want. Experiment and notice the subtle ways that the artist styles combine.

Also you can apply styles, like:

Academicism painting
Pop-art
Surrealism painting
Art deco illustration
Avant-garde painting
Classicism painting
Op Art

External Links to Artist Reference Materials

Surrealism painting of modern city, high quality

Illustration

3D illustrations
Stable diffusion can be used to create any 3D scene or object you can imagine!

Cute panda, origami art
Needle felted scene from the Simpsons, highly detailed, tilt shift, highly textured, action
Isometric assets: Tiny cute isometric kitchen in a cutaway box, soft smooth lighting, soft colors, 100mm lens, 3d blender render
Low Poly: kawaii low poly squirrel character, 3d isometric render, white background, ambient occlusion, unity engine
Pixar Renders: 3d fluffy Lion, closeup cute and adorable, cute big circular reflective eyes, long fuzzy fur, Pixar render, unreal engine cinematic smooth, intricate detail, cinematic
3D Item Render: Tiny isometric Alarm Clock, soft smooth lighting, soft colors, 3d blender render, trending on polycount, modular constructivism, physically based rendering

Cute panda, origami art — Cute panda, **origami art**

kawaii low poly squirrel character, 3d isometric render, white background, ambient occlusion, unity engine

3d fluffy Lion, closeup cute and adorable, cute big circular reflective eyes, long fuzzy fur, Pixar render, unreal engine cinematic smooth, intricate detail, cinematic

More illustrations

Children’s book: Elephant-turtle hybrid, in Children’s book illustration style
Vector: Vector illustration of Living Room in Flat Style, pastel color palette
Scientific Illustration: Anatomy of Pikachu, dissection Scientific illustration from a biology book
Comic: Retro comic style artwork, highly detailed batman, comic book cover, symmetrical, vibrant
Caricature: Caricature art of spiderman sitting on a bed having a nervous breakdown
Propaganda Poster: USSR propaganda poster. Eat Oreo!
Movie Poster: Adventurous trash can, movie poster
Psychedelic Art: Hypnotic illustration of a dear face, hypnotic psychedelic art by Dan Mumford, pop surrealism, dark glow neon paint, mystical, Behance
Splash Art: Splash art of an armored mage channeling arcane magicks, mana shooting from his hands, mystical energy in the air, action shot, heroic fantasy art, special effects, hd octane render
Ukiyo-e: Peppa pig, in Ukiyo-e style
Stickers: Die-cut sticker, Cute kawaii Goldendoodle character sticker, white background, illustration minimalism, vector, pastel colors
Fantasy Maps: DnD map with roads, for printing, highly detailed, with many towns
Pop up paper card: pop up paper card of a beautiful city

Retro comic style artwork, highly detailed batman, comic book cover, symmetrical, vibrant

Retro comic style artwork, highly detailed batman talking to a retro character, comic book cover, vibrant

Character design
When it comes to creating a character you want to first describe the broad description of them like "male orc" then adding more to them like "metallic armor". After that building the details while generating the images and make sure to add artists fitting the person.

Emotions

Simple feelings modifiers can set the atmosphere of the scene!

Cute sad girl toy, curly hair, standing character, soft smooth lighting, soft pastel colors, skottie young, 3d blender render, polycount, modular constructivism, physically based rendering

Cute happy girl toy, curly hair, standing character, soft smooth lighting, soft pastel colors, skottie young, 3d blender render, polycount, modular constructivism, physically based rendering

Positive emotions

Cosy: Cosy vintage bedroom, octane render by weta digital, exotic colorful pastel, ray traced lighting and reflections
Romantic: Photo of a couple shopping, romantic lighting
Joyful: Joyful photo of a husky puppy splashing water at the beach, canon eos r3
Energetic: Energetic waves of the ocean
Hope: Woman, filled with hope, in a beautiful dress on the beach
Lust: Painting of a couple, filled with lust, by mike mignola
Peaceful: A peaceful Japanese city street, dreamy, soft colors, studio ghibli style
Satisfaction: Old man looking at the camera, filled with satisfaction, Canon EOS 5D Mark IV

Glitch of two prompts: Cosy vintage bedroom New York — Cosy vintage bedroom, octane render by weta digital, exotic colorful pastel, ray traced lighting and reflections
Steps: 30, Sampler: Euler a, Schedule type: Exponential, Model: sd_xl_base_1.0

Joyful photo of a husky puppy splashing water at the beach, canon eos r3

Negative emotions

Depressing: Depressing photo, futuristic park
Loneliness: Girl sitting in window, reading a book, loneliness
Grim: Grim painting of a lake with ducks
Regret: Painting of a man looking at photo album, filled with regret
Suffering: Digital painting showing the suffering of a woman, sitting on a bench in the forest, by goro fujita
Hopelessness: Man, hopelessness, black and white, looking into the camera, sketch, intricate details
Fear: Child running towards the camera, in fear, by atey ghailan and mike mignola
Disgust: Photo of a child looking at his food with disgust

Aesthetics

Vibrant

Weirdcore: Weirdcore image of a zoo
Dreamcore: Photo of neighborhood, Dreamcore style
Vaporwave: Vaporwave pool

Gloomy

Liminal Space: Flooded, liminal space, underground city carpark, lighting with lensflares, photorealistic 8 k, eerie
After Hours: After hours, stairs to the park
Brutalism: Abandoned building, brutalism architecture, flowers growing
Post-Apocalyptic: Photo in a Post-Apocalyptic town, with houses and cars

Historic

Baroque: Painting of Danny DeVito, in Baroque cloth and style
Sovietwave: People walking in the street, Sovietwave
Wild West: Photo of a car driving in a town, Wild West
Film Noir: Chandler and monica, detailed faces, Film Noir style

SDXL has 77 predefined styles you can apply:

3D Model, Abstract, Advertising, Alien, Analog Film, Anime, Architectural, Cinematic, Collage, Comic Book, Craft Clay, Cubist, Digital Art, Disco, Dreamscape, Dystopian, Enhance, Fairy Tale, Fantasy Art, Fighting Game, Film Noir, Flat Papercut, Food Photography, GTA, Gothic, Graffiti, Grunge, HDR, Horror, Hyperrealism, Impressionist, Isometric Style, Kirigami, Legend of Zelda, Line Art, Long Exposure, Lowpoly, Minecraft, Minimalist, Monochrome, Nautical, Neon Noir, Neon Punk, Origami, Paper Mache, Paper Quilling, Papercut Collage, Papercut Shadow Box, Photographic, Pixel Art, Pointillism, Pokémon, Pop Art, Psychedelic, RPG Fantasy Game, Real Estate, Renaissance, Retro Arcade, Retro Game, Silhouette, Space, Stacked Papercut, Stained Glass, Steampunk, Strategy Game, Street Fighter, Super Mario, Surrealist, Techwear Fashion, Texture, Thick Layered Papercut, Tilt-Shift, Tribal, Typography, Watercolor, Zentangle, base

Magic words

HDR, UHD, 64K: Quality words like HDR, UHD, 4K, 8k, and 64K can make a dramatic difference.

A Landscape
A landscape, HDR, UHD, 64K

Highly detailed: Quality words like highly detailed can make a dramatic difference.

Joann of Arc portrayed by Jennifer Lawrence, highly detailed, concept

Studio lighting: Studio lighting could really add some nice texture to the image

A cinematic film still of Morgan Freeman starring as 50 Cent, portrait, 40mm lens, shallow depth of field, close up, studio lighting

Professional: Adding professional, can greatly improve the color contrast and details in the image

Empty temple, professional photograph

Trending on artstation

Portrait photo of a beautiful female cyborg from 1920, trending on artstation

Unreal engine

Hyper realistic 4 d model, unreal engine

Vivid Colors: Adding Vivid Colors, adds life to your images

Photo from a city street in the 1970s, vivid colors

Bokeh: Bokeh blurs the background and highlights the subject. It’s like iPhone portrait mode.

A cute totoro in a yard, bokeh

High resolution scan: Want a historic looking photo? Add "High resolution scan"

Aerial view of New York City, 1930, High resolution scan

Aerial view of New York City, 1930, High resolution scan (AI hallucination)

Portrait photo of a beautiful female cyborg from 1920, trending on artstation

biomechanical style cyborg, blend of organic and mechanical elements, futuristic, cybernetic, detailed, intricate, Model: sd_xl_base_1.0, Refiner: sd_xl_refiner_1.0

Advanced Prompting

Commas
In Stable Diffusion prompts commas aren't strictly necessary but can enhance clarity by defining distinct attributes or concepts. Proper prompt structuring, including the use of commas, can influence the AI's interpretation and the quality of the generated image. Commas improve readability and help specify a list of attributes, styles or elements.

Use commas to separate the subject, setting, additional elements and the reference style. A comma structured prompts allows you define weights for segments. However, the importance of commas may vary depending on the model's training and its interpretation of prompt structure.

Weights

Prompt weights allow you to assign varying levels of importance to different elements of your prompt. Utilizing prompt weights can enhance the accuracy, efficiency, and control over the output. Prompt weights are indicated by a double colon ("::") followed by a number, such as "::2". This number specifies the influence the prompt element should have on the generated image. The default weight is 1. Weights can be negative to exclude elements.

Relative scaling of weights is crucial. A weight of 0.5 compared to 2 impacts the image similarly to a weight of 1 compared to 4. Beyond a certain range weight scaling has diminishing returns, making weights over 10 is usually unnecessary.

An equivalent way to adjust keyword strength is to use Prompt Parentheses () and Prompt Brackets [].

Parentheses can increase and decrease, brackets can only decrease.

(keyword) increases the strength of the keyword by a factor of 1.1 and is the same as (keyword:1.1).

[keyword] decrease the strength by a factor of 0.9 and is the same as (keyword:0.9).
[[keyword]] is equivalent to (keyword: 0.81)
[[[keyword]]] is equivalent to (keyword: 0.73)

Keyword blending: You can mix two keywords and the proper term is prompt scheduling. The syntax is [keyword1 : keyword2: factor] The factor, a number between 0 and 1, controls at which step keyword1 is switched to keyword2. For more information read the article Blending in Stable Diffusion.

Negative prompts: You can use the negative prompt to tell Stable Diffusion what not to include in the image. This is especially useful when paired with using the same seed for the new generation.

In summary, start with a clear description of the desired image or concept. Break down the prompt into key components and assign its weights accordingly. Highlight essential elements with higher weights and prioritize their placement in your prompt. You can also use negative weights or negative prompts to exclude elements. Continuously experiment and adjust weights for optimal results.

In Midjourney you can use the effects of '--style raw'. "Images made with --style raw have less automatic beautification applied, which can result in a more accurate match when prompting for specific styles."[2] It reduces interpretation and gives you more control over the generated image.

Woman, filled with 'hope' in a beautiful dress on the beach — (Emma Watson:0.5), (Scarlett Johansson:0.9), (Angelina Jolie:1.2), photo of young woman, perfect eyes, filled with hope, in a beautiful dress, on the beach, studio lighting, looking at the camera

Stable Diffusion Parameters

Resolution
This parameter increases the required VRAM, and the time needed to generate. Choose the Width and Height of the generated images. It’s important to know at what resolution the model was trained, most models are trained for 512x512 pixels, except SDXL 1024x1024, and in general these dimensions provide the best quality and composition. Even with all parameters fixed, changing the resolution will completely change the generated image. Although it may have similar colors and composition. If you want bigger images, you use an Upscaler.

Classifier Free Guidance (CFG) – default is 7
You can see this parameter as a “Creativity vs. Prompt” scale. Lower numbers give the AI more freedom to be creative, while higher numbers force it to stick to the prompt.

Prompt: a red bird drinking water from a lake, children's book painting CFG: 0 Completely ignores the prompt
CFG: 4 Missing the red color
CFG: 7 Good balance
CFG: 15 Too high Starts creating artifacts

This parameter does not affect the VRAM needed, or the generation time.

Step count
This parameter does not affect the VRAM needed, but increasing it is directly proportional to the time it takes to generate an image. Stable Diffusion creates an image by starting with a canvas full of noise and denoise it gradually to reach the final output, this parameter controls the number of denoising steps. Usually, higher is better but to a certain degree, for beginners it’s recommended to stick with the default or a lower count, like 30.

Seed – default is “random”
Seed is a number that controls the initial noise. The seed is the reason that you get a different image each time you generate when all the parameters are fixed. By default, on most implementations of Stable Diffusion, the seed automatically changes every time you generate an image. You can get the same result back if you keep the prompt, the seed and all other parameters the same.

Sampler
Diffusion samplers are the method used to denoise the image during generation, and since they differ in the way of calculating the next step in the image production, they take different durations and different number of steps to reach a usable image. We suggest beginners to use DDIM since it's fast and can usually generate good images with only 10 steps, making it easy and fast to experiment and improve. Give Euler-a a try, it is fast, too.

Important tips

When to use what CFG value?

CFG 2 - 6: Creative, but might not follow the prompt
CFG 7 - 10: Recommended for most prompts. Good balance between creativity and guided generation
CFG 10 - 15: When you’re sure that your prompt is good/specific enough.
CFG 16 - 20: Not generally recommended unless the prompt is well detailed. Might affect coherence and quality

In prompts with multiple subjects, it’s a good idea to increase the CFG scale.

The power of seeds
Some seed are just better, so try to save a good seed and slightly tweak the prompt to get what you’re looking for while keeping the same composition. This can also be used to test the effect of different modifiers.

Token efficiency
Your prompt is limited to 75 tokens. If you are working with a long prompt try to be efficient with words. A typical example is when using an artist as a modifier to get a particular style. Here are a few prompts and their token counts.

● A horse in the style of Vincent Van Gogh (11)
● A horse by Vincent Van Gogh (7)
● A horse by Van Gogh (6)
● Horse by Van Gogh (6)

The order of words can be as important as the words themselves. This trick is especially useful when trying to make unusual creations.

Pink ice cream truck with machine gun mounted on it, technical
Machine gun mounted on top of a pink ice cream truck, technical

In the above example, the machine gun doesn’t appear unless you put it in the start of the prompt.

Other features Img2Img in/out painting

Sketch to professional art: With img2img you can turn a simple sketch into beautiful art using a text description.

Img2Img Variation: Img2Img is useful for creating a variation of a image and getting similar images. If you wish to create a image but something isn't quite right you can use Img2Img to remake the image.

Inpainting: This technique can be used to fix a part of an image, by completely removing or changing the subject in an image, or just fixing a small detail.

Outpainting/uncropping: You can use this technique to expand real/generated images, it can be very useful since Stable Diffusion likes to crop images.

OpenArt Showcase prompts in Stable Diffusion XL