How to use ChatGPT + Midjourney to generate pictures in batches

We all know that visual content is very important for a content creator. However, creating visual content can be a difficult task for people without design skills. However, with the continuous advancement of artificial intelligence technology, we can use some AI tools to help us create visual content, such as using ChatGPT and Midjourney to batch generate images.

Let's have a brief look at ChatGPT and Midjourney. ChatGPT is a natural language processing model based on artificial intelligence technology that can simulate the human language generation process and generate text similar to human writing based on input text. Midjourney is a picture generation tool that can generate corresponding pictures from the input text content.

First train ChatGPT to understand the background knowledge of Midjourney:

Brain AI: Do you know Midjourney?

ChatGPT: I'm not sure what you're referring to, as Midjourney could refer to a number of different things. Can you provide more context or background information? This way I can better answer your questions.

Tell him: Midjourney can generate images from natural language descriptions, called prompts, similar to OpenAI's DALL-E and Stable Diffusion.

Brain hole AI: Midjourney can generate images from natural language descriptions, called "prompts", similar to OpenAI's DALL-E and Stable Diffusion

ChatGPT: I understand what you mean about Midjourney. Midjourney is an image generation system based on artificial intelligence technology that can generate corresponding images by inputting natural language prompts. Similar to other image generation systems such as DALL-E and Stable Diffusion, Midjourney is based on deep learning and generative adversarial network technology and is able to generate high-quality, realistic images.

Unlike other image generation systems, Midjourney's design goal is to provide users with a simpler and more intuitive way to generate images, that is, through natural language description. Users only need to enter a short prompt, such as "A pink fox sleeps under a tree," and Midjourney will generate an image that matches the description of the prompt.

Midjourney's model is an improved version based on the GPT model, using a Transformer encoder and decoder structure similar to DALL-E, which can understand and interpret natural language cues and convert them into corresponding images.

Overall, Midjourney is an innovative image generation technology that makes image generation simpler and more intuitive through natural language prompts.

After understanding Midjourney's background in ChatGPT, he can be trained to create some preliminary prompts.

Brainhole AI: Suppose you are a Midjourney prompt generator, and your job is to generate a fixed-format prompt based on the information provided by the user. Then Midjourney can generate images through the prompt you generated. Please be as imaginative and descriptive as possible so Midjourney can generate unique and interesting images. Remember that prompts can only be described in English. Here is an example of a prompt: Iron Man standing in front of the Great Wall of China

ChatGPT:Here are some potential prompts that could be generated for Midjourney:

A cat sitting on a rainbow-colored cloud

A surfer catching a massive wave on a sunny day

An astronaut exploring an alien planet filled with neon-colored plants

A mermaid swimming through a coral reef with a school of colorful fish

A wizard casting a spell in a dark and mysterious forest

A unicorn galloping across a field of golden flowers

A dragon breathing fire atop a medieval castle

A pirate sailing on a ship through a stormy sea

A robot walking through a futuristic city filled with neon lights

A superhero flying through the clouds with a cityscape in the background

Each of these prompts is designed to spark the imagination and create a vivid mental image that Midjourney can use to generate a unique and interesting image. Whether it's a magical fantasy world, a futuristic sci-fi landscape, or an epic adventure on the high seas, the possibilities are endless with Midjourney and a well-crafted prompt.

We randomly choose a few interesting prompts and try to generate pictures in Midjourney.

A unicorn galloping across a field of golden flowers

A superhero flying through the clouds with a cityscape in the background

It can be seen that although these simple prompts can generate the pictures imagined by midjourney, they lack some style and detail control. For this, we need to further train ChatGPT to generate formatted prompts.

Brain AI: This is a command format of Midjourney. Please help me generate 5 different commands according to this format. The format is as follows: [Pet breed] + [Pet clothing] + [Pet behavior] + [Camera model] + [ High Definition]

Chihuahua wearing a suit and dancing cha-cha, shot on a Sony DSLR, 70-200mm lens f/4.0, cinematic

ChatGPT: The following are 5 commands generated according to the Midjourney command format:

Labrador Retriever wearing a bow tie and playing catch, shot on a Canon EOS R, 50mm lens f/1.8, high definition.

Persian cat wearing a crown and napping on a cushion, shot on a Nikon Z6 II, 85mm lens f/1.8, cinematic.

Poodle wearing a hoodie and running in a park, shot on a Fujifilm X-T4, 35mm lens f/2.0, high resolution.

Beagle wearing a graduation gown and receiving a diploma, shot on a Sony A7S III, 24-70mm lens f/2.8, artistic.

Siamese cat wearing a bow tie and sitting on a bookshelf, shot on a Panasonic GH5, 12-60mm lens f/3.5-5.6, vibrant.

Let’s choose two prompts and try them out.

Persian cat wearing a crown and napping on a cushion, shot on a Nikon Z6 II, 85mm lens f/1.8, cinematic.

Poodle wearing a hoodie and running in a park, shot on a Fujifilm X-T4, 35mm lens f/2.0, high resolution.

It can be seen that the generated pictures are all pets, and they wear random clothes and perform random actions. The photo style is also more realistic as we define the camera model and sharpness of the shot.

At this point, the preliminary training of our Midjourney prompt generator version of ChatGPT has been completed. Next, we just need to repeat the previous step for different trainings. For example, describing a car:

Brain AI: Please help me generate 5 different commands according to this format. The format is as follows: [Car model] + [Background description] + [Camera model] + [Aperture and lens] + [3 random high-quality words strip】. Mercedes-Benz S-Class Silver on a coastal road at sunset, shot on a Leica mirrorless, 50mm lens f/1.4, dreamy

ChatGPT: The following are 5 commands generated according to the Midjourney command format:

Porsche 911 Carrera Red in an urban setting at night, shot on a Canon DSLR, 24-70mm lens f/2.8, cinematic, vibrant, moody.

Lamborghini Aventador Blue in a mountainous landscape at sunrise, shot on a Sony mirrorless, 85mm lens f/1.8, epic, surreal, breathtaking.

BMW M5 Black in a futuristic cityscape at dusk, shot on a Nikon DSLR, 70-200mm lens f/4.0, urban, dynamic, futuristic.

Audi R8 White in a desert landscape at noon, shot on a Fujifilm mirrorless, 35mm lens f/1.4, vivid, stark, high-contrast.

Ferrari 458 Italia Yellow in a forested area at dawn, shot on a Panasonic mirrorless, 24mm lens f/1.4, moody, atmospheric, immersive.

Porsche 911 Carrera Red in an urban setting at night, shot on a Canon DSLR, 24-70mm lens f/2.8, cinematic, vibrant, moody.

Ferrari 458 Italia Yellow in a forested area at dawn, shot on a Panasonic mirrorless, 24mm lens f/1.4, moody, atmospheric, immersive.

Audi R8 White in a desert landscape at noon, shot on a Fujifilm mirrorless, 35mm lens f/1.4, vivid, stark, high-contrast.

The more interesting recent time-travel and collision-style pictures can also be generated in batches, such as the Iron Man series:

Brain AI: Please help me generate 5 different commands according to this format. [Iron Man] + [China background description] + [Era] + [Lighting effect] + [Camera model] + [Aperture and lens] + [Realistic high-quality image entry] The following is a reference command: Ironman at a Chinese supermarket trying to buy some food, 1990s, Cinematic Lighting, shot on a Sony DSLR, 50mm lens f/2.8, ultra detailed

ChatGPT:Iron Man in front of the Forbidden City during the Ming Dynasty, Romantic Lighting, shot on a Canon mirrorless, 24-70mm lens f/2.8, hyper-realistic

Iron Man at a Chinese temple, ancient architecture, Vintage Lighting, shot on a Nikon DSLR, 85mm lens f/1.4, epic

Iron Man at a traditional Chinese street market, bustling crowds, Nostalgic Lighting, shot on a Panasonic mirrorless, 12-35mm lens f/2.8, cinematic

Iron Man on the Great Wall of China, overlooking the sunset, Dramatic Lighting, shot on a Fuji mirrorless, 56mm lens f/1.2, vivid

Iron Man in a Chinese tea house, enjoying the tea ceremony, Cozy Lighting, shot on a Sony DSLR, 35mm lens f/1.8, intimate

Iron Man in front of the Forbidden City during the Ming Dynasty, Romantic Lighting, shot on a Canon mirrorless, 24-70mm lens f/2.8, hyper-realistic

Iron Man at a traditional Chinese street market, bustling crowds, Nostalgic Lighting, shot on a Panasonic mirrorless, 12-35mm lens f/2.8, cinematic

Iron Man on the Great Wall of China, overlooking the sunset, Dramatic Lighting, shot on a Fuji mirrorless, 56mm lens f/1.2, vivid

There is also the famous Jianguo series:

Brain Hole AI: Please help me generate 10 different commands according to this format. [Trump] + [Emotions] + [Description of Chinese background] + [Era] + [Lighting effects] + [Camera model] + [Aperture and lens] + [Realistic high-quality image entry] The following is a paragraph Reference order: Donald Trump smiling but struggling to write Chinese calligraphy with a brush, surrounded by Chinese students observing and giving guidance, 1990s, Cinematic Lighting, shot on a high-quality DSLR with a 50mm lens, ultra-realistic

ChatGPT:Trump staring thoughtfully out of the window of a traditional Chinese teahouse, 1980s, Golden Hour lighting, shot on a Canon mirrorless with a 35mm lens, hyperrealistic.

Trump wearing a traditional Chinese robe and hat, pretending to meditate in a bamboo forest, 2000s, Soft Lighting, shot on a Nikon DSLR with a 85mm lens, painterly.

Trump in a Chinese market, bargaining with vendors over the price of a jade statue, 1990s, Overhead Lighting, shot on a Fujifilm mirrorless with a 24-70mm lens, cinematic.

Trump standing on the Great Wall of China, arms crossed, looking serious and contemplative, 2020s, Dramatic Lighting, shot on a Sony mirrorless with a 70-200mm lens, epic.

Trump posing with a group of Chinese children, dressed up in traditional Chinese clothing, 1980s, Natural Lighting, shot on a Leica mirrorless with a 50mm lens, nostalgic.

Trump sampling Chinese cuisine in a bustling restaurant, 2010s, Neon Lighting, shot on a Panasonic mirrorless with a 12-35mm lens, vibrant.

Trump giving a speech to a large crowd of Chinese people, waving the American flag, 1990s, Spotlight Lighting, shot on a Canon DSLR with a 24-105mm lens, dynamic.

Trump practicing Tai Chi with a group of Chinese elders in a park, 2000s, Soft Lighting, shot on a Pentax mirrorless with a 35mm lens, peaceful.

Trump walking through a traditional Chinese garden, admiring the flowers and architecture, 1980s, Natural Lighting, shot on a Olympus mirrorless with a 50mm lens, tranquil.

Trump playing ping-pong with a group of Chinese children, laughing and having a good time, 1990s, Soft Lighting, shot on a Nikon DSLR with a 50mm lens, heartwarming.

Trump wearing a traditional Chinese robe and hat, pretending to meditate in a bamboo forest, 2000s, Soft Lighting, shot on a Nikon DSLR with a 85mm lens, painterly.

Trump standing on the Great Wall of China, arms crossed, looking serious and contemplative, 2020s, Dramatic Lighting, shot on a Sony mirrorless with a 70-200mm lens, epic.

Trump sampling Chinese cuisine in a bustling restaurant, 2010s, Neon Lighting, shot on a Panasonic mirrorless with a 12-35mm lens, vibrant.

Ping pong diplomacy in the new era:

Trump playing ping-pong with a group of Chinese children, laughing and having a good time, 1990s, Soft Lighting, shot on a Nikon DSLR with a 50mm lens, heartwarming.

Through this method, you can generate high-quality pictures in batches in a very short time, which not only improves your creative efficiency, but also makes your works more vivid and interesting. At the same time, this method also enlightens us that artificial intelligence technology can bring more possibilities to our creations, allowing us to better express our creativity and ideas. I hope everyone can try this technique and develop their own unique style!

Guess you like

Origin blog.csdn.net/2301_76935063/article/details/130201065