# Image generation with Gemini Source: --- Gemini can generate and process images conversationally. You can prompt Gemini with text, images, or a combination of both to achieve various image-related tasks, such as image generation and editing. All generated images include a [SynthID watermark](/responsible/docs/safeguards/synthid). Image generation may not be available in all regions and countries, review our [Gemini models](/gemini-api/docs/models#gemini-2.0-flash-preview-image-generation) page for more information. **Note:** You can also generate images with [Imagen](/gemini-api/docs/imagen), our specialized image generation model. See the When to use Imagen section for details on how to choose between Gemini and Imagen. ## Image generation (text-to-image) The following code demonstrates how to generate an image based on a descriptive prompt. You must include `responseModalities`: `["TEXT", "IMAGE"]` in your configuration. Image-only output is not supported with these models. from google import genai from google.genai import types from PIL import Image from io import BytesIO import base64 client = genai.Client() contents = ('Hi, can you create a 3d rendered image of a pig ' 'with wings and a top hat flying over a happy ' 'futuristic scifi city with lots of greenery?') response = client.models.generate_content( model="gemini-2.0-flash-preview-image-generation", contents=contents, config=types.GenerateContentConfig( response_modalities=['TEXT', 'IMAGE'] ) ) for part in response.candidates[0].content.parts: if part.text is not None: print(part.text) elif part.inline_data is not None: image = Image.open(BytesIO((part.inline_data.data))) image.save('gemini-native-image.png') image.show() ![AI-generated image of a fantastical flying pig](/static/gemini-api/docs/images/flying-pig.png) AI-generated image of a fantastical flying pig ## Image editing (text-and-image-to-image) To perform image editing, add an image as input. The following example demonstrates uploading base64 encoded images. For multiple images and larger payloads, check the [image input](/gemini-api/docs/image-understanding#image-input) section. from google import genai from google.genai import types from PIL import Image from io import BytesIO import PIL.Image image = PIL.Image.open('/path/to/image.png') client = genai.Client() text_input = ('Hi, This is a picture of me.' 'Can you add a llama next to me?',) response = client.models.generate_content( model="gemini-2.0-flash-preview-image-generation", contents=[text_input, image], config=types.GenerateContentConfig( response_modalities=['TEXT', 'IMAGE'] ) ) for part in response.candidates[0].content.parts: if part.text is not None: print(part.text) elif part.inline_data is not None: image = Image.open(BytesIO((part.inline_data.data))) image.show() ## Other image generation modes Gemini supports other image interaction modes based on prompt structure and context, including: * **Text to image(s) and text (interleaved):** Outputs images with related text. * Example prompt: "Generate an illustrated recipe for a paella." * **Image(s) and text to image(s) and text (interleaved)** : Uses input images and text to create new related images and text. * Example prompt: (With an image of a furnished room) "What other color sofas would work in my space? can you update the image?" * **Multi-turn image editing (chat):** Keep generating / editing images conversationally. * Example prompts: [upload an image of a blue car.] , "Turn this car into a convertible.", "Now change the color to yellow." ## Limitations * For best performance, use the following languages: EN, es-MX, ja-JP, zh-CN, hi-IN. * Image generation does not support audio or video inputs. * Image generation may not always trigger: * The model may output text only. Try asking for image outputs explicitly (e.g. "generate an image", "provide images as you go along", "update the image"). * The model may stop generating partway through. Try again or try a different prompt. * When generating text for an image, Gemini works best if you first generate the text and then ask for an image with the text. * There are some regions/countries where Image generation is not available. See [Models](/gemini-api/docs/models) for more information. ## When to use Imagen In addition to using Gemini's built-in image generation capabilities, you can also access [Imagen](/gemini-api/docs/imagen), our specialized image generation model, through the Gemini API. Choose **Gemini** when: * You need contextually relevant images that leverage world knowledge and reasoning. * Seamlessly blending text and images is important. * You want accurate visuals embedded within long text sequences. * You want to edit images conversationally while maintaining context. Choose **Imagen** when: * Image quality, photorealism, artistic detail, or specific styles (e.g., impressionism, anime) are top priorities. * Performing specialized editing tasks like product background updates or image upscaling. * Infusing branding, style, or generating logos and product designs. Imagen 4 should be your go-to model starting to generate images with Imagen. Choose Imagen 4 Ultra for advanced use-cases or when you need the best image quality. Note that Imagen 4 Ultra can only generate one image at a time. ## What's next * Check out the [Veo guide](/gemini-api/docs/video) to learn how to generate videos with the Gemini API. * To learn more about Gemini models, see [Gemini models](/gemini-api/docs/models/gemini) and [Experimental models](/gemini-api/docs/models/experimental-models).