# Image generation with Gemini

Source: <https://ai.google.dev/gemini-api/docs/image-generation>

---

Gemini can generate and process images conversationally. You can prompt Gemini with text, images, or a combination of both to achieve various image-related tasks, such as image generation and editing. All generated images include a [SynthID watermark](/responsible/docs/safeguards/synthid).

Image generation may not be available in all regions and countries, review our [Gemini models](/gemini-api/docs/models#gemini-2.0-flash-preview-image-generation) page for more information.

**Note:** You can also generate images with [Imagen](/gemini-api/docs/imagen), our specialized image generation model. See the When to use Imagen section for details on how to choose between Gemini and Imagen.

## Image generation (text-to-image)

The following code demonstrates how to generate an image based on a descriptive prompt. You must include `responseModalities`: `["TEXT", "IMAGE"]` in your configuration. Image-only output is not supported with these models.
    
    
    from google import genai
    from google.genai import types
    from PIL import Image
    from io import BytesIO
    import base64
    
    client = genai.Client()
    
    contents = ('Hi, can you create a 3d rendered image of a pig '
                'with wings and a top hat flying over a happy '
                'futuristic scifi city with lots of greenery?')
    
    response = client.models.generate_content(
        model="gemini-2.0-flash-preview-image-generation",
        contents=contents,
        config=types.GenerateContentConfig(
          response_modalities=['TEXT', 'IMAGE']
        )
    )
    
    for part in response.candidates[0].content.parts:
      if part.text is not None:
        print(part.text)
      elif part.inline_data is not None:
        image = Image.open(BytesIO((part.inline_data.data)))
        image.save('gemini-native-image.png')
        image.show()
    

![AI-generated image of a fantastical flying pig](/static/gemini-api/docs/images/flying-pig.png) AI-generated image of a fantastical flying pig

## Image editing (text-and-image-to-image)

To perform image editing, add an image as input. The following example demonstrates uploading base64 encoded images. For multiple images and larger payloads, check the [image input](/gemini-api/docs/image-understanding#image-input) section.
    
    
    from google import genai
    from google.genai import types
    from PIL import Image
    from io import BytesIO
    
    import PIL.Image
    
    image = PIL.Image.open('/path/to/image.png')
    
    client = genai.Client()
    
    text_input = ('Hi, This is a picture of me.'
                'Can you add a llama next to me?',)
    
    response = client.models.generate_content(
        model="gemini-2.0-flash-preview-image-generation",
        contents=[text_input, image],
        config=types.GenerateContentConfig(
          response_modalities=['TEXT', 'IMAGE']
        )
    )
    
    for part in response.candidates[0].content.parts:
      if part.text is not None:
        print(part.text)
      elif part.inline_data is not None:
        image = Image.open(BytesIO((part.inline_data.data)))
        image.show()
    

## Other image generation modes

Gemini supports other image interaction modes based on prompt structure and context, including:

  * **Text to image(s) and text (interleaved):** Outputs images with related text. 
    * Example prompt: "Generate an illustrated recipe for a paella."
  * **Image(s) and text to image(s) and text (interleaved)** : Uses input images and text to create new related images and text. 
    * Example prompt: (With an image of a furnished room) "What other color sofas would work in my space? can you update the image?"
  * **Multi-turn image editing (chat):** Keep generating / editing images conversationally. 
    * Example prompts: [upload an image of a blue car.] , "Turn this car into a convertible.", "Now change the color to yellow."


## Limitations

  * For best performance, use the following languages: EN, es-MX, ja-JP, zh-CN, hi-IN.
  * Image generation does not support audio or video inputs.
  * Image generation may not always trigger: 
    * The model may output text only. Try asking for image outputs explicitly (e.g. "generate an image", "provide images as you go along", "update the image").
    * The model may stop generating partway through. Try again or try a different prompt.
  * When generating text for an image, Gemini works best if you first generate the text and then ask for an image with the text.
  * There are some regions/countries where Image generation is not available. See [Models](/gemini-api/docs/models) for more information.


## When to use Imagen

In addition to using Gemini's built-in image generation capabilities, you can also access [Imagen](/gemini-api/docs/imagen), our specialized image generation model, through the Gemini API.

Choose **Gemini** when:

  * You need contextually relevant images that leverage world knowledge and reasoning.
  * Seamlessly blending text and images is important.
  * You want accurate visuals embedded within long text sequences.
  * You want to edit images conversationally while maintaining context.


Choose **Imagen** when:

  * Image quality, photorealism, artistic detail, or specific styles (e.g., impressionism, anime) are top priorities.
  * Performing specialized editing tasks like product background updates or image upscaling.
  * Infusing branding, style, or generating logos and product designs.


Imagen 4 should be your go-to model starting to generate images with Imagen. Choose Imagen 4 Ultra for advanced use-cases or when you need the best image quality. Note that Imagen 4 Ultra can only generate one image at a time.

## What's next

  * Check out the [Veo guide](/gemini-api/docs/video) to learn how to generate videos with the Gemini API.
  * To learn more about Gemini models, see [Gemini models](/gemini-api/docs/models/gemini) and [Experimental models](/gemini-api/docs/models/experimental-models).