| # Image generation with Gemini | |
| Source: <https://ai.google.dev/gemini-api/docs/image-generation> | |
| --- | |
| Gemini can generate and process images conversationally. You can prompt Gemini with text, images, or a combination of both to achieve various image-related tasks, such as image generation and editing. All generated images include a [SynthID watermark](/responsible/docs/safeguards/synthid). | |
| Image generation may not be available in all regions and countries, review our [Gemini models](/gemini-api/docs/models#gemini-2.0-flash-preview-image-generation) page for more information. | |
| **Note:** You can also generate images with [Imagen](/gemini-api/docs/imagen), our specialized image generation model. See the When to use Imagen section for details on how to choose between Gemini and Imagen. | |
| ## Image generation (text-to-image) | |
| The following code demonstrates how to generate an image based on a descriptive prompt. You must include `responseModalities`: `["TEXT", "IMAGE"]` in your configuration. Image-only output is not supported with these models. | |
| from google import genai | |
| from google.genai import types | |
| from PIL import Image | |
| from io import BytesIO | |
| import base64 | |
| client = genai.Client() | |
| contents = ('Hi, can you create a 3d rendered image of a pig ' | |
| 'with wings and a top hat flying over a happy ' | |
| 'futuristic scifi city with lots of greenery?') | |
| response = client.models.generate_content( | |
| model="gemini-2.0-flash-preview-image-generation", | |
| contents=contents, | |
| config=types.GenerateContentConfig( | |
| response_modalities=['TEXT', 'IMAGE'] | |
| ) | |
| ) | |
| for part in response.candidates[0].content.parts: | |
| if part.text is not None: | |
| print(part.text) | |
| elif part.inline_data is not None: | |
| image = Image.open(BytesIO((part.inline_data.data))) | |
| image.save('gemini-native-image.png') | |
| image.show() | |
|  AI-generated image of a fantastical flying pig | |
| ## Image editing (text-and-image-to-image) | |
| To perform image editing, add an image as input. The following example demonstrates uploading base64 encoded images. For multiple images and larger payloads, check the [image input](/gemini-api/docs/image-understanding#image-input) section. | |
| from google import genai | |
| from google.genai import types | |
| from PIL import Image | |
| from io import BytesIO | |
| import PIL.Image | |
| image = PIL.Image.open('/path/to/image.png') | |
| client = genai.Client() | |
| text_input = ('Hi, This is a picture of me.' | |
| 'Can you add a llama next to me?',) | |
| response = client.models.generate_content( | |
| model="gemini-2.0-flash-preview-image-generation", | |
| contents=[text_input, image], | |
| config=types.GenerateContentConfig( | |
| response_modalities=['TEXT', 'IMAGE'] | |
| ) | |
| ) | |
| for part in response.candidates[0].content.parts: | |
| if part.text is not None: | |
| print(part.text) | |
| elif part.inline_data is not None: | |
| image = Image.open(BytesIO((part.inline_data.data))) | |
| image.show() | |
| ## Other image generation modes | |
| Gemini supports other image interaction modes based on prompt structure and context, including: | |
| * **Text to image(s) and text (interleaved):** Outputs images with related text. | |
| * Example prompt: "Generate an illustrated recipe for a paella." | |
| * **Image(s) and text to image(s) and text (interleaved)** : Uses input images and text to create new related images and text. | |
| * Example prompt: (With an image of a furnished room) "What other color sofas would work in my space? can you update the image?" | |
| * **Multi-turn image editing (chat):** Keep generating / editing images conversationally. | |
| * Example prompts: [upload an image of a blue car.] , "Turn this car into a convertible.", "Now change the color to yellow." | |
| ## Limitations | |
| * For best performance, use the following languages: EN, es-MX, ja-JP, zh-CN, hi-IN. | |
| * Image generation does not support audio or video inputs. | |
| * Image generation may not always trigger: | |
| * The model may output text only. Try asking for image outputs explicitly (e.g. "generate an image", "provide images as you go along", "update the image"). | |
| * The model may stop generating partway through. Try again or try a different prompt. | |
| * When generating text for an image, Gemini works best if you first generate the text and then ask for an image with the text. | |
| * There are some regions/countries where Image generation is not available. See [Models](/gemini-api/docs/models) for more information. | |
| ## When to use Imagen | |
| In addition to using Gemini's built-in image generation capabilities, you can also access [Imagen](/gemini-api/docs/imagen), our specialized image generation model, through the Gemini API. | |
| Choose **Gemini** when: | |
| * You need contextually relevant images that leverage world knowledge and reasoning. | |
| * Seamlessly blending text and images is important. | |
| * You want accurate visuals embedded within long text sequences. | |
| * You want to edit images conversationally while maintaining context. | |
| Choose **Imagen** when: | |
| * Image quality, photorealism, artistic detail, or specific styles (e.g., impressionism, anime) are top priorities. | |
| * Performing specialized editing tasks like product background updates or image upscaling. | |
| * Infusing branding, style, or generating logos and product designs. | |
| Imagen 4 should be your go-to model starting to generate images with Imagen. Choose Imagen 4 Ultra for advanced use-cases or when you need the best image quality. Note that Imagen 4 Ultra can only generate one image at a time. | |
| ## What's next | |
| * Check out the [Veo guide](/gemini-api/docs/video) to learn how to generate videos with the Gemini API. | |
| * To learn more about Gemini models, see [Gemini models](/gemini-api/docs/models/gemini) and [Experimental models](/gemini-api/docs/models/experimental-models). | |