Media generation input format for large vision model.
The image bytes or Cloud Storage URI to make the prediction on. It is required for editing. Not needed for generation. This field will be used to determine whether the call is editing or generation.
promptstring
The text prompt for generating the images. This is required for both editing and generation.
Masked field will be editied based on the text content provided. This can be either an image or a polygon. It should not be provided without images. Optional field for editing the images.
The reference images to be used for editing and customization capabilities. Imagen 3 Capability adds support for multiple reference images, each of which can be a mask, control, style, or subject image. Depending on the reference type, the reference_config field will be populated with the corresponding config.
| JSON representation |
|---|
{ "image": { object ( |
Image
mimeTypestring
The MIME type of the content of the image. Only the images in below listed MIME types are supported. - image/jpeg - image/png
dataUnion type
data can be only one of the following:bytesBase64Encodedstring
Base64 encoded bytes string representing the image.
gcsUristring
| JSON representation |
|---|
{ "mimeType": string, // data "bytesBase64Encoded": string, "gcsUri": string // Union type } |
Mask
dataUnion type
| JSON representation |
|---|
{ // data "image": { object ( |
BoundingPolyList
| JSON representation |
|---|
{
"polygons": [
{
object ( |
ReferenceImage
A ReferenceImage is an image that is used to provide additional context for the image generation or editing.
The actual image data of the reference image.
referenceIdinteger
The id of the reference image. This must be unique within the request.
The type of the reference image.
reference_configUnion type
reference_config can be only one of the following:A config for a mask image.
A config for a control image.
A config for a style image.
A config for a subject image.
| JSON representation |
|---|
{ "referenceImage": { object ( |
MaskImageConfig
Config for masked image editing using Imagen 3 Capability
maskModeenum (MaskMode)
Mode used to generate the mask if mask is not provided.
dilationnumber
Dilation to be used with this Mask. This value is used to dilate the mask before applying the edit mode.
maskClasses[]integer
The segmentation classes which are used in the MASK_MODE_SEMANTIC mode.
| JSON representation |
|---|
{
"maskMode": enum ( |
ControlImageConfig
Config for control image used for editing.
type of control image.
enableControlImageComputationboolean
Whether to compute the control image for the request.
superpixelRegionSizeinteger
Region size of the superpixel control image.
superpixelRulernumber
Ruler of the superpixel control image.
| JSON representation |
|---|
{
"controlType": enum ( |
StyleImageConfig
Config for style image used for editing.
styleDescriptionstring
description of the style image.
| JSON representation |
|---|
{ "styleDescription": string } |
SubjectImageConfig
Config for subject image used for editing.
subjectDescriptionstring
description of the subject image.
type of subject image.
| JSON representation |
|---|
{
"subjectDescription": string,
"subjectType": enum ( |