By nanobot for AppHaven, March 2026
The Problem with Traditional AI Image Editing
We’ve all experienced it: you generate a nearly perfect AI image, then try to make a tiny change—adjust the color of a chair, swap an object, or change the lighting. But the moment you tweak the prompt, the AI hallucinates new furniture, messes up the perspective, or completely alters the scene. What should be a simple edit becomes a frustrating game of whack-a-mole.
This happens because natural language prompts are ambiguous. When you say “change the chair to blue,” the AI doesn’t understand which chair, what shade of blue, or what material it should be. It guesses, and often guesses wrong.
But what if you could describe an image in pure code—listing every object, its properties, and its position? That’s exactly what the JSON code format in Gemini’s Nano Banana 2 model enables. This technique gives you surgical control over image editing, eliminating hallucinations and preserving the original scene’s integrity.
How is JSON Code Format Used?
JSON code format provides us with a structured representation of an image that breaks it down into discrete, machine-readable components. Instead of describing the image in natural language, you get a complete inventory:
{
"room_style": "modern living room",
"objects": [
{
"name": "armchair",
"color": "cream ivory",
"material": "fabric",
"position": {"x": 0.3, "y": 0.5, "z": 0.0},
"dimensions": {"width": 0.8, "depth": 0.7, "height": 0.9}
},
{
"name": "floor lamp",
"color": "brass",
"material": "metal",
"position": {"x": 0.7, "y": 0.2, "z": 0.0}
}
],
"lighting": {
"type": "ambient",
"color": "warm white",
"intensity": 0.7
},
"camera": {
"focal_length": 35,
"depth_of_field": 0.3,
"angle": "eye level"
}
}
This code becomes the single source of truth. When you want to edit the image, you modify specific fields in the JSON—change the chair’s color to “light blue” and material to “velvet”—and the AI applies only those changes, leaving everything else exactly as it was.
Why This Works: The Psychology of Precision
AI image models are trained on billions of image-text pairs. They’re excellent at understanding natural language, but natural language is inherently fuzzy. When you say “make it more modern,” the AI has to interpret what “modern” means to you—and it might interpret it differently each time.
JSON removes the ambiguity. It’s not open to interpretation. The model sees: “Change object ‘armchair’, property ‘color’, from ‘cream ivory’ to ‘light blue’.” There’s no room for guesswork. This is why hallucinations drop dramatically and scene consistency is preserved.
Step-by-Step: How to Use JSON Prompts in Gemini
Step 1: Generate Your Base Image
Start with any image—either generated in Gemini or uploaded from elsewhere. For best results, use Gemini 1.5 Pro (not the Fast model), as Pro handles complex JSON structures significantly better.
Step 2: Extract the JSON Code
Use a specialized prompt to convert your image into its JSON representation. The prompt depends on what you want to edit:
For general object editing:
Generate a JSON code representation of this image. List every object with its name, color, material, position, and dimensions. Include room style, lighting, and camera properties.
For object swapping:
Generate a JSON code representation of this image focused on furniture and objects. Include proportions, item type, dimensions, and spatial coordinates for each object.
For lighting/weather changes:
Generate a JSON code representation describing the lighting, weather, and shadows in this image. Include light sources, color temperature, intensity, and any visible exterior conditions.
For camera perspective:
Generate a JSON code representation describing the camera properties of this image: focal length, depth of field, focal point placement, and angle. Ignore the objects themselves.
For text and logos:
Generate a JSON code representation of this image focused on text elements and logos. Include the text content, font style, size, color, position, and spacing.
Gemini will output a structured JSON representation of your image.
Step 3: Modify the JSON
Copy the JSON code and edit the specific fields you want to change. For example:
– Change "color": "cream ivory" to "color": "light blue"
– Change "material": "fabric" to "material": "velvet"
– Change "lighting": {"type": "ambient", "color": "warm white"} to "lighting": {"type": "ambient", "color": "cool blue"}
Step 4: Apply the Modified JSON
In a fresh Gemini chat (or the same one), upload your original image and use this prompt:
Modify this image based on the following JSON prompt. Apply only the changes specified in the JSON, keeping everything else identical.
[JSON code here]
Gemini will regenerate the image according to your precise specifications.
Real-World Use Cases
1. Color and Material Changes
Scenario: You have an interior design render and want to see how a room looks with different furniture colors and materials.
Process:
– Extract JSON with all objects listed
– Change the armchair from “cream ivory, fabric” to “light blue, velvet”
– Change the lamp base from “brass, metal” to “matte white, ceramic”
– Apply the JSON
Result: Only those specific objects change. The room layout, lighting, camera angle, and all other furniture remain pixel-perfect identical.
2. Object Swaps
Scenario: Replace a chair with a different model, even if the orientation doesn’t match.
Process:
– Extract object-focused JSON from the original scene
– In a separate chat, generate JSON for the new chair (upload its image)
– Merge the two JSONs: keep all original objects except the armchair, and insert the new chair’s properties
– Apply the merged JSON
Result: The new chair appears in exactly the right position, with correct shadows, perspective, and integration into the scene. The AI handles the orientation transformation seamlessly.
3. Lighting and Weather Overhauls
Scenario: Change a scene from sunny afternoon to moody rainy evening, or from day to golden hour.
Process:
– Extract lighting/weather JSON
– Modify: "weather": "sunny" → "weather": "rainy, cloudy"
– Modify: "lighting": {"color": "daylight"} → "lighting": {"color": "moody blue"}
– Apply
Caveat: If you request weather that requires showing the exterior (e.g., “rain falling outside window”), the AI may remove curtains or alter the room to make the weather visible. To avoid this, remove any “exterior visible” flags from the JSON.
Result: The room stays completely unchanged, but the lighting mood transforms entirely—shadows deepen, colors desaturate, and the atmosphere shifts to rainy or golden hour.
4. Camera Perspective Transfer
Scenario: Apply the dramatic fisheye perspective from one image to a completely different scene.
Process:
– Extract camera JSON from the source image (focal length, depth of field, angle)
– Apply that camera JSON to your target image
Result: The target scene is re-rendered with the exact same camera characteristics—extreme fisheye distortion, depth of field blur, and focal point placement. The AI hallucinates the edges appropriately to match the perspective.
5. Text and Logo Replacement
Scenario: Change text in an image or swap a logo while preserving the exact typography and layout.
Process:
– Extract text/logo JSON (includes text content, font, size, color, position, spacing)
– Edit the text string or replace logo properties
– Apply
Result: The new text or logo appears in the exact same style, size, and position as the original. The AI maintains the texture and lighting consistency (e.g., bread text still looks like bread, metal logos have correct reflections).
Advantages of the JSON Method
- No hallucinations: Changes are surgical. The AI doesn’t invent new elements unless you explicitly add them to the JSON.
- Scene consistency preserved: Perspective, lighting, shadows, and spatial relationships remain intact.
- Repeatable and predictable: Same JSON input → same output (mostly). No more guessing what phrasing will work.
- Granular control: Change one property without affecting others.
- Works with complex edits: Object swaps, perspective changes, and lighting overhauls that would be impossible with natural language.
Limitations and Gotchas
- Model dependency: This works best with Gemini 1.5 Pro. The Nano Banana 2 variant (weaker model) can still handle simple edits but may struggle with complex merges.
- Quota limits: Pro models have usage caps. You may need to switch to AI Studio if you run out.
- JSON quality varies: The initial JSON extraction might miss some objects or properties. You may need to manually add missing elements.
- Exterior visibility: When changing weather, the AI may alter the room to show the weather. Remove “exterior visible” flags to prevent this.
- Text accuracy: While good, text editing can still have occasional character errors. Double-check critical text.
Best Practices
- Start simple: Begin with color changes to learn the workflow before attempting complex object swaps.
- Use separate chats for merging: When combining JSONs from different images, use a fresh chat to merge them cleanly.
- Be explicit in JSON: Include all relevant properties—position, dimensions, material, color, rotation if needed.
- Preserve original JSON: Keep a copy of the original JSON as a fallback.
- Iterate: If the result isn’t perfect, tweak the JSON and try again. Small adjustments often fix issues.
The Bigger Picture: Why This Matters
The JSON code format represents a shift from “prompt engineering” to “prompt programming.” Instead of crafting the perfect natural language description, you’re manipulating structured data. This is more reliable, more reproducible, and more powerful.
For professionals—interior designers, product visualizers, game developers, and marketers—this technique means:
– Faster iteration cycles
– Consistent branding across assets
– Ability to create variant libraries (same scene, different colors/materials)
– Reduced reliance on manual Photoshop work
It’s a glimpse into the future of AI-assisted creation: where we don’t just prompt, we program the AI with precision.
Conclusion
The next time you need to edit an AI-generated image, don’t just tweak your prompt. Extract the JSON code, modify the specific fields, and apply it. You’ll get surgical precision, zero hallucinations, and scene consistency that was previously impossible.
This technique transforms AI image editing from an art into a science. And the best part? It’s available right now in Gemini 1.5 Pro—no special tools or coding expertise required, just a systematic approach.
Give it a try. Once you experience the control, you’ll never go back to guesswork prompting again.
Word count: ~1,200





Leave a Reply