Midjourney 3D Frames Done Right - My Process

AI Prompt Asset
Commercial photography of a fit young woman with messy brown ponytail sitting inside a giant physical 3D Instagram post frame prop, wearing dusty rose Nike cropped t-shirt and matching athletic shorts with chunky white Nike Air Force style sneakers, the frame is thick glossy white plastic with rounded corners and realistic UI details: circular profile picture with woman's face, username "June" in bold sans-serif, blue "Follow" button with rounded corners, heart icon showing 785 likes in red, comment bubble outline, paper plane share icon, bookmark icon at bottom right, frame interior displays moody industrial gym with hazy blue volumetric light cutting through atmospheric steam and visible overhead fluorescent tubes, outside the frame is seamless medium-gray cyclorama studio backdrop with soft gradient, subject lit by large softbox key light from camera left creating subtle shadow on right cheek, rim light from behind separating hair from dark gym background, shallow depth of field with frame edges slightly soft, 8k, photorealistic, advertising campaign aesthetic, shot on Sony A7R V with 85mm f/1.4 lens --ar 1:1 --style raw --v 6.1
Prompt copied!

Quick Tip: Click the prompt box above to select it, then press Ctrl+C (Cmd+C on Mac) to copy. Paste directly into Midjourney, DALL-E, or Stable Diffusion!

The Spatial Problem of Containment

Creating a convincing 3D social media frame requires solving a fundamental spatial paradox: the subject must appear physically contained within a constructed object while that object simultaneously functions as a window into a different environment. Most attempts at this technique fail because they describe the frame as a graphic element rather than a physical one, or because they allow environmental conditions to bleed across boundaries that should remain distinct.

The breakthrough lies in understanding how diffusion models construct space. Midjourney does not render true 3D geometry with occlusion and lighting physics. Instead, it assembles visual patterns based on semantic relationships and spatial descriptors in your prompt. When you write "Instagram frame," the model accesses training associations with flat UI screenshots and digital overlays. When you write "giant physical 3D prop," you activate associations with sculpture, installation art, and set photography—completely different visual vocabularies that produce dimensional results.

This distinction between graphic and physical treatment determines every subsequent element. A graphic frame has no thickness, casts no shadow on its supporting surface, and interacts with light as a flat plane. A physical prop has volume, creates contact shadows where it meets the floor, and exhibits material properties like edge highlights, surface reflections, and subtle color bleeding from environmental bounce light. The prompt must construct this physicality explicitly because the default interpretation leans toward digital flatness.

Building the Frame as Object

Material specification anchors the frame in physical reality. "Thick glossy white plastic" provides three critical pieces of information: dimensional presence (thick), surface interaction (glossy), and material category (plastic). Each term serves a specific function. "Thick" prevents the wireframe-thin borders that appear when frame edges are unspecified. "Glossy" ensures the specular highlights that read as curved or rounded geometry to the human eye. "Plastic" establishes appropriate reflectivity and subsurface scattering—more responsive than metal, more defined than fabric.

The rounded corners matter beyond aesthetics. Sharp 90-degree corners in generated imagery often appear truncated, aliased, or visually unstable because the model struggles with extreme geometric precision. "Rounded corners" produces more reliable geometry while simultaneously signaling contemporary UI design language. This bridges the physical prop and digital reference without contradiction.

Scale specification must appear early in the prompt structure. Describing the subject first ("woman sitting inside") followed by the frame's physical properties establishes the relative proportions before the model commits to composition. Late or implicit scale references often result in frames that dwarf the subject or subjects awkwardly cropped by undersized borders. The word "giant" functions as an explicit scale anchor.

Environmental Separation and Lighting Zones

The most technically challenging aspect of this composition is maintaining distinct environmental conditions inside and outside the frame. Without explicit separation, Midjourney produces muddy spatial results: the studio background acquires atmospheric haze from the gym interior, or the industrial lighting bleeds onto the cyclorama, collapsing the dimensional illusion.

The solution is zone-specific lighting specification. The exterior environment—"seamless medium-gray cyclorama studio"—requires clean, controlled studio lighting that reads as photographic infrastructure. The interior environment—"moody industrial gym with hazy blue volumetric light"—demands atmospheric treatment with particulate matter (steam) scattering light into visible rays. These systems must be described as independent rather than continuous.

The rim light specification serves a critical compositional function. By placing light "from behind" the subject, you create luminance separation between the figure and the dark gym background. Without this edge definition, the subject's hair and shoulders visually merge with the interior scene, destroying the frame's portal effect. The rim light is not decorative; it maintains figure-ground hierarchy across the depth separation.

Color temperature differentiation reinforces spatial zones. The warm skin tones and neutral studio lighting contrast against the cool blue volumetric interior, creating intuitive depth cues. When both zones share similar color temperatures, the frame reads as transparent overlay rather than dimensional aperture.

UI Specification as Design System

Rendering legible, correctly positioned interface elements requires treating the UI as a designed system rather than a feature list. The model's text generation capabilities remain limited; it excels at visual patterns but struggles with arbitrary character strings. Strategic specification works within these constraints.

Username selection matters. "June" is short, common, and uses letterforms that the model renders reliably. Complex or unusual names produce character artifacts. The specification "bold sans-serif" provides visual weight without requiring precise font identification that the model cannot execute.

Icon specification benefits from color and shape specificity. "Red heart" rather than "like button" activates established visual patterns. "Outline" vs "filled" distinguishes comment and share icons without requiring precise glyph description. Spatial distribution—profile and username top-left, follow button top-right, interaction icons bottom—mirrors actual Instagram layout, leveraging the model's training on countless interface screenshots.

The like count "785" functions as texture rather than information. The model processes numerical strings as visual patterns; specific numbers produce more consistent results than variable text. Three digits fit the expected visual space without overflow or compression artifacts.

Camera and Post-Processing Parameters

The "shot on Sony A7R V with 85mm f/1.4 lens" specification serves multiple technical purposes. The 85mm focal length produces natural perspective compression that flatters the figure without the distortion of wider angles. The f/1.4 aperture creates shallow depth of field that softens frame edges slightly, reinforcing the subject as focal point while maintaining frame readability.

The "--style raw" parameter is essential for this prompt type. Standard Midjourney styling applies aesthetic adjustments that often intensify colors and contrast beyond photographic realism. For product and commercial photography aesthetics, raw processing maintains the controlled, calibrated look of professional advertising imagery. Combined with "--ar 1:1," the square format reinforces the Instagram reference while providing balanced composition space.

Related techniques for controlled product photography appear in our organic product photography guide, which explores studio lighting hierarchies in depth. For understanding how material specification affects rendering, the porcelain bust prompt breakdown demonstrates surface property construction. The broader context of commercial photography approaches is covered in Midjourney's official documentation.

Conclusion

Successful 3D frame prompts require architectural thinking: constructing distinct spatial zones with specific material, lighting, and atmospheric properties, then maintaining their separation through explicit description. The frame is not a border but a physical object; the interior is not a background but a contained environment. This conceptual clarity produces the dimensional, commercially viable results that flat graphic approaches cannot achieve.

Label: Product

Key Principle: Treat the frame as a physical prop with material properties and the interior as a separate environment with independent lighting. Environmental separation prevents the most common failure: flat, confused space where frame and content visually collapse.