Ultra-Realistic Ghost Mask: The Exact AI Prompt Revealed
Quick Tip: Click the prompt box above to select it, then press Ctrl+C (Cmd+C on Mac) to copy. Paste directly into Midjourney, DALL-E, or Stable Diffusion!
Why Split-Tone Lighting Creates Dimension in Masked Portraits
The most critical technical decision in this prompt is not the mask description—it's the environmental lighting architecture. The specification of 5600K green-tinted atmospheric haze on the left and 3200K warm sepia key light on the right establishes what cinematographers call motivated color contrast, and understanding why this works reveals how to control any portrait's dimensional depth.
Color temperature in AI image generation operates as a constraint on the model's interpretation of light sources. When you specify a temperature differential of approximately 2400K between environmental and key lighting, you force the neural network to maintain distinct color spaces rather than averaging toward neutral white. The model's training on photography includes white balance correction algorithms—which means it "knows" that mixed lighting produces color casts. By making those casts intentional and extreme, you prevent the safety of neutral gray that collapses images into flatness.
The green haze specifically references chemical warfare atmospherics established in military shooter game aesthetics. This isn't arbitrary color choice: green in this context signals toxicity, night vision interference, and environmental threat. The 5600K cool temperature prevents it from reading as supernatural or fantasy—it's environmental, not magical. When you describe haze rather than smoke, you trigger volumetric lighting calculations: light scattering through particulate matter, atmospheric perspective degradation, the softening of edges at distance.
The warm sepia key at 3200K serves the portrait function. Portrait photography historically relies on warm key lights because skin undertones (melanin, blood vessels, subcutaneous fat) reflect warm wavelengths more flatteringly. But the critical addition is Rembrandt pattern—a specific lighting geometry where the key light positioned 45 degrees above and to the side creates a small triangular highlight on the shadow-side cheek. This pattern is recognized across training data as "dramatic portraiture," providing the model with clear compositional intent rather than vague "dramatic lighting" that produces inconsistent results.
The Material Stack: Building Physical History Through Layered Description
The mask in the reference image succeeds because it reads as used—not merely weathered in a generic sense, but specifically damaged in ways that reveal material history. The prompt achieves this through what I call material stacking: describing surface layers in the order they would be encountered physically, with each layer's condition implying time and use.
Consider the alternative: "weathered white skull mask." This produces a surface that looks old, but uniformly so—perhaps some scratches, perhaps some dirt, but no narrative of how it became that way. The improved prompt specifies "cracked bone-white paint revealing aged black substrate." This creates a specific damage mechanism: paint failure through flexion and impact, revealing the material beneath. The model renders cracks with dimensional depth because paint has thickness; it renders the substrate as "aged black" rather than pure black because exposure and oxidation have modified the original material.
The vertical black stripe receives "dripping paint texture"—another specific damage state. Paint drips when applied thickly or when heated, suggesting either hasty field application or environmental stress. This detail prevents the stripe from reading as printed or manufactured; it reads as applied, with the irregularity of human hands.
This principle extends to the tactical gear. "Carbon fiber ear cups" provides specific material reflectance—subdued, directional, modern. "MOLLE webbing" references a specific attachment system with regular geometric spacing, preventing generic "tactical vest" interpretation that might produce fantasy armor or outdated equipment. Each material specification constrains the model to real-world references with consistent visual properties.
Optical Specificity: Why f/1.4 and Large Format Matter
Depth of field specification in prompts frequently fails because photographers and cinematographers use the term differently, and the model must resolve this ambiguity. Specifying "shallow depth of field" without optical parameters produces variable results: sometimes extreme blur, sometimes moderate separation, sometimes artificial background replacement rather than optical defocus.
The inclusion of f/1.4 provides specific optical behavior. At this aperture, full-frame lenses exhibit characteristic aberrations: cat's eye bokeh at frame edges, focus plane curvature, longitudinal chromatic aberration (purple fringing on high-contrast edges). These "imperfections" signal optical reality to viewers trained on photography. The model, recognizing this aperture reference from training data, simulates these characteristics.
But f/1.4 on a full-frame sensor produces extremely shallow focus—often too shallow for portrait work where facial features must remain sharp. The additional specification of large format digital with medium format depth characteristics resolves this. Large format sensors (typically 4x5 inch or larger in film equivalence) provide shallower depth of field at equivalent apertures due to longer focal lengths required for equivalent framing. Medium format digital sensors (roughly 44x33mm) split the difference—shallower than full-frame, deeper than large format. This combination produces the portrait-optimal behavior: sharp eyes and mask surface, graduated falloff through the tactical gear, creamy environmental separation.
The "octane render material definition" specification serves a different function. Octane is a GPU-based renderer known for spectral lighting, accurate subsurface scattering, and physically-based material systems. Referencing it in the prompt signals to the model that material properties should follow physical light interaction rules—specular highlights should match surface roughness, translucency should respect material thickness, reflections should show appropriate Fresnel falloff.
The Eye Problem: Maintaining Humanity in Full Coverage Masks
Masked portraits face a specific technical challenge: the eyes are the only visible human element, yet they're partially obscured and often in shadow. Without specific instruction, the model renders eyes as generic spheres with simplified iris detail—"dead eyes" that destroy portrait engagement.
The solution in this prompt is specification of biological processes: "moisture and vascular detail." Moisture in eyes manifests as meniscus highlights on the cornea, tear film variation, and subtle reflections of environment. Vascular detail appears as fine red branching in the sclera—the white of the eye. These details signal living tissue under environmental stress. The special forces context implies alertness, fatigue, adrenaline—physiological states visible in eye tissue.
The constraint "visible around eye perimeters" prevents the model from attempting full face skin rendering that would conflict with mask coverage. It focuses texture generation on the small exposed areas, maximizing detail density where it matters.
Style Reference as Production Context
The specification "Call of Duty Modern Warfare II promotional photography style" operates as a complete visual system reference. Unlike generic "cinematic" which spans film noir to Marvel blockbusters, this references a specific production context: military shooter game marketing from 2022, characterized by desaturated color palettes with selective saturation (the green haze), high contrast with lifted shadows, and a particular treatment of tactical gear as fetish object.
This reference provides color grading behavior—shadow tint toward blue-green, highlight rolloff that preserves detail, midtone contrast that emphasizes material texture. It provides compositional conventions—tight framing on gear, shallow depth isolating subjects from environment, atmospheric effects as framing devices. It provides lighting quality—motivated sources with visible motivation, environmental interaction with particulate matter.
For creators working in adjacent spaces—military fiction, tactical gear marketing, action cinematography—this specific reference produces more consistent results than broader genre terms. The model's training data includes substantial COD promotional material, making this a high-confidence reference.
Applying These Principles to Your Own Work
The technical architecture of this prompt—split environmental/key lighting, material stacking, optical specificity, biological detail, production context—transfers to any portrait subject requiring dimensional presence. The underlying principle: replace quality judgments with physical specifications.
When you want "dramatic lighting," specify the pattern, temperature, and quality. When you want "realistic materials," specify layer order, damage mechanisms, and aging processes. When you want "cinematic depth," specify sensor format, aperture, and focal length. The model renders physics; your job is to describe the physics you want rendered.
For additional exploration of cinematic portrait techniques, see our guides on mastering street portraits and cyberpunk character lighting. For technical background on the rendering systems referenced, Midjourney's documentation provides parameter specifications.
Label: Cinematic
Key Principle: Replace quality judgments ("realistic," "dramatic," "cinematic") with physical specifications: temperatures in Kelvin, aperture values, material layers, and light patterns. The model renders physics, not adjectives.