The piece I’m currently working on features a mountainous landscape at dusk in the background. I wasn’t happy with the results I got from Stable Diffusion’s default model v1-5-pruned-emaonly and so I decided to compare a few model / sampler combinations.

Models in this comparison:

MoistMix V1 (w/ VAE)¹
Openjourney (aka MidJourney v4)
Protogen x5.8 Rebuilt (Scifi+Anime) Official Release
Dreamlike Diffusion 1.0
Elldreth’s Lucid Mix

If you care less about the what and how then feel free to jump straight to the image results. (Don’t you wish every recipe site did that?)

Test procedure and parameters Link to heading

The base prompt² was the following:

stunning epic landscape painting, giant valley of breathtaking beauty, valley surrounded by enchanted stunning mythical mysterious otherworldly majestic imposing magnificent awe-inspiring monumental breathtaking towering mountains, mountains have fractured mighty snowy peaks, mist clings to the mountains in the distance, clouds at dusk, dark blue sky, beauty of nature, grandeur, crisp, clean, dusk, after sunset, purple hues, mystical twilight, shadows, stark, eerie, silence, stillness, fading light, fading glow, chiaroscuro, beautifully lit, cinematic lighting, dramatic lighting, natural lighting, naturalism, luminism, epic composition, golden ratio, accurate, detailed

I ran this prompt with two variations. One variation included artist names at the end:

art painting style by Thomas Moran and Albert Bierstadt and Caspar David Friedrich and Daniel Ridgway Knight and Ivan Aivazovsky and Ivan Shishkin and Alexandre Calame

The other variation had just a generic “landscape painting” appended. This was in order to find out how models would do with an artistic style close to the Hudson River School, but also what the model’s “builtin” interpretation of my prompt would be.³

These models were tested with a few select samplers. Given the prompt, these samplers seemed to give the best results for landscapes:

DDIM
DPM++ 2S a
DPM++ SDE
DPM++ SDE Karras

Other settings were a CFG scale of 7.5 with 20 denoising steps. Both prompts were run on 16 seeds with each model, leading to a total of 640 images.

Results Link to heading

As expected, using artists in the prompt leads to somewhat comparable results across models. That not only shows in the composition, but interestingly also in the colour palette: Hudson River School was an art movement of the mid-19th century, and as it happens, old oil paintings often suffer from yellowing. As a result, Stable Diffusion’s idea of those artists’ painitings is “should have yellowish tones”, even though the original paintings were rather full of lush greens. Now, if you are using Stable Diffusion in a photo editor (e.g. GIMP with Stable Boy) then it’s fairly easy to correct colours, but it’s something to be aware of nonetheless.

Double horizons

It is worth noting that many generated images have two or more horizons layered above one another, with mountains of various sizes floating in-between. This reminds me of the difficulty of generating anatomically correct hands: the individual mountains look good, but the problem is that they can fit anywhere in the picture when surrounded by fog or clouds. From the perspective of a diffusion process it “looks right”.

Some models are certainly more likely to generate double horizons than others, depending on the characteristics of the images they were trained on.

Leaving out artists from the prompt seemed to be a better approach to get a feel for the model’s own “style”: Moist Mix very much respected the “valley” in the prompt and barely had any double horizons, Openjourney produced desaturated / monochrome landscapes with an emphasis on fog and mist, while staying detailed, crisp, and photorealistic. Dreamlike Diffusion produced highly saturated images (and really took the “purple hues” part of the prompt to heart), while Elldreth’s Lucid Mix produced dreamy images with imposing looking mountains lit by the setting sun. Protogen’s landscapes varied in contrast, but for the most part were lacking in details.

In the grids below you’ll also notice that samplers DPM++ SDE and DPM++ 2S a lead to images with higher contrast and saturation compared to the other samplers used in this comparison.

Here are the generated grids, click for full size (3648×8600 px):

Imgur on mobile

You can download all full size grid images from the Imgur album. Unfortunately, if you are reading this post on a mobile, Imgur will redirect you to their mobile site, which will only have lower resolution images. Best to download them from the desktop site. ¯\_(ツ)_/¯

Model	Generic prompt	Prompt w/ Hudson River School artists
Dreamlike Diffusion
Elldreth’s Lucid Mix
MoistMix
Openjourney
Protogen

Conclusion Link to heading

While I love the eerie, photorealistic landscapes of Openjourney, MoistMix (without artists in the prompt) is the one that comes closest to the atmosphere and style of mountains that I was looking for. Elldreth’s Lucid Mix gets an honourable mention.

Makes you wonder: did Openjourney with its trigger word in the prompt have an advantage in this comparison? Did MoistMix with its VAE? Honestly I haven’t tested MoistMix without VAE. Should be easy enough to compare results with the same settings, but maybe that’s for another time.

I hope you found this post informative and useful. If so, drop me a line in the comments and let me know. I would love to hear your thoughts and feedback. And while we’re at it: which Stable Diffusion models would you recommend? I’m always on the lookout for new and exciting tips. Thanks!

What’s a VAE? ↩︎
Prompts for Openjourney included the model’s trigger word “mdjrny-v4 style”. ↩︎
Prompts for MoistMix were run with its VAE loaded. ↩︎