Eval with 6 Different Models

Eval with 6 Different Models

View Sample

Dataset Summary

Evaluation of six different image generation models using the same prompt. The evaluation includes reasoning from content experts

Example

Prompt: A realistic baby chimpanzee hanging from a branch in a misty rainforest, cinematic lighting, highly detailed fur, natural background, photorealistic style.

1

1

1

1

Great expression, fur rendering, and dramatic lighting. Some minor oddities in the fingers of feet, but overall very believable.

2

2

2

2

Natural motion and body proportions; decent integration into the background. The left foot looks a bit anatomically uncanny.

Lighting and foliage are great, but the monkey’s fur is too flat, and the fingers on the feet look unrealistic.

3

3

4

4

4

4

Anatomical accuracy, fur texture, and lighting look nearly photorealistic. The face looks wrong, and the left foot is proportionally too large.

5

5

5

5

Awkward limb anatomy breaks realism. One hand missing. The background looks unnaturally filled with greenery.

6

6

6

6

Highly stylized to the point of looking like a bad oil painting. Skin texture and face are clearly artificial. Extra fingers and incorrect anatomy.