Image Generation — Methodology

Purpose
We create a fair, repeatable way to compare image-generative AI models side by side.

Test Setup
• Output size: 1024 × 1024 (1:1 square).
• Why square: neutral for both portrait and landscape use cases.
• Prompts: a rotating pool of everyday scenes (people, products, environments, text).
• Seeds: for each scene/model, we generate 5 random seeds and select the single best result.
• Settings: model’s best/maximum quality settings that run reliably on our setup.
• Post-processing: none (no upscaling, inpainting, face fixers, or manual edits). Only center-crop to 1024 × 1024 if required.

Best-of-Five Selection
We review the five random seed outputs for each scene/model and pick the best one. We record and display the winning seed (and keep the other four seeds on file for audit/repro). Selection balances prompt adherence, clarity, realism, and minimal artifacts. Tie-breakers favor cleaner subject integrity and lighting.

Scoring Characteristics (Images)
• Prompt adherence
• Ground appearance (contact with ground, shadows/reflections)
• Lighting & shadows
• Human/subject integrity (faces, hands, proportions; or object geometry)
• Text fidelity (when applicable)
• Style control
• Artifacts check (extra fingers, warps, banding, smudged text)

Video Generation — Methodology

Purpose
We compare video-generative AI models using a consistent pipeline and objective checks.

Hardware & Output
• Speed of generation/time of generation is dependent on the GPU used in generating AI videos through comfy UI: NVIDIA RTX 5070 Ti (16 GB VRAM).
• Output target: up to 720p resolution at 24 frames per second.
• Duration of AI Video results: Max-duration AI models i.e, of 5 seconds.

Generation Process
• Scenes: a rotating pool of everyday scenarios (people, motion, objects, environment, camera moves).
• Seeds: for each scene/model, we generate 5 random seeds and select the single best result.
• Settings: model’s best/maximum quality settings that run reliably on our hardware.
• Post-processing: none (no interpolation, upscaling, color grading, or manual edits). Clips may be trimmed for timing only.

Best-of-Five Selection
We compare the five seed outputs per scene/model and choose the strongest clip. We record and display the winning seed with the published result (and keep the other four seeds on file). Selection balances motion quality, realism, and prompt match, with the fewest artifacts.

Scoring Characteristics (Videos)
• Prompt adherence
• Realism (lighting, materials, physics—or faithful stylization)
• Motion consistency (smooth frame-to-frame, low wobble/jitter)
• Subject integrity (faces/limbs/objects stay coherent)
• Ground appearance (feet/tires planted; correct contact shadows/reflections)
• Camera behavior (stable pans/zooms/dollies)
• Artifacts check (flicker, banding, ghosting, trajectory drift, text melt)

Notes on Fairness
• Prompts, categories, and output specs are kept consistent across models.
• We don’t cherry-pick with heavy edits; best of five random seeds keeps it practical while showing true capability.
• If a model supports higher native limits, we document that separately, but comparisons default to the shared target above for apples-to-apples results.

Video Generation — Methodology

Purpose
We compare video-generative AI models using a consistent pipeline and objective checks.

Hardware & Output
• Speed of generation/time of generation is dependent on the GPU used in generating AI videos through comfy UI: NVIDIA RTX 5070 Ti (16 GB VRAM).
• Output target: up to 720p resolution at 24 frames per second.
• Duration of AI Video results: Max-duration AI models i.e, of 5 seconds.

Generation Process
• Scenes: a rotating pool of everyday scenarios (people, motion, objects, environment, camera moves).
• Seeds: for each scene/model, we generate 5 random seeds and select the single best result.
• Settings: model’s best/maximum quality settings that run reliably on our hardware.
• Post-processing: none (no interpolation, upscaling, color grading, or manual edits). Clips may be trimmed for timing only.

Best-of-Five Selection
We compare the five seed outputs per scene/model and choose the strongest clip. We record and display the winning seed with the published result (and keep the other four seeds on file). Selection balances motion quality, realism, and prompt match, with the fewest artifacts.

Scoring Characteristics (Videos)
• Prompt adherence
• Realism (lighting, materials, physics—or faithful stylization)
• Motion consistency (smooth frame-to-frame, low wobble/jitter)
• Subject integrity (faces/limbs/objects stay coherent)
• Ground appearance (feet/tires planted; correct contact shadows/reflections)
• Camera behavior (stable pans/zooms/dollies)
• Artifacts check (flicker, banding, ghosting, trajectory drift, text melt)

Notes on Fairness
• Prompts, categories, and output specs are kept consistent across models.
• We don’t cherry-pick with heavy edits; best of five random seeds keeps it practical while showing true capability.
• If a model supports higher native limits, we document that separately, but comparisons default to the shared target above for apples-to-apples results.

Video Generation — Methodology

Purpose
We compare video-generative AI models using a consistent pipeline and objective checks.

Hardware & Output
• Speed of generation/time of generation is dependent on the GPU used in generating AI videos through comfy UI: NVIDIA RTX 5070 Ti (16 GB VRAM).
• Output target: up to 720p resolution at 24 frames per second.
• Duration of AI Video results: Max-duration AI models i.e, of 5 seconds.

Generation Process
• Scenes: a rotating pool of everyday scenarios (people, motion, objects, environment, camera moves).
• Seeds: for each scene/model, we generate 5 random seeds and select the single best result.
• Settings: model’s best/maximum quality settings that run reliably on our hardware.
• Post-processing: none (no interpolation, upscaling, color grading, or manual edits). Clips may be trimmed for timing only.

Best-of-Five Selection
We compare the five seed outputs per scene/model and choose the strongest clip. We record and display the winning seed with the published result (and keep the other four seeds on file). Selection balances motion quality, realism, and prompt match, with the fewest artifacts.

Scoring Characteristics (Videos)
• Prompt adherence
• Realism (lighting, materials, physics—or faithful stylization)
• Motion consistency (smooth frame-to-frame, low wobble/jitter)
• Subject integrity (faces/limbs/objects stay coherent)
• Ground appearance (feet/tires planted; correct contact shadows/reflections)
• Camera behavior (stable pans/zooms/dollies)
• Artifacts check (flicker, banding, ghosting, trajectory drift, text melt)

Notes on Fairness
• Prompts, categories, and output specs are kept consistent across models.
• We don’t cherry-pick with heavy edits; best of five random seeds keeps it practical while showing true capability.
• If a model supports higher native limits, we document that separately, but comparisons default to the shared target above for apples-to-apples results.