VEFX-Bench — Video Editing Benchmark

A comprehensive benchmark for evaluating video editing models across 300 videos, 9 task categories, and 3 quality dimensions.

Benchmark version: v0

Download Dataset

Download the benchmark dataset from HuggingFace. It contains 300 source videos with their corresponding editing instructions.

Download from HuggingFace

Dataset: https://huggingface.co/datasets/xiangbog/VEFX-Bench

Submission Format

After running your video editing model on all benchmark videos, package the results as follows:

1

Create a .zip file containing 300 edited videos.

2

Name files as 0000.mp4, 0001.mp4, …, 0299.mp4

3

Each video should be the edited version of the corresponding original video in the benchmark dataset.

Tip: Ensure all 300 files are present and correctly numbered. Missing or misnamed files will receive a score of 0.

Task Categories

The benchmark covers 9 distinct editing categories:

Attribute Editing
Camera Angle Editing
Camera Motion Editing
Creative Edit
Instance Editing
Instance Motion Editing
Quantity Editing
Style Editing
Visual Effect Editing

Evaluation Metrics

Each video is evaluated along three complementary dimensions using the VEFX-Reward model. The Overall score is the average of all three.

IF

Instructional Following (IF)

Measures how well the edited video follows the editing instruction.

Range: 1-4 · Higher is better

RQ

Render Quality (RQ)

Measures the visual rendering quality of the edited video.

Range: 1-4 · Higher is better

EE

Edit Exclusivity (EE)

Measures whether only the intended region/attribute was edited without side effects.

Range: 1-4 · Higher is better

Scoring Rubric

Each dimension is scored on a 1–4 scale:

ScoreLevelDescription
4ExcellentFully satisfies the criterion with no noticeable issues.
3GoodMostly satisfies the criterion with minor shortcomings.
2FairPartially satisfies the criterion with noticeable issues.
1PoorFails to satisfy the criterion or has severe issues.
Submit Your Results