I2VControl-Camera Precise Video Camera Control with Adjustable Motion Strength

🎉🎉🎉 ICLR 2025 🎉🎉🎉

1 ByteDance China     2 University of Science and Technology of China (USTC)     3 Institute of Automation, Chinese Academy of Sciences (CASIA)

Table of Contents (Click for quick navigation)

🌼 Overview

We propose I2VControl-Camera, a novel camera control method for image-to-video generation, offering high control precision and adjustable motion strength.

Teaser Image

💡 Method and Analysis

Users can control the camera movement with high precision by converting the camera movement into point trajectories and then executing the control process. We lift the input image from 2D to 3D as a RGBD point cloud. When the camera moves, the 3D points can be considered as moving in the camera coordinate system. Then we project them onto 2D according to current camera pose to obtain the 2D point trajectory.

Pipeline Image

Moreover, we apply a scalar value to control the motion strength of the subjects in the video, which is decoupled from the camera movement.

The follwing figure shows two samples: the top one demonstrates a pan-left camera movement, while the bottom one shows the camera sliding to the right. For each sample, we show a preview (directly render the RGBD point cloud on to 2D plane according to the extrinsic matrix) and our generated result. We can see that the generated result can almost follow the control signal at the pixel level (can be seen in the green boxes) even when there exists movable object (the cat in the red box).

Pixel Image

We test the same camera control signal with different motion strength value. When the motion strength is set as 0, the entire scene is nearly static even when there are movable objects in the figure (polar bear, astronaut, wolf); when the motion strength is large, the main objects moves obviously.

Adjust Image

🔥 Pixel-level Control & Visual Comparisons

We show our camera control results with ground truth preview here, which demonstrates our pixel-level control capabilities. We also list the results of the comparing methods for the qualitative comparison. We can observe that our control precision is significantly higher than that of comparative methods.

Input & GT Preview CameraCtrl MotionCtrl Ours

🔥 Combinations of multiple camera movements

The following samples contain combinations of multiple camera movements.

Input & GT Preview CameraCtrl MotionCtrl Ours
move left + pan right
rotate + move up + tilt down
rotate + zoom in

🔥 Multiple dynamic objects

The following samples contain multiple dynamic objects, where our method can still achieve precise control and natural dynamics.

Input & GT Preview CameraCtrl MotionCtrl Ours

🔥 Multiple motion strength

We show the results under different motion strength. It is evident that as the motion strength increases, the amplitude of the motions enlarged and shows a direct positive correlation with the set values of motion strength.

Input & GT Preview MS=0 MS=200 MS=600

🔥 Experiment on DiT base model (Seaweed)

We present some results on another base model, Seaweed, where the results demonstrates the applicability of our method to any base model.

Pan Zoom Tilt Rotate

Citation

@article{i2vcontrolcamera,
            title={I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength},
            author={Feng, Wanquan and Liu, Jiawei and Tu, Pengqi and Qi, Tianhao and Sun, Mingzhen and Ma, Tianxiang and Zhao, Songtao and Zhou, Siyu and He, Qian},
            booktitle={The Tenth International Conference on Learning Representations, (ICLR)},
            year={2025}
    }