Tencent Open Sourced a 3D World Model

https://news.ycombinator.com/rss Hits: 15
Summary

涓枃闃呰 We introduce HunyuanWorld-Voyager, a novel video diffusion framework that generates world-consistent 3D point-cloud sequences from a single image with user-defined camera path. Voyager can generate 3D-consistent scene videos for world exploration following custom camera trajectories. It can also generate aligned depth and RGB video for efficient and direct 3D reconstruction. Sep 2, 2025: 馃憢 We release the code and model weights of HunyuanWorld-Voyager. Download. Join our Wechat and Discord group to discuss and find help from us. Wechat Group Xiaohongshu X Discord 馃帴 Demo Demo Video demo.mp4 Camera-Controllable Video Generation Input Generated Video output.mp4 output7.mp4 output9.mp4 Multiple Applications Video Reconstruction Generated Video Reconstructed Point Cloud output1.mp4 output2.mp4 Image-to-3D Generation output5.mp4 output11.mp4 Video Depth Estimation depth.mp4 depth2.mp4 鈽笍 HunyuanWorld-Voyager Introduction Architecture Voyager consists of two key components: (1) World-Consistent Video Diffusion: A unified architecture that jointly generates aligned RGB and depth video sequences, conditioned on existing world observation to ensure global coherence. (2) Long-Range World Exploration: An efficient world cache with point culling and an auto-regressive inference with smooth video sampling for iterative scene extension with context-aware consistency. To train Voyager, we propose a scalable data engine, i.e., a video reconstruction pipeline that automates camera pose estimation and metric depth prediction for arbitrary videos, enabling large-scale, diverse training data curation without manual 3D annotations. Using this pipeline, we compile a dataset of over 100,000 video clips, combining real-world captures and synthetic Unreal Engine renders. Performance Quantitative comparison on WorldScore Benchmark. 馃敶 indicates the 1st, 馃煝 indicates the 2nd, 馃煛 indicates the 3rd. Method WorldScore Average Camera Control Object Control Content Alignment 3D Consistency Photome...

First seen: 2025-09-03 11:55

Last seen: 2025-09-04 01:58