Tencent Open Sourced a 3D World Model

https://news.ycombinator.com/rss Hits: 15

Summary

中文阅读 We introduce HunyuanWorld-Voyager, a novel video diffusion framework that generates world-consistent 3D point-cloud sequences from a single image with user-defined camera path. Voyager can generate 3D-consistent scene videos for world exploration following custom camera trajectories. It can also generate aligned depth and RGB video for efficient and direct 3D reconstruction. Sep 2, 2025: 👋 We release the code and model weights of HunyuanWorld-Voyager. Download. Join our Wechat and Discord group to discuss and find help from us. Wechat Group Xiaohongshu X Discord 🎥 Demo Demo Video demo.mp4 Camera-Controllable Video Generation Input Generated Video output.mp4 output7.mp4 output9.mp4 Multiple Applications Video Reconstruction Generated Video Reconstructed Point Cloud output1.mp4 output2.mp4 Image-to-3D Generation output5.mp4 output11.mp4 Video Depth Estimation depth.mp4 depth2.mp4 ☯️ HunyuanWorld-Voyager Introduction Architecture Voyager consists of two key components: (1) World-Consistent Video Diffusion: A unified architecture that jointly generates aligned RGB and depth video sequences, conditioned on existing world observation to ensure global coherence. (2) Long-Range World Exploration: An efficient world cache with point culling and an auto-regressive inference with smooth video sampling for iterative scene extension with context-aware consistency. To train Voyager, we propose a scalable data engine, i.e., a video reconstruction pipeline that automates camera pose estimation and metric depth prediction for arbitrary videos, enabling large-scale, diverse training data curation without manual 3D annotations. Using this pipeline, we compile a dataset of over 100,000 video clips, combining real-world captures and synthetic Unreal Engine renders. Performance Quantitative comparison on WorldScore Benchmark. 🔴 indicates the 1st, 🟢 indicates the 2nd, 🟡 indicates the 3rd. Method WorldScore Average Camera Control Object Control Content Alignment 3D Consistency Photome...

First seen: 2025-09-03 11:55

Last seen: 2025-09-04 01:58

Read Full Article More from this Source

Tencent Open Sourced a 3D World Model

Summary

Related News

Kerberoasting

Knowledge and Memory

iPhone Air

Children and young people's reading in 2025

Hypervisor in 1k Lines