Progressive temporal compensation and semantic enhancement for Exo-to-Ego video generation

Transforming video perspectives from exocentric (third-person) to egocentric (first-person) is challenging due to limited overlap between two perspectives. Existing approaches often neglect the temporal dynamics-critical for capturing motion cues and reappearing objects-and do not fully exploit source-view inferred semantics. To address these limitations, we propose a Progressive Temporal Compensation and Semantic Enhancement (PCSE) framework for Exocentric-to-Egocentric Video Generation. The Progressive Temporal Compensation (PTC) module focuses on long-term temporal dependencies, progressively aligning exocentric temporal patterns with egocentric representations. By employing a reliance-shifting mechanism with a progression mask, PTC gradually reduces dependence on egocentric supervision, enabling more robust target-view learning. Moreover, to leverage high-level scene context, we introduce a Hierarchical Dual-channel Transformer (HDT), which jointly generates egocentric frames and their corresponding semantic layouts via dual encoder-decoder architectures with hierarchically processed transformer blocks. To further enhance structural coherence and semantic consistency, the generated semantic layouts guide frame refinement through an Uncertainty-aware Semantic Enhancement (USE) module. USE dynamically estimates uncertainty masks to locate and refine ambiguous regions, yielding more coherent and visually accurate results. Extensive experiments demonstrate that PCSE achieves leading performance among cue-free methods.

Progressive temporal compensation and semantic enhancement for Exo-to-Ego video generation

Publisher

Keywords

ASJC Scopus subject areas

Publication year

Fingerprint

Areca nut policy developments in the Asia-Pacific: Warning labels

Magnetic softness and hyperthermia efficiency of Fe3O4-Au nanoparticles with silica shell

Technological innovation and export scale: Evidence from the lithium-ion battery industry

Progressive temporal compensation and semantic enhancement for Exo-to-Ego video generation

Publisher

Keywords

ASJC Scopus subject areas

Publication year

Fingerprint

Related articles

Areca nut policy developments in the Asia-Pacific: Warning labels

Magnetic softness and hyperthermia efficiency of Fe3O4-Au nanoparticles with silica shell

Technological innovation and export scale: Evidence from the lithium-ion battery industry