홍 주영 – Page 3 – Robotics and Computer Vision Lab

김 영규 on [arXiv 2025] Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow01/22/2026
안녕하세요 승현님 댓글 감사합니다. 해당 figure는 반투명으로 표현된 여러 초기 포즈들의 inital state에서 출발해도 일관되게 로봇이 작업을 완수 할 수…
김 영규 on [arXiv 2025] IGen: Scalable Data Generation for Robot Learning from Open-World Images01/22/2026
안녕하세요 우현님 댓글 감사합니다. 사실 pointcloud만으로 영상을 만드는건 품질이 떨어지지만, RGB에서 특정 K를 기준으로 Depth를 추정하고 Pointcloud로 만들었다면 같은 K로…
김 영규 on [arXiv 2025] IGen: Scalable Data Generation for Robot Learning from Open-World Images01/22/2026
안녕하세요 인하님 댓글 감사합니다. 저도 과정이 복잡하다고 느꼈는데, Open Image로부터 데이터를 얻으려다보니 다양한 모듈들이 조합되어서 더 파이프라인이 커지고 복잡해지는 것…
김 영규 on [arXiv 2025] IGen: Scalable Data Generation for Robot Learning from Open-World Images01/22/2026
안녕하세요 기현님 댓글 감사합니다. 저도 단일 이미지로부터 굉장히 영리하게 현실적인 visual observation과 로봇의 action을 뽑아냈다고 생각해서 리뷰했습니다,, ㅎㅎ 해당 방식으로…
김 영규 on [arXiv 2026] Sim2real Image Translation Enables Viewpoint Robust Policies from Fixed-Camera Datasets01/22/2026
안녕하세요 태주님 댓글 감사합니다. 말씀해주신 부분에 집중해서 다시 논문을 보면서 생각해봤습니다. A1. 저자들이 명시적으로 논문에 적어둔 내용은 "시간이 2700배 적게…

Author: 홍 주영

[CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval

[CVPR 2025] Rethinking Noisy Video-Text Retrieval via Relation-aware Alignment

[CVPR 2025] MultiVENT 2.0: A Massive Multilingual Benchmark for Event-Centric Video Retrieval

[ICLR 2025] TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval

[CVPR 2025] Video-ColBERT: Contextualized Late Interaction for Text-to-Video Retrieval

[CVPR 2025] Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval

[CVPR 2025] Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions

[CVPR 2023] Clover : Towards A Unified Video-Language Alignment and Fusion Model

[CVPR 2020] End-to-End Learning of Visual Representations from Uncurated Instructional Videos

[2022 Neurocomputing]CLIP4Clip: An empirical study of CLIP for end to end video clip retrieval and captioning

Conference Deadline

NEW POST

New Comment