Robotics and Computer Vision Lab

안 우현 on [arXiv 2024]ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation07/10/2025
안녕하세요 우진님 좋은 질문 감사합니다. 말씀하신 것처럼 이 논문은 scale에 강인하면서도 절대적인 거리 값을 예측하는 metric depth estimation을 목표로 하고…
신 인택 on [CVPR2024] Towards Automated Movie Trailer Generation07/09/2025
안녕하세요 유진님 좋은리뷰 감사합니다. 메타데이터가 없어도 우선적으로 해당 task를 할 수 있고 심지어 이전 논문들보다 성능이 좋다는게 대단합니다. 제가 약간…
손 우진 on [arXiv 2024]ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation07/08/2025
안녕하세요 우현님 좋은 리뷰 감사합니당 궁금한점 있어 남깁니다!! 해당 논문은 결국에 scale에 강인하게 metric depth map을 구하고자 하는 것으로 이해했습니다.…
손 우진 on [CVPR 2024] WorDepth: Variational Language Prior for Monocular Depth Estimation07/08/2025
안녕하세요 좋은 리뷰 감사합니당. 간단한 질문하나 묻고싶습니다. text에서 나온 평균과 분산을 이용해서 이미지에 맞춘값을 샘플링해서 준다고 하셨는데 이미지에 맞춘 샘플링은…
김 영규 on [CVPR 2024]SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation07/08/2025
안녕하세요 우진님 리뷰 감사합니다. Zero shot일 때 성능이 더 좋다는 점이 흥미롭네요 SAM이 그만큼 segmentation을 잘한다는거 같은데 혹시 정량적으로는 좋지만…

[CVPR 2025] Rethinking Noisy Video-Text Retrieval via Relation-aware Alignment

[CVPR2022] Think Global, Act Local: Dual-scale Graph Transformer for vision-and-Language Navigation

[arXiv 2025] Scalable Real2Sim: Physics-Aware Asset Generation Via Robotic Pick-and-Place Setups

ICRA 2025 참관기

[CVPR 2020] On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention

[ICLR 2025] Dense Video Object Captioning from Disjoint Supervision

[arXiv 2024]EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model

[arXiv 2025]OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models

[ICCV 2023] Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World

[ICRA 2025] DexMimicGen: Automated Data Generation for Bimanual Dexterous Manipulation via Imitation Learning

Conference Deadline

NEW POST

New Comment