Robotics and Computer Vision Lab

손 건화 on [CVPR 2024] WorDepth: Variational Language Prior for Monocular Depth Estimation08/14/2025
안녕하세요, 리뷰 읽어주셔서 감사합니다.. 논문에서 text로 얻는 평균과 분산은 텍스트에 적합한 다양한 장면들의 분포를 나타내는 prior 역할을 하게 됩니다. 그런데…
손 건화 on [CVPR 2024] WorDepth: Variational Language Prior for Monocular Depth Estimation08/14/2025
안녕하세요, 리뷰 읽어주셔서 감사합니다. latent 공간에서는 d차원의 벡터로 이미지 공간 정보와 같은 형태를 가지고 있지 않아서 이미지 차원에 맞추어서 모든…
손 건화 on [CVPR 2024] WorDepth: Variational Language Prior for Monocular Depth Estimation08/14/2025
안녕하세요, 리뷰 읽어주셔서 감사합니다. 논문에서 어떤 구간을 1%로 사용한지에 대해서는 언급하진 않았지만, 말씀하신 것처럼 무작위로 선택되는 것이기 때문에 특정 에포크에서의…
정 윤서 on [ICCV 2023] CLIPTER: Looking at the Bigger Picture in Scene Text Recognition08/13/2025
댓글 감사합니다. 본 모델 구조를 보면 아시겠지만 text encoder는 사용하고 있지 않습니다. VLM의 encoder iamge 부분만 가져와 scene image를 embedding한…
정 윤서 on [TPAMI 2025] Instruction-Guided Scene Text Recognition08/13/2025
안녕하세요. 댓글 감사합니다. 1. 말 그대로 condition은 사전에 image에 대한 부가 정보를 주는 것으로 보심 되겠습니다. question이 예를 들어 이미지에…

[CVPR 2023] Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels

[WACV 2021] TCA: Temporal Context Aggregation for Video Retrieval with Contrastive Learning

[ECCV 2016] Identity Mappings in Deep Residual Networks

[2023 AAAI] Towards Global Video Scene Segmentation with Context-Aware Transformer

[CVPR 2023] Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning

[AAAI 2020] Background Suppression Network for Weakly-Supervised Temporal Action Localization

[CVPR2015]Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images

[ECCV 2022] Masked Discrimination for Self-Supervised Learning on Point Clouds

[ICASSP 2022] AudioCLIP : Extending CLIP To Image, Text And Audio

[CVPR 2020] ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes

Conference Deadline

NEW POST

New Comment