Robotics and Computer Vision Lab

손 건화 on [CVPR 2024] WorDepth: Variational Language Prior for Monocular Depth Estimation08/14/2025
안녕하세요, 리뷰 읽어주셔서 감사합니다.. 논문에서 text로 얻는 평균과 분산은 텍스트에 적합한 다양한 장면들의 분포를 나타내는 prior 역할을 하게 됩니다. 그런데…
손 건화 on [CVPR 2024] WorDepth: Variational Language Prior for Monocular Depth Estimation08/14/2025
안녕하세요, 리뷰 읽어주셔서 감사합니다. latent 공간에서는 d차원의 벡터로 이미지 공간 정보와 같은 형태를 가지고 있지 않아서 이미지 차원에 맞추어서 모든…
손 건화 on [CVPR 2024] WorDepth: Variational Language Prior for Monocular Depth Estimation08/14/2025
안녕하세요, 리뷰 읽어주셔서 감사합니다. 논문에서 어떤 구간을 1%로 사용한지에 대해서는 언급하진 않았지만, 말씀하신 것처럼 무작위로 선택되는 것이기 때문에 특정 에포크에서의…
정 윤서 on [ICCV 2023] CLIPTER: Looking at the Bigger Picture in Scene Text Recognition08/13/2025
댓글 감사합니다. 본 모델 구조를 보면 아시겠지만 text encoder는 사용하고 있지 않습니다. VLM의 encoder iamge 부분만 가져와 scene image를 embedding한…
정 윤서 on [TPAMI 2025] Instruction-Guided Scene Text Recognition08/13/2025
안녕하세요. 댓글 감사합니다. 1. 말 그대로 condition은 사전에 image에 대한 부가 정보를 주는 것으로 보심 되겠습니다. question이 예를 들어 이미지에…

[ICML 2021] Learning Transferable Visual Models From Natural Language Supervision (CLIP) – Part 1

[KCCV 2023] 학회 참관기

KCCV 2023 참관기

[KCCV 2023] 학회 참관기

KCCV 2023 참관기

[AAAI 2020] M3ER: Multiplicative Multimodal Emotion Recognition using Facial, Textual, and Speech Cues

[ICCV 2021] Group-Free Object Detection via Transformers

[AAAI-2020] Real-time Scene Text Detection with Differentiable Binarization

[CVPR 2023] Localized Semantic Feature Mixers for Efficient Pedestrian Detection in Autonomous Driving

[CVPR2023] PiMAE:Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection

Conference Deadline

NEW POST

New Comment