Robotics and Computer Vision Lab

손 우진 on [CVPR 2026] TherA : Thermal-Aware Visual-Language Prompting for Controllable RGB-to-Thermal Infrared Translation05/19/2026
댓글 감사합니다. 영규님 저 또한 그렇게 생각이듭니다. 하지만 학습때는 열화상이미지를 노이즈로 변환해서 학습하게 되고 dual classifier-free guidance 학습방식으로 통해서 정확하게…
손 우진 on [CVPR 2026] TherA : Thermal-Aware Visual-Language Prompting for Controllable RGB-to-Thermal Infrared Translation05/19/2026
좋은 댓글 감사합니다. 승현님 동적인 객체 그중에서도 보행자를를 말씀주시는 것으로 이해하였습니다. 우선적으로 보행자 같은경우에는 움직인든 움직이지 않던 열을 가지고 있어…
손 우진 on [CVPR 2026] TherA : Thermal-Aware Visual-Language Prompting for Controllable RGB-to-Thermal Infrared Translation05/19/2026
댓글 감사합니다. 기현님 지금 해당논문은 데이터셋 전체가 외부에서 진행된거라고 보실 수 있습니다. kaist 데이터셋이나 ms2 데이터셋과 같이 TIR과 RGB가 같이…
황 찬미 on [CVPR 2026] Think, Then Verify: A Hypothesis–Verification Multi-Agent Framework for Long Video Understanding05/18/2026
안녕하세요 유짐님 댓글 감사합니다 반복은 1회 고정은 아닙니다! 필요하면 추가반복이 가능하고 verification단계에서 evidence를 더 찾는 루프와, 가설과 clue자체를 다시 만드는…
김 주연 on [IROS 2025] OpenRoboCare: A Multimodal Multi-Task Expert Demonstration Dataset for Robot Caregiving05/18/2026
안녕하세요, 영규님 좋은 댓글 감사합니다. 저도 아직은 팔로업 중이지만 Caregiving에 대해서 연관되어 나오는 주제가 preference-aware 분야인데요. 사람의 선호를 인지하고 이를…

[Arxiv 2026] DeepSeek-OCR 2: Visual Causal Flow

[arXiv 2025] GR00T N1: An Open Foundation Model for GeneralistHumanoid Robots

[ECCV 2024] FoundPose : Unseen Object Pose Estimation with Foundation Features

[NeurIPS 2025] Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation

[ICCV 2025] LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents

MineWorld: A real-time and open-source interactive world model on minecraft.

[arXiv 2025] WorldMM:Dynamic MultiModal Memory Agent for Long Video Understanding

[CVPR 2023] Open-vocabulary Attribute Detection

[2025 NIPS] KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction

[arXiv 2025] GaMO- Geometry-aware Multi-view Diffusion Outpainting for Sparse-View 3D Reconstruction

Conference Deadline

NEW POST

New Comment