신 인택 – Page 2 – Robotics and Computer Vision Lab

허 재연 on [ICRA 2023] Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene Graphs01/15/2026
재밌는 의견 주셔서 감사합니다. 요약하면 t-1->t 프레임 간 변화 정보(차이)를 모델링하는데 있어 전체 프레임을 보는 것보다 부분 정보를 활용하면 좋을…
박 성준 on [NIPS2025] Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding01/14/2026
안녕하세요, 재윤님 좋은 댓글 감사합니다. 재윤님이 말해주신 극단적인 케이스에서는 시간 순대로 나열하는 방식과 차이가 적긴하지만, 시간 정보와 클립 사이의 연결성도…
박 성준 on [NIPS2025] Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding01/14/2026
안녕하세요, 예은님 좋은 댓글 감사합니다. LVU task 중에서도 DB를 생성하고 평가하는 RAG방식의 방법론은 일반적으로 오프라인으로 DB를 생성하는 과정이 오래걸리는 것을…
박 성준 on [NIPS2025] Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding01/14/2026
안녕하세요, 현우님 좋은 댓글 감사합니다. 실제로 저자가 Appendix에서 Limitation 중 하나로 필터링에서 오류가 존재할 수 있다는 점을 언급하고 있습니다. 학습…
박 성준 on [NIPS2025] Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding01/14/2026
안녕하세요, 기현님 좋은 댓글 감사합니다. Vgent는 오프라인 DB를 생성할때에는 연산량이 늘어나고 시간이 오래걸리지만, DB를 생성한 이후에 평가를 진행할 때에는 효율적인…

Author: 신 인택

[CVPR 2024] Open-Vocabulary Calibration for Fine-tuned CLIP

[ICLR2024]CLIPSELF : VISION TRANSFORMER DISTILLS ITSELF FOR OPEN-VOCABULARY DENSE PREDICTION

2025년도 하계 URP 조교를 마치며

[IEEE 2024 IJCNN]Image Caption Method from Coarse to Fine Based On Dual Encoder-Decoder Framework

[IEEE CBMI 2024]Is CLIP the main roadblock for fine-grained open-world perception?

[arxiv 2025]Fine Tuning without Catastrophic Forgetting via Selective Low Rank Adaptation

[NeurlPS 2024]SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection

2025 상반기 회고

[CVPR 2023]Finetune Like You Pretrain: Improved Finetuning of Zero-Shot Vision Models

[ECCV 2022]Simple Open-Vocabulary Object Detection with Vision Transformers

Conference Deadline

NEW POST

New Comment