조 원 – Page 4 – Robotics and Computer Vision Lab

허 재연 on [ICRA 2023] Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene Graphs01/15/2026
재밌는 의견 주셔서 감사합니다. 요약하면 t-1->t 프레임 간 변화 정보(차이)를 모델링하는데 있어 전체 프레임을 보는 것보다 부분 정보를 활용하면 좋을…
박 성준 on [NIPS2025] Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding01/14/2026
안녕하세요, 재윤님 좋은 댓글 감사합니다. 재윤님이 말해주신 극단적인 케이스에서는 시간 순대로 나열하는 방식과 차이가 적긴하지만, 시간 정보와 클립 사이의 연결성도…
박 성준 on [NIPS2025] Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding01/14/2026
안녕하세요, 예은님 좋은 댓글 감사합니다. LVU task 중에서도 DB를 생성하고 평가하는 RAG방식의 방법론은 일반적으로 오프라인으로 DB를 생성하는 과정이 오래걸리는 것을…
박 성준 on [NIPS2025] Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding01/14/2026
안녕하세요, 현우님 좋은 댓글 감사합니다. 실제로 저자가 Appendix에서 Limitation 중 하나로 필터링에서 오류가 존재할 수 있다는 점을 언급하고 있습니다. 학습…
박 성준 on [NIPS2025] Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding01/14/2026
안녕하세요, 기현님 좋은 댓글 감사합니다. Vgent는 오프라인 DB를 생성할때에는 연산량이 늘어나고 시간이 오래걸리지만, DB를 생성한 이후에 평가를 진행할 때에는 효율적인…

Author: 조 원

[AAAI 2021] BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation

[ACM MM 2021] Video Similarity and Alignment Learning on Partial Video Copy Detection

[NeurIPS 2020] Labelling unlabelled videos from scratch with multi-modal self-supervision

[CVPR2021] CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning

[ECCV2021] BSN: Boundary Sensitive Network for Temporal Action Proposal Generation

[CVPR2021] T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval

[arXiv2021] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

[CVPR 2021] Multi-shot Temporal Event Localization: a Benchmark

[arXiv2021] Self-supervised Video Retrieval Transformer Network

[CVPR2021] Self-supervised Video Hashing via Bidirectional Transformers

Conference Deadline

NEW POST

New Comment