Skip to content

Robotics and Computer Vision Lab

AI in Sensing, AI in Perception, AI in Action

  • About
    • History
    • Photo
    • Admission
  • Members
  • Publications
    • Patents
  • X-Review
  • X-Diary
  • Peer Review

Profile

정 의철

About Posts
[ICCV 2025] How Can Objects Help Video-Language Understanding?
  • Posted on: 12/15/2025 –
  • Comments: 4 Comments
[CVPR 2024] Koala: Key frame-conditioned long video-LLM
  • Posted on: 12/01/2025 –
  • Comments: 3 Comments
[arXiv 2025] VideoRAG: Retrieval-Augmented Generation over Video Corpus
  • Posted on: 11/24/2025 –
  • Comments: 1 Comment
[arXiv 2024] SLOWFAST-LLAVA: A STRONG TRAINING-FREEBASELINE FOR VIDEO LARGE LANGUAGE MODELS
  • Posted on: 11/17/2025 –
  • Comments: 9 Comments
[arXiv 2022] Disentangled Representation Learning for Text-Video Retrieval
  • Posted on: 10/13/2025 –
  • Comments: 2 Comments
[2025 ICLR] BRIDGING INFORMATION ASYMMETRY IN TEXT-VIDEO RETRIEVAL: A DATA CENTRIC APPROACH
  • Posted on: 09/29/2025 –
  • Comments: No Comments
[ICCV 2025] Hybrid-Tower: Fine-grained Pseudo-query Interaction and Generation for Text-to-Video Retrieval
  • Posted on: 09/22/2025 –
  • Comments: 5 Comments
[2025 CVPR] Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions
  • Posted on: 09/08/2025 –
  • Comments: 4 Comments
[2023 ICCV] Unified Coarse-to-Fine Alignment for Video-Text Retrieval
  • Posted on: 09/01/2025 –
  • Comments: 2 Comments
[2023 CVPR] Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring
  • Posted on: 08/18/2025 –
  • Comments: 5 Comments
Newer Posts 1 2 3 … 7 8 Older Posts

Conference Deadline

NEW POST

  • [CoRL 2025] Steering Your Diffusion Policy with Latent Space Reinforcement Learning
  • [CVPR 2025] Scale Efficient Training for Large Datasets
  • [AAAI 2026] SM3Det: A Unified Model for Multi-Modal Remote Sensing Object Detection
  • [ICRL 2026] HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
  • [RSS 2025] Robot Data Curation with Mutual Information Estimators

New Comment

  1. 이 승현 on [arXiv 2026] VideoAfford: Grounding 3D Affordance from Human-Object-Interaction Videos via Multimodal Large Language Model03/23/2026

    질문 감사합니다. 우선 action encoder로 사용한 RenderNet을 찾아보니, 일관성 있는 캐릭터와 고품질 이미지를 생성하고 제어할 수 있는 강력한 AI 이미지…

  2. 최 인하 on [arXiv 2026] VideoAfford: Grounding 3D Affordance from Human-Object-Interaction Videos via Multimodal Large Language Model03/23/2026

    안녕하세요 승현님 좋은 리뷰 감사합니다. HOI를 비디오를 사용하여 Affordance의 상호 작용패턴을 학습하는 것이 신기하네요. action encoder에 대한 궁금증이 생겼는데요! action…

  3. 이 예은 on [CVPR 2025] Scale Efficient Training for Large Datasets03/23/2026

    안녕하세요 우진님 질문 감사합니다! 네 맞습니다. 물론 여전히 데이터의 양이 많을수록 이점이 많다는 것은 자명하지만, 너무 많아버리면 saturation 문제가 발생하기도…

  4. 이 예은 on [CVPR 2025] Scale Efficient Training for Large Datasets03/23/2026

    안녕하세요 찬미님 질문 감사합니다! 저도 처음에 그 부분이 의아했었는데요, 해당 방법론이 'loss는 높지만 학습에 도움이 덜 되는 샘플'을 특별히 거르는…

  5. 이 예은 on [CVPR 2025] Scale Efficient Training for Large Datasets03/23/2026

    안녕하세요 주영님 질문 감사합니다! 해당 논문에서 pruning에 소요되는 시간은 구체적으로 언급하지 않고 있습니다. 다만 pruning에 소요되는 시간은 모델 학습 시간에…

  • Sign-in
  • RCV-Calendar
  • RCV-Github
  • Paper R/W
    • Arxiv
    • Deadline
    • Overleaf
  • Coding
    • OnlineJudge
    • Kaggle

포기하지 않는 강한 집념 만이 작은 차이를 만든다.

Design by SejongRCV