Paper – Page 26 – Robotics and Computer Vision Lab

[2023 CVPR] Prototype-based Embedding Network for Scene Graph Generation

안녕하세요 이번에 소개할 논문은 sgg task의 논문으로 unbiased 관계 예측을 해결하기 위해 제안된 모델입니다. SGG에서는 종종 특정 관계나 객체 쌍에 대한 데이터의 불균형이 발생하여, 모델이…

Paper X-Review

[CVPR2023]Semantic Prompt for Few-Shot Image Recognition

안녕하세요? 저는 저번주부터 다크데이터 과제로 인해 Few-shot learning 관련 논문을 읽고있는데요, 저에게는 조금 낯선 개념이여서 그동안 리뷰를 쓰지 못했습니다 ㅎㅎ 그리고 드디어 해당 논문으로 리뷰를…

Paper X-Review

[ICCV 2023] Distribution-Consistent Modal Recovering for Incomplete Multimodal Learning

안녕하세요! 이번에는 논문의 related work에 작성하면 좋을 법한 논문을 발견하여 조금 더 디테일하게 팔로업 해보고자 읽게 되었습니다. 그럼 시작합니다! 1. Introduction 기존의 많은 연구자들은 heterogeneous…

Paper X-Review

[CVPR 2023] Turning a CLIP Model into a Scene Text Detector

안녕하세요, 마흔네 번째 X-Review입니다. 이번 논문은 2023년도 CVPR에 게재된 Turning a CLIP Model into a Scene Text Detector 논문입니다. 바로 시작하도록 하겠습니다. ? 1. Introduction…

Paper X-Review

[2021 CVPR] Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation

안녕하세요 이번에 소개할 논문은 SGG 논문으로 Long-tail 문제를 완화하기 위해 제안된 논문입니다. 구체적으로 기존의 SGG 모델은 의미적 모호성을 충분히 처리하지 못하고, 단일한 결정론적 관계만을 예측하려고…

Paper X-Review

[ECCV 2024] Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data

올 초부터 (Online) Test-Time Adaptation(TTA) 분야의 논문들을 적지 않게 읽어왔습니다. 최근 TTA 논문들에서 핵심적으로 문제삼는 것들 중 하나는 long-term TTA 수행 속 마주하는 Catastrophic forgetting…

Conference Paper X-Review

[CoRL 2023 Oral] Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping

이번 논문은 LEFT-TOGO라는 논문으로 저희가 진행하고 있는 LLM 로봇 과제에서 지향하는 목표 중 하나를 구현한 논문에 해당합니다. 해당 논문은 VLM의 특징 정보를 가진 NeRF인 LERF를…

Paper X-Review

[CVPR 2024] Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation

안녕하세요, 오랜만에 TTA 분야 논문 리뷰입니다.바로 시작하겠습니다. 1. Introduction source domain dataset에 대해 학습된 모델이 실상황에 deploy된 상황 속,학습때는 마주하지 못한 새로운 target domain 에…

Paper X-Review

[MM 2024] Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding

안녕하세요, 오늘의 X-Review에서는 24년도 ACM MM 학회에 게재된 논문 <Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding>을 소개해드리고자 합니다. 카이스트의 정준선…

Paper X-Review

[ICLR 2024] CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

안녕하세요. 이번 주 X-Review에서는, 24년도 ICLR에 Spotlight으로 게재된 <CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction>이라는 논문을 소개해드리겠습니다. ICLR Spotlight으로 지정된 논문인데, 통찰력 있는…

Category: Paper

[2023 CVPR] Prototype-based Embedding Network for Scene Graph Generation

[CVPR2023]Semantic Prompt for Few-Shot Image Recognition

[ICCV 2023] Distribution-Consistent Modal Recovering for Incomplete Multimodal Learning

[CVPR 2023] Turning a CLIP Model into a Scene Text Detector

[2021 CVPR] Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation

[ECCV 2024] Is user feedback always informative? Retrieval Latent Defending for Semi-Supervised Domain Adaptation without Source Data

[CoRL 2023 Oral] Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping

[CVPR 2024] Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation

[MM 2024] Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding

[ICLR 2024] CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

Conference Deadline

NEW POST

New Comment