2024년 07월 – 페이지 2 – Robotics and Computer Vision Lab

[ECCV2022]Detecting Twenty-thousand Classes using Image-level Supervision

#676478 이번에 리뷰드릴 논문은 Object Detection 데이터셋의 다양성 한계를 극복하는 방법론을 다루는 논문입니다. Meta AI(이하, 메타)와 텍사스 대학에서 발표된 연구이며 ECCV 2022에 등재되었습니다. 그럼 리뷰를…

Conference X-Review

[NerulPS 2022] Flamingo: a Visual Language Model for Few-Shot Learning

당분간 LMM 및 여러 VLM를 리뷰해보려고 하는데요, 이번에 리뷰할 논문은 구글 딥마인드에서 발표한 Visual Language Model(VLM)인 Flamingo 라는 논문입니다. 제목에서와 같이 Few-shot으로도 다양한 task를 수행할…

X-Review

[ICASSP 2023]Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation

본 논문은 speech enhancemeht와 speech separation task를 e2e 방식으로 수행하며, downstream인 separation에 유효한 정보의 손실을 막기 위해 gradient modulation을 사용하는 방법론에 관한 것으로, speech enhancemet를…

Conference News Paper X-Review

[CoRL 2023 oral] VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models

이번 논문은 아주 재밌는 논문 입니다. LLM을 활용해 명시적인 명령어로부터 로봇 조작의 추론 및 명령어 생산하고 VLM(~OVD)을 활용해 로봇을 위한 3차원 공간에 대한 이해를 얻어…

Paper X-Review

[ACM MM 2022] Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition

안녕하세요, 마흔 번째 X-Review입니다. 이번 논문은 2022년도 ACM MM에 게재된 Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition입니다. 바로 시작하도록 하겠습니다. 😵…

Conference X-Review

[CVPR 2024]Open-vocabulary object 6D pose estimation

제가 이번에 리뷰할 6D Pose Estimation 논문도 보다 범용적으로 물체의 자세 정보를 추정하기 위한 연구로, 텍스트 프롬프트가 주어졌을 때 이에 대응되는 관심 객체의 자세정보를 추정하는…

Paper X-Review

[CVPR 2024] Enhancing Multimodal Cooperation via Sample-level Modality Valuation

pdf code & dataset [2024.07.31 2.2 파트 설명 추가] 저는 현재 multimodal imbalance, multimodal bias와 관련하여 이를 해결하는 방법론을 제안하는 논문을 작성 중에 있습니다. 그런데…

Paper X-Review

[NeurIPS 2023 Spotlight] 3D-LLM: Injecting the 3D World into Large Language Models

안녕하세요, 마흔번째 x-review 입니다. 이번 논문은 2023년도 NeurIPS에 Spotlight 게재된 3D-LLM: Injecting the 3D World into Large Language Models입니다. 그럼 바로 리뷰 시작하겠습니다 ! 1….

Paper X-Review

[arXiv 2023] LLM4VG: Large Language Models Evaluation for Video Grounding

안녕하세요, 이번 주 X-Review에서는 23년도 말 arXiv에 게재된 <LLM4VG: Large Language Models Evaluation for Video Grounding> 이라는 논문을 소개해드리겠습니다. 방법론 논문은 아니고, 현존하는 LLM과 Multi-modal…

Paper X-Review

[CVPR 2021] Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

안녕하세요, 허재연입니다. 이번에 다룰 논문은 Microsoft Research Asia에서 작성하여 CVPR2021에 게재된 논문으로, 현재 약 420회 인용되었습니다. 기존의 SimCLR, MoCo 등 Contrastive Learning 계열 Self-Supervised Learning…

일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

[월:] 2024년 07월

[ECCV2022]Detecting Twenty-thousand Classes using Image-level Supervision

[NerulPS 2022] Flamingo: a Visual Language Model for Few-Shot Learning

[ICASSP 2023]Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation

[CoRL 2023 oral] VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models

[ACM MM 2022] Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition

[CVPR 2024]Open-vocabulary object 6D pose estimation

[CVPR 2024] Enhancing Multimodal Cooperation via Sample-level Modality Valuation

[NeurIPS 2023 Spotlight] 3D-LLM: Injecting the 3D World into Large Language Models

[arXiv 2023] LLM4VG: Large Language Models Evaluation for Video Grounding

[CVPR 2021] Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

학술대회 마감

최신 글

최신 댓글

학술대회 마감

태그

카테고리

최신 글

최신 댓글