Robotics and Computer Vision Lab

황 유진 on [arXiv2025]Agent-Omni: Test-Time Multimodal Reasoning via Model Coordination for Understanding Anything06/05/2026
안녕하세요 리뷰 읽어주셔서 감사합니다. 말씀해주신대로 프롬프트로 수행하는 방식으로 모달리티별로 정해진 탬플릿이 있는것은 아닙니다 혹시 프롬프트 생성에 활용된 탬플릿이 궁금하시면 논문의…
김 영규 on [arxiv 2025] Is Diversity All You Need for Scalable Robotic Manipulation?06/05/2026
안녕하세요 재찬님 댓글 감사합니다. 1. 저도 궁금하긴 한데 해당 연구 이후 비슷한 문제를 다룬 적이 있는지 보도록 하겠습니다. 2. 다양한…
김 영규 on [arxiv 2025] Is Diversity All You Need for Scalable Robotic Manipulation?06/05/2026
안녕하세요 승현님 댓글 감사합니다. distribution debiasing을 담당하는 모듈 자체가 속도에 대한 부분이라 좀 제한적이지 않을까 싶습니다. 저자들의 설계 의도가 다양한…
김 영규 on [arxiv 2025] Is Diversity All You Need for Scalable Robotic Manipulation?06/05/2026
안녕하세요 인하님 댓글 감사합니다. 해당 자료같은 경우는 같은 양의 데이터를 활용할 때 작업의 다양성이 모델에게 다양한 상호작용에 대한 지식을 알려줄…
김 영규 on [arxiv 2025] Is Diversity All You Need for Scalable Robotic Manipulation?06/05/2026
안녕하세요 주연님 댓글 감사합니다. 말씀하신대로 velocity 외에 force도 조작하는 사람이 달라지거나 기타 이유로 같은 상태에 대해 다른 성격의 데이터가 모일…

[arxiv 2025] Is Diversity All You Need for Scalable Robotic Manipulation?

[ICLR 2026] CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally

[CVPR2025] Video Depth Anything

[NeurIPS 2025] FastVID: Dynamic Density Pruning for Fast Video Large Language Models

[HRI 2026] Learning Human Preferences over a Human-Robot Collaboration Based on Explicit and Implicit Human Feedback

[CoRL 2024] APRICOT : Active Preference Learning and Constraint-Aware Task Planning with LLMs

[ICML 2026] VideoBrain : Learning Adaptive Frame Sampling for Long Video Understanding

[ICLR 2026 Workshop] World Action Models are Zero-shot Policies

[CVPR 2026] EgoX: Egocentric Video Generation from a Single Exocentric Video

[ICLR 2026] AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models

Conference Deadline

NEW POST

New Comment