Robotics and Computer Vision Lab

황 유진 on [arXiv2025]Agent-Omni: Test-Time Multimodal Reasoning via Model Coordination for Understanding Anything06/05/2026
안녕하세요 리뷰 읽어주셔서 감사합니다. 말씀해주신대로 프롬프트로 수행하는 방식으로 모달리티별로 정해진 탬플릿이 있는것은 아닙니다 혹시 프롬프트 생성에 활용된 탬플릿이 궁금하시면 논문의…
김 영규 on [arxiv 2025] Is Diversity All You Need for Scalable Robotic Manipulation?06/05/2026
안녕하세요 재찬님 댓글 감사합니다. 1. 저도 궁금하긴 한데 해당 연구 이후 비슷한 문제를 다룬 적이 있는지 보도록 하겠습니다. 2. 다양한…
김 영규 on [arxiv 2025] Is Diversity All You Need for Scalable Robotic Manipulation?06/05/2026
안녕하세요 승현님 댓글 감사합니다. distribution debiasing을 담당하는 모듈 자체가 속도에 대한 부분이라 좀 제한적이지 않을까 싶습니다. 저자들의 설계 의도가 다양한…
김 영규 on [arxiv 2025] Is Diversity All You Need for Scalable Robotic Manipulation?06/05/2026
안녕하세요 인하님 댓글 감사합니다. 해당 자료같은 경우는 같은 양의 데이터를 활용할 때 작업의 다양성이 모델에게 다양한 상호작용에 대한 지식을 알려줄…
김 영규 on [arxiv 2025] Is Diversity All You Need for Scalable Robotic Manipulation?06/05/2026
안녕하세요 주연님 댓글 감사합니다. 말씀하신대로 velocity 외에 force도 조작하는 사람이 달라지거나 기타 이유로 같은 상태에 대해 다른 성격의 데이터가 모일…

Recent Posts

[arxiv 2026′] VLA-JEPA Enhancing Vision-Language-Action Model with Latent World Model

[arXiv2025]Agent-Omni: Test-Time Multimodal Reasoning via Model Coordination for Understanding Anything

[HRI 2026] CareEval: Evaluating Large Language Models for Decision-Making in Physical Robot Caregiving

[CoRL 2025] CoRI: Communication of Robot Intent for Physical Human-Robot Interaction

[NeurIPS 2025] Don’t Just Chase “Highlighted Tokens” in MLLMs: Revisiting Visual Holistic Context Retention

[IROS 2025] FSGlove: An Inertial-Based Hand Tracking System with Shape-Aware Calibration

[CVPR 2026(Highlight)]CLIP Is Shortsighted: Paying Attention Beyond the First Sentence

[CVPR 2024] BoQ: A Place is Worth a Bag of Learnable Queries

[ICLR 2026] UrbanVerse: Scaling Urban Simulation by Watching City-Tour Videos

[2025 NIPS] HoliTom : Holistic Token Merging for Fast Video Large Language Models

Conference Deadline

NEW POST

New Comment