김 주연 – Page 5 – Robotics and Computer Vision Lab

홍 주영 on [arxiv 2025] Motus: A Unified Latent Action World Model01/25/2026
리뷰 잘 읽었습니다. 저한테는 워낙 익숙하지 않은 분야다보니 질문이 있어 댓글 남깁니다. 1. 논문에서는 World Model을 미래 observation을 예측하는 모델로…
홍 주영 on [Arxiv 2025] VOST-SGG: VLM-Aided One-Stage Spatio-Temporal Scene Graph Generation01/25/2026
리뷰 잘 읽었습니다. 몇 가지 궁금한 점이 있어 댓글 남겨두겠습니다! 궁금한게... position query를 MS-COCO pretrained anchor로 초기화한다고 했는데, 비디오 도메인에서도…
김 영규 on [arXiv 2025] Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow01/22/2026
안녕하세요 승현님 댓글 감사합니다. 해당 figure는 반투명으로 표현된 여러 초기 포즈들의 inital state에서 출발해도 일관되게 로봇이 작업을 완수 할 수…
김 영규 on [arXiv 2025] IGen: Scalable Data Generation for Robot Learning from Open-World Images01/22/2026
안녕하세요 우현님 댓글 감사합니다. 사실 pointcloud만으로 영상을 만드는건 품질이 떨어지지만, RGB에서 특정 K를 기준으로 Depth를 추정하고 Pointcloud로 만들었다면 같은 K로…
김 영규 on [arXiv 2025] IGen: Scalable Data Generation for Robot Learning from Open-World Images01/22/2026
안녕하세요 인하님 댓글 감사합니다. 저도 과정이 복잡하다고 느꼈는데, Open Image로부터 데이터를 얻으려다보니 다양한 모듈들이 조합되어서 더 파이프라인이 커지고 복잡해지는 것…

Author: 김 주연

[ACL 2023] Context or Knowledge is Not Always Necessary: A Contrastive Learning Framework for Emotion Recognition in Conversations

[COLINGw 2022] Shapes of Emotions: Multimodal Emotion Recognition in Conversations via Emotion Shifts

[ICASSP 2022] Multi-Lingual Multi-Task Speech Emotion Recognition Using wav2vec 2.0

[ICASSP 2023] Knowledge-Aware Bayesian Co-Attention for Multimodal Emotion Recognition

[CVPR 2023] Generative Bias for Robust Visual Question Answering

[NeurIPS 2019] RUBi:Reducing Unimodal Biases for Visual Question Answering

KCCV 2023 참관기

[EMNLP 2022] UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognitio

[ICASSP 2022] AudioCLIP : Extending CLIP To Image, Text And Audio

[ACL 2020] Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks – Part1

Conference Deadline

NEW POST

New Comment