Submitted by Crayon-Shinchan 46 OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models · 11 authors 1
Submitted by KID-22 27 OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System · 16 authors 2
Submitted by lyhisme 25 TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs · 7 authors 2
Submitted by Guizhen 15 GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning · 7 authors 0 2
Submitted by minsoo2333 15 EpiCache: Episodic KV Cache Management for Long Conversational Question Answering · 5 authors 3
Submitted by worstcoder 15 DiffusionNFT: Online Diffusion Reinforcement with Forward Process · 10 authors 1
Submitted by taesiri 14 SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks? · 19 authors 2
Submitted by yusufma555 11 ByteWrist: A Parallel Robotic Wrist Enabling Flexible and Anthropomorphic Motion for Confined Spaces · 7 authors 2
Submitted by comar 10 VideoFrom3D: 3D Scene Video Generation via Complementary Image and Video Diffusion Models · 3 authors 25 2
Submitted by AdinaY 10 FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions · 29 authors 2
Submitted by Umean 8 Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels · 10 authors 2
Submitted by MElHuseyni 6 Turk-LettuceDetect: A Hallucination Detection Models for Turkish RAG Applications · 5 authors 1
Submitted by hjeon2k 6 QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models · 5 authors 6 2
Submitted by JonasGeiping 5 Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM · 9 authors 2
Submitted by taesiri 4 ContextFlow: Training-Free Video Object Editing via Adaptive Context Enrichment · 4 authors 2
Submitted by MrZilinXiao 3 MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction · 7 authors 2
Submitted by yeliudev 3 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning · 7 authors 8 3
Submitted by sileod 3 Reasoning Core: A Scalable RL Environment for LLM Symbolic Reasoning · 3 authors 9 1
Submitted by cmhungsteve 3 V2V-GoT: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multimodal Large Language Models and Graph-of-Thoughts · 6 authors 1
Submitted by HJOK 3 AuditoryBench++: Can Language Models Understand Auditory Knowledge without Hearing? · 4 authors 1
Submitted by danielm1405 2 Accurate and Efficient Low-Rank Model Merging in Core Space · 8 authors 2 2
Submitted by richardcsuwandi 1 Adaptive Kernel Design for Bayesian Optimization Is a Piece of CAKE with LLMs · 6 authors 1 2
Submitted by skrishna 1 D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models · 9 authors 2
Submitted by mrajbrahma 1 DIWALI - Diversity and Inclusivity aWare cuLture specific Items for India: Dataset and Assessment of LLMs for Cultural Text Adaptation in Indian Context · 3 authors 2
Submitted by mandipgoswami 1 BeepBank-500: A Synthetic Earcon Mini-Corpus for UI Sound Research and Psychoacoustics Research · 1 authors 0 2
Submitted by SteveZeyuZhang 1 VaseVQA: Multimodal Agent and Benchmark for Ancient Greek Pottery · 10 authors 2
Submitted by abhiram4572 1 When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs · 4 authors 3 2
Submitted by starriver030515 1 From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature · 7 authors 2 2
Submitted by SteveZeyuZhang 1 StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes · 6 authors 3 2
Submitted by Geralt-Targaryen 1 CodeFuse-CR-Bench: A Comprehensiveness-aware Benchmark for End-to-End Code Review Evaluation in Python Projects · 7 authors 2
Submitted by hao-li 1 From Hugging Face to GitHub: Tracing License Drift in the Open-Source AI Ecosystem · 5 authors 2
Submitted by akhaliq 1 DEXOP: A Device for Robotic Transfer of Dexterous Human Manipulation · 12 authors 2
Submitted by dyyyyyyyy - SCAN: Self-Denoising Monte Carlo Annotation for Robust Process Reward Learning · 6 authors 2
Submitted by lucadellalib - FocalCodec-Stream: Streaming Low-Bitrate Speech Coding via Causal Distillation · 3 authors 2