<span class="vcard">/u/Successful-Western27</span>
/u/Successful-Western27

Text-Guided Seamless Video Loop Generation Using Latent Cycle Shifting

I've been examining this new approach to generating seamless looping videos from text prompts called Mobius. The key technical innovation here is a latent shift-based framework that ensures smooth transitions between the end and beginning frames of…

Test-Time Routing Optimization for Multimodal Mixture-of-Experts Models

This paper introduces a test-time optimization method called R2-T2 that improves routing in mixture-of-experts (MoE) models without requiring retraining. The core idea is using gradient descent during inference to optimize how inputs get routed to diff…

Chain of Draft: Streamlining LLM Reasoning with Minimal Token Generation

This paper introduces Chain-of-Draft (CoD), a novel prompting method that improves LLM reasoning efficiency by iteratively refining responses through multiple drafts rather than generating complete answers in one go. The key insight is that LLMs can bu…

Visual Perception Tokens Enable Self-Guided Visual Attention in Multimodal LLMs

The researchers propose integrating Visual Perception Tokens (VPT) into multimodal language models to improve their visual understanding capabilities. The key idea is decomposing visual information into discrete tokens that can be processed alongside t…

AlchemyBench: A 17K Expert-Verified Materials Synthesis Dataset with LLM-Based Automated Evaluation

This work introduces an LLM-based system for evaluating materials synthesis feasibility, trained on a new large-scale dataset of 2.1M synthesis records. The key innovation is using the LLM as an expert-level judge to filter proposed materials based on …

Evaluating LLMs on Complex Temporal Reasoning Using Chinese Dynastic History

A new benchmark dataset called Chinese Temporal Mapping (CTM) tests LLMs on temporal reasoning using Chinese historical knowledge. The dataset contains 2,306 multiple-choice questions spanning major Chinese dynasties, evaluating both pure temporal logi…

Improving LMM Visual Reasoning Through Iterative Self-Synthesis and Expert-Guided Feature Selection

This work introduces a novel methodology for multimodal foundation models to self-synthesize training data that enhances both their cognitive capabilities and explainability. The core technique involves generating synthetic data through recursive self-…

Auto-Weighted Multi-Graph Learning for Distributed Data Under Privacy Constraints

This approach introduces a novel method for learning graph structures across distributed data sources while preserving privacy. The core idea is using an auto-weighted multiple graph learning framework that allows clients to maintain local graph repres…

Model Editing Reality Check: Performance Gaps Between Controlled Tests and Real-World QA Applications

The key contribution here is a rigorous real-world evaluation of model editing methods, specifically introducing QAEdit – a new benchmark that tests editing effectiveness without the artificial advantages of teacher forcing during evaluation. Main tech…

Exploring Non-Algorithmic Modes of Computing: A Framework for Natural and Artificial Computation

This paper examines fundamental differences between artificial and biological computing systems through the lens of representation and interpretation. The key technical contribution is a formal analysis framework that contrasts how machines and organis…