Change the repository type filter
All
Repositories list
63 repositories
VLM2Vec
PublicThis repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]ClawBench
PublicOpen-source benchmark for browser AI agents on daily tasks.StructEval
PublicEvaluating LLMs' abilities to generate structural output [TMLR2025]OpenResearcher
PublicOpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesisverl-tool
PublicRationalRewards
PublicPixel-Reasoner
PublicPixel-Level Reasoning Model trained with RL [NeuIPS25]VideoEval-Pro
PublicVideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation [TMLR26]SWE-Next
PublicVisPhyWorld
PublicSWE-QA-Pro
PublicRewardHarness
PublicSelf-evolving agentic reward framework for image-editing evaluation — 47.4% on EditReward-Bench from only 100 preference demos, no reward-model training. arXiv …EditReward
PublicEditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing [ICLR 2026]Critique-Coder
PublicHierarchical-Reasoner
PublicImagenWorld
PublicStress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks [ICLR 2026]MMLU-Pro
PublicThe code and data for "MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark" [NeurIPS 2024]EvolveCoder
PublicMMMU
PublicContext-Forcing
PublicVisualWebInstruct
PublicVisCoder2
PublicThe official code of "VisCoder2: Building Multi-Language Visualization Coding Agents" [ICLR26]BrowserAgent
PublicMantis
PublicVideoScore2
PublicVideoScore
Publicofficial repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]ImagenHub
PublicA one-stop library to standardize the inference and evaluation of all the conditional image generation models. [ICLR 2024]General-Reasoner
PublicQuickCodec
PublicQuickVideo
PublicQuick Long Video Understanding [TMLR2025]
ProTip! When viewing an organization's repositories, you can use the
props. filter to filter by custom property.