MiniCPM Legacy Topics

Auxiliary topics for earlier MiniCPM releases. For the current flagship model, see the main README.

中文版: README-legacy-cn.md

BitCPM4: Quantization

BitCPM4 are ternary quantized models derived from the MiniCPM series models through quantization-aware training (QAT), achieving significant improvements in both training efficiency and model parameter efficiency.

Improvements of the training method
- Searching hyperparameters with a wind-tunnel on a small model.
- Using a two-stage training method: training in high-precision first and then QAT, making the best of the trained high-precision models and significantly reducing the computational resources required for the QAT phase.
High parameter efficiency
- Achieving comparable performance to full-precision models of similar parameter models with a bit width of only 1.58 bits, demonstrating high parameter efficiency.

BitCPM4 Evaluation

BitCPM4's performance is comparable with other full-precision models in same model size.

BitCPM4 Inference

BitCPM4's parameters are stored in a fake-quantized format, which supports direct inference within the Huggingface framework.

MiniCPM4 Application

Click to view details about MiniCPM4 Application

MiniCPM4-Survey: Trustworthy Survey Generation

MiniCPM4-Survey is an open-source LLM agent model jointly developed by THUNLP, Renmin University of China and ModelBest. Built on MiniCPM4-8B, it accepts users' quiries as input and autonomously generate trustworthy, long-form survey papers.

Key features include:

Plan-Retrieve-Write Survey Generation Framework — We propose a multi-agent generation framework, which operates through three core stages: planning (defining the overall structure of the survey), retrieval (generating appropriate retrieval keywords), and writing (synthesizing the retrieved information to generate coherent section-level content).
High-Quality Dataset Construction — We gather and process lots of expert-written survey papers to construct a high-quality training dataset. Meanwhile, we collect a large number of research papers to build a retrieval database.
Multi-Aspect Reward Design — We carefully design a reward system with three aspects (structure, content, and citations) to evaluate the quality of the surveys, which is used as the reward function in the RL training stage.
Multi-Step RL Training Strategy — We propose a Context Manager to ensure retention of essential information while facilitating efficient reasoning, and we construct Parallel Environment to maintain efficient RL training cycles.

Demo and Quick Start

See here

Performance Evaluation

Method	Relevance	Coverage	Depth	Novelty	Avg.	Fact Score
Naive RAG (driven by G2FT)	3.25	2.95	3.35	2.60	3.04	43.68
AutoSurvey (driven by G2FT)	3.10	3.25	3.15	3.15	3.16	46.56
Webthinker (driven by WTR1-7B)	3.30	3.00	2.75	2.50	2.89	--
Webthinker (driven by QwQ-32B)	3.40	3.30	3.30	2.50	3.13	--
OpenAI Deep Research (driven by GPT-4o)	3.50	3.95	3.55	3.00	3.50	--
MiniCPM-4-Survey	3.45	3.70	3.85	3.00	3.50	68.73
w/o RL	3.55	3.35	3.30	2.25	3.11	50.24

Performance comparison of the survey generation systems. "G2FT" stands for Gemini-2.0-Flash-Thinking, and "WTR1-7B" denotes Webthinker-R1-7B. FactScore evaluation was omitted for Webthinker, as it does not include citation functionality, and for OpenAI Deep Research, which does not provide citations when exporting the results.

MiniCPM4-MCP: Tool Use with Model Context Protocol

MiniCPM4-MCP is an open-source on-device LLM agent model jointly developed by THUNLP, Renmin University of China and ModelBest, built on MiniCPM-4 with 8 billion parameters. It is capable of solving a wide range of real-world tasks by interacting with various tool and data resources through MCP. As of now, MiniCPM4-MCP supports the following:

Utilization of tools across 16 MCP servers: These servers span various categories, including office, lifestyle, communication, information, and work management.
Single-tool-calling capability: It can perform single- or multi-step tool calls using a single tool that complies with the MCP.
Cross-tool-calling capability: It can perform single- or multi-step tool calls using different tools that complies with the MCP.

Demo

Demo is available in this link.

Performance Evaluation

MCP Server		gpt-4o			qwen3			minicpm4
	func	param	value	func	param	value	func	param	value
Airbnb	89.3	67.9	53.6	92.8	60.7	50.0	96.4	67.9	50.0
Amap-Maps	79.8	77.5	50.0	74.4	72.0	41.0	89.3	85.7	39.9
Arxiv-MCP-Server	85.7	85.7	85.7	81.8	54.5	50.0	57.1	57.1	52.4
Calculator	100.0	100.0	20.0	80.0	80.0	13.3	100.0	100.0	6.67
Computor-Control-MCP	90.0	90.0	90.0	90.0	90.0	90.0	90.0	90.0	86.7
Desktop-Commander	100.0	100.0	100.0	100.0	100.0	100.0	100.0	100.0	100.0
Filesystem	63.5	63.5	31.3	69.7	69.7	26.0	83.3	83.3	42.7
Github	92.0	80.0	58.0	80.5	50.0	27.7	62.8	25.7	17.1
Gaode	71.1	55.6	17.8	68.8	46.6	24.4	68.9	46.7	15.6
MCP-Code-Executor	85.0	80.0	70.0	80.0	80.0	70.0	90.0	90.0	65.0
MCP-Docx	95.8	86.7	67.1	94.9	81.6	60.1	95.1	86.6	76.1
PPT	72.6	49.8	40.9	85.9	50.7	37.5	91.2	72.1	56.7
PPTx	64.2	53.7	13.4	91.0	68.6	20.9	91.0	58.2	26.9
Simple-Time-Server	90.0	70.0	70.0	90.0	90.0	90.0	90.0	60.0	60.0
Slack	100.0	90.0	70.0	100.0	100.0	65.0	100.0	100.0	100.0
Whisper	90.0	90.0	90.0	90.0	90.0	90.0	90.0	90.0	30.0
Average	80.2	70.2	49.1	83.5	67.7	43.8	88.3	76.1	51.2

MiniCPM Intel AIPC Client: A New Edge Large Model Powerhouse

Developed in collaboration between Mianbi Intelligence and Intel, the MiniCPM Intel AIPC Client is an edge large model client specially designed for devices equipped with Intel Core Ultra series processors. It delivers a low-latency, high-efficiency, and privacy-preserving local large model experience for developers, researchers, and AI enthusiasts. Its core features include:

Key Features

Deep Intel Hardware Adaptation
Fully compatible with Intel Core Ultra series processors, enabling deep integration with hardware to unleash peak performance. Users can run large models smoothly on local devices without relying on cloud services.
Extreme Optimization Based on OpenVINO
Deeply optimized with the OpenVINO inference framework, it significantly boosts inference efficiency, reaching up to 80 tokens per second. This ensures rapid model response for both quick queries and complex task processing.
Privacy and Security Assurance
Adopting local deployment, all data processing is completed on the device, eliminating privacy risks from cloud uploads. This provides users with peace of mind, especially for scenarios with high data privacy requirements.
Catering to Diverse User Groups
Whether for developers chasing cutting-edge technologies, researchers focused on academic studies, or enthusiasts eager to explore AI applications, the MiniCPM Intel AIPC Client enables easy access to the power of local large models, opening the door to personalized AI exploration.

System Requirements

Recommended processor: Intel Core Ultra 7 or higher (mobile version)
Recommended RAM: 32GB or above

Download

download

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MiniCPM Legacy Topics

BitCPM4: Quantization

BitCPM4 Evaluation

BitCPM4 Inference

MiniCPM4 Application

MiniCPM4-Survey: Trustworthy Survey Generation

Demo and Quick Start

Performance Evaluation

MiniCPM4-MCP: Tool Use with Model Context Protocol

Demo

Performance Evaluation

MiniCPM Intel AIPC Client: A New Edge Large Model Powerhouse

Key Features

System Requirements

Download

FilesExpand file tree

README-legacy.md

Latest commit

History

README-legacy.md

File metadata and controls

MiniCPM Legacy Topics

BitCPM4: Quantization

BitCPM4 Evaluation

BitCPM4 Inference

MiniCPM4 Application

MiniCPM4-Survey: Trustworthy Survey Generation

Demo and Quick Start

Performance Evaluation

MiniCPM4-MCP: Tool Use with Model Context Protocol

Demo

Performance Evaluation

MiniCPM Intel AIPC Client: A New Edge Large Model Powerhouse

Key Features

System Requirements

Download