Skip to content

ChatSpeed v1.2.8

Latest

Choose a tag to compare

@github-actions github-actions released this 24 Feb 14:26

🚀 New Features

  • 429 Exponential Backoff Retry Mechanism: Added intelligent retry functionality to the CCProxy module to effectively handle server-side rate limiting scenarios.

    • When the backend returns a 429 (Too Many Requests) status code, the system automatically performs exponential backoff retries.
    • Retry attempts can be configured via the settings interface (0-10 times, default is 0 meaning no retry).
    • Smart backoff strategy: 1 second for the 1st retry, 2 seconds for the 2nd, 4 seconds for the 3rd, and so on, up to a maximum of 32 seconds.
    • Supports the Retry-After response header from the server, prioritizing the server-suggested wait time.
    • This feature covers all proxy request paths: direct forwarding, protocol conversion, and Embedding requests.
  • Multi-Vendor Reasoning Support: Extended compatibility for reasoning/thinking modes across major AI providers.

    • OpenAI o1/o3 Series: Automatically maps reasoning configuration to the reasoning_effort parameter (low/medium/high).
    • MiniMax: Supports the reasoning_split parameter and correctly parses reasoning_details in responses.
    • Zhipu GLM, DeepSeek, Kimi (Moonshot): Automatically injects the thinking parameter to enable reasoning mode.
    • Qwen Series: Supports enable_thinking and thinking_budget parameter configuration.
    • Reasoning content is now properly forwarded to the frontend for display and preserved across multi-turn conversations.

🪄 Improvements

  • Tool Compatibility Mode Parser Overhaul: Completely rewrote the XML tool call parsing engine for significantly improved robustness.

    • Added a manual scanning parser as the primary method, correctly handling parameter content with unescaped code.
    • Supports self-closing tag format (e.g., <arg name="x" value="y" />), adapting to more model output styles.
    • Supports single-quoted, double-quoted, and unquoted attribute values for stronger fault tolerance.
    • Automatically cleans up redundant closing tags generated by models (e.g., Kimi may output duplicate </cs:tool_use>).
    • Added logit_bias parameter passthrough support for advanced configuration capabilities.
  • Proxy Statistics Dashboard Overhaul: Completely redesigned the Proxy Stats page for better data visualization and user experience.

    • KPI Metric Cards: Added four key performance indicator cards at the top: Total Requests, Total Tokens, Error Rate, and Active Models for at-a-glance insights.
    • Tabbed Layout: Organized all charts into two tabbed sections for cleaner UI and better space utilization.
      • Trend Analysis Tab: Daily Token consumption trends (default) and Daily Requests with Error Rate dual-axis chart.
      • Distribution Analysis Tab: Model Token Usage (default), Model Usage Count, Provider Token Usage (new), and Error Code Distribution.
    • Horizontal Bar Charts: Converted all pie charts to horizontal bar charts for easier comparison of top items.
    • Smart Axis Formatting: Implemented automatic formatting for large numbers using Chinese units (万 for 10k+, 亿 for 100M+).
    • Visual Polish: Applied dashed grid lines with solid axis lines for better readability; fixed tooltip display issues.
    • Provider Token Analytics: Added new backend API and chart for provider-level token consumption analysis.

🚀 新功能

  • 429 指数退避重试机制:为 CCProxy 代理模块新增了智能重试功能,有效应对服务端限流场景。

    • 当后端返回 429 (Too Many Requests) 状态码时,系统会自动进行指数退避重试。
    • 支持通过设置界面配置重试次数(0-10 次,默认为 0 即不重试)。
    • 智能退避策略:第 1 次重试等待 1 秒,第 2 次 2 秒,第 3 次 4 秒,依此类推,最大等待 32 秒。
    • 支持服务端返回的 Retry-After 响应头,优先使用服务端建议的等待时间。
    • 该功能覆盖所有代理请求路径:直接转发、协议转换和 Embedding 请求。
  • 多供应商推理(Reasoning)支持:扩展了对各主流供应商推理/思考模式的兼容性。

    • OpenAI o1/o3 系列:自动将推理配置映射为 reasoning_effort 参数(low/medium/high)。
    • MiniMax:支持 reasoning_split 参数,并正确解析响应中的 reasoning_details 字段。
    • 智谱 GLM、DeepSeek、Kimi(月之暗面):自动注入 thinking 参数启用推理模式。
    • Qwen 系列:支持 enable_thinkingthinking_budget 参数配置。
    • 推理内容现可正确回传至前端展示,并在多轮对话中保留上下文。

🪄 改进

  • 工具兼容模式解析器重构:彻底重写了 XML 工具调用解析引擎,显著提升鲁棒性。

    • 新增手动扫描解析器作为首选方案,可正确处理包含未转义代码的参数内容。
    • 支持自闭合标签格式(如 <arg name="x" value="y" />),适配更多模型输出风格。
    • 支持单引号、双引号及无引号的属性值,容错性更强。
    • 自动清理模型产生的冗余闭合标签(如 Kimi 模型可能输出重复的 </cs:tool_use>)。
    • 新增 logit_bias 参数透传支持,扩展高级配置能力。
  • 代理统计信息页面重构:重新设计了代理统计页面,优化了数据可视化效果和用户体验。

    • KPI 指标卡片:在顶部添加了四个关键指标卡片:总请求数、总 Token 数、错误率和活跃模型数,方便快速查看核心数据。
    • Tab 分组布局:将所有图表组织到两个 Tab 分组中,界面更简洁,空间利用更合理。
      • 趋势分析 Tab:每日 Token 消耗趋势(默认)和每日请求量与错误率双轴图表。
      • 分布分析 Tab:后端模型 Token 消耗分布(默认)、后端模型使用次数分布、供应商 Token 消耗分布(新增)和错误状态码分布。
    • 横向柱状图:将所有饼图转换为横向柱状图,便于对比排名靠前的项目。
    • 智能坐标轴格式化:实现大数字自动格式化,使用中文单位(万表示 1 万以上,亿表示 1 亿以上)。
    • 视觉优化:使用虚线网格线和实线坐标轴提升可读性;修复了提示文字显示问题。
    • 供应商 Token 分析:新增后端 API 和图表,支持按供应商维度分析 Token 消耗情况。