Skip to content

Refine tiflash FAQ and configuration docs#20252

Merged
ti-chi-bot[bot] merged 23 commits intopingcap:masterfrom
JaySon-Huang:refine_tiflash_docs
Jul 4, 2025
Merged

Refine tiflash FAQ and configuration docs#20252
ti-chi-bot[bot] merged 23 commits intopingcap:masterfrom
JaySon-Huang:refine_tiflash_docs

Conversation

@JaySon-Huang
Copy link
Copy Markdown
Contributor

@JaySon-Huang JaySon-Huang commented Apr 23, 2025

First-time contributors' checklist

What is changed, added or deleted? (Required)

  • tiflash/create-tiflash-replicas.md
    • Added instructions for adjusting the remove-peer restriction when rebalancing Regions from old to new TiFlash nodes.
  • tiflash/tiflash-configuration.md
    • Removed the PD scheduling parameters section.
    • Clarified that floating-point numbers can be used for profiles.default.max_memory_usage_for_all_queries since v6.6.0.
    • Removed the multi-disk deployment eariler than v4.0.9 because that version reaches EOL
  • tiflash/troubleshoot-tiflash.md
    • Added a check for CPU support for vector extension instruction sets (AVX2 for AMD64, ARMv8 for ARM64).
    • Reorganized the 'TiFlash replica is always unavailable' section, providing a more structured approach to troubleshooting.
    • Removed the 'Data replication gets stuck' section and integrated its content into the 'TiFlash replica is always unavailable' section.

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions (in Chinese).

  • master (the latest development version)
  • v9.0 (TiDB 9.0 versions)
  • v8.5 (TiDB 8.5 versions)
  • v8.4 (TiDB 8.4 versions)
  • v8.3 (TiDB 8.3 versions)
  • v8.1 (TiDB 8.1 versions)
  • v7.5 (TiDB 7.5 versions)
  • v7.1 (TiDB 7.1 versions)
  • v6.5 (TiDB 6.5 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)

What is the related PR or file link(s)?

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

@ti-chi-bot ti-chi-bot bot added missing-translation-status This PR does not have translation status info. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 23, 2025
Comment on lines -11 to -27
## PD 调度参数

可通过 [pd-ctl](/pd-control.md) 调整参数。如果你使用 TiUP 部署,可以用 `tiup ctl:v<CLUSTER_VERSION> pd` 代替 `pd-ctl -u <pd_ip:pd_port>` 命令。

- [`replica-schedule-limit`](/pd-configuration-file.md#replica-schedule-limit):用来控制 replica 相关 operator 的产生速度(涉及到下线、补副本的操作都与该参数有关)

> **注意:**
>
> 不要超过 `region-schedule-limit`,否则会影响正常 TiKV 之间的 Region 调度。

- `store-balance-rate`:用于限制每个 TiKV store 或 TiFlash store 的 Region 调度速度。注意这个参数只对新加入集群的 store 有效,如果想立刻生效请用下面的方式。

> **注意:**
>
> 4.0.2 版本之后(包括 4.0.2 版本)废弃了 `store-balance-rate` 参数且 `store limit` 命令有部分变化。该命令变化的细节请参考 [store-limit 文档](/configure-store-limit.md)。

- 使用 `pd-ctl -u <pd_ip:pd_port> store limit <store_id> <value>` 命令单独设置某个 store 的 Region 调度速度。(`store_id` 可通过 `pd-ctl -u <pd_ip:pd_port> store` 命令获得)如果没有单独设置,则继承 `store-balance-rate` 的设置。你也可以使用 `pd-ctl -u <pd_ip:pd_port> store limit` 命令查看当前设置值。
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These content are somehow outdated. And how to speed up tiflash replication are duplicated with that described in create-tiflash-replicas.md#speed-up-tiflash-replication.


- 使用 `pd-ctl -u <pd_ip:pd_port> store limit <store_id> <value>` 命令单独设置某个 store 的 Region 调度速度。(`store_id` 可通过 `pd-ctl -u <pd_ip:pd_port> store` 命令获得)如果没有单独设置,则继承 `store-balance-rate` 的设置。你也可以使用 `pd-ctl -u <pd_ip:pd_port> store limit` 命令查看当前设置值。

- [`replication.location-labels`](/pd-configuration-file.md#location-labels):用来表示 TiKV 实例的拓扑关系,其中 key 的顺序代表了不同标签的层次关系。在 TiFlash 开启的情况下需要使用 [`pd-ctl config placement-rules`](/pd-control.md#config-show--set-option-value--placement-rules) 来设置默认值,详细可参考 [geo-distributed-deployment-topology](/geo-distributed-deployment-topology.md)。
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is duplicated with the later section "通过拓扑 label 进行副本调度"

Comment on lines -600 to -610
#### TiDB 集群版本低于 v4.0.9

TiDB v4.0.9 之前的版本中,TiFlash 只支持将存储引擎中的主要数据分布在多盘上。通过 `path`(TiUP 中为 `data_dir`)和 `path_realtime_mode` 这两个参数配置多盘部署。

多个数据存储目录在 `path` 中以英文逗号分隔,比如 `/nvme_ssd_a/data/tiflash,/sata_ssd_b/data/tiflash,/sata_ssd_c/data/tiflash`。如果你的节点上有多块硬盘,推荐把性能最好的硬盘目录放在最前面,以更好地利用节点性能。

如果节点上有多块相同规格的硬盘,可以把 `path_realtime_mode` 参数留空(或者把该值明确地设为 `false`)。这表示数据会在所有的存储目录之间进行均衡。但由于最新的数据仍然只会被写入到第一个目录,因此该目录所在的硬盘会较其他硬盘繁忙。

如果节点上有多块规格不一致的硬盘,推荐把 `path_relatime_mode` 参数设置为 `true`,并且把性能最好的硬盘目录放在 `path` 参数内的最前面。这表示第一个目录只会存放最新数据,较旧的数据会在其他目录之间进行均衡。注意此情况下,第一个目录规划的容量大小需要占总容量的约 10%。

#### TiDB 集群版本为 v4.0.9 及以上
Copy link
Copy Markdown
Contributor Author

@JaySon-Huang JaySon-Huang Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v4.0.x are EOL at 2024-04-02
And user are less likely to deploy a tidb cluster lower than v4.0.9 when checking the latest version of docs.

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 23, 2025

@Lloyd-Pottiger: adding LGTM is restricted to approvers and reviewers in OWNERS files.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Copy Markdown
Member

@CalvinNeo CalvinNeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 23, 2025

@CalvinNeo: adding LGTM is restricted to approvers and reviewers in OWNERS files.

Details

In response to this:

lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 26, 2025
@JaySon-Huang JaySon-Huang changed the title Refine some tiflash docs Refine tiflash FAQ and configuration docs Apr 26, 2025
Comment on lines -248 to -262
## TiFlash 数据同步卡住

如果 TiFlash 数据一开始可以正常同步,过一段时间后全部或者部分数据无法继续同步,你可以通过以下步骤确认或解决问题:

1. 检查磁盘空间。

检查磁盘使用空间比例是否高于 `low-space-ratio` 的值(默认值 0.8,即当节点的空间占用比例超过 80% 时,为避免磁盘空间被耗尽,PD 会尽可能避免往该节点迁移数据)。

- 如果磁盘使用率大于等于 `low-space-ratio`,说明磁盘空间不足。此时,请删除不必要的文件,如 `${data}/flash/` 目录下的 `space_placeholder_file` 文件(必要时可在删除文件后将 `reserve-space` 设置为 0MB)。
- 如果磁盘使用率小于 `low-space-ratio`,说明磁盘空间正常,进入下一步。

2. 检查是否有 `down peer` (`down peer` 没有清理干净可能会导致同步卡住)。

- 执行 `pd-ctl region check-down-peer` 命令检查是否有 `down peer`。
- 如果存在 `down peer`,执行 `pd-ctl operator add remove-peer <region-id> <tiflash-store-id>` 命令将其清除。
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is merged to the "TiFlash 副本始终处于不可用状态"


如果遇到上述方法无法解决的问题,可以打包 TiFlash 的 log 文件夹,并在 [AskTUG](http://asktug.com) 社区中提问。

## TiFlash 副本始终处于不可用状态
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is moved to before the "TiFlash 数据不同步" part

Copy link
Copy Markdown
Member

@CalvinNeo CalvinNeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 28, 2025

@CalvinNeo: adding LGTM is restricted to approvers and reviewers in OWNERS files.

Details

In response to this:

lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@hfxsd hfxsd self-assigned this May 9, 2025
@hfxsd hfxsd added translation/done This PR has been translated from English into Chinese and updated to pingcap/docs-cn in a PR. needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. and removed missing-translation-status This PR does not have translation status info. labels May 9, 2025
@hfxsd hfxsd self-requested a review May 9, 2025 08:07
@hfxsd
Copy link
Copy Markdown
Collaborator

hfxsd commented May 12, 2025

/bot-review

@github-actions
Copy link
Copy Markdown

✅ AI review completed, 21 comments generated.

hfxsd and others added 3 commits June 13, 2025 12:57
Signed-off-by: JaySon-Huang <tshent@qq.com>
@JaySon-Huang JaySon-Huang force-pushed the refine_tiflash_docs branch from 905063b to 4a65996 Compare June 13, 2025 04:57
@ti-chi-bot ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 13, 2025
Co-authored-by: xixirangrang <hfxsd@hotmail.com>
@Oreoxmt Oreoxmt self-requested a review June 30, 2025 02:47
JaySon-Huang and others added 2 commits July 4, 2025 07:57
Co-authored-by: Aolin <aolinz@outlook.com>
Co-authored-by: Aolin <aolinz@outlook.com>
@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jul 4, 2025
@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jul 4, 2025
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Jul 4, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-07-04 02:27:31.523906074 +0000 UTC m=+1621104.247085057: ☑️ agreed by Oreoxmt.
  • 2025-07-04 03:32:16.401172951 +0000 UTC m=+1624989.124351934: ☑️ agreed by hfxsd.

@hfxsd
Copy link
Copy Markdown
Collaborator

hfxsd commented Jul 4, 2025

/approve

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Jul 4, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hfxsd

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the approved label Jul 4, 2025
@ti-chi-bot ti-chi-bot bot merged commit c615f6a into pingcap:master Jul 4, 2025
7 checks passed
ti-chi-bot pushed a commit to ti-chi-bot/docs-cn that referenced this pull request Jul 4, 2025
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Copy Markdown
Member

In response to a cherrypick label: new pull request created to branch release-8.5: #20617.
But this PR has conflicts, please resolve them!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved area/develop This PR relates to the area of TiDB App development. lgtm needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. translation/done This PR has been translated from English into Chinese and updated to pingcap/docs-cn in a PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants