Skip to content

Commit 8e60110

Browse files
committed
feat: update docs content
1 parent ecf3c80 commit 8e60110

File tree

6 files changed

+58
-56
lines changed

6 files changed

+58
-56
lines changed

docs/03-getting-started/02-install-lower-layer-system/02-install-kubernetes.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ sudo apt-mark hold kubelet kubeadm kubectl
4040

4141
## Start Kubernetes cluster (cloud)
4242

43-
This step may occur [Question 1: kube-proxy report iptables problems](/docs/getting-started/install-lower-layer-system/faqs#question-1-kube-proxy-report-iptables-problems), [Question 2: calico and coredns are always in initializing state](/docs/getting-started/install-lower-layer-system/faqs#question-2-calico-and-coredns-are-always-in-initializing-state) and [Question 3:metrics-server keeps unsuccessful state](/docs/getting-started/install-lower-layer-system/faqs#question-3metrics-server-keeps-unsuccessful-state).
43+
This step may occur [Question 1: kube-proxy report iptables problems](/docs/getting-started/install-lower-layer-system/faqs#question-1-kube-proxy-report-iptables-problems), [Question 2: calico and coredns are always in initializing state](/docs/getting-started/install-lower-layer-system/faqs#question-2-calico-and-coredns-are-always-in-initializing-state) and [Question 3:metrics-server keeps unsuccessful state](/docs/getting-started/install-lower-layer-system/faqs#question-3-metrics-server-keeps-unsuccessful-state).
4444

4545
### Reset environment (Skip on first installation)
4646

docs/03-getting-started/02-install-lower-layer-system/03-install-kubeedge.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ cp keadm-v1.9.2-linux-amd64/keadm/keadm /usr/local/bin/
2727

2828
## Start KubeEdge (cloud)
2929

30-
This step may occur [Question 4:10002 already in use](/docs/getting-started/install-lower-layer-system/faqs#question-410002-already-in-use).
30+
This step may occur [Question 4:10002 already in use](/docs/getting-started/install-lower-layer-system/faqs#question-4-10002-already-in-use).
3131

3232
### Reset environment (Skip on first installation)
3333

@@ -90,7 +90,7 @@ Use `journalctl -u cloudcore.service -xe` to check if cloudcore is running norma
9090

9191
## Join KubeEdge cluster (edge)
9292

93-
This step may occur [Question 5:edgecore file exists](/docs/getting-started/install-lower-layer-system/faqs#question-5edgecore-file-exists).
93+
This step may occur [Question 5:edgecore file exists](/docs/getting-started/install-lower-layer-system/faqs#question-5-edgecore-file-exists).
9494

9595
If `keadm join` fails, you can retry from `keadm reset`.
9696

@@ -121,7 +121,7 @@ Execute on the edge:
121121
keadm join --cloudcore-ipport=114.212.81.11:10000 --kubeedge-version=1.9.2 --token=9e1832528ae701aba2c4f7dfb49183ab2487e874c8090e68c19c95880cd93b50.eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE3MTk1NjU4MzF9.1B4su4QwvQy_ZCPs-PIyDT9ixsDozfN1oG4vX59tKDs
122122
```
123123

124-
This step may occur [Question 25:keadm join error on edge nodes](/docs/getting-started/install-lower-layer-system/faqs#question-25keadm-join-error-on-edge-nodes).
124+
This step may occur [Question 25:keadm join error on edge nodes](/docs/getting-started/install-lower-layer-system/faqs#question-25-keadm-join-error-on-edge-nodes).
125125

126126
```bash
127127
# Check for logs

docs/03-getting-started/02-install-lower-layer-system/04-install-edgemesh.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ custom_edit_url: null
88

99
## EdgeMesh environment preparation (cloud)
1010

11-
This Step may occur [Question 6:TLSStreamPrivateKeyFile not exist](/docs/getting-started/install-lower-layer-system/faqs#question-6tlsstreamprivatekeyfile-not-exist)
11+
This Step may occur [Question 6:TLSStreamPrivateKeyFile not exist](/docs/getting-started/install-lower-layer-system/faqs#question-6-tlsstreamprivatekeyfile-not-exist)
1212

1313
Step 1: Remove the taint on the master node.
1414
```bash

docs/03-getting-started/02-install-lower-layer-system/08-faqs.md

Lines changed: 49 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -6,77 +6,74 @@ custom_edit_url: null
66

77
# FAQs
88

9-
[TBD]
9+
## Question 1: kube-proxy report iptables problems
1010

11-
### Question 1: kube-proxy report iptables problems
12-
13-
```
11+
```bash
1412
E0627 09:28:54.054930 1 proxier.go:1598] Failed to execute iptables-restore: exit status 1 (iptables-restore: line 86 failed ) I0627 09:28:54.054962 1 proxier.go:879] Sync failed; retrying in 30s
1513
```
1614

17-
Solutionclear iptables directly.
15+
**Solution:** clear iptables directly.
1816

19-
```
17+
```bash
2018
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
2119
```
2220

23-
### Question 2: calico and coredns are always in initializing state
21+
## Question 2: calico and coredns are always in initializing state
2422

2523
The follwoing message will occur when using `kubectl describe <podname>` which is roughly related to network and sandbox issues.
26-
```
24+
```bash
2725
Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "7f5b66ebecdfc2c206027a2afcb9d1a58ec5db1a6a10a91d4d60c0079236e401" network for pod "calico-kube-controllers-577f77cb5c-99t8z": networkPlugin cni failed to set up pod "calico-kube-controllers-577f77cb5c-99t8z_kube-system" network: error getting ClusterInformation: Get "https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.96.0. 1:443: i/o timeout, failed to clean up sandbox container "7f5b66ebecdfc2c206027a2afcb9d1a58ec5db1a6a10a91d4d60c0079236e401" network for pod "calico-kube-controllers-577f77cb5c-99t8z": networkPlugin cni failed to teardown pod "calico-kube-controllers-577f77cb5c-99t8z_kube-system" network: error getting ClusterInformation: Get "https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.96.0. 1:443: i/o timeout]
2826
```
2927

30-
Reason: Such a problem occurs when a k8s cluster is initialized more than once
28+
**Reason:** Such a problem occurs when a k8s cluster is initialized more than once
3129
and the network configuration of k8s was not deleted previously.
3230

33-
Solution
31+
**Solution:**
3432
```bash
3533
# delete k8s network configuration
3634
rm -rf /etc/cni/net.d/
3735

3836
# reinitialize k8s with the instruction
3937
```
4038

41-
### Question 3metrics-server keeps unsuccessful state
39+
## Question 3: metrics-server keeps unsuccessful state
4240

43-
Reasonmaster node does not add taint.
41+
**Reason:** master node does not add taint.
4442

45-
Solution
43+
**Solution:**
4644
```bash
4745
# add taint on master node
4846
kubectl taint nodes --all node-role.kubernetes.io/master node-role.kubernetes.io/master
4947
```
5048

51-
### Question 410002 already in use
49+
## Question 4: 10002 already in use
5250

5351
Error message 'xxx already in use' occurs when using `journalctl -u cloudcore.service -xe`.
5452

55-
ReasonThe previous processes were not cleaned up.
53+
**Reason:** The previous processes were not cleaned up.
5654

57-
Solution: Find the process occupying the port and directly kill it.
55+
**Solution:** Find the process occupying the port and directly kill it.
5856
```bash
5957
lsof -i:xxxx
6058
kill xxxxx
6159
```
6260

63-
### Question 5edgecore file exists
61+
## Question 5: edgecore file exists
6462

6563
When attempting to create a symbolic link in installing edgecore, the target path already exists and cannot be created.
66-
```
67-
execute keadm command failed: failed to exec 'bash -c sudo ln /etc/kubeedge/edgecore.service /etc/systemd/system/edgecore.service && sudo systemctl daemon-reload && sudo systemctl enable edgecore && sudo systemctl start edgecore', err: ln: failed to create hard link '/etc/systemd/system/edgecore.service': File exists
68-
, err: exit status 1
64+
```bash
65+
execute keadm command failed: failed to exec 'bash -c sudo ln /etc/kubeedge/edgecore.service /etc/systemd/system/edgecore.service && sudo systemctl daemon-reload && sudo systemctl enable edgecore && sudo systemctl start edgecore', err: ln: failed to create hard link '/etc/systemd/system/edgecore.service': File exists, err: exit status 1
6966
```
7067

71-
Reason: `edgecore.service` already exists in the `/etc/systemd/system/` directory
68+
**Reason:** `edgecore.service` already exists in the `/etc/systemd/system/` directory
7269
if edgecore is installed more than once.
7370

74-
Solution: Just delete it.
71+
**Solution:** Just delete it.
7572
```bash
7673
sudo rm /etc/systemd/system/edgecore.service
7774
```
7875

79-
### Question 6TLSStreamPrivateKeyFile not exist
76+
## Question 6: TLSStreamPrivateKeyFile not exist
8077

8178
```bash
8279
TLSStreamPrivateKeyFile: Invalid value: "/etc/kubeedge/certs/stream.key": TLSStreamPrivateKeyFile not exist
@@ -86,21 +83,25 @@ sudo rm /etc/systemd/system/edgecore.service
8683

8784
```
8885

89-
Solution: Check whether directory `/etc/kubeedge` has file `certgen.sh` and run `bash certgen.sh stream`.
86+
**Solution:** Check whether directory `/etc/kubeedge` has file `certgen.sh` and run `bash certgen.sh stream`.
9087

91-
### Question 7edgemesh 的 log 边边互联成功,云边无法连接
88+
## Question 7: edgemesh 的 log 边边互联成功,云边无法连接
9289

93-
#### 排查
90+
**Troubleshooting:**
9491

95-
先复习一下**定位模型**,确定**被访问节点**上的 edgemesh-agent(右)容器是否存在、是否处于正常运行中。
92+
First, based on the **location model**, ensure whether the edgemesh-agent container exists on the **visited node** and whether it is operating normally.
9693

97-
**这个情况是非常经常出现的**,因为 master 节点一般都有污点,会驱逐其他 pod,进而导致 edgemesh-agent 部署不上去。这种情况可以通过去除节点污点,使 edgemesh-agent 部署上去解决。
94+
![Q7-2](/img/FAQs/Q7-2.png)
9895

99-
如果访问节点和被访问节点的 edgemesh-agent 都正常启动了,但是还报这个错误,可能是因为访问节点和被访问节点没有互相发现导致,请这样排查:
96+
This situation is common since the master node usually has taints, which will evict other pods, thereby causing the deployment failure of edgemesh-agent.
97+
This issue can be resolved by **removing the node taints**.
98+
99+
If both the visiting node and the visited node's edgemesh-agent have been started normally while this error is still reported,
100+
it may be due to unsucessful discovering between the visiting node and the visited node. Please troubleshoot in this way:
100101

101102
1. 首先每个节点上的 edgemesh-agent 都具有 peer ID,比如
102103

103-
```bash
104+
```
104105
edge2:
105106
I'm {12D3KooWPpY4GqqNF3sLC397fMz5ZZfxmtMTNa1gLYFopWbHxZDt: [/ip4/127.0.0.1/tcp/20006 /ip4/192.168.1.4/tcp/20006]}
106107
@@ -120,7 +121,7 @@ Solution:在部署 edgemesh 进行 `kubectl apply -f build/agent/resources/` 操
120121

121122
![Q7](/img/FAQs/Q7.png)
122123

123-
### 问题八:master 的gpu 存在但是找不到 gpu 资源
124+
## 问题八:master 的gpu 存在但是找不到 gpu 资源
124125

125126
主要针对的是服务器的情况,可以使用 `nvidia-smi` 查看显卡情况。
126127

@@ -150,7 +151,7 @@ Solution:在部署 edgemesh 进行 `kubectl apply -f build/agent/resources/` 操
150151
151152
```
152153

153-
### 问题九:jeston 的 gpu 存在但是找不到 gpu 资源
154+
## 问题九:jeston 的 gpu 存在但是找不到 gpu 资源
154155

155156
理论上 `k8s-device-plugin` 已经支持了 tegra 即 jetson 系列板子,会在查看 GPU 之前判断是否是 tegra 架构,如果是则采用 tegra 下查看 GPU 的方式(原因在 [[#GPU 支持]]里 quote 过了 ),但是很奇怪的是明明是 tegra 的架构却没有检测到:
156157

@@ -197,7 +198,7 @@ sudo apt-get update
197198
sudo apt-get install -y nvidia-container-toolkit
198199
```
199200

200-
### Question 10: lc127.0.0. 53:53 no such host/connection refused
201+
## Question 10: lc127.0.0. 53:53 no such host/connection refused
201202

202203
During the Sedna installation stage, an error occurs in logs: `lc127.0.0. 53:53 no such host/connection refused`.
203204

@@ -226,7 +227,7 @@ Solution:
226227
1. 如果安装 sedna 脚本有将 hostNetwork 去掉,则检查 edgecore.yaml 的 clusterDNS 部分,**着重注意是否没有二次设置然后被后设置的覆盖掉**
227228
2. 如果没有将 hostNetwork 去掉,则将宿主机的 `/etc/resolv.conf` 添加 `nameserver 169.254.96.16`
228229

229-
### 问题十一:169.254.96.16:no such host
230+
## 问题十一:169.254.96.16:no such host
230231

231232
检查 edgemesh 的配置是否正确:
232233

@@ -235,7 +236,7 @@ Solution:
235236

236237

237238

238-
### 问题十二: `kubectl logs <pod-name>` 超时
239+
## 问题十二: `kubectl logs <pod-name>` 超时
239240

240241
![Q12-1](/img/FAQs/Q12-1.png)
241242

@@ -249,14 +250,14 @@ Solution:
249250

250251

251252

252-
### 问题十三: `kubectl logs <pod-name>` 卡住
253+
## 问题十三: `kubectl logs <pod-name>` 卡住
253254

254255
可能的原因:之前 `kubectl logs` 时未结束就 ctrl+c 结束了导致后续卡住
255256
解决:重启 edgecore/cloudcore `systemctl restart edgecore.service`
256257

257258

258259

259-
### 问题十四:CloudCore报certficate错误
260+
## 问题十四:CloudCore报certficate错误
260261

261262
![Q14](/img/FAQs/Q14.png)
262263

@@ -266,7 +267,7 @@ Solution:
266267

267268

268269

269-
### 问题十五:删除命名空间卡在 terminating
270+
## 问题十五:删除命名空间卡在 terminating
270271

271272
理论上一直等待应该是可以的(但是我等了半个钟也没成功啊!!)
272273
**方法一**,但是没啥用,依旧卡住
@@ -438,7 +439,7 @@ guest@cloud:~/yby$ cat sedna.json
438439

439440
```
440441

441-
### 问题十六:强制删除 pod 之后部署不成功
442+
## 问题十六:强制删除 pod 之后部署不成功
442443

443444
#### 问题描述
444445

@@ -449,7 +450,7 @@ guest@cloud:~/yby$ cat sedna.json
449450

450451
因为--force 是不会实际终止运行的,所以本身原来的 docker 可能还在运行,现在的做法是手动去对应的边缘节点上删除对应的容器(包括 pause,关于 pause 可以看这篇文章[大白话 K8S(03):从 Pause 容器理解 Pod 的本质 - 知乎 (zhihu.com)](https://zhuanlan.zhihu.com/p/464712164)),然后重启 edgecore: `systemctl restart edgecore.service`
451452

452-
### 问题十七:删除 deployment、pod 等,容器依旧自动重启
453+
## 问题十七:删除 deployment、pod 等,容器依旧自动重启
453454

454455
```
455456
journalctl -u edgecore.service -f
@@ -463,7 +464,7 @@ guest@cloud:~/yby$ cat sedna.json
463464
systemctl restart edgecore.service
464465
```
465466

466-
### 问题十八:大面积 Evicted(disk pressure)
467+
## 问题十八:大面积 Evicted(disk pressure)
467468

468469
#### 原因
469470

@@ -511,7 +512,7 @@ systemctl restart kubelet
511512

512513
会发现可以正常部署了(只是应急措施,磁盘空间需要再清理)
513514

514-
### 问题十九:执行iptables 命令时发现系统不支持--dport选项。
515+
## 问题十九:执行iptables 命令时发现系统不支持--dport选项。
515516

516517
#### 问题描述
517518

@@ -525,7 +526,7 @@ systemctl restart kubelet
525526

526527
此时,使用sudo update-alternatives --config iptables命令可以切换版本,执行此命令会提供3个可供选择的版本,其中编号为1的就是legacy版本(必须用sudo权限才能切换成功)。切换成功后再在root模式下执行 iptables -t nat -A OUTPUT -p tcp --dport 10351 -j DNAT --to $CLOUDCOREIPS:10003,理想状态下无输出。
527528

528-
### 问题二十:执行完keadm join再执行journalctl时报错token format错误。
529+
## 问题二十:执行完keadm join再执行journalctl时报错token format错误。
529530

530531
#### 问题描述
531532

@@ -535,7 +536,7 @@ systemctl restart kubelet
535536

536537
此时要么是因为cloudcore.service重启后token变化导致keadm join中的token过时,要么是因为执行keadm join的时候token输入的不对。此时,首先在云端重新获取正确的token,然后在边端从keadm reset开始重新执行一系列操作。
537538

538-
### 问题二十一:重启edgecore.service后再执行journalctl时报错mapping error
539+
## 问题二十一:重启edgecore.service后再执行journalctl时报错mapping error
539540

540541
#### 问题描述
541542

@@ -545,7 +546,7 @@ systemctl restart kubelet
545546

546547
检查是不是/etc/kubeedge/config/edgecore.yaml文件内的格式有问题。yaml文件中不能用tab缩进,必需用空格。
547548

548-
### 问题二十二:重启edgecore.service后再执行journalctl时报错connect refuse
549+
## 问题二十二:重启edgecore.service后再执行journalctl时报错connect refuse
549550

550551
#### 问题描述
551552

@@ -559,7 +560,7 @@ systemctl restart kubelet
559560
lsof -i:xxxx
560561
kill xxxxx
561562

562-
### 问题二十三:部署metrics-service时遇到Shutting down相关问题
563+
## 问题二十三:部署metrics-service时遇到Shutting down相关问题
563564

564565
#### 问题描述
565566

@@ -571,7 +572,7 @@ kill xxxxx
571572
在部署kubeedge时,metrics-service参数中暴露的端口会被自动覆盖为10250端口,components.yaml文件中后续实际服务
572573
所在的端口一致。也可以手动修改参数中的端口为10250即可。
573574

574-
### 问题二十四:169.254.96. 16:53: i/o timeout
575+
## 问题二十四:169.254.96. 16:53: i/o timeout
575576

576577
#### 问题描述
577578

@@ -597,7 +598,7 @@ client tries to connect global manager(address: gm.sedna:9000) failed, error: di
597598
原因:docker国内无法访问,新加入的edge没有做对应配置,导致拉取不到 kubeedge/edgemesh-agent 镜像。配置后重启docker和edgecore即可。
598599

599600

600-
### Question 25keadm join error on edge nodes
601+
## Question 25: keadm join error on edge nodes
601602

602603
Execution of `keadm join` on edges reported errors.
603604

src/pages/index.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ function HomepageHeader() {
2222
dark: require('@site/static/img/dayu-trans.png').default,
2323
}}>
2424
</ThemedImage>
25-
<p className="hero__subtitle">
25+
<p className={clsx('hero__subtitle', styles.heroSubtitle)}>
2626
<Translate>
2727
{/*{siteConfig.tagline}*/}
2828
Provide infrastructure for cloud-edge collaborative stream data analysis.

src/pages/index.module.css

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,9 @@
88
}
99
/* var(--hero-border-color) */
1010
.heroSubtitle {
11-
/* color: var(--ifm-font-color-base); */
12-
color: var(--ifm-font-color-base-inverse);
11+
/*color: var(--ifm-font-color-base);*/
12+
color: var(--ifm-color-primary);
13+
/*color: var(--ifm-font-color-base-inverse);*/
1314
font-size: 1.5rem;
1415
}
1516

0 commit comments

Comments
 (0)