Skip to content

Commit a30bda8

Browse files
authored
Merge pull request #393 from Well2333/captcha
Solve screenshot mobile captcha
2 parents aa10b7b + ed3fcdc commit a30bda8

10 files changed

Lines changed: 149 additions & 22 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ HarukaBot 针对不同的推送场景(粉丝群、娱乐群、直播通知群
3737
- [go-cqhttp](https://github.com/Mrs4s/go-cqhttp):稳定完善的 CQHTTP 实现。
3838
- [bilibili-API-collect](https://github.com/SocialSisterYi/bilibili-API-collect):非常详细的 B 站 API 文档。
3939
- [bilibili_api](https://github.com/Passkou/bilibili_api):Python 实现的 B 站 API 库。
40-
- [HarukaBot_Guild_Patch](https://github.com/17TheWord/HarukaBot_Guild_Patch) 可以让HarukaBot适用于频道的补丁。(已合入 HarukaBot)
40+
- [HarukaBot_Guild_Patch](https://github.com/17TheWord/HarukaBot_Guild_Patch)可以让HarukaBot适用于频道的补丁。(已合入 HarukaBot)
4141

4242
## 支持与贡献
4343

362 KB
Loading

docs/faq.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -69,12 +69,10 @@ UID 不是 直播间ID!
6969
## Playwright 依赖不全
7070

7171
::: tip
72-
Linux 系统下,Playwright 需要安装额外的依赖才能运行 Chromium。目前 Playwright 官方只支持 Ubuntu因此**强烈推荐**使用 Ubuntu 运行 HarukaBot。非 Ubuntu 系统依赖安装出现问题请去 [Playwright Issues](https://github.com/microsoft/playwright/issues) 寻找解决方法!
72+
Linux 系统下,Playwright 需要安装额外的依赖才能运行 Chromium。目前 Playwright [官方支持](https://github.com/microsoft/playwright/blob/main/packages/playwright-core/src/server/registry/nativeDeps.ts) Ubuntu LTS(18.04,20.04,22.04) 共三个版本, Debian(11) 共一个版本,因此**仅推荐**使用上述提到四个版本的发行版运行 HarukaBot。非以上发行版依赖安装出现问题请前往 [Playwright Issues](https://github.com/microsoft/playwright/issues) 寻找解决方法!
7373
:::
7474

75-
Ubuntu:`playwright install-deps`
76-
77-
CentOS(仅供参考):`yum install -y atk at-spi2-atk cups-libs libxkbcommon libXcomposite libXdamage libXrandr mesa-libgbm gtk3`
75+
命令:`playwright install-deps`
7876

7977
## 启动的时候出现 pytz.exceptions.UnknownTimeZoneError: 'Can not find timezone '
8078

docs/level-0/ch02.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,15 @@
2323

2424
:::
2525

26-
在进行阿里云的[账户注册](https://help.aliyun.com/knowledge_detail/37195.html)[实名](https://help.aliyun.com/document_detail/48263.html)之后,点开阿里云的[开发者成长计划](https://developer.aliyun.com/plan/grow-up),选择购买轻量应用服务器,地域选择**北上广深**其中一个,镜像类型选择为**系统镜像**,系统镜像选择 `Windows 2012 R2` 选择合适的购买时长后点击立即付款,进行付款购买,完整流程如下。
26+
::: warning 注意
27+
28+
在下文的系统镜像选择中,Windows版本不得低于 `2016 数据中心` 版本,低于此版本的Windows不再受到HarukaBot支持
29+
30+
:::
31+
32+
在进行阿里云的[账户注册](https://help.aliyun.com/knowledge_detail/37195.html)[实名](https://help.aliyun.com/document_detail/48263.html)之后,点开阿里云的[开发者成长计划](https://developer.aliyun.com/plan/grow-up),选择购买轻量应用服务器,地域选择**北上广深**其中一个,镜像类型选择为**系统镜像**,系统镜像选择 `Windows 2016 数据中心版` 选择合适的购买时长后点击立即付款,进行付款购买,完整流程如下。
2733

28-
::: details 为什么我选择购买轻量服务而不是ECS
34+
::: details 为什么我选择购买轻量应用服务器而不是ECS
2935

3036
选择轻量应用服务器而不选择ECS的原因无他,仅仅是因为对于新手来说,轻量应用服务器更加容易配置。
3137

@@ -49,12 +55,6 @@
4955
2. [VSCode](https://code.visualstudio.com/sha/download?build=stable&os=win32-x64-user)
5056
3. [go-cqhttp](https://github.com/Mrs4s/go-cqhttp/releases/latest/download/go-cqhttp_windows_amd64.exe)
5157

52-
::: details go-cqhttp下载的太慢了/github被墙了没法下载?
53-
54-
[go-cqhttp fastgit加速](https://download.fastgit.org/Mrs4s/go-cqhttp/releases/latest/download/go-cqhttp_windows_amd64.exe)
55-
56-
:::
57-
5858
## 2.3 你的进度
5959

6060
如果上面的原材料你都准备好了的话,你已经拿到了开启新世界大门的钥匙。那还等什么,让我们快点进入下一章,走进这扇门吧!

docs/level-0/ch03.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515

1616
1. 首先先回到阿里云的[主页](https://www.aliyun.com/),点击右上角的控制台
1717
2. 选择 `保有资源的云产品` 中刚开通的 `轻量应用服务器`
18-
3. 在服务器列表内选择你刚开通的 `Windows 2012 R2` 服务器
18+
3. 在服务器列表内选择你刚开通的 `Windows 2016 数据中心版` 服务器
1919
4. 进入之后,选择左边 `服务器运维` 下拉框中的 `远程连接` 选项
2020
5. 点击 `2.通过远程桌面工具连接` 中的 `重置服务器密码` 来设置**mstsc**用的连接密码
2121
6. 经过一系列的设置密码流程后,选择 `是的,请立即重启服务器` ,记住这一页写的**IP地址和账号**
@@ -29,7 +29,7 @@
2929

3030
1. 首先先回到阿里云的[主页](https://www.aliyun.com/),点击右上角的控制台![](/ch03-1.jpg)
3131
2. 选择 `保有资源的云产品` 中刚开通的 `轻量应用服务器`![](/ch03-2.jpg)
32-
3. 在服务器列表内选择你刚开通的 `Windows 2012 R2` 服务器![](/ch03-3.jpg)
32+
3. 在服务器列表内选择你刚开通的 `Windows 2016 数据中心版` 服务器![](/ch03-3.jpg)
3333
4. 进入之后,选择左边 `服务器运维` 下拉框中的 `远程连接` 选项![](/ch03-4.jpg)
3434
5. 点击 `2.通过远程桌面工具连接` 中的 `重置服务器密码` 来设置**mstsc**用的连接密码![](/ch03-5.jpg)
3535
6. 经过一系列的设置密码流程后,选择 `是的,请立即重启服务器` ,记住这一页写的**IP地址和账号**![](/ch03-6.jpg)

docs/usage/settings.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ HARUKA_DIR="./data/"
4343
在群里使用命令前是否需要 @机器人。设置为 `False` 则可以直接触发指令。
4444

4545
```json
46-
Haruka_TO_ME=False
46+
HARUKA_TO_ME=False
4747
```
4848

4949
## HARUKA_LIVE_OFF_NOTIFY

haruka_bot/config.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ class Config(BaseSettings):
1717
haruka_dynamic_interval: int = 0
1818
haruka_dynamic_at: bool = False
1919
haruka_screenshot_style: str = "mobile"
20+
haruka_captcha_address: str = "https://captcha-cd.ngworks.cn"
2021
haruka_dynamic_timeout: int = 30
2122
haruka_dynamic_font_source: str = "system"
2223
haruka_dynamic_font: Optional[str] = "Noto Sans CJK SC"

haruka_bot/utils/browser.py

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111

1212
from ..config import plugin_config
1313
from .fonts_provider import fill_font
14+
from .captcha import resolve_captcha
1415

1516
_browser: Optional[Browser] = None
1617
mobile_js = Path(__file__).parent.joinpath("mobile.js")
@@ -56,11 +57,14 @@ async def get_dynamic_screenshot_mobile(dynamic_id):
5657
)
5758
try:
5859
await page.route(re.compile("^https://static.graiax/fonts/(.+)$"), fill_font)
59-
await page.goto(
60-
url,
61-
wait_until="networkidle",
62-
timeout=plugin_config.haruka_dynamic_timeout * 1000,
63-
)
60+
if plugin_config.haruka_captcha_address:
61+
page = await resolve_captcha(url,page)
62+
else:
63+
await page.goto(
64+
url,
65+
wait_until="networkidle",
66+
timeout=plugin_config.haruka_dynamic_timeout * 1000,
67+
)
6468
# 动态被删除或者进审核了
6569
if page.url == "https://m.bilibili.com/404":
6670
return None

haruka_bot/utils/captcha.py

Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
import contextlib
2+
from typing import Optional
3+
4+
import httpx
5+
from nonebot.log import logger
6+
from playwright._impl._api_structures import Position
7+
from playwright.async_api import Page, Response
8+
from pydantic import BaseModel
9+
from yarl import URL
10+
11+
from ..config import plugin_config
12+
13+
14+
class CaptchaData(BaseModel):
15+
captcha_id: str
16+
points: list[list[int]]
17+
rectangles: list[list[int]]
18+
yolo_data: list[list[int]]
19+
time: int
20+
21+
22+
class CaptchaResponse(BaseModel):
23+
code: int
24+
message: str
25+
data: Optional[CaptchaData]
26+
27+
28+
async def resolve_captcha(url: str, page: Page) -> Page:
29+
captcha_image_body = ""
30+
last_captcha_id = ""
31+
captcha_result = None
32+
33+
async def captcha_image_url_callback(response: Response):
34+
nonlocal captcha_image_body
35+
logger.debug(f"[Captcha] Get captcha image url: {response.url}")
36+
captcha_image_body = await response.body()
37+
38+
async def captcha_result_callback(response: Response):
39+
nonlocal captcha_result, last_captcha_id
40+
logger.debug(f"[Captcha] Get captcha result: {response.url}")
41+
captcha_resp = await response.text()
42+
logger.debug(f"[Captcha] Result: {captcha_resp}")
43+
if '"result": "success"' in captcha_resp:
44+
logger.success("[Captcha] 验证码 Callback 验证成功")
45+
captcha_result = True
46+
elif '"result": "click"' in captcha_resp:
47+
pass
48+
else:
49+
if last_captcha_id:
50+
logger.warning(f"[Captcha] 验证码 Callback 验证失败,正在上报:{last_captcha_id}")
51+
async with httpx.AsyncClient() as client:
52+
await client.post(
53+
f"{captcha_baseurl}/report",
54+
json={"captcha_id": last_captcha_id},
55+
)
56+
last_captcha_id = ""
57+
captcha_result = False
58+
59+
captcha_address = URL(plugin_config.haruka_captcha_address)
60+
page.on(
61+
"response",
62+
lambda response: captcha_image_url_callback(response)
63+
if response.url.startswith("https://static.geetest.com/captcha_v3/")
64+
else None,
65+
)
66+
page.on(
67+
"response",
68+
lambda response: captcha_result_callback(response)
69+
if response.url.startswith("https://api.geetest.com/ajax.php")
70+
else None,
71+
)
72+
73+
with contextlib.suppress(TimeoutError):
74+
await page.goto(
75+
url,
76+
wait_until="networkidle",
77+
timeout=plugin_config.haruka_dynamic_timeout * 1000,
78+
)
79+
80+
captcha_baseurl = f"{captcha_address.scheme}://{captcha_address.host}:{captcha_address.port}/captcha/select"
81+
while captcha_image_body or captcha_result is False:
82+
logger.warning("[Captcha] 需要人机验证,正在尝试自动解决验证码")
83+
captcha_image = await page.query_selector(".geetest_item_img")
84+
assert captcha_image
85+
captcha_size = await captcha_image.bounding_box()
86+
assert captcha_size
87+
origin_image_size = 344, 384
88+
89+
async with httpx.AsyncClient() as client:
90+
captcha_req = await client.post(
91+
f"{captcha_baseurl}/bytes",
92+
timeout=10,
93+
files={"img_file": captcha_image_body},
94+
)
95+
captcha_req = CaptchaResponse(**captcha_req.json())
96+
logger.debug(f"[Captcha] Get Resolve Result: {captcha_req}")
97+
assert captcha_req.data
98+
last_captcha_id = captcha_req.data.captcha_id
99+
if captcha_req.data:
100+
click_points: list[list[int]] = captcha_req.data.points
101+
logger.warning(f"[Captcha] 识别到 {len(click_points)} 个坐标,正在点击")
102+
# 根据原图大小和截图大小计算缩放比例,然后计算出正确的需要点击的位置
103+
for point in click_points:
104+
real_click_points = {
105+
"x": point[0] * captcha_size["width"] / origin_image_size[0],
106+
"y": point[1] * captcha_size["height"] / origin_image_size[1],
107+
}
108+
await captcha_image.click(position=Position(**real_click_points))
109+
await page.wait_for_timeout(800)
110+
captcha_image_body = ""
111+
await page.click("text=确认")
112+
geetest_up = await page.wait_for_selector(".geetest_up", state="visible")
113+
if not geetest_up:
114+
logger.warning("[Captcha] 未检测到验证码验证结果,正在重试")
115+
continue
116+
geetest_result = await geetest_up.text_content()
117+
assert geetest_result
118+
logger.debug(f"[Captcha] Geetest result: {geetest_result}")
119+
if "验证成功" in geetest_result:
120+
logger.success("[Captcha] 极验网页 Tip 验证成功")
121+
else:
122+
logger.warning("[Captcha] 极验验证失败,正在重试")
123+
124+
return page

haruka_bot/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
from packaging.version import Version
22

3-
__version__ = "1.5.4"
3+
__version__ = "1.6.0"
44
VERSION = Version(__version__)

0 commit comments

Comments
 (0)