Skip to content

Conversation

@sanchezzzhak
Copy link
Collaborator

@sanchezzzhak sanchezzzhak commented Dec 30, 2022

for matomo, you will need to add a new HTTP_X_CLIENT key
https://github.com/matomo-org/matomo/blob/d1eaaca1b7abdddea15ff8d1d8e2075b6f92c672/core/Http.php#L996-L1012

[!] this PR is worth viewing about when we have more bots through the xClient header.

@sanchezzzhak sanchezzzhak linked an issue Dec 30, 2022 that may be closed by this pull request
@liviuconcioiu
Copy link
Collaborator

@sanchezzzhak I have something to add:

  1. Can you also include the HTTP_FROM header? According to https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/From this one is a must for any crawler. So, if this header is present it should be detected as Generic bot for example.
  2. Here are some HTTP_FROM values that should be added, to be detected as specific bots:
googlebot(at)googlebot.com
bingbot(at)microsoft.com
[email protected]
[email protected]
[email protected]

@sanchezzzhak
Copy link
Collaborator Author

@liviuconcioiu Perhaps the implementation of HTTP_FROM should be added separately, since this PR is in limbo.
It would be good to have approximate statistics before implementing new features.

@sanchezzzhak
Copy link
Collaborator Author

ChatGPT support header HTTP_FROM;

From: gptbot(at)openai.com
User-Agent" Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.2; +https://openai.com/gptbot)

@liviuconcioiu
Copy link
Collaborator

Here is a complete list of what I have so far:

bingbot(at)microsoft.com
googlebot(at)googlebot.com
gptbot(at)openai.com
[email protected]
[email protected]
[email protected]
TGVnaXRpbWF0ZSBsaW5rIHRyYWNrZXI=
[email protected]
[email protected]
[email protected]
pigafetta-bot(at)visual-seo.com
[email protected]
"<?=print(9347655345-4954366);?>"
[email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Improvements to the ClientHints - HTTP_X_CLIENT

2 participants