Subject
- #robots.txt
- #GPT
- #OpenAI
- #Crawler
- #ChatGPT
Created: 2024-07-27
Created: 2024-07-27 23:29
OpenAI (GPT) operates crawler bots.
Basically, GPT also needs to collect data to continuously learn and upgrade, so it performs crawling,
In the early days, it reportedly used Wikipedia data and news from various media outlets, and it also operated a large number of crawlers, which caused controversy. However, it now officially operates GPTBot, and this bot respects robots.txt, meaning it will not collect data if blocked.
Related content: https://platform.openai.com/docs/bots
For example, if you write the following in robots.txt
you will block only GPTBot, and
if you want to allow GPTSearch and block GPTBot, which was recently released, you can do it as follows.
By appropriately utilizing the necessary parts, you can prevent unauthorized crawling by GPT.
Comments0