argument: Notizie/News - Digital Governance
According to an article from Tom's Hardware, several AI companies are bypassing the Robots Exclusion Protocol (robots.txt) to scrape website content without authorization. This protocol, designed in the 1990s to prevent web crawlers from overloading websites, has traditionally been respected. However, a content licensing startup, TollBit, reports widespread non-compliance among AI agents. TollBit’s analytics show that various AI systems use data for training without proper permission, leading to disputes between AI firms and publishers. For instance, Forbes has accused the AI search startup Perplexity of plagiarizing its content. The robots.txt protocol is not legally enforceable, but its violation has prompted some publishers, such as the New York Times, to take legal action for copyright infringement. Others are negotiating licensing agreements. This issue highlights the tension between AI developers and content creators, as AI-generated summaries become more common. AI firms argue that accessing non-paid content does not breach laws, but publishers disagree, especially when paid content is involved. TollBit positions itself as an intermediary to facilitate content licensing agreements, helping publishers monetize their content used by AI systems. The startup tracks AI traffic and negotiates fees, though it has not disclosed its clients.