Use Squarespace? Be Aware of AI Crawlers Scanning Your Content
News | By Stephan Jukic | December 1, 2023
In November Squarespace added a new settings feature that lets platform site owners opt right out of artificial intelligence crawlers.
However, the feature that lets you do this is by default set to “opt-in”. In other words, you need to manually check if it’s on and disable it if that is something you’re not comfortable with.
AI crawlers trawl the web for training data that are later often used to teach generative AI tools how to create human-like content, be it visual or text-based.
As many content creators whose work is online have complained, this could lead to their content being copied by an AI in all kinds of ways. Some companies, like OpenAI (creators of ChatGPT) have for example done this on an enormous scale.
Others, like Adobe, also use content from their own customers for AI training, but for their native own AI models, and in exchange they offer generative tools that are copyright-safe.
If this is also something that you wouldn’t like as a possibility for your own photos or other digitized media, you might want to opt out of AI tracking wherever possible and in this case, with Squarespace if you use the platform.
Most content creators of any kind do want certain types of crawling algorithms to find their content and crawl through it. After all, Google does exactly this for the sake of indexing your content to make it visible in search.
However, third-party AI crawlers for LLM training are a slightly different thing. In many cases, they scan through others’ media not for the sake of directly benefitting its creators but so that they can feed their LLMs with more material.
I wouldn’t quite call this parasitical behavior, but it does skirt the definition quite closely.
In any case, you often can disable this tracking, at least formally.
Pro portrait photographer Miguel Quiles, who recently discovered the AI crawler option while updating his Squarespace site, has even made a neat video that details where the feature is located and how you can cut it off.
As he explains, “Out of curiosity, I clicked on it to see what it was about and saw the artificial intelligence crawlers option.
Naturally, he was pretty surprised and unhappy to see that it was activated by default. Whether you care or not about AI sucking up your media for its LLM training, it’s something that Squarespace should at least let you opt into voluntarily instead of the other way around.
Squarespace has actually responded to wider criticism of this with the following:
“We understand the concern. To clarify, AI models currently train on all public web data. This feature is not actively opting you in, but instead offers you the option to opt out and prevent your site from being crawled by artificial intelligence crawlers.”
How true this framing of the situation by Squarespace is can be debated. It’s widely known that major platforms are routinely paid by third parties to offer their user data for all kinds of purposes.
Squarespace would be a surprising exception if it didn’t and well, Its privacy policy certainly doesn’t suggest that it avoids collecting and selling your data.
The company hasn’t responded to requests for further clarification from from other sites, but if they do we’ll update this post as needed.
If you’d like to turn off AI crawling on your site, open your settings panel, click on the section called “Crawlers” and there you’ll see two options. One is called “search engine crawlers” and the other is called “Artificial Intelligence Crawlers”
You might want to leave “Search Engine Crawlers” active if you don’t want to lower your search visibility. For the “Artificial Intelligence Crawlers” however, you can toggle it to off in order to opt out.
Check out these 8 essential tools to help you succeed as a professional photographer.
Includes limited-time discounts.