Andy Reid@lemmy.world to Technology@lemmy.worldEnglish · 10 months agoAI companies are violating a basic social contract of the web and and ignoring robots.txtwww.theverge.comexternal-linkmessage-square193fedilinkarrow-up11.09Karrow-down115cross-posted to: technology@midwest.socialtechnology@beehaw.orgwolnyinternet@szmer.infotechnology@lemmy.zip
arrow-up11.08Karrow-down1external-linkAI companies are violating a basic social contract of the web and and ignoring robots.txtwww.theverge.comAndy Reid@lemmy.world to Technology@lemmy.worldEnglish · 10 months agomessage-square193fedilinkcross-posted to: technology@midwest.socialtechnology@beehaw.orgwolnyinternet@szmer.infotechnology@lemmy.zip
minus-squarewise_pancake@lemmy.calinkfedilinkEnglisharrow-up58·edit-210 months agorobots.txt is a file available in a standard location on web servers (example.com/robots.txt) which set guidelines for how scrapers should behave. That can range from saying “don’t bother indexing the login page” to “Googlebot go away”. IT’s also in the first paragraph of the article.
robots.txt is a file available in a standard location on web servers (example.com/robots.txt) which set guidelines for how scrapers should behave.
That can range from saying “don’t bother indexing the login page” to “Googlebot go away”.
IT’s also in the first paragraph of the article.