Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Copy from URL – A tiny tool to bypass robots.txt for AI chatbot (copyfromurl.com)
2 points by ngkw on July 2, 2024 | hide | past | favorite | 6 comments
Hello HN! I'd like to share a very small project I've built called "Copy from URL" (https://copyurl.io). It's a minimal tool designed to solve a common frustration when working with AI chatbots like ChatGPT or Claude.

The Problem: Many of us need to copy website content for AI queries. However, popular sites often use robots.txt to block AI crawlers, resulting in errors when trying to use these sites with AI. The Solution: Copy from URL is a simple, no-frills web tool that lets you extract text content from any webpage, bypassing robots.txt restrictions. Here's all it does:

- You enter a URL - Click "Go!" - Get clean, copyable text content - Use it in your AI interactions

Key Features:

- No registration - Open-source (GitHub link in comments) - Completely free - Bare-bones functionality

Tech Stack: Built with minimal dependencies using Next.js, React, and Tailwind CSS. Uses Cheerio for HTML parsing. Privacy & Security: We don't collect or store any data. Everything happens client-side. This is a very small project born out of personal necessity. It's not feature-rich or complex, just a simple tool that does one thing. I thought others might find it useful too. I'd love to hear your thoughts or suggestions, especially if you've encountered similar issues with AI interactions!



Github codes: https://github.com/Amunzen/CopyFromURL

Any comment is welcome! Thanks.


haha I have mistaken the URL in the comments. Here's the right one: https://copyfromurl.com


Or you know you can just ignore robots.txt like everyone else. What do you think robots.txt does exactly?


Thanks for your comment, JSDevOps! I appreciate your input. I understand that robots.txt can be bypassed by some. The idea behind "Copy from URL" is to address a specific need when interacting with AI chatbots through their websites, like ChatGPT or Claude, where we don't have control over robots.txt settings (that's up to OpenAI or Anthropic). If you use the API, you can just ignore this, but that's not the case I'm targeting. This tool is designed for those times when you want to quickly copy content from a site to use in your AI conversations. It's just a simple way to make that process smoother. Hope that clarifies the intention behind the tool!


To be honest, I'm really frustrated that OpenAI and Anthropic don't open certain links. It's not just because of robots.txt restrictions, but also due to copyright issues and their excessive caution regarding media content. Isn't it annoying to deal with that? So, the idea was to create a tool that allows us to easily bypass these limitations ourselves.


Aka "isn't it annoying to respect the rule of law?"

Reminded of this post by https://twitter.com/GergelyOrosz/status/1808515221817885017




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: