Let's be honest here. Curl to 98% of people is http/1 requests with some post params and maybe json body, with some custom headers. Most language standard libs (which might be using libcurl) can facilitate writing that relatively quickly. And probably with a more 'modern' CLI.
Curl though, is a very wide breadth project. It currently supports the following protocols: DICT, FILE, FTP, FTPS, GOPHER, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET and TFTP.
Each one of those is a massive undertaking of protocol/spec management, edge case handling, cross compatibility, hacks, and much much more to make them work. Then to put a stable C api on top of it all to be a cross language toolkit is a MASSIVE undertaking.
But, again, the "idiots" who posted those comments, aren't talking about any of this. Perhaps they shouldn't care to. Most folks I know don't even know about the other protocols curl supports and have only interfaced with curl through its http. Frankly there are nicer http cli's out there with less code that _can_ be written in a weekend (assuming they piggy back on a http lib).
What Daniel Stenberg achieved is giving the world a fantastic, reliable, cross protocol, cross platform and cross language network library that can be used as the foundation for many projects. I'm sure few of those cited would claim they could do that.
A question that arises out of this is, should the 90% use cases be handled by a small, simple tool, or by the tip of the iceberg of a large, complex tool such as curl?
I can see the advantages of standardizing on a complex, powerful tool for simple use cases. For one thing, it may be the only way to standardize: simple versions are too easy to write, and therefore you get dozens of competitors, none of whom are popular enough to shake out their edge case bugs.
It's also nice not to have to find, install, and learn a new tool when you stray over that 90% boundary.
With software dependencies, I think the advantages of small, simple libraries win out over generality. Supporting powerful use cases makes an API more complex and often means that simple, 90% use cases require weird incantations to make them work. Here I think of the times I've had to get into the guts of Jackson despite never doing anything remotely exotic with JSON. The 95% of your codebase that can be simple should be simple, so you can devote your attention to the things that need to be complex.
Counter-example: Jenkins. It does what you ask of it, its base install is "naked" and only contains the minimum functionality in the core.
Everything then becomes a plugin. Git. GitHub. Branch for multi-branch pipelines. Credentials management. And on and on and on.
Now you have stay on top of maintaining the plugins in addition to the core. Also, many plugins require other plugins so just to do some basic stuff like set up a multi-branch pipeline from a GitHub repo you're suddenly staring down the barrel of dozens and dozens of bespoke plugins with varying levels of quality and support.
A monolithic application like curl is a dream to me by comparison. Everything is tested in every release. Sub-components are kept up to date by the maintainer. No plugins fighting each other's plugins.
From afar it's easy to see the praise simplicity and modularization but honestly monoliths can be undervalued too.
I can definitely see your point, having experienced the same thing with plugins for SBT, the Scala build tool. I didn't really consider the case of a small core with a multitude of plugins as a twist on the small, simple tool. I think you're right that a plugin architecture lets a thousand flowers bloom, but you don't get long-term stability, because people move on to other tools and stop maintaining the plugins they wrote.
For example, VSCode plugins are great because VSCode is thriving, and Emacs packages are a crapshoot because many of the programmers who wrote them have moved on. Eventually VSCode plugins will be like Emacs packages.
Also: node. Everything is a module, and every module requires a hundred more. Projects with thousands of dependencies become common. No one understands what is actually “under the hood” and hardly anyone cares. “It just works” most of the time. Good enough.
I would be surprised if nobody had tried to make a mini-curl that could go in busybox. The idea of having a tiny version of the program which handles the 90% cases by itself but can call out to the real-deal bigger brother when necessary is a nice one. This sort-of happens already with lots of common tools which are shadowed by shell builtins, why not curl?
> A question that arises out of this is, should the 90% use cases be handled by a small, simple tool, or by the tip of the iceberg of a large, complex tool such as curl?
But nobody is forced to use curl? People use it because it’s convenient, shoot them selves in the foot and then lash out at the author the tool for their own choices. Where’s the fault?
I like curl, but this isn't true. When you're SSH-ing onto a box, you often don't have permissions to install your favorite CLI tool, and even if you do have said permissions it's inconvenient to have to install it each time you SSH onto the box (not a major inconvenience, mind you). Moreover, in many cases you need to run a script or some other software that depends directly on curl.
In general the "nobody is forced to use it" arguments rarely pan out (I remember this was a canned argument from C++ folks circa 2011: "C++ is the best language because it has every feature and if you don't like some features, you aren't forced to use them!").
Well, your distro choose to include one lib that can be used with most protocols out there. They include the multi tool and many of the apps also included on the box require the tool.
If you want to use something else. You have to install it on the host. Every box comes with bash, but we still install other languages and frameworks on the host so we can run out applications with tools that make sense.
If you want a different tool, make it part of your default install.
The point is that you’re not always the person who gets to decide which distro, which tools to install atop the distro, or which dependencies your scripts will use. If you own those choices, of course you can add in your own tool, but you frequently don’t own those choices.
But cannot this charge be levied upon all tools and utilities? If you do not have the permission to bring your own tools it stands to reason that you will have to use the tools already in place, be that curl or some other random assortment of literally anything else. I don't much see the moral basis behind showing up at the construction yard and then lamenting over your lack of choice simply because your employer only brought makita brand tools.
No one picks curl to be on a box, it a core library for everything else on the host. It's not a Makita drill, it's more like electrical power at the site and this guy is complaining this tool uses gas or propane.
Doesn't stop anyone else from using a power saw or charging batteries. If you need propane, bring it.
I guess I don't understand the complaint? You're worried that other people are using it for their own projects? The reason it's on every box is because it exposes the API and it's a library for half the shit on the host.
If you don't get to choose anything on the host, why are you concerned what other people use? If you install your own apps, add that lib you want?
I agree. At any rare the System default curl is rarely compiled with all those features enabled, so finally one sticks to http/https/ftp/ftps subset for curl all the time.
Yeah, very easy to write an HTTP client on top of Berkley sockets. Until chunked responses come into play, and HTTPS, and HTTP/2, and HTTP/3...
(this is actually something I'm very worried about, it used to be easy to cobble together a very basic HTTP client, not anymore with the new HTTP protocol versions, and all the bells and whistles).
And of course, CURL deals with multiple protocols as the OP mentioned. I've used libcurl for sending emails and FTP but mainly for HTTP.
Using the multi interface it's relatively trivial (a few hundred lines of C) to fetch using hundreds or thousands of concurrent connections. There's even some courtesy opts for rate limiting per host.
It's so good he/they built c-ares for asynchronous DNS lookups, IIRC something that wasn't so readily available in years gone by.
Agreed. They're harder to debug, MITM, trace, etc.
As a silver lining: I have been less worried recently seeing that most of the adoption of http/2-3 (at least their odd features) have been at the edge. Most developers are still writing http/1 endpoints and leaving the edge to optionally up-convert them for transport efficiency.
> What Daniel Stenberg achieved is giving the world a fantastic, reliable, cross protocol, cross platform and cross language network library that can be used as the foundation for many projects.
One of which, curl(1), is a constant ass-and-life saver for me personally and professionally. Of course the library itself ends up being used by me as well (in C, ++ and tcl). curl to me is in the league of sqlite aka "absolutely gorgeous and more often than not all you need"
In the midst of so much open source drama and angst, I'm still just profoundly grateful for Daniel Stenberg's work on curl.
I'm no expert on open source management but Daniel has for years been a positive and steady force working on curl.
This is him publishing the direct criticisms he's receiving as "thanks" for his efforts and it guts me he's got to put up with these ignorant comments.
You've changed on my mind on this being an odd design decision.
Baking in libcurl for the user (or optionally allowing them to dynamically link) if they want networking was maybe the most sane/pragmatic choice you could have made for a new language.
True. I'm kind of surprised (again) by Daniel after 20 years of maintaining curl being so salty about "people not appreciating the achievement". Yes, for overwhelming majority of users curl in 2021 is just a simple http/https request client. And, yes, in 2021 that could be written over a weekend with a better CLI, mainly because most of the languages already have all important stuff implemented in a [standard] library (ironically, sometimes via libcurl).
> I'm kind of surprised (again) by Daniel after 20 years of maintaining curl being so salty about "people not appreciating the achievement"
You're surprised by him being salty? I can't imagine what it's like to be on the receiving end of the water torture of people continually belittling the work you have done over 20 years and given to the community.
Not to mention the guys who threatened to kidnap/kill him if he didn't fly like 10000km immediately, on his own expense of course, to solve their bug pro bono.
That was honestly my first thought. Of course it's easy to rewrite curl nowadays. It can't be anything more complicated than slapping a command line interface on top of libcurl...
At first I laughed... then I thought about the sheer number of command line options that you can find in the man pages. After thinking about that, even writing the command line interface would be a non-trivial undertaking.
> Let's be honest here. Curl to 98% of people is http/1 requests with some post params and maybe json body, with some custom headers.
100%, that's why we see so many smaller projects pop up (on GitHub and the like) that support basically just this. No shade to those projects, improving the UI for this subset is a worthy cause, it's just not anything near cURL.
Curl has been a life safer for me when I had to interact with a legacy third party server. The Python request model was regularly hanging when making requests to this server, whereas I never observed these issues with curl. Curl appears to be better at dealing with such legacy edge cases as it has been around for more than two decades.
Why even get mad at curl? If you hate curl and all you want is a simple http/1 client just spend a few minutes to write a Python script w/requests that takes verbs and payloads from the command line rather than make the curl author's day shitty.
I'm one of the 98%, and I don't like using cURL for this reason -- the CLI feels clunky and its gigantic number of options are distracting for my simple use case.
Can anyone recommend a good alternative for the base case?
I recognize the huge amount of time and effort that went into make cURL, but I agree that the gigantic number of options are distracting. When I read the cURL man page, I have a hard time finding what I want because there are literally dozens of screens of options.
That said, I still will reach for cURL even when simpler options exist because it's ubiquitous. Same thing with Bash, grep, and sed.
Yeah. I once though I would have a look at what it would mean to just send a single file over SMB. What better way than to just reverse engineer the little bits of communication between a client and a server?
The simplest solution would have been to give up the instant I started reading the catched traffic. I gave up after about 40 minutes of trying to figure out where to start.
SMB is... not very straightforward. I needed to reverse engineer it once to write an exploit, and it's difficult to properly formulate even basic requests, let alone the more complex stuff the protocol supports.
What better thing to do with a universal resource locator syntax than to have a universal client?
> If a project is only going to ever use http, why would I need to bundle in all of the others?
Your project might only ever think to use http, but my project might appreciate having one client for any URL, instead of having to parse URLs myself and decide on separate clients based on a part of it.
It is a good library API with good support for async and that sort of thing. As a user it is less work to integrate the Nth protocol in a library I understand than to read the docs and try to reverse engineer some new API model of how to do things including asynchronous and configurations and TLS/cert management etc. there are a thousand ways to skin the cat, and libcurl is a good one. I want to learn a bunch of new distributed protocols because those are real things; I don’t want to learn a bunch of different APIs to do the same things because those are arbitrary wrappings to the real thing. If an API is successful enough it would be real, but cleverness in API design isn’t that useful without the popularity.
My first thought is flexibility and interoperability. If you have one tool, cross platform, that supports a wide variety of protocols, you can use it on both ends of a connection, and easily swap out protocols without having to:
1) install a new tool on both sides
2) learn a new API for each protocol
3) bug check for it
It also let's you do all this dynamically, so switching protocols on the fly is trivial.
What's frustrating is people who think that they can radically simplify something without knowing anything about it. The same people think they can implement a text editor in 1000 lines of code because that's how the first text editors were made. Why can't we go back to those simple days?
We can't because modern software handles cases that those didn't. For example, rendering and editing text seems straightforward until you look closer. Text Rendering Hates You (https://gankra.github.io/blah/text-hates-you/) and Text Editing Hates You Too (https://lord.io/blog/2019/text-editing-hates-you-too/) are two amazing reads that analyze this in detail. They explain the hidden complexity to the vast majority of us who never knew. We simply slapped down a <textarea> or <EditText> and called it a day.
If we tried to reimplement this stuff from scratch, we would likely start with a simple implementation, then tack on bits when we tried to handle issues like right-to-left layouts, Unicode, emoji, anti-aliasing, text overlapping and so on. All this before getting into implementing features that Dennis Ritchie didn't have to, like supporting plugins, releasing on multiple operating systems, accessibility or even syntax highlighting. Problems already known to and tackled by existing text editors today.
If you look at work outside of your area of expertise and you feel like starting a sentence like "why don't you just ...", then reconsider. If the person you're talking to is halfway competent (they probably are), they've already considered what you're about to say. "Have you considered using less memory/CPU or removing this feature that I personally don't need?" Why yes, we have.
All those people don't realise tools like curl did start with few lines back then. And after the first two minutes of testing the devs started adding additional code to handle failures, invalid inputs, different platforms, different versions of a standard etc.
One can write a lot of things in 100 lines if it doesn't have to meet higher standards than a basic chat app tutorial in the "getting started" section of a programming language.
> One can write a lot of things in 100 lines if it doesn't have to meet higher standards than a basic chat app tutorial in the "getting started" section of a programming language.
I'd argue that even a basic chat app is massively complex, only that all the really complex parts have already been solved. I once tried to write one on the LOWEST level I could. Realized after a month that there were not enough hours left in my life to finish it.
Your "Software Isn't Bloated" post reminds me of an old post from Joel Spolsky about how complexity (ie. bloat) creeps in from countless unforeseen edge cases.
"Back to that two page function. Yes, I know, it’s just a simple function to display a window, but it has grown little hairs and stuff on it and nobody knows why. Well, I’ll tell you why: those are bug fixes. One of them fixes that bug that Nancy had when she tried to install the thing on a computer that didn’t have Internet Explorer. Another one fixes that bug that occurs in low memory conditions. Another one fixes that bug that occurred when the file is on a floppy disk and the user yanks out the disk in the middle. That LoadLibrary call is ugly but it makes the code work on old versions of Windows 95."
This is what I think of whenever I see someone taking a dogmatic stance on function length, conditional nesting depth, functional purity, OO-purity, etc. "Here's a person who has never worked on text rendering, text editing, GUI event systems, image decoding, etc."
I have a chunk of code for implementing pointers, cursors, button clicks, and drag-and-drop in games. It's some of the gnarliest code I've ever written. It's also some of the most resistent code to "good design patterns" I've ever seen. I've written it and tweaked it and rewritten it and thrown it away and started from scratch and it always comes out the same. When you're dealing with things that have to meet long-established user expectations, it's always going to be full of little edge cases that you can't abstract away.
If you want to see extremely toxic, combine this mentality from the business side with confirmation bias from developers who think the same way.
You get development efforts with massive scope people brush off as nothing. I've worked with a lot of these sort of groups where someone entrepreneurial understands enough to think they can simplify and improve something. Then they get a developer, two, or small team that agree. They start the undertaking or even worse, do the work under contract for someone else where they have contractual deliverables. Some way in they start to realize just how shortsighted and cavalier they were about brushing off uncertainties they either knew or didn't know about. "It can't be that hard! It's just software, it's all virtual! We don't need a bunch of capital! We just need a little bit of math! ... "
All those uncertainties start to become quantities and they realize the absolute pit they dug themselves into. At that point they look for someone to swoop in and cleanup their mess. Usually it takes careful reading of the contract deliverables with liberal interpretation of what is stated and how those things can be minimally met. You really need people familiar with the types of software and systems involved and needed to achieve your goals to narrow down these uncertainties. If you don't then you're digging your own grave.
>If we tried to reimplement this stuff from scratch, we would likely start with a simple implementation, then tack on bits when we tried to handle issues like right-to-left layouts, Unicode, emoji, anti-aliasing, text overlapping and so on.
This is why I always try to look back as far as I can in a source's history to see how it started. It at least gives me a sense of the amount of work it takes to get from simple to mature.
And in fact even the "simplest" editors seemed to take quite some time even for our heroes. Ken Thompson has stated [1] that he devoted a week out of a month for what I understand was the first version of ed(1):
[...] [my wife] was gone a month to California and I allocated a week each to the shell, to the operating system, the shell, the editor, and the assembler, to reproduce itself [...]
An editor seems insanely simple until you realize that computers really don't like "inserting" data into existing data in memory, and then you're staring down the barrel of continuous memcpys.
There's a very simple algorithm to deal with that. You make a buffer that is bigger than the file being edited. You copy the start of the file to the beginning of the buffer, and the end of the file to the end of the buffer. The gap in the middle is where the cursor is. Then each keystroke by the user is a very quick operation. If the user types in so much information that the gap closes up, then you need to do a buffer expand operation and copy large amounts of data around, but that won't happen very often.
Ah, darn. That'll teach me to comment without refreshing the page first. I refer you to the earlier DonHopkins post.
Today I would probably reach for an RRB tree and define some kind of simple heuristic for transients, giving me free undo/redo, and reasonably simple support for multiple cursors. Back then a gap buffer seems like a great idea. Two linked lists is a simple enough idea.
Kudos to you for supporting multiple cursors from the start, which a buffer with a gap can't support efficiently. Douglas Englebart would have approved!
>History of key products: The first instance of a collaborative real-time editor was demonstrated by Douglas Engelbart in 1968, in The Mother of All Demos. Widely available implementations of the concept took decades to appear.
Vis-à-vis mamelons, Ted Selker (who invented the Trackpoint) actually build a prototype Thinkpad keyboard with TWO Trackpoints, which he loved to show at his New Paradigms for Using Computers workshop at IBM Almaden Research Lab.
While I'm not sure if this video of the 1995 New Paradigms for Using Computers workshop actually shows a dual-nippled Thinkpad, it does include a great talk by Doug Engelbart (14:22), and quite a few other interesting people! (James Gosling of Sun Microsystems talks about capabilities of Sun's new web browser HotJava, at 24:36! A classic!)
The multi-Trackpoint keyboard was extremely approachable and attractive, and everybody who saw them instantly wanted to get their hands on them and try them out! (But you had to keep them away from babies.) He made a lot of different prototypes over time, but unfortunately IBM never shipped a Thinkpad with two nipples.
That was because OS/2 (and every other contemporary operating system and window system and application) had no idea how to handle two cursors at the same time, so it would have required rewriting all the applications and gui toolkits and window systems from the ground up to support dual trackpoints.
The failure to inherently support multiple cursors by default was one of Doug Engelbart's major disappointments about mainstream non-collaborative user interfaces, because collaboration was the whole point of NLS/Augment, so multiple cursors weren't a feature so much as a symptom.
Bret Victor discussed it in a few words on Doug Engelbart that he wrote on the day of his death:
>Say you bring up his 1968 demo on YouTube and watch a bit. At one point, the face of a remote collaborator, Bill Paxton, appears on screen, and Engelbart and Paxton have a conversation.
>"Ah!", you say. "That's like Skype!"
>Then, Engelbart and Paxton start simultaneously working with the document on the screen.
>"Ah!", you say. "That's like screen sharing!"
>No. It is not like screen sharing at all.
>If you look closer, you'll notice that there are two individual mouse pointers. Engelbart and Paxton are each controlling their own pointer.
>"Okay," you say, "so they have separate mouse pointers, and when we screen share today, we have to fight over a single pointer. That's a trivial detail; it's still basically the same thing."
>No. It is not the same thing. At all. It misses the intent of the design, and for a research system, the intent matters most.
>Engelbart's vision, from the beginning, was collaborative. His vision was people working together in a shared intellectual space. His entire system was designed around that intent.
>From that perspective, separate pointers weren't a feature so much as a symptom. It was the only design that could have made any sense. It just fell out. The collaborators both have to point at information on the screen, in the same way that they would both point at information on a chalkboard. Obviously they need their own pointers.
>Likewise, for every aspect of Engelbart's system. The entire system was designed around a clear intent.
>Our screen sharing, on the other hand, is a bolted-on hack that doesn't alter the single-user design of our present computers. Our computers are fundamentally designed with a single-user assumption through-and-through, and simply mirroring a display remotely doesn't magically transform them into collaborative environments.
>If you attempt to make sense of Engelbart's design by drawing correspondences to our present-day systems, you will miss the point, because our present-day systems do not embody Engelbart's intent. Engelbart hated our present-day systems.
The multiple-cursors problem goes to show how much of computing is STILL strongly single-user, even for all our multiuser underpinnings.
Arguably the only "single user" devices should be things like the mouse itself, as multiple people can see the same screen (and maybe even use the same keyboard).
Some games implemented this to allow two-player on the same machine - each would get a joystick or "their half" of the keyboard.
The Thinkpad with two Trackpoints was ostensibly single user, but multi-hand, which applications still are not designed to cope with, let alone multi user multi hand/finger/trackpoint!
Given a program that supported multiple trackpoints, two people could use the two Trackpoints on each side of the keyboard, just like you describe with keys.
Now we have multitouch APIs on mobile devices, at least. But they're not a good match for supporting multi-mouse/trackpoint, since they only support tracking fingers while they're touching the screen, not pointing around without pressing like a mouse/trackpoint does.
Close, but not ideal - lines are occasionally very long, and you're wasting nodes on sequences of newlines. There's a number of useful data structures that can do this well, like a 'rope'.
> What's frustrating is people who think that they can radically simplify something without knowing anything about it.
I run into this problem with my manager a lot, who knows just enough to get himself in trouble. There's probably a term for it, but I call it the "complexity trap". He sees a problem, thinks he has an "aha!" moment, and tells me about a simple solution, and asks me to implement it. This solution, of course, disregards most of the complexity inherent in the problem.
Complex problems require equally complex solutions, and if the solution is simple, it disregards the complexity of the problem, either shoveling down the ladder of abstraction, or hoping for some future state where it's not as complex.
I’ve found that a small amount of mathematical rigor reduces the size of most complex code by a few orders of magnitude.
It has the effect of short circuiting the complexity of the problem, because it defines all the corner cases in a compact form.
Interface with sloppy spaghetti systems can usually be hidden away at the edges of the system. The main problem is that less experienced engineers invariably come in and say, “A ha! I have a special corner case that doesn’t fit the abstraction.” and then try to add copy paste methods, boundary violations, and so on.
As long as there’s someone that diligently guards against such things, it works out OK. Results vary wildly after those people leave.
HP printer drivers are a famous example of the sort of complexity reduction I’m talking about. They used to copy the entire source tree for each printer (font rasterizer, dithering algorithms, and all), then assign a full time team to maintain it.
Eventually they had many thousands of full time engineers maintaining the result, and print quality was still terrible across the product line.
The open source drivers reverse engineered the printers, and factored out as much common logic as possible. The printer specific code was then reduced to just implementing the wire protocol and device geometry stuff. At that point, they had better output than HP, with 1% the developer resources.
Someone at HP did the same thing internally, reimplemented all the drivers in a unified way, and was promptly promoted to the executive team.
> I’ve found that a small amount of mathematical rigor reduces the size of most complex code by a few orders of magnitude.
I agree, though this is generally limited to singular problems, and usually becomes harder to implement the higher up the abstraction train you go. I've found this especially true in business logic, where you don't know what the requirements or even end goal is; by the time you're ready to apply some nice mathematical based optimizations at a high abstraction level, doing so often requires a lot of work exposing all of the information you need from lower abstraction layers. All comes down to planning.
> As long as there’s someone that diligently guards against such things, it works out OK. Results vary wildly after those people leave.
This is specifically why I mentioned my manager haha, who will both come up with crazy edge cases that in many cases don't even apply, and will come up with "aha!" solutions to problems that ignore major edge cases. Perks of not working for a software company.
Very interesting history on the HP printer drivers, thanks for sharing that. I do remember some of the evolution that they went through, from the Linux perspective, and it was amazing how that tangled mess just started to magically work, circa 2008 if I remember right.
We also saw this with COVID. Plenty of people with no knowledge in infectious disease stating that the public solution was stupid and wrong but instead the obvious, simple and effective solution would be to "just do ....".
Personally if that happens at work and somebody (often managers) tells us to solve a difficult problem by "why don't you just..." I usually tell them the code is available and they "can just do it" themselves. So far I have never had any takers.
Yeah the inner source mode is useful in several ways. “That sounds interesting, we welcome pull requests” — much nicer than “there are about three fundamental things you are overlooking.”
Obviously everyone could rewrite curl*, in 2021, or so they think.
Here's the magic in that trick: we have, collectively, amassed a heap of knowledge on HTTP and its behavior in the wild; this was made possible by curl, in the first place. Once this is handwaved away, of course it's easy to do. "Look at me, [standing on the shoulders of giants,] I have a really great view from up here!"
(*small print: and this is speaking about the happy path, where client, its stack, network, server stack, and server all align perfectly; this happens pretty much never)
I wonder what percentage of attempts to rewrite curl in modern languages end up eventually just calling code from libcurl if you trace down through their standard libraries far enough?
This is quite acceptable, IMNSHO. If you're reinventing a wheel, of course you need to know why your predecessors have not used the Obvious Simple Shortcut That Intuitively Makes Sense. Was it not available back when this was written? Was it to keep compatibility with Windows XP? Or perhaps your idea really did not occur to the developers.
Most time - when reimplementing stuff - is not spent coding, or even debugging, but on research.
Every so often (less so in recent years) I come across a bad homespun implementation of an HTTP client embedded in a larger project. Usually, a few lines of code to open a socket and POST some data to a hard coded URL. I always replace these with curl, usually over the objections of the original programmer.
The problem with HTTP (v1) is that it seems very simple to write a cut down client. But your simple client better handle proxies, various encodings, chunked bodies, redirects, HTTPs (with all the machinery that implies), and a whole host of annoying little features that the client is not in control of.
"But I don't need to implement talking through a proxy, we don't use one"
You don't use one now, but in 6 months you might. Or your client's might install one without your knowledge.
It is actually pretty simple to write an HTTP server (I wouldn't advise it) because the server controls a lot of the conversation. But clients are best left to either the OS or curl.
Sometime people started using expect headers (with content length) for long posts and 100-Continue and curl just handled it before I heard of it (outside of like push type apps).
Just a little tangent but I think everyone should know cURL… not postman, not httpie or any other more “friendly” tool. cURL is the standard, pre-installed, do-everything, scriptable tool to interact with web APIs. I could say mostly the same stuff about knowing Bash. It separates the barely effective from hyper-effective individuals on simple day to day (programming) tasks.
>Just a little tangent but I think everyone should know cURL… not postman, not httpie or any other more “friendly” tool.
5 years ago I was partner in a startup, we got some money from government right about the time I said I have to take a few months off to do consulting and make some money, my partner said I want to use money to make app I said no for various reasons but he went ahead and did app.
So I was helping them to do the app a couple hours every night - while consulting 8+ hours a day. Anyway the guy making app was using postman, I'd never used postman. He had a problem with the a part of the API not working.
I sent him a curl request to post a two line json document to the api and showed it got the correct response back.
His response - I'm not a curl expert or anything, it would really help me out if you downloaded and installed postman and tried to make it work!
My response - I'm not a curl expert or anything, I showed you it worked in curl because that is basically the default and should establish it is possible to do. You figure out how to do it in your tool.
I've often jokingly said to people "curl or it didn't happen" suggesting they send me a single curl command I can run to re-produce it. However, every time someone's actually provided a curl command over a postman file, it's either showed it's not a bug or I've found the bug within minutes of running the command.
I'm a fan of postman for some things, it sure is easier to modify and re-run http requests over and over in some cases. But on the whole if you're trying to debug an issue, dropping down to curl solves so many problems and is pretty easy to share with anyone no matter their OS for the most part.
I am always highly annoyed when someone has documentation for their web API and all that they provide is some postman collection of crap (or a crappy code example in an arbitrary language) and some vague abstract description of their API. Then I'll have to spend X amount of time on reverse engineering it to curl so I can see what's actually going on. Often I'll discover that they're doing really weird stuff (double base64 encoding for example) and that I'm the first to notice that it is weird. When I tell them I use curl they think I'm some hacker using some obscure tool.
Too many people in our field just take the first thing that works and they won't bother to actually look at it.
Sure, then they copy that nonsense command and send it to you, without really understanding what all of the options are, and just treating it like a magic blob.
I really wish busybox would bundle curl instead of wget. In the age of containers, I often find myself having to use wget when operating inside of one with busybox as shell, and I get it wrong on first try.
What usually trips me up is the syntax: `curl -O http://host/file` (goes to stdout without -O) instead of `wget http://host/file` (goes to file without -O)
It doesn't even bundle a proper wget, just a massively stripped down HTTP client it aliases to wget. Sure, providing minimal tools is the point of busybox, but having to sniff what actually runs when you do `wget` is massively annoying. (e.g. until recently, busybox wget didn't believe in HTTPS by default)
Mentioning wget makes me embarrassed for the days (probably 20 years ago) where I used to prefer wget. Curl annoyed me because I couldn’t “simply download a file”. Needed to provide an option just to write the file to disk.
Of course now I feel really icky if im ever forced to use wget. And curl’s feature of defaulting to stdout is of course a perfect thing and makes way more sense than the alternative (for many reasons).
It shows how standard it is when you can right click a network request in browser dev tools and copy as cURL is there as one of a few other odd options. :)
1998 was around the time i first got into linux (slackware) and the internet. For a long time, I thought curl was one of those commands that was just a core part of linux and a team of bofhs write and maintain. It was sort of inspiring to find that it was just 1 guy and he did it fairly recently. Mad props to Daniel, whose website and tool have been invaluable to me for 23 years.
It's both sad and funny. Saw similar comments for "fman", a simple file manager. Simple, until you start thinking about the details: https://www.catnapgames.com/2018/11/20/fman/
I think there's more truth to this comment than your presentation suggests.
I bet at least a subset of people who think they could reimplement curl easily, think of curl as a CLI tool to make HTTP requests. It's just a little bit of python/requests or node/fetch code to cover some basics for that. I think these people simply aren't aware that their programming language's built-in libraries often call in to libcurl for the dirty work.
exactly, my point was, some people think they can build it in a day, because they see their favorite library already supports sending HTTP, they might not be aware that some of those libraries are wrappers around cURL library
In 99.99% of cases it is a CLI tool to make HTTP requests.
The people saying they could rewrite it are really saying "I could rewrite Curl (using an an existing HTTP library) so that 99.99% of the use cases work" which is probably true.
What's interesting is that the next Uber may be created by somebody naive about the effort it will take. Being naive can be an advantage in this case. Sometimes people who KNOW the effort are discouraged and never try. And it's the guy who is clueless who ends up embarking on the quest.
I wouldn't be surprised if the first comment is right that you could build a HTTP client sufficient for 90% of the usecases in ~100 lines or so. It's that other 10% that takes 90% of the time, effort and complexity, and for which having libraries like curl is invaluable.
I think there are two classes of cURL users - those who use it as a "let me download this file to the current folder" users who always forget the -O; and that use case is quite simple and seems it could be easily replaced.
And then there are those who use all the API features and realize it's a complex program.
But the missing link is that even the first use case has many complex edge-cases that aren't obvious at first.
As Delphi programmer, I have so far been avoiding libcurl successfully, and spent years rewriting curl because I need to make HTTP requests
At first it was simple. I was making Windows apps and Windows has an API for HTTP requests. Just call wininet with the url, and it handles everything. No need to use someone's library if there is a standard OS API. And it is fully integrated in Windows, you get the default proxy settings and user-defined HTTPS certificates. It even shares cookies with the internet explorer (which I did not want, so I wrote my own cookie parser)
Then I made a linux port. Suddenly there is no API. But I also avoid C, since it is unsafe, while Delphi/Pascal has memory safe strings and arrays. So I needed a HTTP Pascal library. There are several libraries, which also support other protocols I have never heard of. Then I used the synapse library. It is really causing a lot of problems. Incomplete cookie handling, no gzip encoding, OpenSSL messing everything up...
Then I made an Android port. The standard stable system library was Apache HttpComponents, so I use that. Shortly after, Google makes a new Android version, and removes Apache HttpComponents. Then I had to find a new library.
In hindsight all the porting were useless projects. I should have told people to use WINE and be done with it. A stable API is much better than a library. Or write my own HTTP client from the start, which would have been much easier than switching libraries all the time. Especially if I have to implement half of it (header parsing, OpenSSL interfacing) anyways when the libraries do not work properly
It’s not that these devs don’t understand the complexity of curl, since it does make things appear simple, it’s because they don’t understand the complexity of the protocols implemented by curl.
Long ago, at ${WORK} we implemented the http 1.1 protocol on top of platform tcp sockets, it took several months to implement a correct subset of the protocol, then when we needed more we scrapped it and just used libcurl!
If you're a fan of curl and have benefited from it but can't contribute via code, docs, etc, I'd recommend their OpenCollective page: https://opencollective.com/curl
Many employers let you expense $50-100/month without much effort so get a handful of friends and it has a bigger impact than one corporate contributor.
I could write a simplified curl in a week-end, I could even write an HTTP server in the same week-end. The problem, they will only work together, no https of course, that "s" is really annoying.
And even that is not as straightforward as one might think. I did it it C, Unix, TCP sockets, nothing unusual, but when you start putting significant load, you start getting errors left and right and you have to handle them properly. These are all documented, but easy to overlook.
With C and something like libev or won’t have issues with load before you have issues with weird http stuff that works for other clients or servers. I just had to modify my squid config to allow airflow Python to to a GET with a four length body: null (and no url-form-encoded content type).
This is quite common, unfortunately. In other news, I could do the jobs of every single other person in my company comfortably. Only my job is difficult.
All the best art, whether it be books, music, movies or software, make it seem so effortless. Ironically, it's when you see something bad that you kinda realize what goes into it. Like huh, hundreds of talented professionals worked really hard on this movie, and it still turned out this bad. Now imagine if I, a complete amateur, would give it a go...
> Now imagine if I, a complete amateur, would give it a go...
Now that one is actually tricky. I'd probably enjoy a mediocre amateur production more than a mediocre Hollywood production. It's a bit like the uncanny valley idea - it's easier to look at something meh than at something you know is a bit off from being good.
Excluding the bragging, I don't think this part of the first comment is incorrect: "You can implement a 90% usecase HTTP client in < 100 lines without golfing." We use the general purpose curl machine for convenience, but a minimal version would suffice for majority of cases. (Just an observation, I'm not suggesting we do that)
This is a problem I fight against nail and tooth almost daily, and it's closely related to the not invented here syndrome [0]. I would even go so far as to call it the Dunning–Kruger effect of technical knowledge. For context, I'm a frontend developer slash consultant who usually gets called in when development slowly grinds to a halt due to an overbearing number of bugs and regressions. Almost exclusively, at their core these problems can be reduced to underestimating complexity and trying to build something from scratch. It's a very common theme — "I can build form validation from scratch, that's easy." But then there's async validators, fields that have circular references and timezone issues. "I can build a table from scratch," but then there's endless scrolling, custom filters and performance issues. I think the common denominator is that all of these issues look simple when you address the smallest possible usable case, but quickly blow up in your face when you start building without thinking what you need to add on later. Unless you want to hire me, don't be that guy. Forms, tables, network layers and most other things are already solved problems — please don't try to build them from scratch unless you really need to and have a good plan.
I'd attribute it to developers that have a complete lack of empathy and professionalism towards both their future selves and their colleagues.
Once one has gotten burned sufficiently by undocumented convoluted code I think there's an understanding that starts forming on how to make choices in libraries and structuring code. The problem is devs that jump around from greenfield to greenfield project and never get to experience that pain.
This is definitely a problem. There's also the problem of devs that can't do the simplest task without pulling in 5 massive libraries. The thing that keeps programming fun is that every case is a little different.
Of all the libraries I have ever used, Curl is the one I used the most and found the most useful. I have never had the idea to rewrite it or even use another library to replace it.
With one of the comments I immediately thought "this is HN". And I was right.
But on a second look I'm quite certain it wasn't the wording or the sentiment, but the beige background color that I hadn't consciously noticed before.
It seems like the correct response for these "reimplementation" types of criticism is to encourage the person to do so and share the results.
I have, for personal gratification, attempted (with varied success) to reimplement several utilities. The experiences were quite eye-opening, seeing how well designed the "standard" utilities are, and identifying new ways to improve my code.
Why do so many people indulge in casual mockery? Is it because we are told not to respect authority or institutions but instead to respect ideas? I see no reverence anymore; just casual mockery everywhere one turns. I am constantly awestruck all the time by how amazingly beautiful everything is in GNU/Linux world. What would I do without FOSS!
This is just the usual thing where people forget that it is the final 5-10% of a good project that takes 99% of the time. It is ok, they will remember, the hard way, one day soon!
I'd be fairly proud of myself if I manage to write a working HTTP/0.1 library and command-line client in a week-end. I can barely imagine the amount of work put in cURL.
More and more APIs use HTTP as the transport for various reasons, but one of them is that HTTP has considered large number of edge cases and has been tested "in the fire" - rolling your own custom API with security and encryption is much more fraught with dangers.
But I would bet that those only use the GET or POST functionality of HTTP, and that the real reason it's used is because it's easy to quickly get something up an running.
It's a short-term gain, but quite inefficient in the long run.
ZeroMQ is a perfectly valid replacement for HTTP in internal networks. There's no reason for everything to be human readable.
Micro-services, for example, are pretty much all HTTP + JSON. The only place for HTTP is the browser and purely for historical reasons. Using it for anything server-server is a waste of network and CPU.
Blurring does little to hide identity, as-is the difference to find that Twitter profile without even needing any tools is literally Googling "<word> <word> <word> <word> twitter" instead of "username twitter". Doesn't hurt the profile pic is included either.
Simply filling the space with a solid color is not only easier but infinitely stronger.
And the fact that it would be really easy to rewrite a usable subset of curl if you use your language's `import curl` which has behind the scenes had as much development effort as curl itself.
Indeed. And that default client is, statistically, likely to use libcurl under the hood (because who has time to fix all those pesky corner cases that you wouldn't think of in 10 years? Obviously someone who's been doing that stuff for 25 years.).
This happens everywhere. The devil is in details. If we use an application and the surface we see looks simple, we assume it must be simple to build :) Then we start to see all the complexity behind.
Shameless plug: This kind of posts was the reason I wrote about it recently too [1]
I never used Curl or looked at the source code, I don't know about the specifics, but rewriting a simpler copycat from scratch is not always a bad idea.
I've done that a few times, to prove a point or because the original was bad and too difficult to maintain.
I don't think it does. curl has been maintained for nearly 25 years now; there's a benefit to that, and I'd need someone to come up with very compelling reasons to write a new implementation before abandoning it beyond "my code is shorter".
Every piece of software that people actually use will face criticism. But as quite honestly for some one with the clout of Mr Stenberg to out some poor Joe Codes just because they dared to have an opinion is just as pathetic if not more.
Writing an HTTP 1 client in a weekend is doable and Rust is probably a better language today than C is (with some caveats obviously). Is having that opinion enough to to be publicly humiliated on Mr Stenberg's blog, I sure hope not.
Curl though, is a very wide breadth project. It currently supports the following protocols: DICT, FILE, FTP, FTPS, GOPHER, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET and TFTP.
Each one of those is a massive undertaking of protocol/spec management, edge case handling, cross compatibility, hacks, and much much more to make them work. Then to put a stable C api on top of it all to be a cross language toolkit is a MASSIVE undertaking.
But, again, the "idiots" who posted those comments, aren't talking about any of this. Perhaps they shouldn't care to. Most folks I know don't even know about the other protocols curl supports and have only interfaced with curl through its http. Frankly there are nicer http cli's out there with less code that _can_ be written in a weekend (assuming they piggy back on a http lib).
What Daniel Stenberg achieved is giving the world a fantastic, reliable, cross protocol, cross platform and cross language network library that can be used as the foundation for many projects. I'm sure few of those cited would claim they could do that.