Oh wow. Just heard the Notesnook is on the front page of HN. I am the co-founder so if you have any questions, feel free to ask.
Oh and to clear a few confusions:
1. Notesnook is 100% open source. That includes the server, client apps, and everything else. It's not partially open source.
2. Zero knowledge does not mean Zero Knowledge Proof but Zero Knowledge as in we, the company and people behind Notesnook, have no knowledge regarding what you have in your notes. I see that this might be more accurately called "no knowledge".
There's no point in offering users self hosting if you are going to throw them into a fire by doing so. Notesnook is rapidly evolving and changing, and hoping for self hosted users to keep up is impractical. Instead, we want to first stabilize the backend, and then think about self hosting.
After v3, our primary focus will be on self hosting and getting an audit done.
> Notesnook is rapidly evolving and changing, and hoping for self hosted users to keep up is impractical
I don't know...does self-host equal slow changes? For me that's part of being a self-hoster, I have to keep the softwares up-to-date, and I actually appreciate if updates are frequent.
> After v3, our primary focus will be on self hosting
Sounds good, good luck on further development, if I ever see a self-hosting guide I'll check y'all out ;)
Not slow changes. Offering self hosted officially means that we have to be aware of users who are self hosting before we make any drastic changes, writing migrating guides, and giving some sort of support. All that has an impact on productivity.
build an export and import and offer it with docker and I'm on board. I dont care much If I have to take a few extra steps... but I'm done hosting my private stuff in the cloud. Self host is the future.
yes, it's a pwa so offline first. it uses yjs with webrtc for p2p generically and the "last mile" will be a "light" electron app that you can use via webrtc to store files .
no technical server, so really the only infrastructure is the signalling, and it's straight forward to scale that.
might still be a niche but it serves my day job needs. field work and reporting asynchronous is a pain.
As someone who has been doing local-first for the last 3 years with Notesnook, let me tell you: it's not all gardens and roses. Local first has its own very unique set of problems:
1. What to do with stale user data? What happens if a user doesn't open the app for a year? How do you handle migrations?
2. What about data corruption? What happens if the user has a network interruption during a sync? How do you handle partial states?
3. What happens when you have merge conflicts during a sync? CRDT structures are not even close to enough for this.
4. What happens when the user has millions of items? How do you handle sync and storage for that? How do you handle backups? How do you handle exports?
One would imagine that having all your data locally would make things fast and easy, but oh boy! Not everyone has a high end machine. Mobiles are really bad with memory. iOS and Android have insane level of restrictions on how much memory an app can consume, and for good reason because most consumer mobile phones have 4-6 gbs of RAM.
All these problems do not exist in a non-local-first situation (but other problems do). Things are actually simpler in a server-first environment because all the heavy lifting is done by you instead of the user.
> 1. What to do with stale user data? What happens if a user doesn't open the app for a year? How do you handle migrations?
version = db.query("select value from config where key='version'").fetch_one()
switch (version) {
case 1:
db.migrate_to_version_2()
fallthrough
case 2:
db.migrate_to_version_3()
// ... and so on
}
assert(version == 3)
start_sync()
Just don't delete the old cases. Refuse to run sync if device is not on the latest schema version.
One of my Django projects started in 2018 and has over 150 migration files, some involving major schema refactors (including introducing multi-tenancy). I can take a DB dump from 2018, migrate it, and have the app run against master, without any manual fixes. I don't think it's an unsolved problem.
> 2. What about data corruption? What happens if the user has a network interruption during a sync? How do you handle partial states?
Run the sync in a transaction.
> 3. What happens when you have merge conflicts during a sync? CRDT structures are not even close to enough for this.
CRDTs are probably the best we have so far, but what you should do depends on the application. You may have to ask the user to pick one of the possible resolutions. "Keep version A, version B, or both?"
> 4. What happens when the user has millions of items? How do you handle sync and storage for that?
Every system has practical limits. Set a soft limit to warn people about pushing the app too far. Find out who your user with a million items is. Talk to them about their use cases. Figure out if you can improve the product, maybe offer a pro/higher-priced tier.
> Mobiles are really bad with memory. iOS and Android have insane level of restrictions on how much memory an app can consume, and for good reason because most consumer mobile phones have 4-6 gbs of RAM.
You don't load up your entire DB into memory on the backend either. (Well your database server is probably powerful enough to keep the entire working set in memory, but you don't start your request handler with "select * from users".)
You're asking very broad questions, and I know these are very simplistic answers - every product will be slightly different and face unique trade-offs. But I don't think the solutions are outside of reach for an average semi-competent engineer.
> You may have to ask the user to pick one of the possible resolutions. "Keep version A, version B, or both?"
For structured data, with compound entities, linked entities, both, or even both in the same entity, that can be a lot more complicated.
If a user has updated an object and some of its children, is that an atomic change or might they want the child/descendent/parent/ancestor/linked updates to go through even if the others don't? All of them or some? If you can't automatically decide this (which you possibly can't in a way that will satisfy a large enough majority of use cases) how do you present the question to the user (baring in mind this might be a very non-technical user)?
Also what if another user wants to override an update that invalidates part/all of their own? Or try to merge them? Depending on your app this might not matter (the user might always be me on different devices, likely using one at once, that is easier to understand than the user interacting with others potentially making many overlapping updates).
I think you misunderstand. My intention was not to say local-first is bad or impossible; it's not. We have been local-first at Notesnook since the beginning and it has been going alright so far.
But anyone looking to go local-first or build a local-first solution should have a clear idea of what problems can arise. As I said in the original comment: it's not all gardens and roses.
Just a few weeks back a user came to us after failing to migrate GBs of their data off of Evernote. This, of course, included attachments. They had successfully imported & synced 80K items, but when they tried to login on their iPhone, the sync was really, really slow. They had to wait 5 hours just to get the count up to 20K items. And that's when the app crashed resetting the whole sync progress to 0.
In short, we had not considered someone syncing 80K items. To be clear, 80K is not a lot of items even for a local-first sync system, but you do have to optimize for it. The solution consisted of extensively utilizing batching & parallelization on both the backend & the users' device.
The result? Now their 80K items sync within 2 minutes.
The problem wouldn't exist. This was about the phone fetching 80k new items from the server. If the phone just shows the item you're looking at, one at a time, and doesn't try to sync everything, there's no such problem.
I've been working on mobile apps developed for education in Afghanistan, rural Rwanda, etc for the last 9 years. I used to think that sync was the way, but I have learned from painful experience.
4 (extended): What happens when the user has access to millions of items, but they probably only want a few (e.g. an ebook library catalog)? Do you waste huge amounts of bandwidth and storage to transfer data, of which 99.9% will be useless? We had a situation where the admin user logging in on a project that had been running for years resulted in a sync of 500MB+, 99.99% of that data the admin would never directly use.
Also: do you make users wait for all that old data to sync before they can do anything when they login?
Relying on sync is basically relying on push-only data access. I think in reality most offline/local first applications will work best when they push what a user probably wants (e.g. because they selected it for offline sync, starred it, etc) and can pull any other data on demand (in a way that it can be accessed later offline).
Query-based replication works when you know what the user probably wants to have in advance (e.g. a device in a warehouse needs records for that stock in that warehouse, not others). But that's still push.
You still need pull on demand access when a user opens any random item where we don't know in advance what they probably want (e.g. a discussion board scenario).
I'd say you're spot on except for point (3). There's a number of crdt and event log approaches that, when combined properly in order to preserve user intent, can solve almost all merge issues of applications that do not require strong consistency.
> 4. What happens when the user has millions of items?
Partial replication is a problem I haven't seen many people solving but it is definitely the next frontier in this space.
I am the developer of RxDB, a local-first javascript database, and I made multiple apps with it and worked with many people creating local first apps. The problems you describe are basically solved.
> What to do with stale user data?
Version/Schema migration in RxDB (same for IndexedDB and watermelonDB) can be done in simple javascript functions. This just works. https://rxdb.info/data-migration.html
> What about data corruption?
Offline first apps are built on the premise that internet connections drop or do not exist at all. The replication protocols are build with exact that purpose so they do not have any problems with that. https://rxdb.info/replication.html
> What happens when you have merge conflicts during a sync?
> What happens when the user has millions of items?
There is a limit on how much you can store on a client device. If you need to store gigabytes of data, it will just not work. You are right at that point.
> How do you handle backups? How do you handle exports?
Some domains just don't have these issues, like note taking where data is small and there are many acceptable ways to handle conflicts, like backup files and three way merges.
Maybe the question is less "How to make this work local first" and more "How to squash down the problem domain into something that just naturally fits with local first"?
I wish we had something like an embeddable version of SyncThing, that had incoming file change notifications, conflict handler plugins, and partial sync capabilities, and an easy cloud storage option.
I think most everything I've ever wanted to p2pify could probably be built on top of that, except social media which is a separate really hard thing.
I don't think even you understand what you just said.
Consumer devices are notorious for their reliability problems. Compared to a full blown server that you have 100% control over with almost insane amounts of RAM & CPU power & a lot of guarantees.
Running a migrations on a server is far, far different to running it on every users' device. The sheer scale of it is different.
> Using 4 GB per user on your backend works?
That was a comment on the average RAM on a consumer device - not the total RAM required per user.
> Running a migrations on a server is far, far different to running it on every users' device. The sheer scale of it is different.
Well, it’s not only just that. Among the other things, some of the application instances would be outdated but still need to work, so you would need to support _all_ the DB schemes you have ever had for your app.
I know I understand what I said, and I am not convinced by anything you said.
What reliability problem would prevent you from running local-first software but doesn't interfere with running a thin client?
Why would the business logic part of your app require more RAM on the end-user device that it requires per-user (or per-document, etc) on a server?
Why do you claim that running migrations is so fundamentally different here and there?
If you want to argue I would appreciate you doing it with real arguments and experiences rather than even more unsubstantiated claims and statements like "you don't understand what you just said".
I don't think that's true. We have been using ASP .NET Core + SignalR for everything in Notesnook[0]. It has great performance, libraries for everything out there, very good ergonomics, and a mature ecosystem.
I know you can probably get more performance out of Rust or Go but I don't think rq/s benchmarks really hold up in reality. If you are running a startup, you'd probably see 1 to 2 rq/s unless you become viral. .NET Core can easily handle that and more.
The author is going into technicalities without much actual substance, ending with: it depends.
I think whenever we, as programmers, try to pin down a certain principle, it bites us. Hard. DRY was cool as an observation but when it got turned into a law we saw the spaghetti code.
Duplication, on the other hand, is detested almost as much as the goto statement. Let me tell you, it's not that bad. Duplicate code makes everything more flexible. It helps you to NOT bend over backwards in order to change a line of code. It allows you to NOT touch anyone else's code.
So many good things. Of course, I agree with the author's summary of the bad things that can happen with duplicated code. But there's a litmus test for that:
If you have to make changes in multiple blocks of duplicated code in order to change the behavior of something, there's a problem. DRY out the code so you only have to touch 1 place.
If, however, 2 blocks of code LOOK similar but aren't actually the same, and changing one block doesn't make the other block outdated and stale, you are good to go.
Judge and decide. It's just 2 approaches that when taken to an extreme can cause a lot of pain, but if used with common sense, nothing is simpler.
> Duplication, on the other hand, is detested almost as much as the goto statement.
Honestly, even the goto statement isn't that bad. It's pretty useful in C code. I'm not saying anyone should put it in a new language, but the amount of hate it gets is really just related to BASIC monstrosities from the 1970s, not any real-world applications of it.
Anyone who says X must be done in Y way is most absolutely wrong. Who is Tiago Forte to tell how I should organize my thoughts? Am I a robot to follow so simply in the footsteps of another? I would like a more general approach, please.
Any serious note taker eventually realizes that the way their brain works doesn't really fit into any PARA, ARPA or any other standarized system.
Note taking is a messy business. It's not supposed to be organized because you take notes to become more organized, not to organize more notes.
Just take your notes.
Organization will automatically happen. Humans are very bad at doing things randomly. There's always an order, a sequence, a method to the way we do things. Who's to say your method is better than mine?
People will surely, as you say, adopt certain habits, e.g. ordering notes chronologically for lack of a better idea.
That doesn't mean that there isn't a "best" system, which anyone who wants to be productive should adopt. But what I regret is dogmatism unsupported by empirical evidence of superiority (=unscientific claims), and the world of productivity systems (including note-taking) is full of that.
Notes may contain ideas that need to be captured so that they aren't forgotten.
Notes may contain event logs that need to be capture to verify later what actually happened (or not).
Notes may contain task lists, potentially with commitments and/or assignees and associated promised completion dates.
(Many notes can be repurposed later: an informal list of things may become a list of chapter titles in a book one day - I guess that view makes me a pragmatist [pragmaticist, more precisely, c.f. Pierce], maybe not surprising for a student[yours truly] of a student[Spärck-Jones] of a student[Masterman] of Wittgenstein).
The good thing about computers as tools that support our personal knowledge management is that we do not need to worry that much about the organization compared to e.g. a paper-based system, because we can always fulltext-index
everything, so at least there is a high chance of re-finding things (known item search) at least as long as we give things specific names or we can recall sets of keywords we likely used in describing something.
EDIT: PS: My personal note on note-taking is that indexed plain text beats any particular software anytime - because notes (any data, really) live longer than the note taking software of the day, so you don't want lock in.
I think all note apps are just databases. Some day an LLM might be able to make sense of the messy notes in my note taking app but I am not holding my breath. My notes are not supposed to stand the test of time. They are ephemeral thoughts relevant only for a specific period - a best-worst case against the risk of forgetting them.
The most an AI should do is find relevant notes on a topic I search. That's it. I don't need its robotic voice summarizing my grotesque ideas in a painless, unpassionate way. The last thing I need is more of my ideas seeping into more of my computers in hopes of somehow making me smarter.
These guys also tried to submit PRs on the Notesnook[0] repository. The PRs are really good but there's no way to talk to the actual developer working behind those PRs. They have a single central account named "gitstart" and all PRs that any of their user works on falls under that account. Of course, we couldn't accept PRs from them because they don't follow DCO[1] i.e., DCO requires that the committer MUST NOT be an organization.
I talked to them about this and they said they'd work on it etc. etc. Not sure if anything has changed in that regard or not.
> The PRs are really good but there's no way to talk to the actual developer working behind those PRs.
I'd really like an avenue to get into the US market as a remote worker, but am being unfairly treated by this job market. It is a pity as I am both a highly skilled programmer and have nearly a decade of experience. I'd consider this service if it could serve to showcase my skills, but if I am not going to get any credit for doing the work personally, there doesn't seem to be much point to it.
If you are a highly skilled programmer, why not just start contributing to open source repos yourself? You don't need to go through GitStart to get there.
Many companies, including my own and the commercial open source companies mentioned in this post, consider open source contributions a major factor in hiring remote talent.
If you're comfortable sharing, could you elaborate on how a transition from unpaid contributions to some kind of paid work arrangement typically happens.
In your experience, is there usually a more or less deterministic path to a stage where the question of starting to get paid usually comes up? Who initiates it?
The result of this discussion may very well be that the company isn't interested in this kind of relationship for any number of good reasons. But what's important, I think, is for a contributor to be able to have the right expectations coming in. As in:
- Should I join on a purely for fun basis and see where it goes from there, keeping in mind things most likely will stay this way going forward.
- Or if everyone is happy with the quality of code, communication, etc across a number of pull requests, then it's definitely OK and expected to bring up the question of payment/employment.
Your public contributions are a showcase of your alleged skills, in most cases.
There is no deterministic way to transition from unpaid to paid. It's just one signal among many that a recruiter or company looking for services would look into.
I don’t think you’re being treated any more unfairly than anyone else trying to break into the US market. It’s simply a case that a decade of experience and being a skilled programmer are not enough to stand out from the crowd.
Depending on your location there may be legal restrictions preventing from US companies hiring you - even through a B2B contract.
Most companies hiring remotely don't have if your 1-person company is US-based or not.
The ones that care to hiring within vetted countries for $reasons usually will not accept exceptions. Notable example is GitHub which has a list of countries they hire from (even though they're owned by MSFT and could hire on the Moon if they wanted).
Having a company is mostly for tax purposes. It makes everything easier. I think the hiring company doesn't care if the contract is done with a business or an individual. Both are usually limited liability and offer no advantage in case of contract breaches.
If you are highly skilled already, it should be very viable to just work with OSS projects directly. For example in Python, the Django & DRF projects are always looking for contributors (though Django can take a long time to land substantial features).
In my experience as a hiring manager it was quite rare to see lots of OSS work in candidates’ GitHub accounts, but I’d absolutely prioritize those that had good work in OSS. (Also worth emphasizing that technical design, collaboration, and documentation are important and underrated, and can also be showcased in an OSS project. If you can demonstrate good communications in an async OSS environment, that would probably reflect well on your ability to contribute as a remote employee.)
All that said I’m not sure that OSS is the best resume builder. For big companies you need to drill LeetCode and system design. Perhaps for startups it is not the worst use of your time.
> For big companies you need to drill LeetCode and system design. Perhaps for startups it is not the worst use of your time.
Exactly. Any hiring manager with a brain and not bound my clueless corporate processes would use OSS contributions are a decent signal for proficiency and social skills.
That means nothing in a big corp though. The hiring panel will never accept a candidate that fails the Leetc0d3 test because that means other panels could do the same and then it all falls apart for them. Status quo and all.
To be fair, it’s a hard optimization problem. If you are trying to remove bias from your hiring process then it is difficult to objectively score things like OSS contributions. (I do agree it’s something most bigcorps could do better.)
As a small company you don’t need to try to remove bias with objective metrics (indeed, “culture fit” and “thinks like me” can be good heuristics for building a small tight-knit and high-performing team) but when you hit the company size where you must introduce multiple layers of management, then fully trusting each line manager’s subjective judgements can lead to very disparate quality and other political/organizational issues.
We currently attribute commits back to every single dev involved in a PR (including reviewers) as co-authors. We also actively work with our customers to allow devs to mention their contributions in their CV publicly. And you can always reach out to them directly if they have an open position (especially mentioning your experience working with them through GitStart)
What would be an ideal way to attribute the hard work back to the devs in our case?
> Of course, we couldn't accept PRs from them because they don't follow DCO[1] i.e., DCO requires that the committer MUST NOT be an organization.
I read the certificate. It isn't long:
Developer Certificate of Origin
Version 1.1
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
Where does this require that the submitter not be an organization? Even if you're fully committed to that idea, the obvious approach would appear to be:
(1) GitStart releases their patch under the open source license of the project's choice.
(2) An employee of GitStart submits it, in their official capacity, to the project, asserting under clause (b) that the contribution is based on (consists entirely of) previous work that is covered under an appropriate license.
That's a weird document. It seems that it actually wants the submitter to certify that either A & B are true, or C is, and that D is acceptable. But it lists them like you must accept all 4, but all A & C can't both be true at the same time.
Yes, it's malformed. There is no expectation that (a) and (c) can be simultaneously satisfied, since they are linked by a chain of "or"s. But it seems clear that the intent of the document is that you're supposed to certify two things:
I. At least one of (a), (b), (c) is true.
II. I understand and agree as specified in (d).
In other words, (d) is not parallel to (a), (b), and (c) -- it's a drafting mistake to put it in the same sequence of clauses.
It is on our roadmap to properly integrate DCO on the platform, as its a hard blocker for all repos under the linux foundation. We already add each devs that touches the PR internally as co-author now
The problem lies in the definition that “committer most NOT be an organization” which would require us to do a full review of our legal contract with devs so that this condition is upheld properly
It's a lot easier to trust an individual than it is to trust an entire organization which is effectively trusting every individual who may have access at that org at any given time.
We often think of orgs as these big trusty things but honestly they are just groups of individuals, and it's harder to trust N individuals than it is to trust one individual, especially when the N is opaque and unbounded.
What they should do is co-sign the PRs/commits so it's the worker's personal account + the GitStart account. Then the workers are actually building a git repertoire for themselves which puts them a step closer to independence. That might go too much against the exploiting cheap labor theme that seems to be lurking in the dark here though. That said, I would love to be pleasantly surprised if GitStart is OK with co-signing commits alongside accounts directly in the control of their workers. Would definitely be based, doubt it will happen.
What do you mean by co-signing? We currently do attribute the work using "Co-authored-by:" in the commit message. I'm not sure if there's any other/better way to do it.
On top of that, we are thinking about developer profile you can share as a part of your resume. I'd love to hear some suggestions on what you think we should include in it.
I retract my previous statement! If you're already doing the Co-authored-by, that is fantastic, then their names are clickable on GitHub, and the developers themselves will be able to directly interact through their own GitHub accounts if they are @'ed by the customer. This is great. Definitely advertise that aspect.
In general my advice is this is one of those gray area markets where it is extremely easy to exploit workers, so you should do everything humanly possible to make sure that:
a) workers in this program are progressing with the goal being they can eventually "graduate" to the point where a middle-man is not required. This won't be possible across the board but an evil version of this startup would be designed to keep workers in this situation as long as possible, so just don't be that.
b) Allow customers who want to direct hire specific workers to do so seamlessly and don't do hefty referral fees that are enough to stop a lean startup from being able to pull the trigger. I've personally witnessed startups unable to pull the trigger in such situations because they can't afford to "buy the person out" of whatever referral company owns them. Don't be like that. If you absolutely must do a referral fee, structure it so they at least get back the money in credits with your program or something.
c) Treat workers like the assets they are -- each worker that eventually gets a job through your program becomes a marketing asset that will drive referrals from their home country when they eventually find success and tell their friends how they succeeded. Make sure you have created a positive enough experience for them that they will actually recommend you and dear god give them referral fees when they send you someone.
Your advice is spot on, and why we wanted to build a better than the current status quo!
a) we already have a sizeable alumni who have gone through GitStart over time, with many still in touch. We are in works to bring them all together in discord
b) there is no current restriction or even referral feel for both devs and companies to work with each other. The only thing we ask is for devs to either be full time on the platform or work with them directly and pause GitStart
c) good people recommend more good people! And we have a program where as alumni they get free credits for their own companies (over 5 have launched their company and used those free credits)
We currently do not have a referral program for alumni to recommend devs (it is there for currently active devs) but that’s a great idea to roll out!
I would NOT call this frugal at all, actually. You are overspending in a lot of places and under spending in others. $281.32 per month for such a simple service is...astonishing. Change Vercel and Intercom with Cloudflare and Chatwoot (self hosted), and you have saved $120/mo. Get rid of Ahrefs for Ubersuggest (or no SEO tool at all) to save about $80 more. That leaves you with $81.32/mo. If you try even harder, you can easily bring that under $50.
But I get it. If X tool saves time, why not pay for it? I wouldn't call that frugal though because you are still spending where you can save.
Use one of the OSS password managers to save on 1P. If you have a cloud sync service you already use, KeePassXC + cloud sync will solve having the passwords available everywhere you are. You can even use Syncthing for free (though I'd at least donate once off) since you can use it peer-to-peer between your devices.
SEO really needs disruption here on price. Ubersuggest seems decent.
On Intercom: OP can probably go into more detail on why Intercom's widget is so much better. If it's just animations and UX, then yeah go find some alternative or self host.
What's useful in the self-hosting space is finding either a tool to manage server provisioning & management (e.g. Ploi), or a hosting provider that makes it easy to install (e.g. DO marketplace). Then you're only spending a small fee on the management and only spending on hosting cost (which for a long time can be negligible).
Spending some time to DIY with reliable tools (automated where possible) is IMO still preferable to spending this much money if starting from scratch.
Does nobody here see the irony in bending over backwards to avoid spending money on software products when the entire purpose of your business is to sell a software product?
I could see if we were talking about a side project with no commercial intent. But this post is about an indie Saas founder...if a few hundred dollars a month is a barrier to success, then that's a pretty good sign your indie startup isn't starting up.
I find these threads endlessly amusing considering that, for hundreds of years, starting a business involved massive investments in real estate, physical equipment, etc. We're living the dream, yet somehow quibbling over spending $10/month on a piece of software. If it saves you just 2 hours, per year, then it's worth it.
I generally agree, but I'd say it depends. I'm not too surprised that software developers are less likely to spend money on software - they do have another option non-programmers don't have. I'd also not be shocked to hear that trained accountants starting a company are less likely to hire accounting firms than non-accountants.
And most software I deal with isn't exactly free of opportunity cost - I dabbled a bit with accounting tools to the point of frustration and time sunk that I decided to roll my own (built around ledger, in case anyone cares) in about a day. Writing my own tool is sometimes faster for me than trying to figure out how to use one that's not designed around my specific needs.
Sure, 10$ doesn't seem like much, but all these little invoices can creep up to a non-trivial sum.
Yes, but shouldn't I be making more even money by using these services to optimize those other things? If, say, Calendly helps me get booked for appointments, appointments that wouldn't have happened without that app, then isn't that spending money to make money?
If you think so, it's probably true. That's why I can't really disagree with parent :D
I find the cost and value of tooling options quite difficult to fully grasp, it often comes down to gut feeling IMO.
What I would want to caution folks for is that the cost of a tool is not only what it actually costs. It's also the amount of time that has to be invested into usage, and potentially migrating to something else a while down the line in case of discontinuation, disappearing capabilities, new requirements, ToS or pricing changes, ...
Is it bending backwards though? The problem is that saas solutions scale in cost as well, so when you grow you pay more. For a small startup, there is always a cut off where the price of the saas service(s) explode well over the utility but you are hooked and you can afford it now. But should you? Personally I enjoy profits and more profits are better. I cannot accept using tools that I don’t really need, but, at success time, I cannot really get rid off anymore easily. And then suddenly that $10/mo is $23k/mo and I knew that when I integrated it in the first place, but thought it will save me 2 hours per year. And the new price utility is not suddenly saving me 100 or more hours a year; more like 4 hours.
I don’t doubt it’s possible, but I would love an example of a Saas product that successfully scales pricing from $10/m to $23,000/m on customer segments that aren’t enterprise-scale and has the kind of lock-in you describe.
I ask because I would like to quit my job and build a competitor ASAP.
Also, the problem isn’t the potential cost down the road. If you’re scaling on any Saas product to the $23k/m tier, then congrats, you’ve made it!
The problem is, there’s a 90%+ chance all your cost optimizing upfront was a giant waste of time, because a vast majority of startups don’t work out. And the ones that do often pivot.
Two things: the first is that it's easier to estimate costs with services like 1P due to it being mostly static, unlike other services where costs are quite dynamic ranging from different factors like usage and hours spent. The second is that once you reach that scale (more than 10 members), your costs are so large on personnel alone that $10/person is an absurdity to be frugal about when it can save you time.
If you have employees, you’re paying hundreds of thousands of dollars per year for them. Paying an extra $10/m to save 2 hours of their time is still a profitable trade.
I think others have addressed your comment well, but I'll add my 2c:
I think it's more about minimising the cost with effective alternatives rather than just taking whatever the "brand name VC startup/scaleup" is in the space and forking over hundreds of dollars.
You will more likely always have to spend money to make money, I agree. But if starting from nothing, having an immediate -$200/month (up to $500+ once the free credits run out) before you actually make any money does not sound like a good time.
Also take into account when running multiple such projects. Survivorship bias in our space is real, and it takes a lot of hits to "make it big". That can easily pile up to thousands for a handful of projects while none of them get near break-even before you go bankrupt.
All it takes is finding a good-enough alternative in the space, or if you trust the software enough to self-host, do that.
There's nothing wrong with actually trying to be frugal. We're supposed to have the skills to be more self-sufficient in this space, and I don't think it's a negative to want to actually do so. And neither is paying for something if you feel the premium is worth it.
It depends a lot on your time and experience... Also, what you want to know. There's no guarantees at all with a SaaS startup, and if you can increase your knowledge base at a minimal time sink it may be worth it. I have a dozen things I tinker with and no expectations, so I treat it more as a hobby.
I am spending about $140/mo on a dedicated server currently, mostly because I want to run my own mail server without depending on a relay host or nickel and dime costs ($2-10/mo per address). This now runs just about everything I use. Nearly all my apps are dockerized so I can port them to another host pretty easily.
I've also been playing with writing for Cloudflare directly (workers & pages) as well as CockroachLabs (CockroachDB Cloud). Both with relatively low startup pricing and can grow as needed based on usage/demand, both with a relatively clean self-host strategy as an exit.
It would be too easy for me to wind up paying $20-30/month or more on a dozen or more projects. I try to keep this down, as if I don't touch something for a month, or don't complete it for a year, it adds up quick... If I keep baseline costs low, it's less of an issue.
If I was 20 years younger, I'd probably have gotten more of these things done or put more effort in. The tech costs are a lot better today than 25-30 years ago when I was starting out. But I'm still pretty mindful of it.
That's just being nice. We developers are a right bunch of cheap bastards. At least I am. I'm working on it, but it's just so ingrained and pervasive, it's weird. I've been trying to figure out where it comes from for a while now, and how to get people to work past it, with little success.
If there was a magic debugger that made debugging 100x easier and faster, but it wasn't free and open source, I'd still have a hard time using it even though it would save me a bunch of time, and easily pay for itself in time and frustration saved.
I take frugal to mean being smart or conservative with spending. However, optimizing to the extent some people are talking about here would redefine the term to mean "irrational" for me.
In the context of a business, nearly any piece of software is a trivial cost compared to hiring a human (or yourself) to do a task.
> Does nobody here see the irony in bending over backwards to avoid spending money on software products when the entire purpose of your business is to sell a software product?
No, because nobody thinks that it'll happen to them. Their company's/startup's/side project's product will be worth paying money for, and surely nobody's going to try to start an open-source product that will compete with them.
Pay more, no matter how unreasonable it is because you can afford it. You’re living in the US bubble, where VC funded startups feed each other in often unsustainable ways. People were taught this argument to keep the wheel spinning.
The right way to build a startup is: offload your infra, self host everything you can while balancing the maintenance effort/cost.
Looks to me like the 1P biz rate is $7.99 per chair per month, so once you hire employee number 100, you're spending $9600/yr. That shouldn't put you out of business, but still could be spent elsewhere?
I used to be CTO of an Uber Eats style service that ran on hosting costs of about £100pm. Thousands of orders every few days, real time driver tracking, self hosted invoice generation, the works. Hetzner servers, Docker Swarm and Cloudflare for the tech.
It is truly amazing to me the value that is provided by renting hardware instead of VPSes. As long as you're willing to roll your own infrastructure instead of buy into a cloud provider's infrastructure.
If you know what you're doing it's easy enough to roll out a multi-region distributed system with HA and backups on a pretty modest (<100pcm) budget that can handle competitive QPS.
However, most people do not - some will learn, but most will fall for the cloud marketing depts and become infra renters for life. Teach a man to fish, and so on.
Another reason is that if you look for an investor, one of the first things they ask is how / where you host it. If it is anything remotely DIY you'll be turned down.
It was actually painful to see start ups spending thousands a month on hosting, that could be easily achieved as you say under 100 pcm plus. They would have to get a contractor to set it up and for support, but it would have worked out much cheaper and they could have bumped the salaries of their workers.
I'm sure you can guess - pure risk aversion. Your business idea is risky enough, and they would need engineers to assess your (possibly ever-changing) DIY stack.
You see the same thing in the corporate world for in-house stuff. Your manager (and your manager's manager) don't want to hear about in-house or self-hosted things that AWS can provide.
This is totally understandable. It's a repeat of the whole "nobody ever got fired for buying IBM" mantra of computing's early decades.
> some will learn, but most will fall for the cloud marketing depts and become infra renters for life
Do you have any learning recommendations for someone looking to start down this path? I've only ever worked in an infra-renter context, and I've begun exploring the 'rent from Hetzner, manage your own infra' for personal projects, but I would love to learn from the paths of experts where possible.
I'm no expert but hopefully I can still point you toward the happy path: start very small and increase distributed complexity at your own pace until you can fully appreciate the entire end-to-end system and all the processes involved. The book: DDIA is a well-known 101, if a little primitive, and has references you can dig into as well.
Ideally, you also have some exposure to this at $job as simply building DIY infra horrors without seeing the real-world context, tradeoffs, etc. in which they typically operate will be misleading.
It's a pretty common abbreviation of high availability on this context, with a heavy implication on active-passive redundancy (although the GP looks like an exception here). It's more used than the full wording, and part of the name of some important tools.
So, yeah, it's good that you asked, because it's not as widely known as the people that use it think it is.
You'd be amazed how much you can get out of one of the Hetzner $3/mo ARM servers with the right code.
I use a $6/mo box for my primary business hosting, but I have a $3/mo one that I'm using to build v2, really just to prove what's possible. If you set up your DB and caching right you can do so much with so little...
Similarly, but on the opposite spectrum, I also wonder how far you can go on vertical scaling nowadays. For 200 EUR/month on Hetzner you get a dedicated 80-core ARM CPU, 128GB ECC RAM, 2TB SSD...using a good performance multi-threaded language, what _can't_ you run on that? It's ridiculous value.
> what _can't_ you run on that? It's ridiculous value.
Yeah - I remember a StackOverflow talk[1] where they basically said that they just vertically scale their database.
The fact that they were able to make it work tells me that most businesses should probably just go that route and avoid the headache of distributed systems[2].
2. Obviously a business should probably invest in redundancy when it comes to data (as did StackOverflow), but a pure "Raid 1" setup is the easiest of distributed systems to understand.
Those ARM boxes are incredible for the price, I'm using them to document a hobby K8S cluster because the overall cost is low enough that it won't price a hobbyist out.
Hetzner is great and I love it and use it for all my dev needs. But there availability is in limited zones and for production I do suffer from the latency. Haven't found any other provider which costs anywhere near the same(every comparable server is at least 2x the cost). I use their AX41-NVME. Any suggestions of an alternate?
Vultr has some very cheap offerings and maybe worth a look. I can't speak to their reliability or service, but they have DCs everywhere even in South America and Africa where there are few cloud options outside of AWS. They're more expensive than Hetzner, maybe even a little over your 2x range depending on your load, but if latency is your issue they probably got you covered.
I've been pretty happy with my OVH server, but agreed on support... had better luck with chatgpt and a lot of additional reading getting my CIDR block of addresses configured under ProxMox (which made me far less worried about ChatGPT taking my job any time soon).
In the end, was an interesting learing experience, that I hope to not have to repeat. Only went for a single server as I had several smaller VPSes on DigitalOcean and wanted to add a mail server in the mix, and couldn't reliably send via DO or Linode, so was easier to consolidate and run on a single/larger host for hobby projects.
I did that once and the data center burned down. Sure I could have the service spread over several centers and build distributed backups etc.
In the end self hosting and self managing is a money / time trade off, especially for a side gig I’d us SAS and managed solutions. The one thing one has to make sure off is to not get locked in with a particular provider, so knowing how to do everything yourself is a very valuable skill.
To be fair, Datacenter burning down is pretty much on the bad luck side of risk management. It's in the same category as your distribution center being hit by an earthquake... I'd guess it wouldn't happen again, but who knows...
Didn't that happen from someone leaving a sink on or something, causing flooding and then the UPS batteries to short and explode?
I was there when someone at AWS accidentally unplugged an entire region.
Shit happens, it doesn't matter where it's hosted. People act like the cloud is infallible or something. You're literally sending lightning bolts worth of electricity through bricks of metal. Anything can happen.
I'm working on a middle-tier solution for this gap -- I call it Nimbus, but basically the idea is to provide managed services at low cost cloud prices.
There's no reason someone should have to run a service like Chatwoot themselves, the software is so good that it's mostly set & forget for most small use cases.
That's where I come in. Unfortunately I don't have ChatWoot yet, but I have (and use) Umami for page view tracking extensively on my own projects now, with Nimbus[1]. The dogfood tastes decent so far.
Yeah, but all improvements will likely be upstreamed/made open source (their license isn't AGPL or anything but just makes sense to me since it's MIT).
The first thing I want to do is add a backup mechanism that isn't just take a snapshot -- There are a bunch of similar tools to Umami and I don't think that any of them have a really good cross-project way of taking backups.
Feels like there should be a page view/analytics backup standard, so you can easy move from a tool like Plausible or Umami and try out a new one, like Fugu.
But outside of advanced functionality I think my platform is just a lot closer on cost. The instance costs don't go up per traffic served (especially since 99% of people won't need that) -- it's more like parts + maintenance (and since Umami is good software it doesn't need a TON of maintenance either, just regular patches and some monitoring/extremely light resilience engineering).
People used to make websites by uploading some php or perl scripts to a shared server (or just buying a domain and pointing it at their home ip). In that context, I can't imagine someone writing an article like this.
I think it's kind of sad and insane that people seem to have forgotten (or came into the industry too late to know) that it's possible to build fun and useful things that way.
I built a forum/social site that way back in the day and it was active and profitable for 10+ years.
It also feels like the art of optimization has been lost.
For small and medium sites a lot of today's crazy build pipeline and distributed asset hosting complexity can be sidestepped if you just focus on optimization and making sure that cache expiration dates for your assets are set correctly.
On the server side, it is considered "normal" these days to have maybe 50+ database queries per request. People then reach for expensive and complex database solutions (clustering, etc) before doing simple app-layer optimizations like caching.
The "shared server" concept is funny because as much as it seems old fashioned, it embodies the spirit of all the modern things:
* Serverless - in that you don't need to do server admin
* Online IDE - CPanel lets you make changes online to your code
* Continuous Integration - When you save the file online it is in immediate production
* Branching - you can copy a folder from one subdomain folder to another - this effectively gives you a poor man's git and environments in one swoop!
It is like a REPL of web dev. Nothing can come close to it, because once you made your changes they are live, there is no infuriatingly slow publishing step.
Ubersuggest has a comparison page[0]. From my understanding it boils down to Ubersuggest being cheaper and most likely enough for what most users will need, but Ahrefs has a larger tool suite (and much higher cost).
There are a lot of use cases for which you can’t actually save by kicking out Ahrefs. Of all the tools here it is the hardest one to replicate. I agree w the rest - but in many situations Ahrefs is not replaceable.
Its provides detailed analytics on the popularity of different websites and the keywords that people search for; it goes really, really deep on this, much deeper than you can go through manual google searches and various scripts. You can easily recreate Vercel on your own if you're willing to forgo some convenience, for example; there's no equivalent to that for Ahrefs - you can't realistically 'roll your own' or code your own Ahrefs. The web largely works on an advertising model, and Ahrefs is what tells you what has value for advertising purposes and what people will visit, or are visiting now. If you're doing something simple, it may not matter. If you're not, Ahrefs is indispensable.
We have been using Tiptap in production for more than a year in Notesnook[0]. Glad to see it finally launching here on HN!
We have had quite a long and rough ride in search of a stable rich text editor. We began with Quill.js then migrated to TinyMCE and then finally settled on Prosemirror. Unfortunately, contenteditable is still absolutely horrible on web browsers, especially mobile ones.
Tiptap is a good choice if you are looking for a framework agnostic and thin abstraction over Prosemirror. However, if you are primarily working with React you should go with Remirror[1]. Tiptap's APIs are heavily inspired by Remirror (almost a duplicate in some places). Remirror takes the edge on the maturity and stability of the API and extensions. The sheer number of utilities offered by them to simplify Prosemirror's APIs is astounding. And trust me, you will need a lot of those utilities eventually. Prosemirror is not an easy API - really, really well designed but not easy.
In the end, though, its Prosemirror that's doing all the heavy lifting. And no matter how many abstractions you put on it, you will have to get really, really close in with Prosemirror's internals. Tiptap or Remirror do not make that any easier or harder aside from the initial bootstrapping.
Thank you for your honest feedback. Our goal is to make ProseMirror with Tiptap even easier to use. The demand for modern content editing continues to grow. Tools like Notion have raised the bar. We don't want to hide ProseMirror, we want to complement it.
We believe that a good editor needs not only a frontend, but also easy-to-use backend services that we try to integrate as seamlessly as possible. With our framework-agnostic approach, we support more than just React.
I've found react-prosemirror [1] to be a light ProseMirror React integration that supports node views. I had also considered Remirror before, but its abstraction and internals seem even heavier than Tiptap's.
Would love to see how bfs compares to fdir[0] for directory traversal. Even though fdir is using Node.js underneath, the comparisons I have done with fd & find are pretty close. Of course, bfs would probably be quite a bit faster...but how much faster exactly?
I tried it on my end. Built bfs from source using `make release`:
$ hyperfine -w1 "NODE_ENV=production node ./fdir.mjs" "./bin/bfs /
home/thecodrr/ -false"
Benchmark 1: NODE_ENV=production node ./fdir.mjs
Time (mean ± σ): 965.5 ms ± 53.0 ms [User: 703.0 ms, System: 1220.5 ms]
Range (min … max): 858.4 ms … 1041.3 ms 10 runs
Benchmark 2: ./bin/bfs /home/thecodrr/ -false
Time (mean ± σ): 1.530 s ± 0.127 s [User: 0.341 s, System: 2.282 s]
Range (min … max): 1.401 s … 1.808 s 10 runs
Summary
'NODE_ENV=production node ./fdir.mjs' ran
1.58 ± 0.16 times faster than './bin/bfs /home/thecodrr/ -false'
$ cat fdir.mjs
#!/usr/bin/env node
import { fdir } from "fdir";
console.log(await new fdir().onlyCounts().crawl("/home/thecodrr").withPromise());
For some reason, reducing the UV_THREADPOOL_SIZE to 2 gives the best result on my machine (I have heard the opposite in case of macOS):
$ hyperfine -w1 "UV_THREADPOOL_SIZE=2 NODE_ENV=production node ./fdir.mjs" "NODE_ENV=production node ./fdir.mjs" "./bin/bfs /home/thecodrr/ -false"
Benchmark 1: UV_THREADPOOL_SIZE=2 NODE_ENV=production node ./fdir.mjs
Time (mean ± σ): 355.8 ms ± 16.1 ms [User: 479.4 ms, System: 356.0 ms]
Range (min … max): 328.3 ms … 387.5 ms 10 runs
Benchmark 2: NODE_ENV=production node ./fdir.mjs
Time (mean ± σ): 935.4 ms ± 52.7 ms [User: 695.8 ms, System: 1176.5 ms]
Range (min … max): 850.6 ms … 1031.9 ms 10 runs
Benchmark 3: ./bin/bfs /home/thecodrr/ -false
Time (mean ± σ): 1.534 s ± 0.104 s [User: 0.353 s, System: 2.307 s]
Range (min … max): 1.428 s … 1.773 s 10 runs
Summary
'UV_THREADPOOL_SIZE=2 NODE_ENV=production node ./fdir.mjs' ran
2.63 ± 0.19 times faster than 'NODE_ENV=production node ./fdir.mjs'
4.31 ± 0.35 times faster than './bin/bfs /home/thecodrr/ -false'
Another factor to take into account is that I ran all this on a WSL instance which may or may not affect the performance. However, since both programs are running on WSL, the results should be accurate.
Oh and to clear a few confusions:
1. Notesnook is 100% open source. That includes the server, client apps, and everything else. It's not partially open source.
2. Zero knowledge does not mean Zero Knowledge Proof but Zero Knowledge as in we, the company and people behind Notesnook, have no knowledge regarding what you have in your notes. I see that this might be more accurately called "no knowledge".