The even better parasite story is the one that enters into a rat and crosses it's wires so that it is sexually attracted to cat pee. It goes into heat when it sees the cat and naturally gets eaten - the goal was the cat brain the whole time. As heard on NPR http://www.wbur.org/npr/9560048.
It's being run backwards and looks great that way. Imagine playing a couple of pool games and then singling out that one really cool shot you made, and start by saying, "that pool shot is how I play pool."
The parasites did their thing just fine for each individual organism. But that one random one that happened to cross wires in just the right way ended up doing orders of magnitude better. A rare occurrence, but very advantageous once it happens.
Theres actually a relatively simple explanation when you look at the neurobiology. First of all toxo is a parasite that can infect and damage the brain so it makes sense that it could affect behavior. Secondly, the response of fear to cat urine or attraction to female mice is mediated by the same pathways in the brain (the limbic system, which governs emotion, and is closely tied to olfactory input, specially in mice). Making the mice feel attracted to cat urine is actually not that farfetched when you stop to think about it and notice that its just a matter of "flipping a bit" somewhere in the wiring. And since this adaptation leads to a large fitness increase it makes sense that it would be strongly selected for that that the modern version of the parasite would have very specialized mechanisms to take advantage of this quirk in the rat brain.
I get it. The sequence of events seems very analogous to a Rube Goldberg machine, and one would conclude if the sequence of events didn't go as planned the parasite would have been doomed.
My theory is the parasite community for that species can survive other ways just fine, but has stumbled across a scenario that serendipidously gives it an extra boost at the change of reproducing.
I would imagine that most species that benefit from these complex interactions tend to have left-over backups for survival and reproduction. When species over-adapt to specific scenarios, it seems likely they'd become to susceptible extinction or their populations would be periodically thinned out.
FoundationDB is the parasite, wanting to be eaten by the cow (Apple). It was eaten up by ants(companies that need noSQL, ACID databases) and they brought its attention to the cow.
Peeps, IIRC, was trematodes[1]. After the opening action sequence, the narrator rambles about how trematodes infect snails to get to the birds that eat the snails. Still involves the intermediate host[2] concept, though.
This is actually kinda ridiculous. I feel so bad for companies that have invested in this product. Just pulling the downloads like that... wow. I'm hoping that paying customers at least got some heads up, data migration is the hardest thing to get right.
Contracts for support are mostly useless in any case when your service is down and you need to get a vendor to respond in a timely fashion. If they are in breach of contract, you have no mechanism to force them into compliance in a short enough timeframe. I've been in situations several times where a vendor with a paid support contract just couldn't fix a problem fast enough and I was forced to dig into their product myself to figure out a fix.
This is why everything you build your own product on and that can't be replaced in a matter of hours by a competing product should come with source. Worst-case scenario, you fix the issue in source yourself.
If you're into the last month or so of said contract and then your vendor is bought out before getting a chance to renew then one of two things happen:
1. the new owners aren't interested because they want to product to cannibalise for something else they're doing. Original product will never see the light of day again, no new bug fixes and possibly no more security patches.
2. the support contract cost is hiked up from maybe a few hundred dollars a year to several thousands
or the possibility of:
3. one or other of above but with the product's support staff reduced, dispersed or fired.
#1+3 above happened to us after Oracle bought out a bunch of stuff we relied on heavily for our hosting platform. Was a very painful time.
The acquire-hire-kill cycle for startups that target developers has become really frustrating. I'd love to try a lot of these new products, but it's hard to know if it's worth my time.
Here's a simple rule you can apply - if it's closed source, then it's not worth your time.
Like, don't build on top of freaking closed source platforms and after 17 years of OSI, 24 years of GPLv2 and myriads of open-source alternatives available, you'd think people would learn.
Not for pro users, which I thought is what the topic of discussion was. Lack of color matching is the first feature that comes to mind, and that's an absolute deal breaker for any print design people.
Startups don't make another photoshop, they make another gimp and hope to grow it into another photoshop over time, or hope to get bought out by a company that makes another photoshop and wants their talent and technology.
For most people, yes. Unfortunately Photoshop is pirated, so it isn't being used at face value.
> Blender does all that Maya does I'm sure as well
If you want to build your own stuff, which isn't uncommon at all and if you have a movie budget of 100 million dollars, you might be better off forking Blender. Also, Autodesk's Maya is not a startup and it isn't being used necessarily as a platform, so that's building a straw-man on your part ;-)
> SCADA systems for industrial control?
You mean Eclipse SCADA?
Funny thing is that I'm working on a software system using it right now. My mission is to steer and monitor power plants and nothing on the market was suitable for my needs. Building on top of an open source stack helped tremendously.
Tragedy of the commons in action. Every VC-backed startup is incentivized to sell, but in the process continues to kill goodwill towards dev startups. I'd be super hesitant of relying on any other company for core tech with high switching cost for this reason.
What if startups with high switching costs (DBMS, OS, pretty much anything else infrastructural) added a guarantee to their sales contracts?
For example, it could be that, if they get acquired, they'll open-source the technology, sell it to another for-profit entity that will maintain it, or provide a migration tool.
Even for something like FoundationDB, it'd hardly be any skin off Apple's back to have a few employees spend a few months ensuring that previous customers have some sort of support.
>What if startups with high switching costs (DBMS, OS, pretty much anything else infrastructural) added a guarantee to their sales contracts? For example, it could be that, if they get acquired, they'll open-source the technology, sell it to another for-profit entity that will maintain it, or provide a migration tool.
That would be like them putting a huge paper hat on their head, saying "I'm not a good target for acquisition".
Companies buy them so they (also) get their product/IP etc. If those startups have promised to give it away in such a case, then they are not that good of a buy.
Those are just contingencies if the acquirer doesn't want to sell the product anymore. Basically it's a guarantee that, even if they get acquired, their customers won't be punished.
That scenario won't hurt companies that are being acquired with the intent of keeping the product running, for obvious reasons. It also shouldn't hurt acqui-hire scenarios.
The only time it might hurt is when the acquirer wants to use the technology internally, but not offer it to anyone else. That's fairly rare, and the downside (potentially scaring away that tiny fraction of acquirers) is much smaller than the upside (making potential customers feel safe).
It's hard to really guarantee effectively without putting up serious money.
The typical large closed-source codebase is full of undocumented things and hidden dependencies on random chunks of proprietary environment. Making it usable as open source is a ton of work.
And you can't guarantee that work will happen without putting up some kind of bond or buying insurance, given that the vendor could simply go bankrupt and not honor the contract.
Or choose free software. If it's free software, even if the original developer goes away, you can hire someone to keep working on it, or someone else can decide to make a business supporting it, or the like.
Or you just get left with a broken OSS project whose contributors moved on to shinier things, and nobody really much cares about continuing it, and you don't have the knowledge/time/resources to hire someone to do it.
But if the product is already working, chances are that you get away long enough (to do a graceful migration) by just doing small bug fixes, changes to support OS updates, etc.
This is what we are now doing for a project with Berkeley DB XML, which hadn't seen updates for five years. When there finally was an update, it was buggy and moved to the Affero GPL 3, which conflicts with other open source licenses used in the project. So, we continue to use the five year old iteration with a small set of patches.
(Lesson learned: once a product is owned by Oracle, prepare your evacuation plans.)
I was agreeing with the previous posters until I realized this has kind of happened to me. Tastypie was the go-to REST package for Django 3 years ago. Now it has kind of been abandoned and everyone is ranting about Django-REST-Framework.
Fortunately it is stable (for my use cases), and it doesn't actually seem ton be that big a deal that it isn't being worked on. Django_REST_framework is a lot nicer though.
One of the problems you get is that once something is stable, the maintainers generally don't have a huge incentive to dedicate already-over-committed time to work on features they don't personally need. Just triaging tickets on a popular project can be a major time commitment, particularly since a fair percentage of them will be helping people understand the API or trouble-shoot something in their project which is causing failures in your code.
In the case of tastypie, I think all of the maintainers have switched jobs at least once in the last few years and at the same time the general Django community has been moving in the direction of simplicity rather than complex generic frameworks. Daniel's list of things he's not interested in implementing in restless is a good list of things which have been painful in tastypie: https://github.com/toastdriven/restless#anti-features
Like I say, it is stable for what I am using it for just now.
I have started using REST framework for other parts of the project, and it seems a lot more consistent with Django's other components (serializers are similar to forms, APIViews are similar to the generic views). In the end that just makes things easier. Its less context switching essentially, which is really useful when I don't touch that part of the project for a few months, then need to update something.
Thanks for the offer though (and thanks for the framework, it has been useful). And unfortunately I don't have the time to help with maintenance.
I do choose established providers. The problem is for the community as a whole -- each company that this happens to is another hit to the credibility of early startups.
Even if you get a few months of reliable support, you still have to switch which will cost you dearly.
Often what is done is source escrow. The source is given to a third party, and if anything goes wrong the third party releases the source to the one who purchased the product.
Even this sounds like a bad outcome for users of the tool. One minute you're building a product, the next you're stuck maintaining a closed source database.
I've been wondering if there's a sort of nash equilibrium to these things. How many startups will a restaurant let auction its open tables before losing faith in all of them?
Is it simply a case of the acquirer wanting the technology and maybe the personnel while the acquired company is losing money hand over fist so they just shut the company down?
Sometimes it's because the product is on the way to failure in the market. It might be a great product, but that doesn't mean it'll ever be profitable.
The acquirer may love the technology and use it internally, but it's expensive to keep it in a public marketplace. It requires support staff, marketing, sales, etc.
There are also acquisitions that are purely about customers, so the startup's product is shut down or rolled into the acquirer's existing products.
So there are a lot of good reasons you'd want to company other than its product, but we can't always tell which reason it is right away.
It's a question of whether the acquirer and/or the acquiree have a good faith social contract with the early adopters who take the risk and believe in them.
There is no transfer pricing mechanism for programmers like there is for footballers. Worse, programmers are often loyal to a particular technology or concept. So in order to extract the valuable staff from a company it's necessary to destroy that thing so the staff can be reallocated profitably.
"So in order to extract the valuable staff from a company it's necessary to destroy that thing so the staff can be reallocated profitably."
I wonder how often this actually works. In most cases where I've seen this (or heard of it happening first hand from people I know) the vast majority of the people you'd want to keep were out the door of the new place pretty close to as soon as possible (meaning, as soon as the contracts allow, or the golden handcuffs are mostly off, or whatever is relevant to the specific situation).
On the inside there tends to be a pretty predictable path: New management tells everyone nothing will change materially, everything inevitably changes very quickly, people get disgruntled and take off for other opportunities (at a quickly accelerating pace as the old guard sees all their former colleagues from the old place leaving the new).
It's Apple. They don't like having lots of public projects, especially ones tangential to their mission, they probably want to weld it firmly to their in-house usage and have it as a competitive advantage which nobody else can buy.
This is really about closed source, proprietary infrastructure technology (not all kinds of software in general ), offered by startups, not established companies.
That is to say, people use Oracle, SQL server, Teradata, etc and they are not worried they will go out of business or be sold or otherwise that they will be left in the dark by a sudden shift in business practices and product availability.
The problem is almost entirely with startups, which are an easy target for bigger companies interested in their technology and team skills set. This is even more so an issue because the majority of startups are VC funded and are under pressure to sell or comply with VCs interests.
So in practice you have three ‘safe’ ways to build your infrastructure, which are not mutually exclusive. Choose OSS software, buy from companies such as Oracle, IBM, HANA, etc, and build it yourself. Depending on the expertise of your team and funds available for purchases, as well as qualifies of the available solutions, OSS is probably the safest way, followed by purchased from large corps. Rolling your own infrastructure is expensive, time consuming, requires committing your developers to building infrastructure instead of well, building a product, and may not work in the end at all.
Even the companies that can afford to do it everything in house (Facebook, Yahoo, Twitter, etc) choose OSS for the majority of their needs and build on top of that. Google is Google. However, if that infrastructure is your selling point and what differentiates you from the rest, and/or you have very specific needs and makes more sense to do it this way, doing it yourself can be a great alternative.
We rarely use OSS here, and we don’t use any proprietary infrastructure technology. We have built everything ourselves, and it has worked out great so far, but we have put a lot more effort and resources into that, whereas we could have instead invested on the Product. If we had to make a choice again, in retrospect, we ‘d most likely have gone the OSS way.
It’s all about tradeoffs.
Established companies are no guarantee. If you rely on Visual Basic or FoxPro, then good luck to you. Oregon hardly had a good experience with Oracle. Much of IBM is doomed.
FOSS or proprietary to yourself are the only valid options.
> We have built everything ourselves, and it has worked out great so far, but we have put a lot more effort and resources into that, whereas we could have instead invested on the Product.
So, you don't use a libc, or a web framework, or any library that implements common logic that other programmers have spent ages refining?
Obviously, I thought I didn't need to mention that, by everything ourselves I meant our core infrastructure. We didn't write an OS, a compiler tools chain, an editor, a standard library etc. Though we did write our own javascript syntax/semantics based language/compiler/runtime for server side needs.
Like I said, if we were to start over, we d use OSS, but, all things considered, that decision didn't hurt us and gave us leverage, knowhow, flexibiliry and that helped is be where we are now.
There are two reasons why many companies run a closed source database. For some types of databases, there is no good equivalent in open source. For some types of workloads, the closed source implementations are orders of magnitude more efficient, faster, or scalable than anything in open source, so open source cannot reasonably support the workload.
It happens more often than you'd think. There are still many things in the database world that open source does relatively poorly compared to alternatives.
Many banks, hedge funds use kdb+/q for time series databases. This (very expensive) software is literally unheard of outside of these niche domains. I've been using it for close to 5 years for high frequency data, and honestly nothing out there comes close to this awesomeness of kdb+
I would be interested to hear what you consider high volume (writes/second). I am supporting a manufacturing system and it sits at the moment at 7 000 writes/second (it is a normal time series, ie id,time,value,quality).
7000 writes per-second is pretty low for many high-volume time series needs.
Though it's not usually write throughput that most of these technologies are worried about. It's usually compression using dsp methods, aggregate stream folding computations, etc... that matters.
The systems I deal with start at millions of writes per second and go up from there. I have heard of systems that do over a billion writes per second, though I have not breached that threshold personally (yet).
From an IoT or sensor network standpoint, 7000 writes/second is an idle server.
I don't have numbers handy. The real power of kdb+/q comes from the column oriented architecture and the extremely powerful vector functional language q. The language q is inspired by APL. I highly recommend you to check out this article to get a sense of the APL family of languages https://scottlocklin.wordpress.com/2013/07/28/ruins-of-forgo...
If you want a database for blazing fast data-storage and retrieval, there are many options available. You start seeing the real benefits of kdb+/q when you use q to simplify very complex operations that aren't easily done in SQL. Also, the high level operators that q makes your code extremely terse. I've written complex backtesting systems that perform data mining on massive datasets - all in one page of very tight q code!
There is a free 32bit version of kdb available (http://kx.com/software-download.php). For the commercial version, pricing information is not publically available.
The thing with KDB is almost all uses of it are in memory deployments. It isn't hard to make something that has little persistence or relegates persistence to a 2nd class citizen to run quickly.
Datomic is closed source and has features that no open source database currently offers. In particular, it's a time series database of immutable/append-only facts, so its horizontal read scalability is excellent, but it's still ACID and supports joins.
It's definitely a very slow database. You have to be extremely fortunate to have a problem that fits into its niche neatly. I'd sooner figure out a historical insert-only schema for PostgreSQL in future. They're not great about fixing problems with Datomic either, it feels like an afterthought. Means of overflowing labor not currently allocated to a Cognitect contract gig, not a priority in its own right.
I think they've improved a lot WRT fixing problems--we had a chat with them after some issues with Datomic in production, and since then (6 months ago) we've had every problem we've discovered get fixed very promptly, and Datomic's continued to scale for us.
coolsunglasses - Why did Datomic seem slow to you? Can you describe the problems you had in detail? I'm not from Cognitect, just someone who is developing some prouducts that currently use Datomic among a few other databases.
Would love to hear some honest feedback. Maybe your struggles were because of the tech, earlier versions, bad hardware config, or mis-applied use case?
Stardog[0], a semantic/ontological[1] database, is probably best in class, and is closed source. Anyone interested in writing a open source triplestore, email me ;)
[0] http://stardog.com/
[1] They've started calling it a graph database, though I think triplestore is the most correct name
When you're best in class you can afford to be proprietary.
Clark & Parsia had a history of open source (eg. Pellet) which was the best in-memory reasoner for a long time IMO... but not a lot of luck getting sustainable business subscription revenue. This led to the switch to dual-license AGPL in 2008 and now closed-source Stardog...
The nice thing if you where using Stardog and this happened, you could easily move to any of its competitors which implement the same standards. Including opensource version too. i.e. you might miss a feature but at least your queries will still rung and you won't need to redo your whole app again.
SPARQL should really be everyone first technology to investigate before heading off to anything else. i.e. when you are still pivoting every week you should have the most generic database tech possible. Only when you scale you should specialize.
One could be "interested in writing a open source triplestore", but why would you go down that path rather than, say, optimizing the heck out of Neo4J?
Well, first, I don't have a high opinion of Neo4j. Secondly, SparQL queries are pretty distinct, and while, yes, they can be translated into generic graph queries, I'm pretty sure there are some fun optimizations to be had if you focus on their patterns 100%. Thirdly, because it'd be a hell of a lot of fun! Why else would anyone write a database...
> Thirdly, because it'd be a hell of a lot of fun! Why else would anyone write a database...
Presumably, because you have a business, which has a product, which has a nascent feature, which requires some particular set of time-and-space-and-distribution guarantees that no current database on the market makes. This is why, for example, Cassandra was developed.
Do you mean forking the codebase or layering something like N3 over it? (btw, last I checked the Neo4j community version could only scale up and the distributed version was commercial.)
While there are some OSS column store DBs, Vertica is a very well put together solution.
It's very fast, it scales reasonably well, it has support for wide-range of analytic functions, and good support.
Foundationdb has ACID transactions over the entire db, over the cluster and over multiple keys. And fast.
I've looked over so many open sources alternatives, and they claim ACID but it's a deception based on some narrow interpretation of ACID. It's very annoying to spend time researching to discover the truth between the lines.
I would love to find a fast, scalable open source db that implements Foundationdb's features.
Clustered Indexes. Data in-index. Restoring your database should not be measured in hours for just millions of rows. Statement generation for backups? IF you had clustered indexes you'd never finish restoring.
The issue isn't with closed source databases. The issue is with trusting smaller startups who don't have a customer base large enough to avoid dropping.
Oracle and Teradata for example are proven databases with official support available worldwide and a talent pool you can draw from almost immediately. You don't get that with most open source databases (at least those that don't have a parent e.g. Datastax, Mongo, MySQL).
> Oracle and Teradata for example are proven databases with official support available worldwide and a talent pool you can draw from almost immediately.
I've watched Oracle try to strangle more than one company once they were dependent on their database. One they succeeded, one managed to migrate to PostgreSQL just in time. If you build your business to be dependent on Oracle they have you by the balls; don't think they're not going to squeeze. And IME the worldwide talent pool is much more available and... well, talented, for PostgreSQL or MySQL or any of the major open-source options.
If no one allowed small-startup technology into production, then the industry stagnation would be tremendous, so I disagree that is the issue. However, once you do rely on a small-startup database, it better be closed source, so I disagree: the issue is the closed-source part, not the small-startup part.
The fact is that the world depends on closed source databases. So yes I continue to disagree that you should never use one just because it is closed source.
And I never said that people shouldn't use small startup technologies. Only that when you do you take the risk of the company not being around in a few years. And the people who will take that risk are really other startups or early adopters.
>> I disagree: the issue is the closed-source part, not the small-startup part.
Correct. But you are betting the success of your nascent start up on another nascent start up. This is straight up wrong.
For a large company its different. They have all the resources to go into months long migration projects. As a start up, you can't afford time for migrations when you are busy doing the real work.
This is why the world needs early adopters. There needs to be people interested enough in new technologies for their own sake to invest time in them. That's a very different motivation than P&G or Unilever.
I oversimplify it by saying "Companies will pay cash and accept closed source in return for good documentation and someone to answer the phone." Most non-IT buyers don't bring up open vs closed source in purchasing discussions.
please don't down vote because you disagree. Write a reply. Down vote what is inappropriate and doesn't add to the conversation.
I strongly disagree your stance on Oracle in particular. I had two arrays at a medium size academic library 8 years ago. Anything I had to call about the data base meant a line item for business review due to cost if it wasn't covered by Oracle's service agreement.
PostgreSQL is amazing and I much rather work with that and hire whoever I want with what I want to do either per instance or annual contracts.
It's funny you mention that.. but actually hiring a part-time PostgreSQL DBA is all but impossible, I reached out to most of the support companies listed on the north american website... mainly I wanted for someone to setup a small (3-node) replica set of the most recent version of postgres with plv8 some sane backup scripts and pretty much nobody replied... EnterpriseDB won't talk to you without laying out at least $10k to start, and I would rather pay a person (or small company) I can call that to get things running... more if it kept running well.
I didn't have the time to delve through all the options out there for this purpose, and evaluate each of them, when there are out of the box solutions that were closer to my needs, though not strictly SQL based (Mongo, Rethink, ElasticSearch, Cassandra all come to mind). There is ~6k/month allocated to hosting costs, and ~$40k/month to the handful of people in the IT team... there isn't much wiggle room there for a small company, and everyone wears a couple of hats. The current application is using MS-SQL (hosted in Azure without redundancy) and MongoDB mirrored data for searching against... licensing to get a replicated MS-SQL setup for better availability would be more than our entire next generation hosting budget... If we could have actually talked to someone who wasn't a sales person at EnterpriseDB that could do more than send you a PDF sheet targetted at managers that might have swayed me.
Sorry, will end my mini rant.. in the end, what support I do have from MongoDB (using their backup service), and my experience with actually just using ElasticSearch and Cassandra has been far better for setting up for something resembling high availability/distributed configuration has been easier than even getting a proof of concept PostgreSQL setup working.
I really hope that PostgreSQL gets it together within the next year or so, it would have been my first choice had I been able to actually get some support within a reasonable budget for my needs, or if I actually had the equivalent of a DBAs salary or more to throw at the problem, which I didn't/don't.
This is part of the horrible brokenness of IT labor.
I don't know the features of PostgreSQL that you want to use, but I'm totally willing to learn if somebody is going to cover my living costs. But I'm not even going to respond to your job ad if you put "PostgreSQL plv8 REQUIRED" in it.
For that matter, if you think it's simple enough for a part-time DBA, then why don't you just assign one of your existing IT people to learning and implementing the RDBMS that you need? Surely not all of them want to do the exact same job forever. PostgreSQL has excellent documentation.
Because our resources (time) is already pretty thin wrt maintenance as well as our next generation version. PostgreSQL has several unsupported, and a commercial option for replication. Unfortunately you need a support contract to even talk about getting the commercial/supported version, and there's ongoing development towards bringing it in the box. I already expended enough time trying to get up to speed and have something reasonable working, and it was less time to look elsewhere for the features I needed in another database that had redundancy/scale features in the box.
If I was hiring a full time DBA, I would have put POSTGRESQL DBA as the job title, and made plv8 a feature requirement that I needed/wanted. As it is, there's no budget for that.
This is probably why a lot of companies like to use closed source solutions. I have mainly been using SQL Server and there is a lot of consultants who knows how it works. In a few years I think there will be more database products with good support from 3rd parties but currently it is hard to know what to choose.
The problem is, downvotes due to disagreement are codified as acceptable.
Which still sucks, since downvotes can lead to shadowbans.
PS. Just double checked the Guidelines, and this codification is no longer there. However, there's also no guidance to suggest an appropriate reason to downvote.
Reddit. esp certain subs. its not perfect, and its hard to find the same density of tech-smart folks as HN. however, its more relaxed and friendly, doesn't punish creativity, and doesn't have the same "that opinion or statement is not allowed here" effect as I see on HN. It does have a kind of liberal/PC groupthink on some issues. But they're issues I don't like to talk about anyway. Reddit's discussion forum UI is also much friendlier and more sophisticated than HN's. And they have lots of areas that focus on non-HN topics, while also being better at allowing one to avoid politics and "startup Foo raised/valued-at $X" posts, yet-another-framework posts, etc. Again, you lose some compared to HN, but also gain a lot. Luckily everything on the web is a tab away and we can vote with our eyeballs.
> I don't understand why anyone would run a closed source database, especially with the open source options available.
I generally share this opinion. However, in this case no open source solution came close to the features offered by FoundationDB. There are a couple of attempts (like CockroachDB) which could achieve something similar in the future.
Because HA, simple to use solutions are very expensive, closed source, or operationally costly.
if you think you've got the secret sauce, and you've actually had to put it into action and still hit five nines, awesome. But IME doing that with something like PostgreSQL is non-trivial (read: costly). That's why FoundationDB looked so appealing (to me).
Biased opinion here (Aerospike CTO and Founder), but you might want to check out open-source Aerospike DB. Like Foundation, it's clustered by default, very very good at SSD / Flash, runs great in cloud deployments, etc.
It's been used in production by big ad houses like AppNexus as well as retailers like SnapDeal. Lots of miles on the code.
Was closed source for years, but went open source about 9 months ago.
No high availability on open-source. Meant to be used with partitioning in mind so if you need cross-partition transactions most of the time, it's slow.
VoltDB has fantastic HA, but yes, not in the OSS version.
You might be surprised how many apps deploy on VoltDB with cross-partition transactions making up a solid chunk of their workload. Yes, they're slower than partitioned operations, but they're still faster than MySQL much of the time.
Most of the apps we see partition very well, especially for writes. The fact that they can run 10k distributed aggregates a second to get a global view is something few other systems can touch.
If you don't need SQL, there are a few pretty compelling options out there, each not without their faults... just the same.
MongoDB, RethinkDB, ElasticSearch, Cassandra and I'm sure a number of others.. each of them have HA options (though RethinkDB is still a few months out for auto-failover iirc), and Cassandra has pretty close to linear scalability in production for some very large data loads. I really like each for different reasons, and would lean towards one or another depending on load.
Not to mention, my speculation is PostgreSQL will likely have an in-the-box replication with failover and/or multi-master solution in place within a few versions.
PostgreSQL has been speculated to get "it'll be all better RSN" failover for as long as I've been using it (around 8.4 IIRC). ;-)
But yeah, what made FoundationDB's SQL-Layer exciting (for me) was:
- For small clusters it was free.
- Automatic HA
- Operationally Inexpensive (talking admin time/effort/training; far cheaper than PostgreSQL)
- Horizontal Scalability
The things that didn't matter for the 99%:
- It wasn't very fast.
I don't have "big data" problems (I could invent some). Most small shops (I suspect) don't.
The problem I do have is 3AM pagers, availability, wearing too many hats, putting dozens of hours into learning and experimentation to get PostgreSQL to use the hardware it's put on effectively, coming up with complex CARP+REPLICATION+FAILOVER plans, ZFS snapshotting because PostgreSQL still can't match the backup/restore process any commercial database had nailed down two decades ago, backing up the snapshots, figuring out how to partition clients into different table-spaces, blah blah blah.
You sacrifice some single-client performance with FoundationDB, but you solve almost every other problem you've got. And you now have the option of deploying a couple extra nodes to exceed your previously fairly intractable TPS milestones.
It's so easy in fact, you can now autoscale your database with your application servers.
And for your 80% of smaller clients it's absolutely free.
Your description of what matters to many customers doesn't get enough appreciation. Its faster too often trumps it lets me sleep at night in the battle for attention.
Looks very different in terms of the sort of performance you'd likely expect out of it, though. My initial reading is that if you want acceptable performance you're likely forced into thinking about sharding - and even then you're going to be punished by SQLite's poor concurrency support.
SQLite's concurrency support is irrelevant. ActorDB is a sharded by default type of database. The point of ActorDB is to split your dataset into "actors". Actor is an individual unit of storage (a sqlite database). As such your entire database is scalable across any number of nodes.
A natural fit is something like dropbox, evernote, feedly, etc. Every user has his own actor and it is a full SQL database. Which makes the kind of queries dropbox would need very easy to do.
It also has a KV store option. This is basically a large table across all your servers. This table can have foreign key sub tables. From my understanding of FoundationDB this part is closer to what they were offering.
It's not irrelevant. It means your ability to perform concurrent transactions depends entirely upon your ability to decompose your data into a very large number of different actors, otherwise you're bound to be hampered by SQLite's global lock. If you decompose too far you'll end up doing cross-actor transactions, which per the documentation has a substantial performance impact.
This is not to rubbish it - I've not used it after all - but the claims being made for ActorDB are pretty far away from the claims made for Foundation.
> All distributed databases shard data. If you hammer at only a specific shard area, performance will be limited to the speed of that shard.
Agreed. And what I'm saying is that it appears that ActorDB's per-shard area concurrency is limited to one writer. And that means that SQLite's concurrency support is (contrary to your earlier post) extremely relevant: not just in terms of pure performance, but also ability to perform concurrent operations. If you need more concurrency, your only choice is to shard extremely heavily (which might mean you require more cross-shard operations, which are apparently slow).
As you say, some data models fit the actor model well, but this is still a far cry from the capabilities that were promised by FoundationDB.
> FoundationDB was single process. They had no per-node writer concurrency.
Interesting - I didn't know that. Even so, it depends on what kind of writer concurrency we're talking about, I guess - I presume that ActorDB is limited not just by having to run requests one at a time per-process (which is a legitimate tactic to avoid latching overheads and so on), but by also not being able to run any new transactions against an actor that's received a write until that write commits?
> The reason why I said sqlite concurrency support is irrelevant is because ActorDB serializes requests anyway. It must do so for safe replication.
Do you mean by this that the entire cluster can only perform one request at a time? Or am I misreading you?
Individual actors are individual units of replication. What one actor is writing is concurrent to what another is doing.
Read/write calls to an actor are sequential. I'm quite sure this is how other KV stores like Riak do as well. They have X units per server and those process requests sequentially. Their actual concurrency is basically how many units per server are running. They may interleave reads/writes per node or they may not.
ActorDB does not allow reads while a write is in progress. It is quite possible we will optimize this part in the future as it is quite doable.
In FoundationDB you never had to think about splitting your dataset into something like your "actors". All transactions are independent and parallelizable, unless when they touch a common key - in which case one of the transactions is retried (optimistic concurrency).
All sorts of ways around that, so it's unlikely you would get a contract that precludes you not being out of support before the term is up.
As a thought experiment, if the company just went bankrupt and had no one working for it any more, you contract won't help right? Second part, what happens if someone just buys the assets (IP) and closes the company, then hires the old team? There are many possible variants here.
It's fairly common to have contract terms requiring the SaaS provider to provide the source code to their (typically closed-source) product in case they go under. Of course, that only helps if you buy the product or support for it and not just rely on free downloads.
I think it's reasonable to expect any company doing anything vaguely important to do configuration management at least to the level of keeping copies of their production software installation binaries.
I don't understand why anyone would run a closed source database, especially with the open source options available.