> By 2025, graph technologies will be used in 80% of data and analytics innovations, up from 10% in 2021, facilitating rapid decision making across the enterprise.”
What is behind the thought that graph databases are going to grow so much in the next few years? To me they've always had a niche use... Are they really going to be ubiquitous (like this funding seems to assume?)
Historically, graph databases did a passable job of supporting data models and queries that were not really possible in SQL (absent proprietary, vendor-specific extensions). That's all over now, because recent versions of SQL support recursive queries that can handle general graphs quite easily. No real need for a specialized solution, even plain vanilla Postgres is good to go.
There's admittedly some ongoing work on extending the SQL standard syntax with some extra sugar for "Property Graph Query". But there's nothing wrong technically with just using basic SQL syntax, it's just a matter of getting it to work. Performance will vary depending on query optimizer, any INDEX definitions, etc. and is quite a separate concern.
Overall, graph databases are so general as a model that writing slow queries will always be a possibility, so one should be mindful of these concerns. But that's just as true in NoSQL graph db's.
Something I just found out after looking into status updates on the Property Graph Query (PGQ) work being done in SQL, is that it will exactly mirror the work going into GQL (Graph Query Language, a newish standard in its early stages of development based mostly off of Neo4j's Cypher).
To summarize this post[0] by someone involved with the standards:
- GQL (ISO/IEC 39075) is a full database language to create and manage property graphs and create, read, update, and delete nodes and edges (or vertices and relationships)
- SQL/PGQ (ISO/IEC 9075-16) is a new add-on part of the SQL standards which introduces the capabilities to create property graph views on top of existing tables in an SQL database, as well as the ability to query property graphs using a GRAPH_TABLE function in an SQL FROM clause
- The input to the SQL/PGQ GRAPH_TABLE function is a property graph query, sometimes referred to as Graph Pattern Matching or GPM. Graph Pattern Matching is common between SQL/PGQ and GQL. That is, the syntax accepted in a GRAPH_TABLE function in an SQL FROM clause is identical to the syntax in a GQL graph query. Because GPM is the same in both draft standards, changes to GPM for SQL/PGQ also apply to the GPM portions of the GQL specification.
---
I also just came across the Apache AGE project[1] which basically allows you right now to extend PostgreSQL DBs with property graph capabilities and enables full(?) use of Cypher/GQL.
At the end of the day Neo4J needs to operate a query planner on top of a relatively standard index structure to present the graph abstraction. There is limited difference between Neo4j's planner and what could be planned from SQL.
GraphDBs make more sense when there is strong evidence that either the natural description of the program is a graph or that the underlying storage engine can efficiently model the graph.
So far no GraphDB has demonstrated either statements as true for the majority of problems.
Neo4j is all-in on, "almost everything looks like (or can be made to look like) a graph, so almost everyone should be using a graph database".
As for those specific figures, I'm guessing there's enough wiggle room in "data and analytics innovations" (emphasis mine) to find or project almost any trend one wishes. What are data analytics innovations? Why, it's the set of things that will see 80% use of graph technologies! "Graph technologies" is also so potentially-vague that it could plausibly be 100% of almost anything related to software.
"Everything looks like a graph" is more damning of the idea of a graph as storage than it is praise. The whole point of a database is to impose _additional_ constraints on the data to ease subsequent application development or data analysis.
Relational data may be a hassle but its a hassle you end up having to deal with anyway at some point.
I can see a graph database as being a useful place to stash a ton of shitty data as an initial place to start an ETL but I can't imagine using it as a system of record except in very limited situations.
The additional constraints are also what enable performance optimizations. And not the small ones, the ones that give orders of magnitude improvements. Whereas right now neo4j is slower for graphs than postgres, just with a nicer UI.
Oh, I agree that, baring some actual honest-to-god innovation, the whole product category's niche-by-nature. Just relating the way Neo4j's been positioning themselves.
the point is that "to ease subsequent application development or data analysis" can be done just as well, or better, by a graph DB. You don't have to end up with the hassle of relational data as in an RDBMS.
I've done development on an app with Neo as the back end, and what I liked about it was mainly py2neo and the cypher query language. Even after developing in it, approaching another graph in DGraph was conceptually impenetrable, as my impression of dgraph was they had a bunch of unnecessary and poor abstractions in their documentation. The next candidate is the redis graph, but I haven't. With Neo, if you learn cypher, you literally don't need to know anything else about it to be useful in it, which brings me to what I think their real market is.
The opportunity I understood after using Neo was the big product play would be a kind of mental shift for enterprise data analyst users whose jobs exist in excel/powerbi today with power users using Cognos, and less devops/SaaS company/etc. I over-use Apple as an example, but if Apple entered into enterprise data products, Neo would be the kind of thing to be the underlying tech for it, as if you are an apple user, an apple'ey analytics tool would be based on users producing and reasoning about their data with graphs instead of tables, if you could imagine a kind of photoshop for data, or a fundamental conceptual change from spreadsheets to graphs. They aren't as competitive as a data tool, but I think they are unrivaled as a knowledge tool.
The tech is really great, but the product piece appears to have been a challenge because the use cases for graphs have been very enterprise'y, which has limited adoption because people who operate at that higher business logic level of abstraction that graphs enable are not the people picking and adopting new technologies. The growth will come from younger people who learned python in high school, and have a more data centric world view. Maybe that's the play.
Anyway, as a user I can see why they got participation on an F round. Imo, they've solved the what/how/why and have done some amazing science and engineering, and what I hope that money buys them is some magic.
> my impression of dgraph was they had a bunch of unnecessary and poor abstractions in their documentation
I'm surprised to hear that. Dgraph uses GraphQL (and DQL, a fork of GraphQL) as the query language -- which is a lot more widely adopted language than Cypher. Dgraph users really like the simplicity and intuitiveness of the language and ease of use of the DB.
Nice! It's been a few years but what I remember is that if I ever had a chance to ask, I was going to ask this: What was the problem that calling what laymen call attributes/properties Predicates solved, are they just attributes, and is there some other mathematical property of DGraph predicates that makes them different from normal attributes that a user should know?
Literally every dgraph user must necessarily know the answer to that already, or maybe they just mentally black box it and work around it, but at the time, my impression was non-users don't know this, and if I'm adopting a whole new taxonomy I need extra incentives to know it's worth while. It's probably an excellent and even superior technology, but what read as auteurism in the product at the time made me reconsider how much time I wanted to invest before encountering another one.
Anyway, coming from being a Cypher user, the learning curve for the use case of "I want to create nodes of different types with attributes, with relationships of types with attrbites, then CRUD them and verticies with a Flask app" felt a bit steep after that.
SQLite would do the trick, but I wanted consistency from my business logic to a grammar, to a data model. It's very easy to encounter graphs and just think we're not smart enough for them or our problem isn't graphy enough, but given the ease with which I could encode a grammer into cypher, I reluctantly gave up on dgraph. That said, I'm not a gremlin/tinkerpop fan either, as from a top-down user use case, it wasn't satisfying either.
DGraph has a lot of users and customers who love your product and the smartest people I knew recommended it to me, so my issues might not register, but there were a few experiences going through the tutorials that made me wary I was sinking costs into it relative to my use case, e.g. I have 1 week to build a PoC Flask app with a graph on the back end, and then scale it if the customer cares. That's what I used Neo for, and didn't use dgraph for, even though I figured I'd hire developers to rewrite it for dgraph if it got off the ground.
Anyway, long way round, but I'm a long time believer and user of graph techs and want everyone in that market to succeed.
Amongst all graph databases I tried, Neo would land third and last. Dgraph and ArangoDB would definitely be ahead in terms of developer experience from data loading to regular transactional use.
But I do appreciate all the effort Neo4j put for years in educating us all on graph databases, use cases, and just drawing attention and awareness.
What this tells me is that the graph db space has a lot of room in it for someone to come along and make a kick ass product, because honestly every time I've had a problem a graph db can solve I remember I basically only have a few mediocre choices to choose from.
Neo4J has been very meh in my experience, but they are the biggest.
I discovered that there is a fork of a previous Neo4j enterprise version known as ONgDB. Don't know if it will have a sufficient pool of maintainers to fix and make evolve such a product. But at least it remains fully open source.(Neo4J enterprise version open source code has been removed..)
I investigated using Neo4j back in 2010 for storing the schedules/routes of ocean shipping freight companies, forwarders, etc. The first cut of the system stored the schedule data in a graph like way in MySQL, but to do recursive SQL queries (needed for the types of queries being performed) was annoying. Things worked well enough, but a graph database would have resulted in a much more logical representation of the data structures being used - and the code behind the queries easier to develop.
That said, the 2nd system never got off the ground, I quit the badly run startup before finishing it. And now that I have a bit more experience with Neo4j, I'd say it would have been a bear to fully implement. Java is too heavy and Neo4j is a memory pig. It works, I can't say it is bad or perhaps iffy like Tinkerpop, but it is "Enterprise Software" and everything that is associated with that meme.
I have been using Tigergraph for my latest research into modeling the schedules etc of rail transport. It is much faster than Neo4j, requires far less memory (I can store every bit of data I need to in it unlike with Neo4 (would need multiple 64GB of RAM servers)), and its programming language is pretty nice once you get the hang of it.
So I'd recommend Tigergraph. The downside is that it is not as 'plug and play' as Neo4j, does not has all the mindshare/fancy bells and whistles, and is entirely C++/Unix based. So having some UNIX sysadmin experience is helpful unless you want to use their cloud solution.
> to do recursive SQL queries (needed for the types of queries being performed) was annoying. Things worked well enough, but a graph database would have resulted in a much more logical representation of the data structures being used
I think there's plenty of room to disagree with this view that modeling graph data in SQL is not "logical enough". Though to be fair, there seems to be some ongoing work on adding some "property"-based layer to bare SQL in order to make common graphs-and-properties oriented queries a bit more ergonomic.
For what I was doing, and how I was doing it, working with graph structures directly with Cypher would have been easier. Perhaps "logical" is the wrong word; my intent was to relay my specific experience, not express a general "principle".
It's not about the databases, it's about the migration in the first place.
If you have a problem that can be solved best with a graph database, then there is no problem. Many problem can be better solved with a graph structure. Choose one, and you'll be happy.
But, if your use-case is migrating from MongoDB to a graph database, that's a bit of a red-flag. What data model do you have where you can migrate from a document/schema-less system to a graph database? Maybe the tech lead figured out that a graph model works better for your data. If that's the case, then great -- migrate away.
But given that they want to go from Mongo to a graph DB, the fear is that this is someone who is only chasing the next cool technology and not solving an underlying business problem.
>But given that they want to go from Mongo to a graph DB, the fear is that this is someone who is only chasing the next cool technology and not solving an underlying business problem.
To be fair to the teach lead, I do feel like it was the other way around. MongoDB was foisted on us on a new project (we were previously SQL) by a software architect who left soon after. I've never felt that MongoDB was a good fit for what we want to do, but I want to return to SQL.
New tech lead pushing switching an existing product from infamously-cargo-culted MongoDB, of the much-hyped-but-now-passed Document Databases Are The Future wave, to either of a couple products in the current "X database architecture is The Future" wave? Does that not read like it could just as well be straight-faced parody, as real? The products may be fine, so far as they go, that's not what I'm trying to puzzle out here.
In my experience MongoDB has only given me problems (either performance or data loss). Most likely when someone wants so "solve" something with MongoDB, there is always a better technology to do it (Cassandra, ScyllaDB, S3!, PostgreSQL/JSONB). I could Imagine that their current implementation has a half modeled graph-like structure in MongoDB and migrating to something else (I am generally against Neo4J because of their horrible pricing tiers).
they are a solid product - graph and document is a good mix. don't listen to the negative postgres fundos (especially ones who don't understand what a solid and performative database Mongo has become)
The most recent graph DB I've used was Dgraph, I've found the interface to be good to work with and it does scale well performance wise. Memory consumption was still too high for my tastes and if you need to build common algos on top like PageRank, again for example, they don't support that out of the box. If you read through their forum you'll see they may never choose to support things like that natively so you have to do things like I did which was export the data out. This was maybe 7 months ago now.
I'll also say that working on the entire graph if you need to is difficult, they're not oriented around working on the whole more like fragments that you've paired down through your query modifiers so if you know you're going to be doing a lot of work that requires you to do things on the entire graph that may change the performance characteristics for you a lot.
I like it and would use it again but there are rough edges to work around still and it is young so know your use case and know the trade offs you're making.
We use ArangoDB and are super happy. But I guess it depends on your use case. We operate in the area of 1 million records. Everything is super fast and the ability to also have search, graph and document workloads was most important for us.
We used ArangoDB at my previous job. I liked it, though it was my first engineering job so I didn't get deep into the technicals. I thought it had a nice UI and query language
Honestly I never heard of it, I had a few in mind but that wasn't one. I'll give it a try next time I need a graph db maybe it can scratch my itch so to speak.
My concerns basically range around memory consumption, query language and language ecosystem.
Edit: Oh and I guess around like functional extensibility. The last time I used a graph DB I had to export from the db itself to HDFS and use Spark to do things like PageRank and I'd rather be able to write that natively in their query lang or some like UDF equivalent.
That's what you can do in Neo pretty easily. The DS library offers a bunch (50+) algorithms to run on the graph data directly or projected, e.g. PR on 117M wikipedia links runs in 20s.
I've had bad experience with Neo4J's memory consumption so I'm wary of that to be honest. I don't disagree that it has those things but we actively chose to go against it because of past issues with resource usage.
Redis is memory-backed, so you have limitations here. When I think of neo4j and its equivalents, it's about super giant datasets that are beyond memory limits.
I've been looking at Bitnine AgensGraph and thus also Apache AGE. These are (related) extensions of PostgreSQL that use PostgreSQL to implement a complete graph database and Cypher query language. Interestingly you can even mix SQL and Cypher in the same statement.
This approach of extending PostgreSQL is very appealing to me. There is a great deal of value in the PostgreSQL stack that doesn't need to be reinvented just to deliver a graph database and query language. How much easier is it to adopt graph database techniques when it is simply an extension of database technology nearly everyone is already running? Conceivably one might find some future version of PostgreSQL RDS (and other cloud PostgreSQL services) delivers Cypher.
cypher is really cool and with neosemantics[1] you have the best of both worlds - labeled property graphs and rdf* they even have a cool reasoner and sparql.
that being said I thought about porting it to postgresql with apache age vs. using neo4j for a project because it's faster at least for this usecase. Easier said than done, through.
If you want to play with graphs and linked-data it's super cool. There is also structr[2] that builds CMIS / Alfresco ECM like functionality atop neo4j with graaljs scripts.
I always find it amusing how much graphQL there is without actual graph db behind it.
Seems the concept of having fluid relationships is appealing for querying but not structuring/storing... which seems like a disconnect.
I have only seen a few Neo4J systems in serious production workloads and they were ALL on logistics... I'm not sure that it's being positioned (or interpreted) as a nice simple solution to start out on.
Edit: I just checked out neo4j "bloom", and it's definitely a good way to make graph more accessible - they should continue to build further on it.
GraphQL has been such a bad name to deal with. I've seen so much "we need a graph, so isn't graphQL a good idea?" or "this is a graph database, so doing graphQL on it will be way easier & more natural than on a SQL db, right?". Even from technical and semi-technical people.
It's also torturing the definition of "query language." There is no equivalent of "join", or any other typical query feature such as aggregation, grouping, sorting, filtering. GraphQL has as much to do with graphs or query languages as my smart TV has to do with intelligence. It's RPC, but RPC fell out of fashion when SOAP/WSDL/XML died.
That's an interesting point. The beauty about SQL is that behind it, it has really good backed theory of Codd's Relational Algebra. Whereas Document Based, Column Based and GraphQL, don't have that. It would be interesting to see research on Graph theory as data sets and how to represent them as a formal query language. The majority of Graph theory I remember taking in my CS classes (granted, that was more than 15 years ago) was about graph traversal and path-finding.
Eh, many REST endpoints are de-facto indistinguishable from RPC. Though as far as a general query language for Web clients, you can use SPARQL which interoperates well with REST principles.
Is the latter not the case? I'm genuinely asking here, as this is a point of view expressed by my tech lead (though in our case it's a graph DB vs an existing MongoDB setup).
Absolutely unrelated. You can think of GraphQL as a standardized glorified RPC that calls arbitrary functions that return some data. Whether the source of data is an RDBMS, a bunch of REST microservices, DynamoDB, redis, SQLite, a flat JSON file, Neo4j, or a D12 dice doesn't really matter.
The "graph" part is if your arbitrary data is actually somehow related, you can traverse those relationships in one request instead of having to do many calls in a waterfall.
Not especially, no. It likes trees, or things that can easily and naturally be represented as trees, best, as far as I've ever been able to tell. Granted, I suppose, that's a kind of graph.
I don’t think that’s particularly amusing or a “disconnect.” From what I can tell, the entire point of GraphQL at Facebook was to expose an interface that looks like a graph database but is in fact able to load data from any number of different underlying data stores (without the query developer needing to know about that part).
Not sure if you saw our graphql integration, that takes typedefs and converts a graphql query into a single cypher query, which can then be executed directly. https://neo4j.com/product/graphql-library/ has links to docs and api scaffolding tools.
When I started back then in 2016 with it, it was pretty cool how directly graphql mapped to the graph model in the db.
Series F isn’t an inherently bad thing. That just means that they’ve been around for awhile, want funding to accelerate growth, and don’t want to go public. Every company doesn’t have to grow super fast after 1-2 massive funding rounds and then get acquired.
Enterprise licenses for on premise Neo4j are certainly for a specific customers with specific needs, but there is always the DBaaS (https://neo4j.com/cloud/aura/).
Or if you absolutely need on premise and are small there is the startup program for free enterprise licenses (https://neo4j.com/startups/)
Neo4j's entire pricing model, even in cloud, is built around the idea that you'll have one centralized very large graph.
Many companies, like the one I'm at, have the opposite use case -- many, geo-distributed, tiny graphs and multiple (read: 3-5) pre-prod environments. They simply don't have a pricing model that supports customers like us.
They wanted to charge us something like 10% of our ARR for something that was just a component of one microservice.
tiny compared to the scale they're looking to sell but not exactly small. Also network-isolated.
A graph db is the right tool for this use case. Just not really theirs, although it could if they could understand how to sell it to us at a fair price.
I don't think you're actually supposed to use CE in a production environment, it's more there for learning and teaching, which is why it's free. If you want production grade then use the cloud offering.
I worked on a use case which was clearly in graph in nature. So graph database seemed a natural choice, of course Neo4j was one of the most heard product even back then.
We did evaluate Neo4j, but put down due to its complex query language (cypher) and slowness. It was really an awkward language, super awkward.
We also evaluated arangodb and we found it much better than Neo4j. Performance was good and its query language was better too.
What we realised in the process is, using graph databases is more of a cultural transformation as well. SQL is much well understood, well adopted and well supported by community.
Ultimately we implemented the use case in Postgres, and thank God we did it that way. IMO, we can still get all the benefits of graphs we SQL databases with little efforts.
I joined a small open source project that had decided to use Neo4J instead of a SQL database as its primary store. Simple queries mysteriously caused Neo4J to gobble up huge amounts of memory. Neo4J struggled even though we had fewer than a million items, sometimes getting so stuck that we had to restart the database. There wasn't any good tooling to explain why our queries were so slow. We'd already upgraded our VM beyond what we originally hoped to spend and were reluctant to spend even more on a larger one.
We had a team member who had used Neo4J professionally for years and could not figure it out. And we only had one; every other teammate and new volunteer had to be trained in a strange new way of thinking about databases and a new query syntax. Setting it up to run locally for development was a difficult process. Progress was slow and our code to access the database was messy. We kept being promised that, in exchange for these heavy burdens, Neo4J would do amazing things for us once we started doing graph queries, but we never got there because it couldn't do the basics.
We rewrote the project to run on PostgreSQL. Five tables, properly indexed, lightning fast, easy to set up and understandable by anyone. A hundred million rows and it didn't break a sweat, on the lowest tier of machine. Even graph queries were straightforward and quick.
Advice: Don't use Neo4J as your primary store, and avoid it altogether if you want volunteer or casual contributors. For us, it was all costs and no benefits.
Ever since Sarbanes-Oxley there's been a continuously declining number of new public companies. The reality is the modern legal and regulatory environment makes being a public company significantly more burdensome than it was in 1995.
There's more than abundant amounts of capital in private equity, so the only real reason to go public is to create liquidity for early founders/investors/employees who want to cash out. Given that, arguably you could say going public, instead of raising private capital, is the smell. Or at least an attempt to top-tick the valuation, e.g. WeWork.
Another way I look at this is that wealth has accumulated so disproportionately at the top that the small number of individuals/firms flush with wealth need to find ways to spend the money. So why not dump hundreds of millions on another company? Since there isn't any actual force coming in to take and redistribute it, might as well spend it.
Both. I personally find the fact that having these later and later rounds of funding considered normal a huge indicator that something is very wrong with the current industry.
It means there's a lot of capital being dumped into trying to find some hidden source of profit and it's getting harder and harder to find it.
It's the capital equivalent of going from finding oil in your back yard to blasting it out of tar sands in the Canadian tundra. Sure the capital/oil keeps flowing, but the inherent unsustainability of the system starts to show its face more clearly.
That's exactly the trade off, isn't it? Either you do the work to store the relationships and save on the compute and memory cost later, or you pay as you go to build it in real time with a relational database. It's horses for courses.
To propose a different perspective, a relationship in a graph db is like a materialized join. You pay on relationship creation (you might be using index lookups to find the nodes to connect, similar to a relational db), then for traversal it's just pointer hopping across the relationships to the connected nodes. Aside from the initial lookup of starting node(s), traversing the graph won't use indexes at all, so becomes constant time operations.
These seem to be non-standard SQL extensions, AIUI. At least Neo4j has something of a quasi-standard solution for their querying layer, that might also be supported by other vendors. These extensions are a dead end.
I’m not sure I understand calling this a “dead end”. Almost no one limits themselves to pure ANSI SQL. Pretty much any application of reasonable size in production uses vendor specific APIs. A “quasi-standard” is not a standard.
I was thinking more about technical reasons in terms of the storage layer. The query syntax seems to be the least interesting part of a database, to be perfectly honest.
Okay,that seems more interesting. Any resources on the data structures used to avoid indices? Without table ddl, if node types are arbitrary, that seems like a hard problem to solve in terms of storage layout.
"To understand why native graph technology is so efficient, we step back in time a little to 2010 and the coining of the term index-free adjacency by Rodriguez and Neubauer. The great thing about index-free adjacency is that your graphs are (mostly) self-indexing. Given a node, the next nodes you may want to visit are implicit based on the relationships connecting it. It’s a sort of local index, which allows us to cheaply traverse the graph (very cheaply, at cost O(1) per hop).
Neo4j manages to keep traversal costs so low (algorithmically and mechanically) by implementing traversals as pointer chasing. This implementation option is available to us precisely because we bear the cost of building the storage engine..."[1]
"Each node (entity or attribute) in the graph database model directly and physically contains a list of relationship records that represent the relationships to other nodes. These relationship records are organized by type and direction and may hold additional attributes. Whenever you run the equivalent of a JOIN operation, the graph database uses this list, directly accessing the connected nodes and eliminating the need for expensive search-and-match computations."[2]
> Neo4j manages to keep traversal costs so low (algorithmically and mechanically) by implementing traversals as pointer chasing.
This is how network and navigational databases worked, before modern RDBMS's were introduced. It's very much legacy tech, and skipping the indexing step brings negligible gains if any at all. Where optimizations are worthwhile, they're already being used.
If that were so then there would be no need for native graph databases at all, and we would not be seeing cases that could not be served by relational dbs that are possible with Neo4j and native graphs.
You may be thinking of non-graph use cases. When hundreds of thousands to millions or more traversals are required to address graphy use cases, if those traversals are implemented as table joins, and the join complexity is dependent upon tables that are millions or billions in size (so dependence on total size of the data instead of just the relevant connected relationships) then you can see where pointer hopping on only relevant connected relationship and node records (proportional only to the elements of the subgraph traversed, not total data) would outperform the relational model. Also you have the flexibility of being as strict or as lenient as required with the labels of nodes traversed or the relationship types traversed, as well as their direction. That's tougher to do when you may not know what tables are meant to be joined or how, or if you pour all your nodes into a giant table, where the join cost is proportional to your total data.
Relational databases are very good at what they do. But no tool is perfect and covers all use cases easily. Design is a matter of tradeoffs, and some of the design choices made that make them excellent in many categories becomes a weakness in others. We're in an era of big data, huge data, where modeling, traversing, and exploring the connections within this data is increasingly valuable, and increasingly costly due to the sheer amount of data and the complexity of both the connections between and the use cases themselves. Native graph databases are a tool for these cases, and can also bring along simplicity in modeling and querying to the table as well as the performance that gives them an edge in these cases.
> largest investment in a private database company
I guess this is one of those PR moves that is trying to make something lame sound good? If your customer portfolio includes Walmart, Volvo and AstraZeneca why are you raising money a 6th time?
Having some department at e.g. Volvo using your product doesn't mean that you have Volvo as a serious customer paying you like they would as a serious customer.
I take your point that this is a really late round of funding, but this doesn't mean they've caught on like they want to yet.
I agree that what you've described is likely the true situation. It just looks funny to see a company claim to be worth $2 billion dollars and namedrop big brands and yet require a 6th cash injection. There is an incongruence there.
IDK, on the one hand it's a late round. On the other, as a CEO, the ability to just immediately raise 325 million dollars is both impressive and appealing. It's insane how far even one million dollars goes, 325 is mind boggling to me.
325M isn't all that much when the company has been growing for over 10 years, doubling its staff count every 2 years, with a bunch of VPs added in, supposedly to figure out how to run professionally 100 nerds. It leads to 4 layers made of directors, senior managers and managers below each VP. Those jobs take a decent pay, and a significant cash bonus. Significant travel and other expense coming from sales folks who won't take airbnb for a 3 days trip in NY, 5 stars sheraton please. Add to that it's pretty common to spend 1/3 of the yearly flow into marketing, your IT, HR and accounting departments having itself large IT expense (concur, service now, etc) you end up with a boat burning 325M per year quite easily.
Is profit not normally extracted from the public? Rideshare costs are definitely increasing as they move towards profitability, this seems to be the expected and reasonable path towards that.
The ability to generate a profit and investment rounds aren’t mutually exclusive, but if you’re focused on growth you’re not going to have profits because you’re going to be spending all of your money on growing. That doesn’t necessarily mean you couldn’t drastically reduce expenditures and become profitable if investment wasn’t an option (or just an unfavorable option)
Welcome to the new economy. VC money get poured with an exit at sight. The bags keep growing until it ends up on public offering.
By that time, the share is pretty much what it's worth. But 100 times round A. If all went well of course. Over 90% of the time, it didn't go well.
Who knows how that will end for Neo4j, but the investors have their eggs in many other baskets anyway.
What matters isn't showing profit anymore, not even significant revenue to justify further funding. All you need is some appealing growth figures, sometimes not even that, just a convincing argument that hyper growth is on the horizon.
At some point millions are put into advertising and a strong sales force to grow revenue many folds. In the enterprise market, the trick often works pretty well.
and the sheer size of these rounds always astonishes me for very specialized software products. 300 million bucks, that's enough to build a death star, what do they do with all of the cash
(from Hasura) I've been wanting to try out a thing and use Neo4j + Postgres simultaneously. Use Postgres for data, Neo4j for relationships and graph-y queries.
Has anyone tried that? Would love any notes/pointers!
Join data across the two to get the best of both basically. Hasura doesn't support Neo4j natively yet, but maybe using Neo4j's graphql wrapper as an input to Hasura perhaps?
It's always interesting to hear about huge funding rounds for companies I've never heard of. Even more so when they apparently work with a bunch of companies I have heard of.
They're a fairly major graph database product with a well-funded sales team that has been hammering the "I mean, doesn't everything look kinda like a graph? Isn't your data kinda a graph? You should definitely use us as your database-of-record, or for literally anything else you might use a database for, look how fast we are at graph stuff!" line hard and (apparently) with great success.
I’ve used it in production roughly 5+ years ago. Both graph data modeling and cypher queries are very fun to work with. It’s both dynamic and powerful but still gives you a decent amount of structure and ACID guarantees.
One thing that is much easier to model and query, or rather more natural and simple, is authorization and other granular questions you might have about how users and data is connected.
A thing that I can’t wrap my head around however is temporal data modeling with graphs. Haven’t seen or thought of anything too satisfying yet, that meshes well with how I think about graphs. Whereas in SQL it is more explored and clear to me.
I agree that their marketing is very aggressive, but this tech has quite some merit.
I'm sure it's fast, but also consumes a ridiculous amount of memory. Tried loading a smallish dataset into neo4j that runs on a 100mb ram postgres server just fine. Neo4j wanted gigabytes of memory.
When I used it, it was fast at a very specific set of graph traversal operations. It was extremely easy to step outside those narrow patterns, while doing things that seemed like they should be fine and were very common sorts of things one might want to do with the DB, at which point performance would become terrible. A look at the underlying data structures they use was revealing re: why they were so fast at some things, and so slow at others. And they only achieved that much performance by, as you note, eating memory like it's free (I mean, it is Java...).
It was also alarmingly weak on data-integrity protection features, like constraints, locking, and data-types, at the time. IDK, maybe they've fixed that. Then, it was IMO wholly unsuited to hosting any data set that you couldn't stand to have completely destroyed every so often (so, a very particular kind of caching, which IIRC is exactly what one of their marketing department's favorite names to trot out, Ebay, used it for then).
[EDIT] I would, however, agree with nisa's post elsewhere in the thread, that the Cypher query language is excellent.
Did you do anything to configure that? If I remember correctly (has been a few years), Neo4j will be default reserve a bunch of space for caches and such (on top of default reserved heap sizes etc for Java). As such, the defaults don't necessarily say much.
Yeah, I mean, it's not the first time a database that's best-suited to a niche use-case has successfully had its market hugely expanded by sales & marketing promoting it as a superior replacement for what are in fact better general-use products. Clearly, it's a playbook that works.
What is behind the thought that graph databases are going to grow so much in the next few years? To me they've always had a niche use... Are they really going to be ubiquitous (like this funding seems to assume?)