Denormalize at scale

Tuna Vargi, Our journey from Graph Databases to PostgreSQL (2016)

VARGI: MongoDB was doing its job quite well for simple querying and storing figure data as documents, but the aggregation and recursive queries were spiraling out of control. ...
Neo4J is great, but it gets slower with complicated queries and increased data scale.
... New query requirements such as private users and private photos which only owners and superusers are able to see started breaking the system, we couldn't cache anything due to completely dynamic calculations ...
As our database grew, Neo4j started getting slower and slower with no clear path for scaling or optimization
... PostgreSQL is perfect for writing aggregation functions, complex queries. However this had performance impacts. Category hierarchies were not changing too often and our data was unidirectional, from parent categories to child categories to figures. So we decided to denormalize and store the parent category information of each category in a concatenated string identifier such as parent1|parent2|parent3|. This approach allowed us to make recursive queries, also we no longer needed Neo4J, MongoDB or any other database but PostgreSQL thanks to the denormalized precalculated paths. Since PostgreSQL uses indexes for like queries, it's working quite fast when we compare with the MongoDB and Neo4j stack.