We're moving into times where development needs to go fast. And, as a software engineer and agile coach I'm now more and more a fan of velocity driven design: deliver quality features as fast as possible by using a lot of the ASF principles (TDD, YAGNI, ...) and using the correct tool for the job.
Personally, I'm not interested anymore in spending time optimizing my SQL queries. I don't like to think about how to write my aggregate roots so that they fit into my sql database, be it oracle or sql server. I hate lazy loading and detached objects. Downtime when doing big data migrations is not acceptable for me. Did I mention I'm not a fan of SQL? :-) But I love writing code that generates business value...
At the project where I'm working now, we use Elastic Search as our primary datastore with a backup in sql server. Using this architecture, allowed us to deliver features fast. But we face a challenge: altering our domain involves altering over 80.000.000 (20.000.000 per environment) Elastic documents. How to tackle that?
What it does
The idea of this submission is to find out some best practices in how to do Elastic Search (in fact any NoSql JSON Document store ) document migrations. Ideally, our goal would be to have a public, Cegeka branded Github repository where we share a .NET nuget package which allows us to do data migrations the easy way.
How we built it and outlook on the future of ElasticSearch Migrations
We already have a basic project where we do data migrations (we already altered 8.500.000 docs per environment within 2.34h!) but I think we can do better: leverage our multi-tenant hardware to do migrations faster, safer and with an automatic safety net.