Archive for April 2016
Last week, I attended a free seminar on Neo4j and came back wondering why I’ve not looked into the delightful world of graph databases earlier! I’ve never really grokked relational databases in the sense that it’s never second nature for me to think of most everyday problems in terms of rows and columns. The biggest stumbling block I have is that it is still unnatural for me to think of a relationship between two things as being a column in a relational database. I understand it abstractedly but just not natively. Conversely, Graph databases address this issue directly and seem to have a much better mapping to what is intuitive in my head.
Of course, I’m very much a neophyte in the world of graph databases, and learning something new is always a delicate balance between sheer delight in discovery new ways of doing things and head-exploding frustrations. The best way for me to learn is to start on a concrete “next small thing”, and I’ve already hit some stumbling blocks I’ll like to get answers to:
I want a website like kayak.com that let’s you track flights, but instead of showing what tickets and fares are available, they show you what airline awards are possibly available.
To do that I need to model each airline’s award chart.
1) Most loyalty program’s awards are based on regions such as North America, South Asia or Europe. But of course, each airline’s definition of regions are slightly different. Do we create separate regions as nodes for each loyalty program?
2) Another example of a tricky region: Almost all regions encompass an entire country (e.g. North Asia includes all airports in Japan). The glaring exception is that most airlines separate Hawaii as a separate region from the rest of the US. How do we model Hawaiian airpots as a region?
Once of things that attract me to neo4j is the schema-less nosql nature. You are much less locked into your initial design or schema and as you discover what your real needs are, your database structure can change much more easily.
However, change management is never free. What are best practices when dealing with changes? From simple node label orrelationship name changes to more complicated design changes? Are there pointers to good practices?
This current project requires loading from myriad of disparate data sources (airport codes, regions, award charts, rule changes, etc.). What are best practices to initially load in such data, and then to ensure that such data is up to date?