Thursday 4 May 2023

Size does matter, also in Data Modeling!

Recently, my colleague Pascal Desmarets wrote a fantastic article about “Domain Driven Data Modeling”. Many things I liked about that article – especially how Pascal was able to tie together some of the best insights that I know of in Agile software development methodologies, with the best practices in modern Data Modeling practices. The two are clearly linked: if you truly want to implement an agile development methodology, then you need to have data models that follow the principles of that methodology. The reason for that should be obvious: development concerns are some of the core concerns that we try to address with modern data modeling tools.

In the article, Pascal maps the core principles of one of the great, proven agile software development methodologies (Domain Driven Design) onto the practice of data modeling:

No surprise: these principles are at the core of the Hackolade toolset that we have now spent years developing.

As you will see, Domain Driven Data Modeling has some inherent comments that pertain to the SIZE of a data model. This is one of the core points that Pascal tackles in the early part of his article, and a really interesting one to me.

About the size of data models

In the last few months as I have worked my way into the wonderful world of data modeling, I have had several client discussions where VERY large data models came up in the conversation.

Basically, there have been two distinct cases where this has been the case:
  1. Where people wanted to adopt some “industry standard” data model for their reference data models. Think ACORD in insurance, or FHIR in healthcare, but there’s plenty of other examples.
  2. Where people, specifically in large organizations that have to deal with lots of integration issues (e.g. as a result of mergers and acquisitions, or other evolutions of systems), have felt the need to create Enterprise Data Models, covering their entire enterprise data architecture.
Both of these large data models have great, noble intentions behind them, but we really can and should think carefully about the specific situations where they are useful – and when they are not.

Let’s put these two sides of the coins side by side:

How do we then deal with this, knowing that every business is different and every data modeling, software development and data governance projects has it own specific circumstances to address? This is where the principles of Domain Driven Data Modeling will come to the rescue.

All about the balance

Ultimately, we will have to make a judgement call on the right size of the data model that we want to be working with. There is no predisposed guidelines here, except for the ones that we have learned from Domain Driven Data Modeling:
  • Focus on the core of the problems that you are trying to address. Clearly defining what is core to your business is always key.
  • Make sure that you have some kind of limitation of scope and domain – that there is no possibility for the “kitchen sink syndrome” (aka scope creep).
  • Aggregate related things – keep things together that belong together. Make sure you have good reasons for not doing so, if you absolutely have to.
Keeping these principles in mind, will, almost automatically, bring some kind of focus and specific borders and limits to the modeling efforts that you are trying to bring to your business. And that focus, will highly benefit the return that you are likely to see from these efforts.

Like with so many things – the size of your data models is going to be a balancing effort. You will need to make sure that the models are not too small (as they will not contribute to the real solution of material problems then), and not too large (as they will just never deliver any tangible, specific value then – either). Striking that balance, is important – and the principles of Domain Driven Data Modeling should assist in achieving that. And of course, the Hackolade toolset is going to let you find the balance, by providing key functionalities (think references to external definitions, target models derived from multiple polyglot models, etc) – and support you in the balancing act as you see it fit.

Hope this was a useful discussion for you. Would love to hear your thoughts, as well.

All the best


No comments:

Post a Comment