Wednesday 6 December 2023

Joining Hopsworks: of course I had to get in on the AI/ML hype! Here's why!


I wanted to write a short article to share with you some of my thoughts on why I joined Hopsworks and what my and our plans are for the next couple of months and years - aside from utter world domination, of course :) … Here goes.

2023 has been an interesting year for me. Exiting from Neo4j after a decade was of course a little bit of an earthquake for me - mostly because all of a sudden I was no longer part of the graph tribe as much any more. Even though I wanted the exit and totally agreed to the process, it was still more impactful than I had guessed. Then came a number of interesting but turbulent months where I worked for a number of smaller startups, among which the amazing team at Hackolade - but in all honesty, just could not find my bearings. So when that quest ended, it felt like I really needed to take a step back and think about what I wanted to do.

That process, and a whole bunch of coincidences that I never could have planned, lead me to join the best Machine Learning platform company I could find :) … Here’s what made me do that.
  1. The Hopsworks market seems incredibly interesting to me. Yes, since the ChatGPT hype every man and their dog seems to want to do something with AI - but I am convinced that this is not the next blockchain hype. There is much more to it - and ML and AI systems are here to stay.

  2. The reason why they are here to stay isstay, is that there is true value in it. Yes, we have had great data analytics for years. Yes we have had statistics for years. But to be able to use the best of that in a system that brings in the best of these techniques in operational pipelines that are actually going to accurately predict the future - it’s pretty much a game changer. If you have seen how ChatGPT is able to predict the words of sentences etc… then you understand how good these ML-based predictions can really be. The results are just very impressive.

  3. The number of business use-cases for that type of predictive power is just without limit. As always, I expect the adoption to be gradual and value-based: the people that get the most value from these platforms will likely start using it first. But, not unsimilar to what I experienced at Neo4j, there is no limit to the potential of Hopsworks. The current customers already prove it.

  4. Hopsworks actually has some amazing product tech that makes it really well poised for success. At the heart of that technology is a feature store (think of it as the beating heart of the ML system: this is where the online and offline data of your ML system gets stored and managed) that, just because it is so damn fast, allows for a completely different way of doing machine learning. RonDB, the key-value store that Mikael Ronström originally conceived for MySQL as NDB, has found another great use case for it - and we are going to use it to the max.
And then, there is one final thing that made it easy to join a new adventure: the team. Hopsworks is about 35 people strong these days, and they have been nothing but superbly welcoming to me. It’s been a month of fantastic energy, lots of laughter, great discussions, and a feeling of collective purpose that I am savoring by the minute. I am extremely grateful to be able to start this journey - and will share updates along the way.

If you want to know more about Hopsworks, please go to hopsworks.ai, or hit me up so that we can schedule a chat. I have also included a short intro video at the bottom of this post.

All the best

Rik

PS: Find a lot of our videos on our youtube page!





Wednesday 14 June 2023

Just the right amount of thinking things through

This may become a little bit of a weird, “metah” article. But I feel like it’s an important one. It relates to something I have been thinking a lot lately about how, both professionally and personally, something that I think holds important life lessons. Maybe it’s because I am turning “half a century” later this year, that these types of thoughts and considerations are on my mind, I don’t know.

Here’s the deal: I think that, both personally and professionally, there’s a lot to be said for a) not overthinking things, b) not underthinking things either. Let me try to explain what I mean with that.

You don’t want to be overthinking

I know that some problems are very hard. It’s super difficult to get it all in your head, to rationalise all the parameters, to assess the impact of all the different factors, and to play out what will be the right decision in a given set of, usually ever-changing circumstances. So you think and you think and you think things through – but often times that just does not get you any closer to a practical solution. In my experience, very often you are better of “just getting going”, chopping away at the problem and moving the solution forward in what you think will be the right direction. It’s not possible to solve the whole thing all at once, and overthinking it will not get you closer to that solution. It will just continue to look like a massive sticky hairball, a “big ball of mud” that is impossible to manage or untangle. Stop thinking, start doing is often very sound advice.

Friday 9 June 2023

Alignment (or not) between data tech and software development industries

How data technologies and software development have evolved in parallel and synergistic ways




When I think about both the data industry and the software development industry, and try to take stock of the evolution that we have gone through in them in the past 25 years, I can’t help to see that there are some really interesting and important trends, and that these trends are actually really well aligned between both industries. No surprise?

Relational databases and waterfall development dominating the earth

When I started my career in tech, the world was reigned by Relational Database Management Systems (RDBMSs). My university graduation thesis was actually about a sizing study of an early version of MS SQL Server – of all things J … And: at that time, we were developing software in a very well-structured, very well-governed fashion, using waterfall development methodologies. Every step had its clear and defined purpuse, and would follow another step. It was all about analysing, designing, developing and testing software on significant chunks of functionality, that would all take months to deliver.

Things started to shift…

I remember at the time, feeling that there was starting to be a shift. We were starting to see some understanding of how we would need to become more agile – that the mythical manmonth was indeed, mythical. That we would need to break up problems into smaller ones. That use cases were going to have to drive our alignment with business stakeholders.

In my experience, the advent of many new technologies (object orientation, open source operating systems and development frameworks/infrastructures, to name a few), this started to change significantly, and we started to develop software in a truly more agile, iterative way. And forgive me if I get the chronology wrong here, but just around that time, we also started to look at data in a different way, and our industry developed an entire range of new (NOSQL, for lack of a better category name) data technologies. All of a sudden the normalised, table-based data model that had famously overthrown the navigational and network databases of the mainframe, was deemed less opportune for modern software development practices. Document databases, key-value stores, graph databases, and wide-column stores each had their unique traits that suited specific use cases – and therefore drove their adoption to sky-high levels. We are still seeing that today.

Parallels between data tech and software development

So in my mind, we have a really interesting, and parallel set of shifts:
  • From waterfall methodologies to agile software development:

  • From relational databases to NOSQL databases



About alignment and misalignment

It seems to me that there is an interesting alignment and misalignment here between these shifts:
  • waterfall/agile methodologies are opposed and therefore misaligned,
  • relational/NOSQL technologies are very different and therefore misaligned,
  • relational/waterfall were clearly historically aligned, and share some characteristics for sure,
  • NOSQL/agile were also historically aligned, and share some characteristics too.
I tried to express this in the following bigger mindmap, from which I have pasted above:


Interesting, not? Clearly, my overview and articulation of this evolution is incomplete and even inexact, but the broad lines of the argument hold true. I would love to hear it if you think otherwise. Note that I do not mean to say that you cannot do agile with RDBMSs, or that you cannot do waterfall with a NOSQL database – but I do not think they are well aligned.

It feels important to point out these types of higher level shifts so that people understand why a modeling tool like Hackolade, does what it does, and doesn’t do what it doesn’t do. After all: data modeling
  • On the one hand touches the software development industry: software developers will use data models to better understand their business domains, and develop better software in an agile, iterative way. Hackolade is super aligned with that approach.
  • On the other hand touches the data industry: data architectures are more and more frequently composed out of different, heterogenous database systems, streaming services, APIs, and many other components. Hence Hackolade's support for all kinds of different physical targets.
And in any case: let’s hope that it encourages the conversation and discussion – we can only collectively get better by doing that.

All the best

Rik

PS: I thought about using a graph method and "close the triangles". Kind of like the old adage "an enemy of my friend is my enemy", one could argue that "misalignment of a concept to an aligned concept will misalign that concept with the original concept". Therefore: NOSQL/Waterfall and RDBMS/Agile would be implied to be misaligned. But I think that is too simplistic, and therefore I did consciously not include it in the article.