Friday 26 April 2024
I just posted this on Twitter as @ rvanbruggen
Ever since started working at @hopsworks , I have seen a similar pattern that I also saw at @neo4j (and at different startups before/after that), which is that adopting high-tech more often than not hinges on the #valuecase - and we techies royally s*ck at making them.
Ever since started working at @hopsworks , I have seen a similar pattern that I also saw at @neo4j (and at different startups before/after that), which is that adopting high-tech more often than not hinges on the #valuecase - and we techies royally s*ck at making them.
— Rik Van Bruggen (@rvanbruggen) Apr 26, 2024
from Twitter https://twitter.com/rvanbruggen
April 26, 2024 at 07:30AM
via IFTTT
Thursday 25 April 2024
Wednesday 24 April 2024
Tuesday 23 April 2024
I just posted this on Twitter as @ rvanbruggen
Our #neo4j + @hopsworks event is just a week away! Hope to see you there? https://t.co/N5cBJaSj7h
Our #neo4j + @hopsworks event is just a week away! Hope to see you there? https://t.co/N5cBJaSj7h
— Rik Van Bruggen (@rvanbruggen) Apr 23, 2024
from Twitter https://twitter.com/rvanbruggen
April 23, 2024 at 08:30AM
via IFTTT
Monday 22 April 2024
Sunday 21 April 2024
Middenstatie, part 4
Saturday 20 April 2024
Middenstatie, part 3
Friday 19 April 2024
Middenstatie, part 2
I just posted this on Twitter as @ rvanbruggen
RT @jim_dowling: This can be tried out today on https://t.co/qjiGPkJbMh
RT @jim_dowling: This can be tried out today on https://t.co/qjiGPkJbMh
— Rik Van Bruggen (@rvanbruggen) Apr 19, 2024
from Twitter https://twitter.com/rvanbruggen
April 19, 2024 at 07:05AM
via IFTTT
Thursday 18 April 2024
Middenstatie, part 1
Wednesday 17 April 2024
I just posted this on Twitter as @ rvanbruggen
One thing I have heard a lot of our @hopsworks customers debate, is the question of "Build vs Buy" of #featurestore #ai/#ml infrastructure. I wrote about that over here: https://t.co/AiM7NkfwqS - I hope it is useful.
One thing I have heard a lot of our @hopsworks customers debate, is the question of "Build vs Buy" of #featurestore #ai/#ml infrastructure. I wrote about that over here: https://t.co/AiM7NkfwqS - I hope it is useful.
— Rik Van Bruggen (@rvanbruggen) Apr 17, 2024
from Twitter https://twitter.com/rvanbruggen
April 17, 2024 at 07:30AM
via IFTTT
Tuesday 16 April 2024
I just posted this on Twitter as @ rvanbruggen
Looking forward to speaking at the #belgian #mlops #meetup on May 30th! Should be a good story around how you can leverage the @hopsworks feature store for better #ai and #llm systems on private data. See https://t.co/DOXYzDfIqF if you want to join us.
Looking forward to speaking at the #belgian #mlops #meetup on May 30th! Should be a good story around how you can leverage the @hopsworks feature store for better #ai and #llm systems on private data. See https://t.co/DOXYzDfIqF if you want to join us.
— Rik Van Bruggen (@rvanbruggen) Apr 16, 2024
from Twitter https://twitter.com/rvanbruggen
April 16, 2024 at 03:43PM
via IFTTT
I just posted this on Twitter as @ rvanbruggen
This should be a great event for anyone in #nyc interested in #llm #ai #ml and @hopsworks ! https://t.co/TFNwsEfjiC
This should be a great event for anyone in #nyc interested in #llm #ai #ml and @hopsworks ! https://t.co/TFNwsEfjiC
— Rik Van Bruggen (@rvanbruggen) Apr 16, 2024
from Twitter https://twitter.com/rvanbruggen
April 16, 2024 at 11:08AM
via IFTTT
I just posted this on Twitter as @ rvanbruggen
Next week, @hopsworks ' is co-hosting a great event in #paris about #responsibleai. You can join us there: https://t.co/hrrb4iu9pj.
Next week, @hopsworks ' is co-hosting a great event in #paris about #responsibleai. You can join us there: https://t.co/hrrb4iu9pj.
— Rik Van Bruggen (@rvanbruggen) Apr 16, 2024
from Twitter https://twitter.com/rvanbruggen
April 16, 2024 at 08:01AM
via IFTTT
Monday 15 April 2024
I just posted this on Twitter as @ rvanbruggen
We are hosting this event on the integration between #neo4j #graphs and @hopsworks : https://t.co/N5cBJaSj7h - hope to see you there?
We are hosting this event on the integration between #neo4j #graphs and @hopsworks : https://t.co/N5cBJaSj7h - hope to see you there?
— Rik Van Bruggen (@rvanbruggen) Apr 15, 2024
from Twitter https://twitter.com/rvanbruggen
April 15, 2024 at 02:04PM
via IFTTT
Sunday 14 April 2024
Saturday 13 April 2024
Friday 12 April 2024
Thursday 11 April 2024
Wednesday 10 April 2024
I just posted this on Twitter as @ rvanbruggen
At @hopsworks , we think there's a lot of our #softwareengineering #bestpractices that we can take into the emerging, wonderful field of #ml and #ai. We call that a #softwarefactory for #models. This blogpost explains the concept in some detail: https://t.co/F00UBdZ7N4
At @hopsworks , we think there's a lot of our #softwareengineering #bestpractices that we can take into the emerging, wonderful field of #ml and #ai. We call that a #softwarefactory for #models. This blogpost explains the concept in some detail: https://t.co/F00UBdZ7N4
— Rik Van Bruggen (@rvanbruggen) Apr 10, 2024
from Twitter https://twitter.com/rvanbruggen
April 10, 2024 at 04:31PM
via IFTTT
I just posted this on Twitter as @ rvanbruggen
Our @hopsworks CEO @jim_dowling says it well. In #softwareengineering, we have learnt a lot about how to do things. #agile. #devops. #infrastructureacode. #softwarefactories. It's time to bring that experience to #ai and #ml. Now. https://t.co/QuN7RyQiyG
Our @hopsworks CEO @jim_dowling says it well. In #softwareengineering, we have learnt a lot about how to do things. #agile. #devops. #infrastructureacode. #softwarefactories. It's time to bring that experience to #ai and #ml. Now. https://t.co/QuN7RyQiyG
— Rik Van Bruggen (@rvanbruggen) Apr 10, 2024
from Twitter https://twitter.com/rvanbruggen
April 10, 2024 at 10:27AM
via IFTTT
Tuesday 9 April 2024
Monday 8 April 2024
I just posted this on Twitter as @ rvanbruggen
Happy Monday! Here's another great @hopsworks #5minuteinterview with @GoAbiAryan. https://t.co/23OAiueou5
Happy Monday! Here's another great @hopsworks #5minuteinterview with @GoAbiAryan. https://t.co/23OAiueou5
— Rik Van Bruggen (@rvanbruggen) Apr 8, 2024
from Twitter https://twitter.com/rvanbruggen
April 08, 2024 at 09:28AM
via IFTTT
Sunday 7 April 2024
Saturday 6 April 2024
Friday 5 April 2024
Thursday 4 April 2024
Wednesday 3 April 2024
Tuesday 2 April 2024
Monday 1 April 2024
Sunday 31 March 2024
Saturday 30 March 2024
Friday 29 March 2024
I just posted this on Twitter as @ rvanbruggen
Just in time for the Easter weekend: another Hopsworks #5minuteinterview, with Jiri Steuer: https://t.co/qjfVdfjojd !
Just in time for the Easter weekend: another Hopsworks #5minuteinterview, with Jiri Steuer: https://t.co/qjfVdfjojd !
— Rik Van Bruggen (@rvanbruggen) Mar 29, 2024
from Twitter https://twitter.com/rvanbruggen
March 29, 2024 at 01:48PM
via IFTTT
Thursday 28 March 2024
I just posted this on Twitter as @ rvanbruggen
It's tonight, people! Join us in #amsterdam for a lovely #meetup where #graphs, #datascience, #neo4j come together with #ai, #ml and @hopsworks. Looking forward seeing you at the @Xebia offices who are again the kindest of hosts. https://t.co/lm0SqXuVbN https://t.co/LaIdMFexTn
It's tonight, people! Join us in #amsterdam for a lovely #meetup where #graphs, #datascience, #neo4j come together with #ai, #ml and @hopsworks. Looking forward seeing you at the @Xebia offices who are again the kindest of hosts. https://t.co/lm0SqXuVbN https://t.co/LaIdMFexTn
— Rik Van Bruggen (@rvanbruggen) Mar 28, 2024
from Twitter https://twitter.com/rvanbruggen
March 28, 2024 at 11:47AM
via IFTTT
Wednesday 27 March 2024
Tuesday 26 March 2024
Monday 25 March 2024
I just posted this on Twitter as @ rvanbruggen
Super happy about this: we are making our @hopsworks #5minuteinterviews available as a #podcast. All 12 past episodes and all future recordings will be made available on https://t.co/pRCzg9WgIv .
Super happy about this: we are making our @hopsworks #5minuteinterviews available as a #podcast. All 12 past episodes and all future recordings will be made available on https://t.co/pRCzg9WgIv .
— Rik Van Bruggen (@rvanbruggen) Mar 25, 2024
from Twitter https://twitter.com/rvanbruggen
March 25, 2024 at 04:01PM
via IFTTT
Sunday 24 March 2024
Saturday 23 March 2024
Friday 22 March 2024
I just posted this on Twitter as @ rvanbruggen
@hopsworks @alafroiskiotos And of course also https://t.co/F2QRZNPn7r
@hopsworks @alafroiskiotos And of course also https://t.co/F2QRZNPn7r
— Rik Van Bruggen (@rvanbruggen) Mar 22, 2024
from Twitter https://twitter.com/rvanbruggen
March 22, 2024 at 03:02PM
via IFTTT
I just posted this on Twitter as @ rvanbruggen
@hopsworks @alafroiskiotos Warmly recommend that you read some of @alafroiskiotos' other posts: https://t.co/Da5OQpYZpF
@hopsworks @alafroiskiotos Warmly recommend that you read some of @alafroiskiotos' other posts: https://t.co/Da5OQpYZpF
— Rik Van Bruggen (@rvanbruggen) Mar 22, 2024
from Twitter https://twitter.com/rvanbruggen
March 22, 2024 at 03:02PM
via IFTTT
I just posted this on Twitter as @ rvanbruggen
Here we are again! Publishing another @hopsworks #5minuteinterview with my colleague @alafroiskiotos - talking about his passion for #distributedsystems and how an #ml #featurestore implements that. See https://t.co/YKm1J3Kd4w
Here we are again! Publishing another @hopsworks #5minuteinterview with my colleague @alafroiskiotos - talking about his passion for #distributedsystems and how an #ml #featurestore implements that. See https://t.co/YKm1J3Kd4w
— Rik Van Bruggen (@rvanbruggen) Mar 22, 2024
from Twitter https://twitter.com/rvanbruggen
March 22, 2024 at 03:02PM
via IFTTT
I just posted this on Twitter as @ rvanbruggen
RT @jim_dowling: Now that Redis open-source is dead, look no further than https://t.co/Uds6edncw4 for an open-source in-memory database. Bu…
RT @jim_dowling: Now that Redis open-source is dead, look no further than https://t.co/Uds6edncw4 for an open-source in-memory database. Bu…
— Rik Van Bruggen (@rvanbruggen) Mar 22, 2024
from Twitter https://twitter.com/rvanbruggen
March 22, 2024 at 07:20AM
via IFTTT
Thursday 21 March 2024
I just posted this on Twitter as @ rvanbruggen
I posted about the synergies between @neo4j and @hopsworks before - but now I would like to show you! Take a look at https://t.co/4Uxka2WBBl
I posted about the synergies between @neo4j and @hopsworks before - but now I would like to show you! Take a look at https://t.co/4Uxka2WBBl
— Rik Van Bruggen (@rvanbruggen) Mar 21, 2024
from Twitter https://twitter.com/rvanbruggen
March 21, 2024 at 10:01PM
via IFTTT
I just posted this on Twitter as @ rvanbruggen
On my way to the #MLOps #Belgium #meetup in sunny #Ghent - looking forward! https://t.co/gnviCuhMxM
On my way to the #MLOps #Belgium #meetup in sunny #Ghent - looking forward! https://t.co/gnviCuhMxM
— Rik Van Bruggen (@rvanbruggen) Mar 21, 2024
from Twitter https://twitter.com/rvanbruggen
March 21, 2024 at 05:10PM
via IFTTT
Wednesday 20 March 2024
I just posted this on Twitter as @ rvanbruggen
Next week, the #graphdb #meetup in #amsterdam is kind enough to host another session in the @Xebia offices: https://t.co/VCB0qbfpdy - and I am planning to show some very cool integrations between @neo4j and @hopsworks . Looking forward!
Next week, the #graphdb #meetup in #amsterdam is kind enough to host another session in the @Xebia offices: https://t.co/VCB0qbfpdy - and I am planning to show some very cool integrations between @neo4j and @hopsworks . Looking forward!
— Rik Van Bruggen (@rvanbruggen) Mar 20, 2024
from Twitter https://twitter.com/rvanbruggen
March 20, 2024 at 04:02PM
via IFTTT
Church
Tuesday 19 March 2024
Love
Monday 18 March 2024
Carwash
Sunday 17 March 2024
White
Saturday 16 March 2024
Rocket
Friday 15 March 2024
I just posted this on Twitter as @ rvanbruggen
It's amazing what this @hopsworks release enables. Now you can create #LLM applications, on your *private* data, in the #cloud or #onprem, at a significantly lower cost. Totally disruptive. https://t.co/5fgitAGUOA
It's amazing what this @hopsworks release enables. Now you can create #LLM applications, on your *private* data, in the #cloud or #onprem, at a significantly lower cost. Totally disruptive. https://t.co/5fgitAGUOA
— Rik Van Bruggen (@rvanbruggen) Mar 15, 2024
from Twitter https://twitter.com/rvanbruggen
March 15, 2024 at 11:26AM
via IFTTT
Ghosts
I just posted this on Twitter as @ rvanbruggen
RT @hopsworks: In honour of the EU passing the AI Act yesterday, we're resharing this piece highlighting how Hopsworks provide guardrails f…
RT @hopsworks: In honour of the EU passing the AI Act yesterday, we're resharing this piece highlighting how Hopsworks provide guardrails f…
— Rik Van Bruggen (@rvanbruggen) Mar 15, 2024
from Twitter https://twitter.com/rvanbruggen
March 15, 2024 at 07:41AM
via IFTTT
Thursday 14 March 2024
I just posted this on Twitter as @ rvanbruggen
Loved the @neo4j #graphsummit yesterday - lots of friends old and new, looking forward on collaborating on synergies with @hopsworks . Maybe you can join us at the #graphdb #meetup in #amsterdam in a few weeks? https://t.co/VCB0qbfpdy
Loved the @neo4j #graphsummit yesterday - lots of friends old and new, looking forward on collaborating on synergies with @hopsworks . Maybe you can join us at the #graphdb #meetup in #amsterdam in a few weeks? https://t.co/VCB0qbfpdy
— Rik Van Bruggen (@rvanbruggen) Mar 14, 2024
from Twitter https://twitter.com/rvanbruggen
March 14, 2024 at 09:01PM
via IFTTT
Petrol
Wednesday 13 March 2024
I just posted this on Twitter as @ rvanbruggen
Even more impressive than the chapel in 2019! https://t.co/bnYQKogMJY
Even more impressive than the chapel in 2019! https://t.co/bnYQKogMJY
— Rik Van Bruggen (@rvanbruggen) Mar 13, 2024
from Twitter https://twitter.com/rvanbruggen
March 13, 2024 at 09:36AM
via IFTTT
I just posted this on Twitter as @ rvanbruggen
RT @hopsworks: Here's the latest episode in our series of @Hopsworks #5minuteinterviews. But - are on a roll again.Thrilled to share this c…
RT @hopsworks: Here's the latest episode in our series of @Hopsworks #5minuteinterviews. But - are on a roll again.Thrilled to share this c…
— Rik Van Bruggen (@rvanbruggen) Mar 13, 2024
from Twitter https://twitter.com/rvanbruggen
March 13, 2024 at 09:32AM
via IFTTT
Iron
Tuesday 12 March 2024
I just posted this on Twitter as @ rvanbruggen
Last chance to register for our @hopsworks event in #london on Thursday! Gonna be great - https://t.co/PinOw7Cdvf !
Last chance to register for our @hopsworks event in #london on Thursday! Gonna be great - https://t.co/PinOw7Cdvf !
— Rik Van Bruggen (@rvanbruggen) Mar 12, 2024
from Twitter https://twitter.com/rvanbruggen
March 12, 2024 at 11:01PM
via IFTTT
Monday 11 March 2024
I just posted this on Twitter as @ rvanbruggen
Attending the @neo4j #graphsummit in #breda on wednesday. Looking forward! See https://t.co/L8xO4PDjaJ to register - hook me up if you want to meet and discuss the synergies with @hopsworks !
Attending the @neo4j #graphsummit in #breda on wednesday. Looking forward! See https://t.co/L8xO4PDjaJ to register - hook me up if you want to meet and discuss the synergies with @hopsworks !
— Rik Van Bruggen (@rvanbruggen) Mar 11, 2024
from Twitter https://twitter.com/rvanbruggen
March 11, 2024 at 04:01PM
via IFTTT
Sunday 10 March 2024
Thursday 7 March 2024
I just posted this on Twitter as @ rvanbruggen
Looking forward to this! Speaking at the #amsterdam #graphdb #meetup about the synergies between @neo4j and @hopsworks . Exciting! https://t.co/VCB0qbfpdy
Looking forward to this! Speaking at the #amsterdam #graphdb #meetup about the synergies between @neo4j and @hopsworks . Exciting! https://t.co/VCB0qbfpdy
— Rik Van Bruggen (@rvanbruggen) Mar 7, 2024
from Twitter https://twitter.com/rvanbruggen
March 07, 2024 at 07:30AM
via IFTTT
Wednesday 6 March 2024
I just posted this on Twitter as @ rvanbruggen
If you haven't for our in-person @hopsworks event in #london yet, then today's probably the best day to do it: https://t.co/PinOw7Cdvf - it's truly amazing what you can do with #llms on private data these days!
If you haven't for our in-person @hopsworks event in #london yet, then today's probably the best day to do it: https://t.co/PinOw7Cdvf - it's truly amazing what you can do with #llms on private data these days!
— Rik Van Bruggen (@rvanbruggen) Mar 6, 2024
from Twitter https://twitter.com/rvanbruggen
March 06, 2024 at 11:01PM
via IFTTT
Tuesday 5 March 2024
I just posted this on Twitter as @ rvanbruggen
Great news for all @hopsworks and @DeltaLakeOSS users! https://t.co/kVRTxHdcNk
Great news for all @hopsworks and @DeltaLakeOSS users! https://t.co/kVRTxHdcNk
— Rik Van Bruggen (@rvanbruggen) Mar 5, 2024
from Twitter https://twitter.com/rvanbruggen
March 05, 2024 at 07:43PM
via IFTTT
Sunday 3 March 2024
I just posted this on Twitter as @ rvanbruggen
It's been 3 years. Time heals all wounds, but I will never forget chasing this beautiful human being through the streets of London on our way to a @neo4j meeting. #covid sucked. https://t.co/FmBsdhfAvv
It's been 3 years. Time heals all wounds, but I will never forget chasing this beautiful human being through the streets of London on our way to a @neo4j meeting. #covid sucked. https://t.co/FmBsdhfAvv
— Rik Van Bruggen (@rvanbruggen) Mar 3, 2024
from Twitter https://twitter.com/rvanbruggen
March 03, 2024 at 11:58AM
via IFTTT
Friday 1 March 2024
I just posted this on Twitter as @ rvanbruggen
RT @jim_dowling: I have been busy the last few months writing a book for O'Reilly about how to build ML systems (batch, real-time, and LLMs…
RT @jim_dowling: I have been busy the last few months writing a book for O'Reilly about how to build ML systems (batch, real-time, and LLMs…
— Rik Van Bruggen (@rvanbruggen) Mar 1, 2024
from Twitter https://twitter.com/rvanbruggen
March 01, 2024 at 08:09AM
via IFTTT
Thursday 29 February 2024
I just posted this on Twitter as @ rvanbruggen
As a sidenote: I remember from my @neo4j days how hard it is to get a book like that pulled together. Especially an @OReillyMedia book. I also fondly remember banter with @jimwebber about the best #neo4j book. Never quite resolved that.
As a sidenote: I remember from my @neo4j days how hard it is to get a book like that pulled together. Especially an @OReillyMedia book. I also fondly remember banter with @jimwebber about the best #neo4j book. Never quite resolved that.
— Rik Van Bruggen (@rvanbruggen) Feb 29, 2024
from Twitter https://twitter.com/rvanbruggen
February 29, 2024 at 04:33PM
via IFTTT
I just posted this on Twitter as @ rvanbruggen
This is quite a big thing for the #ai, #ml and #llm communities out there: @jim_dowling just published the 1st chapter of his new book on building #mlsystems with a #featurestore. See https://t.co/50DWGObb4r
This is quite a big thing for the #ai, #ml and #llm communities out there: @jim_dowling just published the 1st chapter of his new book on building #mlsystems with a #featurestore. See https://t.co/50DWGObb4r
— Rik Van Bruggen (@rvanbruggen) Feb 29, 2024
from Twitter https://twitter.com/rvanbruggen
February 29, 2024 at 04:33PM
via IFTTT
Wednesday 28 February 2024
I just posted this on Twitter as @ rvanbruggen
Have you registered for our inperson @hopsworks #event in #London yet? Take a look at https://t.co/PinOw7Cdvf for more details.
Have you registered for our inperson @hopsworks #event in #London yet? Take a look at https://t.co/PinOw7Cdvf for more details.
— Rik Van Bruggen (@rvanbruggen) Feb 28, 2024
from Twitter https://twitter.com/rvanbruggen
February 28, 2024 at 11:01PM
via IFTTT
Tuesday 27 February 2024
The 3 Whys of Feature Stores for Machine Learning & AI
Start with 3 Why’s
Quite a few years ago, I read a really intriguing book by Simon Sinek: Start with Why. The subtitle actually gives away the essence of the book: How Great Leaders Inspire Everyone to Take Action. Spoiler alert: they do so by explaining WHY something needs to get done, before explaining how and what needs to get done. It’s a very simple, but in my experience, important and intuitive way to effectively communicate something to any audience. Whether you are communicating to customers, co-workers or your kids - the WHY usually paves that way for much smoother discussions and actions. Sinek talks about the Golden Circle, which outlines how starting from the inside (why) and working towards the outside (what) is an effective method of any communication strategy.Since I started working for Hopsworks, I have had this framework in the back of my mind, as I got to talk to many more users, customers and partners that have been adopting the amazing technology that the team has built. In these discussions, it actually became clear to me that there are three different “WHY” questions that we need to answer for our community, if we want to be successful in the marketplace. At the risk of misusing the golden circle visualization, I have tried to put these 3 questions in 3 concentric circles in the figure below:
As you can see, you move from the OUTER circle to the INNER circle, and you try to address the following 3 questions:
- Why would you consider using a feature store architecture in the first place? If you would find enough solid reasons for doing so, you would proceed to the next “Why” question, being:
- Why would you NOT BUILD, but instead BUY a feature store for your data platform architecture? And if you find enough reasons to BUY and not build, then you would consider the last and final “Why” question, being:
- Why would you specifically choose to buy the Hopsworks feature store for your data platform architecture?
So let’s explore these three WHY questions, and their answers, in a bit more detail.
1. Why Consider a Feature Store for ML/AI?
It’s pretty clear that not everyone needs a Feature Store. A data platform like that is quite specific to the ML/AI workloads, and would only realistically be required or used by organizations and teams that have quite a deep understanding and investment into the relatively new fields of machine learning and artificial intelligence. If all you have in your environment is an early stage experiment with ML/AI technology, then most likely you do not yet have a need for a feature store - seems logical, right? So: what are the conditions under which you would want to consider it? What are the reasons for implementing a Feature Store in your organizations? Let’s explore this!Many of these reasons were actually outlined in an earlier article on the Hopsworks Blog, and I believe that the reasons for considering a Feature Store are accurately described there. In this overview, I would like to make the distinction between technical and non-technical (as in, business / organizational / competency-related). Let’s dig into it:
- Technical reasons for considering a feature store:
- Existing models running in production are expensive - they are hard to debug, review and upgrade, they are bespoke systems that are difficult and costly to maintain. There’s a growing body of evidence that ML/AI systems that do NOT have a feature store architecture in the backend, are simply too expensive because of that - see other points.
- Monitoring production pipelines is challenging, or impossible. The data that powers AI changes over time, and identifying when there are significant changes that require retraining your AI is not easy.
- Difficulties in managing the lifecycle of feature data, including the tracking of versions and historical changes. This is an elementary requirement for all regulated data processing environments - and a key reason why feature stores align so well with these industries’ requirements.
- Feature data is not centrally managed; it is duplicated, features are re-engineered, and generally data is not reused across the organization.
- Non-technical reasons for considering a feature store:
- Valuable models are created but once the experimentation stage is over they do not bridge the chasm to operations - the models do not consistently generate revenue or savings. This is all about getting the models to deliver value, consistently.
- No cohesive governance in the storage and use of AI assets (feature data and models), everything is done in a bespoke manner, leading to compliance risks.
- Slow time-to-market of AI models, and a general inability to provide very fresh feature data or handle real-time data for ML models, which is critical for industries like finance, retail, or logistics where real-time insights can add significant business value. This point is all about the speed with which the data science team can develop their models and bring them to life in a production environment.
- Hard to derive a direct business value from the models, they exist in isolated environments that do not directly influence business operations. This then obviously makes it much harder to justify the investments required to develop and operationalize the models.
- Slow ramp-up time when onboarding new talent into the ML teams. Sharing available AI assets is complex because operational knowledge is held by a few individuals or groups.
So now we know and understand why an organization requires a feature store - great! But that does not necessarily mean that they will actually go out to look for one in the marketplace! Many organizations, especially the “digital natives” that are tuned in to the latest technology trends (like ML/AI) nowadays have a tendency to at least consider building a software component themselves - instead of buying one. This is a good and worthwhile consideration, as it seems clear to me that there is a minimum of scale and maturity required before wanting to go “all-in” on this brand new technology. For many people, a homegrown solution might be “good enough”.
So how do we consider whether or not a roll-your-own solution is good enough or not? Let’s consider some criteria.
2. Why NOT BUILD, but BUY a Feature Store?
In the second layer of the diagram below, we consider some of the reasons / criteria that would warrant you to look at the BUY option instead of the BUILD option. Some of these reasons have also been covered in a previous article, but let's revisit it here.
The most common reasons for buying and not building a feature store are:
- Maintenance Burden & Total Cost of Ownership (TCO): clearly, this is something that every mature IT organization will consider. Ultimately, this is related to the potential technical debt that this organization will want to incur, given the significant costs that could be associated with this down the line. It’s important to consider not just the short term, but also the longer term implications of a build vs. buy decision.
- Technical complexity: clearly, a piece of infrastructure software like a Feature Store, which will underpin all ML/AI applications that the organization would choose to develop, has a significant amount of technical complexity associated to it. It’s important to consider this, and to investigate the most crucial domains in which a “build” approach could encounter unexpected technical challenges.
- Offline / Online sync: one of the key characteristics of a feature store is that it will both contain the historical data of a feature dataset, as well as the most recent values. Both have their use and purpose, and need to be kept in sync inside the feature repository. Feature Stores like Hopsworks do this for you, but in a “build” scenario you would need to take this into account and do all the ETL data lifting yourself.
- Reporting and search: in any large machine learning system where you have dozens/hundreds/thousands of models in production, you would want and need the feature data to be findable, accessible, interoperable, and reusable - according to the so-called “F.A.I.R.” principles that we have described in this post. This seems easy - but if you consider all of the different combinations that you could have between versions of datasets, pipelines and models, it is clear that this is not a trivial engineering assignment.
- Metadata for versioning and lineage: similar to the previous point, a larger ML/AI platform that is hosting a larger number of models, will need metadata for its online and offline datasets, and will need to accurately keep track of the versions and lineage of the data. This will increasingly become a requirement, as governance for ML/AI systems will cease to be optional. Implementations of and compliance with the EU AI Act, will simply mandate this - and the complexity around implementing it at scale is significant.
- Time-travel and PITC joins: if we want to make the predictive results of our ML/AI systems explainable, we will need to be able to offer so-called “time travel” capabilities. This means that we can look at how a particular model yielded specific results based on the inputs that it received at a specific point in time. Feature Stores will need to offer this capability, on top of the requirement to guarantee that the models yield accurate and correct information at a given point in time - something we call “Point-in-time correctness”. Again, the technical complexity of implementing this yourself is not to be underestimated.
With that, we hope to have outlined some of the key reasons that you should consider buying, not building your feature store solution. At the end of the day this is a strategic decision that will be different for every organization - as long as the question is honestly asked and answered.
3. Why buy the Hopsworks Feature Store?
Last but not least, we would also like to offer the readers that have a) first decided that they need a feature store, and b) also decided that they will want to buy such a critical piece of infrastructure and not build it themselves, a perspective on why Hopsworks might be the best choice for your environment. In line with the previous “Golden Circle” visuals, we now get to the “inner” circle of the diagram:Obviously we are conscious of the poor readability of the diagram, so here’s a cut-out that is a bit more readable:
As you can see, we think that there are essentially 4 main reasons why the Hopsworks Feature Store solution could be the best possible fit for your environment. Let’s discuss each of these briefly:
- Performance and HA: Hopsworks has been working on the Feature Store for a number of years, with a top team of academic and industry specialists. We have integrated and embedded the best possible technologies, like for example RonBD, on the market, and have proven that this is currently giving us unparallelled performance. Take a look at these open benchmarks for yourself, and you will see that Hopsworks is in a league of its own with regards to performance. On top of that, we have been leveraging expertise in systems High-Availability to develop a feature store solution that can withstand the most demanding workloads.
- Flexible deployment (serverless / cloud / on-prem): Hopsworks is the only solution on the market that offers you the choice of deployment options that is best-suited for your specific environment. You can start small with a multi-tenant-based serverless environment, grow into a managed cloud deployment in your AWS / Azure / GCP account, or even repatriate the workload onto your own, on-premise hardware. No other solution offers this, today.
- Governance and compliance: Hopsworks has taken great pains at developing industry leading governance capabilities into the product. Versioning, lineage, time travel, search, security, monitoring and reporting - all of the advanced functionalities that a compliant solution will be required to deliver, now and in the future.
- Value for money, TCO: Hopsworks believes that in order for ML/AI to be successful, it needs to deliver value, and it needs to offer its users a clear Return on Investment. That means that the solution needs to be available at a reasonable price, and that consumption-based metrics cannot always be used for billing. We need to allow for testing, training, experimentation, learning and development - without requiring the customer to empty their pockets from day one, and all the while managing the total cost of ownership of the solution.
All the best
Rik