Feast - its an open source feature store for machine learning.
At this point you might be asking me why are we thinking about this project? Is it dead on arrival? Instead of directly answering this, we’ll talk through some of the design goals in Feast, and where I perceive to be the gaps and deficiencies which we should think about.
Feature Store Design
In Feast, the feature store is described as three “layers”. It is presented as
If we place this back into our lambda architecture diagram, we can begin to see some patterns
Okay, so based on this model, what does Feast presume? As alluded, one approach is to simply assumptions. Infact Feast makes use of a pull model in order to satisfy the demands of a real-time and batch processing as part of the serving layer in the infrastructure!
Implicit in other aspects of Feast, is that the incoming data is already appropriately modelled and suitable for ingestion. If the data being provided isn’t denormalized; unfortunatley Feast can’t really “deal” with this input. This is made more explicit when we consider the Lambda Architecture diagram.
What are the Gaps
One of the key goals of Project Maquette is to create something truly end-to-end without fancy infrastructure. Even considering Feast, there are still gaps in the lack of support for normalized data store (we’ve assumed that the data being ingested is already denormalized).
What should our software applications consist of?
To start off with let’s see how far we can get using standard libraries only. This would mean that we will rely on:
sqlite3
For the serving and storage of data. For the analytics purposes at a later stage, we may make use of Pandas, though the serving of data may be made easier through using a webframework, we’ll also consider how far we can get without those dependencies.
Afterall wouldn’t it be interesting if we could serve a working feature store using out of the box Python?