An Exciting POC With Layer Just Started
We have started a new journey together with the Layer team. But first, let me summarise what we are doing and why we have started this journey.
As SmartIQ, all our services depend on information extracted from a very huge vehicle data (around 20M rows ) with the help of machine learning algorithms. We are trying to develop the best #vehicle #price #prediction service based on our data. This requires a lot of effort if you want to do a point shot and replace people’s habits with this topic.
The problem of precise prediction gets worse in a very dynamic market like Turkey. All your data models are directly affected by external factors like currency rates, inflation, new car prices, etc.
Even though we have a very compact data science team up to the day, we have managed to solve our problems with clever approaches. But every day our engineers and data science team face a new problem or finds an improvement area to increase our accuracy.
On the other hand, the customer’s usage started to change from a decision support system to more like an operational system.
This increases the stress on our shoulders when we want to try a new ML algorithm.
(We need to test all the impacts for a very large car variation set)
As business owners, we also want to have some visibility about what is going on within the system, like the data set being used to train the current algorithm, or our agility, our pace when we need to change the model or training data set.
I mean you do not need to learn a new UI, all you need to do is prepare your config files and push them to Layer. (You can enjoy your coffee while Layer handles all the tasks you have given to it )
We all have some idea when we talk about DevOps, but what the hack is #MLOps?
- Creating data pipelines and being able to reuse them
- Creating a model catalog
- Creating a dataset catalog
- Versioning the dataset and machine learning models
- Managing the required hardware resources to train the models.
- Data source&quality management
- Data Science team collaboration
- Continuous testing of #ML #models
Our Use-Case: Facelift Detection and Car Build Year Detection
This is one of the resource-intensive topics we were dealing with. We are trying to understand if a car has a facelift and try to guess the production year of the car by checking the images
Hint: Facelift directly affects the output of our vehicle pricing API, can you imagine how critical it is for us , for you? If you need more info about Facelift in automotive, you can check here .
Reasons for selecting Facelift Detection:
- Facelift directly affects the output of our vehicle pricing results
- Need to push the model training to a GPU farm and this is resource-intensive
- The size of the model training data is very huge
- The model can change too frequently
- Having a dynamic pipeline where we can reuse some parts
- Layer to handle all the GPU resources and scaling needs in the background
- Having visibility of which #MLModel links to which #dataset
- Being able to understand the resulting impact of each model.
As a first step, the Layer team created a prototype in their environment for a specific brand.
(During POC we will start with Yolo V3, and try different models.) Below you can see the approach:
After creating the Layer artifacts we will get access to the environment and expand the use to all the brand&model combinations (around 95000), train our models, extract the #facelift information and push it back to our #vehicle #data #catalog. The configuration of the project will be like below.
That is all for now. I will share more experience in the upcoming posts.