Applied Machine Learning Life Cycle for Computer Vision Tasks

posted in tech

Machine learning projects are on everyone's lips, but from customer projects we know that the implementation of AI projects is a mystery to many. That's why we will show you how the life cycle of our machine learning projects looks like in a series of blog posts. Our target audience for this series are project managers, engineers, decision makers, and everyone else planning an AI project.

In this first part of our series we're going to briefly touch upon the single project phases. We're also going to discuss special challenges we face in the field of computer vision. In later blog posts we will take a closer look at each project phase.

Machine learning projects start like any other technology project. There is a problem or a need, and we begin to explore the task and discuss possible approaches to solve it. However, the execution of AI projects is fundamentally different from traditional technology ventures, because they are of more iterative and explorative nature. This is why every machine learning project is carried out in a life cycle process.

Rule of Thumb: Start Small, Fail Fast

Machine learning projects always involve a high degree of uncertainty in terms of workload and result quality. To minimize the risk and investment for our customers we strictly follow the start small, fail fast philosophy. This means we build a feature complete system with the minimal possible workload to get a fast feedback if the model and available data play well together. Then we improve the data and model in iterations (one iteration is a complete life cycle run) to raise the result quality to the needed level.

Let's look at the life cycle phases.

1. Data Collection

Machine learning models should solve a given problem on the basis of data. Therefore everything starts with collecting enough samples with proper metadata.

Quality, quantity, and the balance of the data are the decisive points in data collection. The more data we have and the better the quality and balancing is, the better the model will learn and predict accurately.

The quality of the samples is important because wrong or misleading samples or metadata (called noisy data) will confuse the model and dramatically lower the prediction quality. We can improve the quality with data cleaning (phase 2 of the life cycle).

Having balanced data means to have roughly the same amount of training data for each class. Unbalanced training data can lead to biased models as classes are not represented equally.

In computer vision projects we often face a lack of training data (correctly labeled images). To increase quantity and improve balance of the data we might be able to use data synthesis to create training data programmatically ourselves. This process can be very complex and there are various methods to do this.

Another common method to create more data is called data augmentation. We create additional data by modifying existing samples, e.g. through random cropping, adding noise, changing colors or brightness.

2. Data Preparation

Let's say we have collected enough data, then we need to create a structure we can feed the model with.

We clean the data by identifying noise, false or misleading data and correct or remove it from the training set. Additionally, we preprocess the data to normalize it. In our cases this mostly mean scaling or cropping images, converting them into a relevant format and creating a folder structure we can use for training.

Collecting, cleaning, and preprocessing data are our biggest and most time-consuming challenges. It is not unusual to spend a major portion of the project time for these tasks.

3. Model Evaluation and Training

During model evaluation we take a closer look at different models and model architectures in order to find out which architectures work well with certain data and certain problems.

There are models that work well with text, e.g. translation, term classification. Other models work well with images, e.g. classification models, detection models, or localization models. Our experience, best practice orientation, and scientific research lead us to the appropriate model for our current project.

Before we start training the model we split the training data set into actual training data (the majority of the data, let's say 75%), validation data (10%), and test data (15%). The actual distribution can vary depending on the amount of data available. Training data and validation data is used for model training. The test data is used after the training to validate the model performance with unseen data.

An example of how we split the training data, and where the data sets are used during the life cycle.

To train a model in the field of computer vision is more complex and time-consuming than text-based machine learning tasks. This is because we use deep and complex models, and the needed data for these models tend to be very large, up to terabytes. Calculation is therefore very time consuming.

4. Model Validation

After finishing the training as described above, we assess the quality of the model. We work with the model to understand its behavior: which aspects are already solved very well, and which are not. By inspecting the visual data we interprete necessary changes to the training set in order to optimize result quality. An adjustment could be for example to collect or synthesize more data from a specific category.

Sometimes we even have to change the model architecture, especially if we find that the model either cannot grasp the task or just memorizes the training set (under- and overfitting).

5. Comparison and Feedback

In this step it is time to share the progress we've made so far with our customer. We present our findings on the quality and condition of the model, we show what worked and what did not work. A good teamwork with our customer is significant here. Together, we discuss possible improvements of the model, for example gathering more data and where to get this data from. In close cooperation, we plan the next iteration of model training.

6. Deployment

The deployment of our current model version acts as the quality base line for the following training iteration. If the model already adds value for the customer, it can be integrated in his prototype or even in production. Meanwhile, we begin the next iteration of training, the life cycle starts again.

Stay tuned for our next article of this series. We're going to talk about all things regarding data: how we collect it, how we clean it, and how we preprocess it. If you need an experienced helping hand with your AI project, just get in touch with us.

Related Content You Might Like