Data Collection Planning

Before you build an application, it is important to create a data collection plan. This involves determining what sort of metadata you are going to capture as well as what type of events you are detecting.

We recommend when starting a new application that you first collect a small amount of training data and validate your results in a proof of concept. Once you have validated your initial training set, collect a larger amount of data using what you’ve learned during the initial proof of concept.

../../_images/image_1.png

Determining your metadata

Metadata are custom properties that you can save to the files you capture that allow you to filter your sensor data based on characteristics of the files. Metadata properties are normally attributes about the subject or object you are recording. This is a very important feature. Let’s go over a couple examples of when this is useful:

  1. If you are building a motor fault detection application, you can save the motor size of the motor you are recording. When you start to build a machine learning algorithm you might find out that you need two models to get accurate results: a small motor model and a large motor model. Since you saved the motor size as metadata you can easily split the models.

  2. You could save the subject ID or motor ID as metadata. The subject ID would allow you to ignore certain subjects if you find that their data was not recorded correctly or maybe one subject/object is an extreme outlier from the rest of your data.

Determining events of interest

Events of interest are ultimately the main goal of your application. These are the events you want your sensor application to be detecting.

Once you can determine your events of interest you can capture them and train a SensiML Knowledge Pack with a machine learning algorithm for detecting those events.