Estimate Weight From a Photo Using Visual Regression in Edge Impulse

From the kitchen to the factory floor, it’s often helpful to know the weights of the objects you are working with. For example, a baker may need to add specific weights of ingredients to their batter to create the perfect cake.

Many of our most useful appliances could become even more helpful if they could estimate the weights of objects. For example, a smart oven could adjust the temperature and cook time based on the weight of the vegetables or meat placed inside, making sure it gets hot enough to kill off harmful bacteria. A washing machine or dryer could adjust cycle time and intensity based on the mass of a load of laundry. And agricultural equipment with the ability to measure mass could help farmers harvest food at exactly the right moment.

It’s not always practical—or possible—to integrate weighing scales into a piece of equipment. But what if we could estimate the mass of objects using other information, such as visual input from a camera? We could combine this visual input with a machine learning algorithm to predict the mass of an object.

Regression models

You can easily train this type of model in Edge Impulse using a technique known as regression. Regression models can take any kind of input and return a numeric output—so they can learn to take in an image and output a predicted weight.

For an example use case of my project, I decided to use a common household object as the subject of my images. In my case, I ended up using a total of 2 cups of rice (approximately 400 grams) in order to train this model. The goal was to train a model that would accurately predict the weight of a pile of rice, in grams, with a 10% or less error percentage.

Once trained, the model performed with an accuracy of 99.02% on the Testing dataset. It transferred very well to functioning in the real world, achieving nearly perfect accuracy when testing on different piles of rice.

Tips and tricks for visual regression

From my experience working on this project, some things to keep in mind are that you will need a large amount of data in order to get the model performing accurately. For my example on trying to predict the mass of a pile of rice, I had to collect a total of 50 images for each 10 grams up to 400 grams, totaling 2050 images. Each image in your dataset should be labelled with the weight it represents.

It’s also worth noting that when collecting the data, you must be sure to maintain a consistent camera angle and maintain a consistent distance between the camera and the subject.

Training a visual regression model

To get started on training a visual regression model, you must first collect a dataset following the guidelines and tips that I have mentioned above. Then, create your Impulse with an Image preprocessing block and a Regression block.

You should also choose an appropriate input size based on the content of your images. I started using a 96 by 96 input size, but my lighting conditions at home were making the grains of rice very blurry—so for my project I had to use 160 by 160. The best optimal input size will depend on your specific dataset, so feel free to play around!

The Image block also lets you decide whether to use RGB or grayscale inputs. I found that RGB worked best for my dataset, since in grayscale images the grains of rice would somewhat merge into the background, but again this is extremely dependent on your dataset, so play around until you get the best result possible!

In the Regression block you can start with the default architecture. Your model will accept an image as its input, but it will output a single scalar value representing the predicted weight.

Depending on your dataset, you may need to experiment with different model architectures to ensure that your model converges properly and trains well, but for my dataset, the default architecture—with some additional epochs—ended up with the best accuracy for the least train time.

Since we’ve trained the model in Edge Impulse we can now deploy it to nearly any embedded device, from a Cortex-M4 microcontroller as featured in the Sony Spresense or Himax WE-I Plus to a powerful embedded Linux board like the Raspberry Pi 4 or NVIDIA Jetson Nano.

Next steps

You’ve now had a quick overview of training a simple visual regression model. Thank you for reading this blog post! If you’d like to take a deeper look at the project from this article you can access the public project here:

https://studio.edgeimpulse.com/studio/40106

The best way to learn more is to train your own regression model. Give it a try and let us know how it goes on the Edge Impulse Forum, or via Twitter (@edgeimpulse) or LinkedIn.

For some general information on regression in Edge Impulse, take a look at Predict the Future with Regression Models. And if you have any questions, don’t hesitate to ask over on the Edge Impulse forum.


Aditya Mangalampalli is a machine learning intern at Edge Impulse.

Comments

Subscribe

Are you interested in bringing machine learning intelligence to your devices? We're happy to help.

Subscribe to our newsletter