CONTACT US
By Jason Chandralal On 30 Sep 2022
Machine learning (ML) enables embedded systems to learn automatically from existing data and to use this captured knowledge to independently make predictions and conclusions.
Embedded devices used for machine learning applications can fulfill many tasks especially at the edge and it is expected that AI-enabled IoT shipments will grow at a rapid rate in the coming years. Several new applications, such as voice assistants, have been made possible due to the progress in the field of neural networks. This is in turn increasing demand on processing capabilities at the endpoint.
What is embedded ML?
Significant recent advances in microprocessor architecture and algorithm design have made it possible to run complex machine learning workloads on even the smallest of microcontrollers. TinyML is a foundation that supports Embedded ML on devices through open framework. Expanding upon the definition by TinyML.org, Tiny machine learning is broadly defined as a fast-growing field of machine learning technologies and applications including hardware (dedicated integrated circuits), algorithms and software capable of performing on-device sensor (vision, audio, IMU, biomedical, etc.) data analysis at extremely low power consumption, typically within the range of mW and below, and hence enabling various always-on use-cases and aiming at battery operated devices. The key advantages are bandwidth, latency, cost savings, reliability and privacy.
There are primarily three types of Learning –
Furthermore, there are different types of classification models to be used depending on the data you are going to process such as images, sound, text, and numerical values.
Some of them are –
If you are planning to explore or evaluate using Embedded ML for your devices, there are some key points that you need to consider before you begin deployment.
An approach to Testing ML Systems:
Testing is a technique used to make sure something functions as planned. In order to prevent poor development, rising downstream expenses, and lost time, we are encouraged to implement tests and identify sources of mistakes or failure as early in the development phase as possible.
Code:
The foundation for testing any software is using a framework that adheres to the Arrange Act Assert Cleaning methodology. In Python, there are many tools such as Unittest, Pytest, etc that allow for any implementation and have a lot of built-in functionality such as parameterization, filters, markers and aligned to coverage tools to validate many conditions at scale.
Data:
A machine model is trained using the data that is collected and provided to it, which makes data collection a critical factor in the machine learning process. Accuracy of a model of a machine learning model is totally dependent on the Data that is provided. Effective machine learning (ML) algorithms require accurate Data to make accurate predictions and decisions in the real world.
Data can be collected from various resources or databases. Data that is collected for a particular problem in the right format is termed as a dataset. As far as embedded devices are concerned, data can be collected from various sensors and actuators. Different sensor data can serve as different attributes combined to form different datasets. A dataset is analyzed to form a pattern and make predictions further.
The data that is collected is further segregated into training and test data. To validate your data, you need to first understand the use of training data and test data.
Once we define what our data should look like, we can use, expand and adapt these expectations as our dataset grows. The key expectations to validate in data is rows/columns, individual values, and aggregate values. This data needs to wrapped around a dataset module.
To test the accuracy or correctness of the data, Great Expectations library can be used, which is quite efficient. The great expectations library covers all the common issues being faced and provides an upper hand in automation due to its flexibility in testing the data. For instance, if a machine learning model is being implemented to detect faulty gears, the expected dataset will have faulty gears and functional gear details and if we look at the data collection closely, there would be certain data patterns. But when the data is being collected, a faulty gear data would be stored as a functional gear data which will in turn affect the machine learning model development. So, it becomes important to check whether the data being collected is correct. Expected dataset collection can be tested by applying certain rules or conditions where one checks the collected data (faulty, functional) follows the same pattern of a faulty or functional gear data.
To implement automated codes to test the data, GE libraries can be used to check certain conditions and assert functions that can be applied at required checkpoints. Once the expected checkpoints are created, one can test it on incoming data or data that has already been collected. If the data, i.e. row, column, or an individual value doesn’t satisfy the conditions created, it can be discarded. These tests can be run on pipelines via Makefile, or workflow orchestrator like Airflow, KubeFlow Pipelines, etc. One can also use the Great Expectations GitHub Actions to automate validate data pipeline code. Lastly, create data documentation for the validation tests & runs. For example, if you use Great Expectation Open-source library, the CLI supports this automatically.
Models:
The final phase of testing ML systems would be to test the models that are selected across various phases such as training, evaluation, inference, and deployment.
Training – Here we write tests while we’re developing our training pipelines so that we can spot errors quickly. For instance, we can write tests to check whether the data being fed for training the model or what percentage of data is fed to train the model etc.
Evaluation/Behavior Testing – Behavioral testing can be termed as testing of input data and expected outputs. During these tests the models are treated as a black box i.e. in this type of testing the main focus would be on the data that is fed and what kind of expected output it predicts. This type of test is done on different data patterns. So we can say that these tests are a sanity test on the model behavior.
Calculated Inference – Ultimately once the model is built and deployed, it is the end users who will be using the model for deriving conclusions. It is a best practice to test all scenarios before deployment.
Deployment – Once we are certain that the model works in a customer equivalent or accepted environment, we can run system level tests on the model to determine the quality and efficiency.
The greatest potential for the next computing revolution lies in scaling ML to the billions of smaller, power-constrained endpoint devices. Multiple companies in the embedded space are coming out with innovative solutions to accelerate ML workloads on existing controllers/devices and open communities like TinyML are bringing together industry and academia to provide the right platform to come out with the best technologies.
No related posts.
is general manager of Product Engineering Services at Happiest Minds Technologies. He is responsible for defining and leading Test Engineering solutions in the area of Datacenter Technologies specially focused on SDN and NFV technologies. Jason is also responsible for embedded and Systems and device related technologies associated with Internet of Things and Industrial Automation. He has over 24 years of experience in Telecom, Datacom, Networking and IoT product development and testing working across Network Equipment Provider, Telecom Service Provider, Hi-Tech and Manufacturing customers with specialization Networking and Testing, QA. Before, Happiest Minds, Jason has held multiple senior roles in Large PES organizations based in India and did work stints in US and Europe.
Jason Chandralal is general manager of Product Engineering Services at Happiest Minds Technologies. He is responsible for defining and leading Test Engineering solutions in the area of Datacenter Technologies specially focused on SDN and NFV technologies. Jason is also responsible for embedded and Systems and device related technologies associated with Internet of Things and Industrial Automation. He has over 24 years of experience in Telecom, Datacom, Networking and IoT product development and testing working across Network Equipment Provider, Telecom Service Provider, Hi-Tech and Manufacturing customers with specialization Networking and Testing, QA. Before, Happiest Minds, Jason has held multiple senior roles in Large PES organizations based in India and did work stints in US and Europe.
Read other blogs by Jason Chandralal
These blogs might interest you
by Jason Chandralal on 26 Mar 2023
by Monika Aggarwal on 24 Mar 2023
by Anand Kumar Loganathan on 15 Mar 2023
by Dipanwita Kundu on 10 Mar 2023
Tag Cloud
Subscribe for blog updates
Technology Focus
News & Events
RESOURCE CENTER
ABOUT HAPPIEST MINDS
Happiest Minds enables Digital Transformation for enterprises and technology providers by delivering seamless customer experience, business efficiency and actionable insights through an integrated set of disruptive technologies: big data analytics, internet of things, mobility, cloud, security, unified communications, etc...
© Happiest Minds 2023 Terms and Conditions Privacy Policy