Accessability Links

Machine learning, deep learning and clean data - how do they work together?

Deep learning and machine learning are often used interchangeably but there are differences between the two. Machine learning is an umbrella term that describes the process of an algorithm making decisions based on available data, then it’s able to make correct future decisions correctly in similar circumstances, slowly learning the nuance of each factor based on the success of each action.

Machine learning

The difference comes in how both systems determine success, machine learning will usually be a binary yes/no. An example of basic machine learning would be a conveyor belt running at a certain controlled speed, as an experiment the engineer increases the speed of the belt to improve productivity. If a sensor recognises that it’s causing jams along the process, it can slow the belt down to restore order. By changing the pace of the belt to find the optimum efficiency, the algorithm knows, from these few simple data inputs, how fast to run without a human to adjust it.

Deep learning

Deep learning is a far more sophisticated version of machine learning, where it can consider hundreds of different inputs from all kinds of data sources in real time. A simplified example would be the same conveyor belt, considering the time of day and exhaustion of the workers and maybe running a little slower towards the end of a shift. It might consider who’s clocked in and how fast they have worked in the past, maybe there’s a large shipment coming up where higher than usual numbers are required. Deep learning is when the algorithm is able to calculate the importance of all these factors without the input of an engineer, and the more data you can provide it, the more effective it becomes.

In short, machine learning is trial and error, and will improve as it runs more successful trials. Deep learning is about the algorithm understanding every factor and element of the data. We are faced with deep learning every day. When YouTube suggests a video, it’s based on your demographic, what you’ve been watching and how long you watched it for, what you’ve liked and disliked, what other people are watching, the time of day, the age of the video and so on. Deep learning is brilliant at what it does, with users watching a billion hours  of footage a day, it’s getting a lot of feedback on its process.

Clean data

The YouTube algorithm is able to improve because it gets a lot of ‘clean’ data, Google, who own YouTube, are some of the best data processors in the world and they’ve built a framework that ensures the supercomputer that run the algorithm is getting exactly the information it needs to be effective. Clean data is free from inaccurate, corrupt and irrelevant records and means the algorithm is only processing data that will help it improve.

At Capita IT Resourcing, we understand the importance of clean data and all forms of machine learning and how they contribute to technological progress. We’re always looking out for talented data analysts and specialists. If you’re interested in joining a company with the resources to change the future of computers, browse our careers page here.

Add new comment

Meet the team

Back to Top