With the increasing development of cloud computing, data are generated at an enormous scale. Data are generated by billions of users everyday, with the majority of them being generated by mobile devices. A lot of these data are generated every second, which is why they are referred to as streaming data. For example, people can use their phones to take pictures and upload it to the internet, or stream videos on YouTube. This however, can be tedious work for someone who wants to process these data. This is where machine learning comes in. Machine learning is a subset of artificial intelligence that trains computer models to make predictions based on previous data.
With cloud computing, machine learning can utilize an enormous number of machines to train the model. This is done with the help of lambda architecture. With lambda architecture, data are processed in parallel with the input data being split into multiple parallel streams. This allows for parallelism with the model being trained by the data. By using many nodes, the training of the model requires much less time than if the model was not being trained in parallel. However, without proper parallelism, it would take an extremely long time for the model to be trained.
The model is then processed by a node called Append-Only-Memory (AOM). The AOM node takes in all of the information about the model and stores it in a log file so that it can be used later while making predictions.
Once the model has been trained, an algorithm is used to make predictions off of the log files for different data sets. The algorithm does this by taking in a new data set and comparing it to previously recorded data. If it doesn’t find anything similar to the new data set, it will assume that it is a new event and will add to the log file for that specific data set. However, if it finds something similar, then it will update the log file and the training phase will start over again.
This concludes my article on web development and machine learning.