The Big Deal About Big Data

gsam javier rodriguez alarcon

In the present day, we are creating more data than ever before and at an exponential rate. This information can be used for purposes that were unprecedented when data was first collected. Enhancements to technology and computing power have been critical in making sense of the data that is available globally. The growth in distributed databases, where data is stored via a centralised database across several platforms instead of a single platform, allows for highly-scalable parallel processing of vast amounts of data. This can decrease processing time by several orders of magnitude for many applications.

Yet, faster computers and bigger databases do not solve the predicament of digesting the continuous stream of data that we now have access to. As a result, data processing algorithms have evolved from simple data processing to learning how to process. This approach is called Machine Learning.

There are two basic types of Machine Learning algorithms: supervised and unsupervised. Supervised Machine Learning algorithms can make predictions based on historical observations. They analyse historical data (‘training data’), model the relationship between input data (defined by its ‘features’), and label output data. Unsupervised Machine Learning takes this one step further by analysing a large set of input data in order to create structure around it. As dynamic models have emerged to analyse data that is difficult to quantify, focus has been shifting from structured to unstructured data. We can now extract information from languages, images and speech. Having access to new types of information, and being able to effectively capture and process it, has resulted in entire industries being revolutionised by Big Data.

Download the complete article at the link beneath Related Files

Supporting documents

Click link to download and view these files