Big Data without algorithms is a dumb data. Algorithms like machine learning, text processing, data mining extract knowledge out of the data and makes it smart data. These algorithms make the data consumable or actionable for humans and businesses. Such actionable data can drive business decisions or predict products that customers most likely to buy next. Amazon and Netflix are popular examples of how the learnings from data can be used for influencing customer decisions. Hence, machine learning algorithms are very important in the era of Big Data. BTW in the field of Big Data, ‘Machine learning’ is considered more broadly ( than what it is really meant by the machine learning professionals) and includes pure statistical algorithms as well as other algorithms that are not based on ‘learning’s.
Earlier today, on 16th June, Microsoft announced a preview of machine learning service called AzureML on its Azure cloud platform. With this service, business analysts may easily apply machine learning algorithms like the ones related to predictive analytics to data.
Machine learning itself has been popular for last few years. Microsoft has recognized the trend and jumped on it. When it comes to big players making machine learning services on cloud, Google had pioneered its PredictionEngine as a service on cloud few years back.
Traditionally data scientists use tools like Matlab, R, Python (NumPym, SciKit, Sklearn) and others for analyzing data. Programmers use open sources like Apache Mahout, Weka for developing Application services using Machine Learning algorithms. However, having machine learning algorithms is not good enough, scaling the machine learning algorithms to big data is very important.
Last year Cloudera did an acqui-hire, Myrrix, and open sourced Machine learning on Hadoop as Oryx. Berkeley’s Ampslab has opensourced its Big Data Machine learning work, called MLBase, in Apache Spark, an open source big data stack becoming rapidly popular.
The momentum in machine learning has already fueled a good amount of venture funding in this area.
- SkyTree got $18Million funding from U.S. Venture Partners, UPS and Scott McNealy.
- Nuotonian grabbed $4 million Atlas Ventures for Big Data Analytics.
- Another startup wise.io raised $2.5 from VCs led by Voyager Ventures. Wise.io would makes it easy to predict customer behavior using machine learning.
- AlpineLabs that came out of EMC raised series B last year from Sierra Ventures, Mission Ventures and others. It provides a studio and easy to assemble set of standard Machine Learning and analytics algorithms.
- Oregon based BigML raised $1.2 million last year to provide easy to use machine learning cloud service.
- RevolutionAnalytics which got $37 (in total) makes R algorithms to work on Map Reduce.
- and the list goes on
There is an interesting Machine learning project called Vowpal Wabbit that initially started at Yahoo and continued at Microsoft. However, Interestingly, instead of VW, Microsoft is making R language and algorithms available on Azure Cloud.
Anyway, the trend of making machine learning services easy to run on Big Data and on Cloud would continue. But having the tools and algorithms available would not enough to solve the problem. We need qualified people who understands which algorithms to use for solving which cases and how to use them (parameterize). Moreover, what we really need is applications using such algorithms to solve the business problems without even having a need for users to understand the algorithms. In my opinion , what we would see in future is such vertical applications / services that would abstract (use but hide) machine learning or prediction algorithms to serve domain specific business needs.