Data on BigData

According to Transparency Market Research’s
  • Cumulative Ave Growth Rate (CAGR) of Big Data projected to be 40% from 2012-2018
  • the global big data market was worth USD 6.3 billion in 2012 and is expected to reach USD 48.3 billion by 2018
  • Big Data tools : CAGR of 41.4% from 2012 to 2018
  • Storage CAGR of 45.3% from 2012 to 2018
  • Major players (by revenue) last year HP Co.Teradata, Opera Solution, Mu Sigma and Splunk

Oracles Big Data Appliance Puts Hadoop, NoSQL, R in a Box

According to Oracle PR:

The Oracle Big Data Appliance is a new engineered system that includes
            an open source distribution of Apache(TM) Hadoop(TM), Oracle NoSQL
            Database, Oracle Data Integrator Application Adapter for Hadoop,
            Oracle Loader for Hadoop, and an open source distribution of R.

Engineered to work together, the Oracle Big Data Appliance is easily
            integrated with Oracle Database 11g, Oracle Exadata Database Machine,
            and Oracle Exalytics Business Intelligence Machine, and is designed to
            deliver extreme analytics on all data types, with enterprise-class
            performance, availability, supportability and security.

Oracle NoSQL Database: Oracle NoSQL Database Enterprise Edition is a
            distributed, highly scalable, key-value database. Unlike competitive
            solutions, Oracle NoSQL Database is easy to install, configure and
            manage, supports a broad set of workloads, and delivers
            enterprise-class reliability backed by enterprise-class Oracle
            support.
        --  Oracle Data Integrator Application Adapter for Hadoop: The new Hadoop
            adapter simplifies data integration from Hadoop and an Oracle Database
            through Oracle Data Integrator's easy to use interface.
        --  Oracle Loader for Hadoop: Oracle Loader for Hadoop enables customers
            to use Hadoop MapReduce processing to create optimized data sets for
            efficient loading and analysis in Oracle Database 11g. Unlike other
            Hadoop loaders, it generates Oracle internal formats to load data
            faster and use less database system resources.
        --  Oracle R Enterprise: Oracle R Enterprise integrates the open-source
            statistical environment R with Oracle Database 11g. Analysts and
            statisticians can run existing R applications and use the R client
            directly against data stored in Oracle Database 11g, vastly increasing
            scalability, performance and security. The combination of Oracle
            Database 11g and R delivers an enterprise-ready deeply-integrated
            environment for advanced analytics.
        --  Oracle NoSQL Database, Oracle Data Integrator Application Adapter for
            Hadoop, Oracle Loader for Hadoop, and Oracle R Enterprise will be
            available both as standalone software products independent of the
            Oracle Big Data Appliance.

Oracle Goes BigWay in Big Data Analytics, NoSQL and Hadoop

In his keynote at Oracle Open World 2011, Mark Hurd announced new Exalytics analytics appliance that is geared to execute OLAP and MOLAP. It is for online application processing or multi-dimensional online application processing, for deriving business intelligence. Cloud, Big data are among the key themese on this year’s Oracle Open World. Oracle’s Co-president Safra Katz declared, “We are big data. We are also the cloud.”. The push on Cloud is much more significant on the background of last year’s statement by Larry first ridiculing the usage of term Cloud and then claming that Oracle is already providing cloud. But this year is the real delivery of Cloud BI, Cloud based Apps, etc. Fitting in its vision of e2e in a box, in addition to Exadata and Exalytics, Oracle announced Big Data Appliance. The Oracle Big Data Appliance integrated Apache Hadoop, Open Source R, Oracle’s NoSQL Database, ODI adapter for Hadoop and Oracle Loader for Hadoop on Linux and Oracle Java VM in a Big Box. This combination provides a good for big data processing of unstructured / strucutred data. For more on Big Data Appliance: https://texploration.wordpress.com/2011/10/04/oracles-big-data-appliance-puts-hadoop-nosql-r-in-a-box/

With the advent of NoSQL database and MapReduce infrastructures, I already thought that Oracle cannot be left behind in the latest NoSQL train. Hadoop is gaining significant traction in batch oriented applications like unstrucutred data processing, Warehousing, etc. Hadoop provides a way to distribute data and processing logic on nodes in server cluster. It takes the processing logic close to the data. Hadoop, originated from Yahoo, is based on Map Reduce architecture introduced by Google. Anyway, I predict that usage of Hadoop in Oracle stack would go adding it in Big Data Appliance . Oracle may do some acquisition in the same.

Yahoo Spinning off Hadoop Development to Ride on Hadoop Wave

[tweetmeme source=”khanderao” only_single=false]
Accoring WSJ report,  http://on.wsj.com/fMzApi , yahoo is considering to spin off Hadoop Development as a separate company like Cloudera. This would help in ridding on the wave of Hadoop. Yahoo has a significant work in Hadoop. Hadoop is an Apache opensource project based on Map-Reduce concept first introduced by Google. Hadoop is increasingly popularly in big data analytics to provide scalability especially using commodity hardware platforms. This Map Reduced based platforms like Hadoop,  Cassendra, CouchDB, MongoDB are increasingly getting popular especially in social site and web2.0 where the volume of data is huge.

Yahoo has been one of the core contributors of Hadoop. It has contributed from the begining and still committed a large team to Hadoop development. It also developed additional layers like Pig to enable data warehousing / business analytics apps to leverage Hadoop. I believe Yahoo uses Hadoop extensivly in its content minning, emails etc.

Still, why does Yahoo wants to spin-off? Definitely for taking advantage of the Hadoop’s commercial potential. There are startups like Cloudera are floated around the Hadoop ecosystem. As per analysts there is a multi-billion dollar market based on Hadoop ecosystem. By spinning off a separate company, yahoo can monetize on the market without getting distracted to its main business. For the customers, it is better to have a company with a backing from Yahoo. Such company can focus on delivering specialized solutions around Hadoop. Having said it, I am not yet sure whether this rumored company will also focus on training and services or not. In any case, this move would help in further maturing Hadopp ecosystem.