Posted by:Ranjani Rao March 6th, 2012

VMware’s latest Spring project – Spring Hadoop – is an exciting development for Javadevelopers using the Spring framework as it aims to streamline the Java development of end-to-end solutions that work with Big Data, leveraging on Apache Hadoop’s data processing capabilities. It combines the UVP of Spring – ease-of-use and simplicity – with the Apache Hadoop platform while inheriting features from Spring, Spring Batch and Spring Integration.

Adam Fitzgerald, VMware director of developer relations states that the capabilities of Spring Hadoop make it ideal for building end-to-end pipeline solutions for enterprise Java applications. Developers can build and execute complex workloads from the Spring framework itself that can interact with Hadoop as individual MapReduce requests or as data-streaming (non-Java) results.

By integrating Spring and Hadoop, VMWare has taken Spring’s dependency injection mechanism for linking related objects and applied it to Hadoop. Fitzgerald believes this will save developers time and increase the efficiency, testability, and portability of applications.

‘VMware is committed to helping developers build, deploy, manage and scale the new wave of data-driven applications,’ said Adrian Colyer, CTO for Cloud and Application Services at VMware, in a statement. ‘By building upon Spring’s strong and versatile foundation of simplifying data access, and leveraging the depth of the Hadoop platform, VMware is delivering a streamlined programming model that makes Spring the natural way to integrate Hadoop systems into the enterprise application landscape.’

VMware staff engineer, Costin Leau talks about the new project on the Springsource blog: ‘Whether one is writing stand-alone, vanilla MapReduce applications, interacting with data from multiple data stores across the enterprise, or coordinating a complex workflow of HDFS, Pig, or Hive jobs, or anything in between, Spring Hadoop stays true to the Spring philosophy offering a simplified programming model and addresses ‘accidental complexity’ caused by the infrastructure. Spring Hadoop, provides a powerful tool in the developer arsenal for dealing with big data volumes.’

Salient features of Spring Hadoop as gathered from Leau’s blog post are:

  • The Spring container enables the execution of MapReduce, Cascading, HBase, Hive and Pig on spring Hadoop.
  • Read and write data access to the Hadoop Distributed File System (HDFS) is enabled with Spring’s resource abstraction via JVM scripting languages such as Groovy, Rhino/JavaScript and jRuby.
  • Declarative and programmatic support is offered for Hadoop tools including FsShell and DistCp.
  • Spring container functionality such as property placeholders and environment support allow you to start small and build the app up with available configuration options. The creation and submission of the job configuration is handled by the IoC container.
  • Developers need not rewrite their MapReduce job in Java as they are treated as objects that are created, configured, wired and managed by the framework.
  • Existing Hadoop Tool implementations are also supported.
  • Hadoop configuration options and template for client connections to Hadoop are available.
  • Spring Batch integration provides tasklets for various Hadoop interactions.
  • Spring Integration supports event triggering.
  • Spring’s powerful service abstractions can be used in app development.

As on February 29, 2012, the first milestone release – Spring Hadoop 1.0.0.M1 was made available to the public. It is being released under an Apache open source license.

So, eager to learn why your business MUST take notice of Mobility ? Or want to decide which app is a right fit for your business? Download your choice !

Leave a Reply

Your email address will not be published. Required fields are marked *