Java on Azure

In my conversations with customers on Windows Azure the topic of running Java on Azure often comes up, where I would explain that Java developers can benefit from the PaaS capabilities of Azure just as much as .net developers including the elasticity and ‘self-healing’ properties of our Public Cloud offering whilst avoiding having to maintain VMs (or physical machines)

The best way to get an overview of the experience is to take a look at this 5 minute video as it covers all the steps required to take an existing JSP application in Eclipse and run it on Azure;  if you’d rather read than listen, carry on Smile

Overall, the experience for Java developers is pretty close to the one for .net developers, with a couple of notable differences I will touch upon shortly, so, to start with, let me describe how I got a simple Java application to run on on Azure –

I’m starting with Eclipse IDE for Java EE Developers (Indigo) and already have the Azure SDK installed on my laptop and so the only thing I need to do to prepare my Eclipse environment is to install the Azure plug-in for Eclipse, by using the Help->Install New Software Eclipse menu item and pointing it at http://dl.windowsazure.com/eclipse

(you can read all about it in the Windows Azure Java Developer Center)

With the IDE ready I create, for example, a new dynamic web project using Tomcat 7 as the server, add to it a JSP page and test it locally. normal stuff. (I can just about do a hello world)

With my ‘elaborate’ web application working locally, it is time to test it in the Azure Emulator and so I create a new ‘Windows Azure Project’ which is now available after installing the plug-in; to get my code included in the Azure project I export the WAR from the dynamic web project to the approot folder in it.

Last, and this is the only real difference between .net and Java with regards to deploying on Azure, is that I need to provide the JDK and the Server to run on the Azure role – .net doesn’t really have the concept of multiple servers, IIS is the de-facto web server , which is automatically included in all web roles. Azure roles also come out of the box with all the .net runtime versions to date; In the Java case developers have a choice of servers they could use and could target one of several JDK versions; for this reason the Azure project needs to include the selected JDK and Server packages and the script required to install them.

This is done by placing jdk.zip, which includes the JDK I wish to use and a zip file containing the server I wish to use – in my case tomcat7.zip containing apache-tomcat-7.0.22 server I downloaded earlier in the approot folder as well.

Last – I need to provide the role with a script telling it how to install what.

Thankfully – the Azure project template for Eclipse includes sample scripts for the most commonly used servers, namely TomCat 7, Glassfish OSE 3, JBoss AS 6 and 7 and Jetty as well as a custom script that can be expanded, and these takes care of any hassle so all I need to do is copy the contents of the sample script file provided for tomcat 7 to the startup.cmd file in the role and make sure all the file names are correct (WAR, JDK and Server packages)

That done my projects are now ready and after building I can run the RunInEmulator.cmd script, also included in the Azure project template, to deploy the role to the local Compute Emulator – this tests both the script and the application and, once deployed, I am be able to use my application hosted by the emulator-embedded-role.

Happy with that, the last step is to prepare the application to be deployed to the cloud, this is done by changing a property on the Azure project from “Testing in Emulator” to “Deployment to Cloud” and building again – now the project contains the package to be deployed to Azure, the configuration file accompanying it and even a shortcut to the management portal.

I use these in the management portal to initiate a new deployment and several minutes later I’ve got my Apache Tomcat server running my JSP page and no VM in sight!

Of course – with the package on Azure, the platform can deploy this time and time again when scaling up or when an instance fails and needs to be re-deployed.

Result! Smile

 

One final, temporary note: working on Windows 8 CTP I did find a small problem with the script provided – to avoid problems arising from long paths, the script creates a symbolic link on the root of the drive pointing at the location of the files (somewhere deep under the Eclipse workspace). Windows 8 seems very unhappy with unzipping files into a symbolic link location and the thing breaks. the temporary solution, if you are working on Windows 8, is to remove the step to CD into the symbolic link location. the default location is the approot folder anyway and everything works just fine. the nature of CTPs, I guess.

Setting up my environment to build packages to run on Hadoop on Azure

It shouldn’t have, and I have only myself to blame, but it took some time before I finally figured out what I need to do to setup an environment on my laptop that I could use to build Map/Reduce programs in Java to run on Hadoop on Azure, here’s the set-up I have –

I’ve downloaded and extracted Eclipse Version: 3.6.1 (Helios) from http://archive.eclipse.org/eclipse/downloads/ to my Program Files (x86) directory (could have been anywhere, of course)

I then downloaded the Hadoop Eclipse plug-in (hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar) and placed it in the plug-ins folder in Eclipse – I found it here

Following the excellent instructions on the YDN tutorial(despite the versions mismatch) I was able to confirm that the plug-in loads fine and looks as it should, although currently, given the lack of proper authentication with hadoop, hadoop-on-azure does not allow connecting to the cluster from the outside (that would introduce a risk of somebody else connecting to the cluster, as all it takes is guessing the username) which means it was not actually possible for me to connect from the Map/Reduce locations panel to the cluster or indeed through the HDFS node in the project explorer.

It also appears that the plug-in lags behind the core development, and project templates are not up-to-date with the most recent changes to the hadoop classes, but that’s not too much of a problem as there’s not much code in the templates and this can be easily replaced/corrected.

The bit that, due to my lack of experience with Java and Eclipse, took infinitely longer than it should have is figuring out that this is not enough to build a map/reduce project….

Copying the code from the WordCount sample I kept getting errors about most of my imports until I finally figured out what should have been very obvious – I needed hadoop-core-0.20.203.0.jar and commons-cli-1.2.jar, the former could be found on http://mirror.catn.com/pub/apache/hadoop/common/hadoop-0.23.0/  the latter could be found on http://commons.apache.org/cli/download_cli.cgi, although both (and others) also exist in the cluster so I could RDP into it and use skydrive to transfer them over.

That was pretty much it – I could then create a new project, create a new class, paste in the contents of WordCount.java from the sample provided, export the JAR file and use it to submit a new job on hadoop-on-azure

What took me so long?! Smile

Next step would be to be able to test things locally, but I don’t think I’ll go there just yet…