Choosing a Cloud platform for Java application

Recently I had to make a decision to which cloud provider (Platform as a service -PAAS) deploy new application.
Usually we deploy applications to our data centers, so this was the first time my team decided to use cloud provider. We wanted to free ourselves from the operational and maintenance part of deployment. Provisioning servers, configuring, testing, configuring monitoring, configuring databases takes a lot of time that can be more effectively be used in product development areas. To free ourselves from those operational concern was one of the objectives. I would say make a move from devops to noops.

About application itself – it was written in Java with Spring framework and use MongoDB as data storage. Application was stateless, so there is no session stickiness involved. So I was looking for PAAS provider with java capability as well as MongoDB hosting either with the same company or as a partner but within the same cloud providers. I was considering following offering:

  1. Amazon Elastic Beanstalk with MongoLab
  2. Heroku and MongoHQ
  3. CloudBees with MongoHQ

After comparing, reading docs and playing with products I decided to go with CloudBees. Here is why:

  • CloudBees provide Continious Integration as part of the cloud infrastructure. This incudes jenkins build servers, Artifactory repository and sonar quality control tools. Even I have all of those services installed on local network, I think it is a great idea to run those services on demand in the cloud. Cloud-based development is hot, according to Bob Agielo, author of Configuration Management Best Practices: Practical Methods that Work in the Real World
  • CloudBees allow to deploy archive .war file into their PAAS system using maven plugin or jenkins plugin or from command line using provided SDK. I like this option more than using git as required by Heroku. Deploying archives like .war or .ear files fit better in our current build process.
  • CloudBees have partnership with MongoHQ, so creating MongoDB was easy and MongoDB administration console was integrated with the same management console as application.
  • CloudBees offer NewRelic monitoring as add-on which provides great service to monitor the application performance.
  • Ability to test application locally using CloudBees SDK before uploading to the cloud.
  • Deployment to JBoss was there as option for those who interested in full J2EE stack.


My deployment went very well and I was happy with my choice. One day later I stumbled upon an article from InfoQ written by Michael Yuan titled  Java Developer guide to PAAS. Below I am reposting this article and author’s conclusion supported my choice. It is much more in depth analysis of current PaaS offering. I am sure in 2012 we will see more and better offerings to deploy our jvm based applications to the cloud.

The Java platform is well suited for PaaS since the JVM, the application server, and deployment archives (e.g., WARs and EARs) provide natural isolations for Java applications, allowing multiple developers to deploy applications in the same infrastructure. However, for the past several years, most PaaS offerings were around platforms such as Ruby and Python, whilst Google App Engine was a lone PaaS provider for Java developers. Fortunately, that is starting to change.
In the past year or so, several commercial providers have entered the Java PaaS space. It makes sense since the estimated 10 million Java developers almost certainly represent one of the biggest developer groups in the world. In this article, we will try to compare those PaaS offerings from the developers’ point of view. Specifically, our comparison methodology is to compare the features of each offering in 4 areas:

  • Support for technology platforms and stacks.
  • Support for developer productivity and development processes.
  • Performance and scalability.
  • Pricing and other business concerns.

In this article, we will compare the following Java PaaS offerings (in alphabetical order).

  • Amazon Elastic Beanstalk is Amazon’s Java PaaS offering built on their EC2 cloud. It provides managed Tomcat instances running on EC2, complete with load balancers and on-demand provisioning capabilities for scaling. It integrates with the rest of Amazon Web Services to provide access to managed relational databases (RDS), big data stores (SimpleDB), message queues, email, and other services.
  • CloudBees is a VC-based startup that is run by JBoss and Sun veterans, and recently raised $14M in two rounds of financing. It may be a new name, but its influence is fast growing in this space. CloudBees brings several unique features into the Java PaaS scene, in particular continuous integration – a complete development / deployment cycle management in the cloud. In addition, like Heroku, the company includes a market place for 3rd party plugins and services.
  • Cloud Foundry is an Open Source initiative from VMware. VMware software powers virtualized data centers, which is the basis of most PaaS offerings. VMware is also the home of Spring Framework, a very popular platform stack in enterprise Java. A unique feature of Cloud Foundry is that it does not have to be a hosted PaaS at all. You can download its code and host a PaaS yourself! In that sense, it is more of a hosting platform than a hosted PaaS service.
  • Google App Engine for Java is perhaps the oldest (and most mature) Java PaaS offerings on the market. It has an ambitious goal of linear scalability, and it is not afraid of making drastic changes to the Java platform itself.
  • Heroku for Java is the latest offering from PaaS power house Heroku, which has a deep heritage in the Ruby community.
  • Red Hat OpenShift is Red Hat’s experimental offering in PaaS. Red Hat’s JBoss Application Server (AS) is amongst the most popular Java application servers, and the OpenShift service provides comprehensive JBoss AS support.

 

Supported Technology Platforms and Stacks

One of the most important attributes of a Java PaaS provider is the technology platform and stack it supports. After all, the technology platform is what distinguishes Java PaaS from all other PaaS offerings. Yet, during the long evolution of the Java platform, there have been many competing technology stacks on the platform. For the Java PaaS vendor, I believe that supporting as many different technology stacks as possible is very important.
In this category OpenShift and CloudBees support the widest variety of technologies, from a simple servlet container (typically Tomcat) to full Java EE 6 Web Profile support (JBoss AS 7).The Java PaaS pioneer, Google App Engine, is now lagging behind most newcomers in terms of standards support. Google App Engine does not support the full Java SE platform, and hence offers poor support for many popular frameworks. Google App Engine also requires the user to program to its own network and persistence APIs, as opposed to supporting the open standard, resulting in applications that are very hard to port. Similarly, Heroku for Java requires the application to wrap around its own Jetty instance, breaking the more traditional Java EE application deployment model.
The Cloud Foundry project supports the Tomcat container. But its application development and deployment are heavily optimized for the Spring framework, creating an semi external dependency. Cloud Foundry is well suited to applications based on the Spring framework since its parent company, VMware, is also the owner of Spring. In addition the platform supports message queuing using RabbitMQ and based on the AMQP standard. But its support for other Java frameworks such as the Java EE is weak.

Amazon Beanstalk CloudBees Cloud Foundry Google App Engine Heroku for Java OpenShift
Tomcat Yes Yes Yes No No Yes
Java SE Yes Yes Yes No Yes Yes
Java EE No Yes No No No Yes
Support standard Java libraries Yes Yes Yes No Yes Yes
File system access Yes Yes Yes No Yes Yes
Thread access Yes Yes Yes No Yes Yes
Outbound network connections Yes Yes Yes Limited Yes Yes
MySQL RDS Yes Yes Paid plan Yes Yes
Commercial relational databases RDS External External No External External
Big Data support SimpleDB External External BigTable External External
Deploy without special frameworks Yes Yes No No Yes Yes
Friendly to migrate existing apps Yes Yes No No No Yes
Portability of apps High High Moderate Low Low High
Production ready? Yes Yes Beta Yes Beta Beta

 

Support for Developer Productivity and Development Processes

A key value of the PaaS is that it makes life easier for application developers, as it removes the overhead for application and resource management. So, developer friendliness and tools integration is an important consideration in our evaluation.
In this category CloudBees is a clear winner. It is not only a PaaS runtime environment, but also an integrated build and test environment. Developers can make use of the Jenkins service to have CloudBees automatically and continuously check out, build, test, and report code in the repository. This continuous integration process has been adopted by many large teams as a key component of their software development process. However, build server management is often time consuming and painstaking work for the QA team. CloudBees takes out this pain, and make the process much more transparent for developers. Recently, Red Hat OpenShift has made progress catching up to CloudBees in this space by supporting Maven and Jenkins integration.
Amazon Beanstalk, OpenShift, and Google App Engine all provide developer tools, SDKs, and IDE plugins that are consistent with other Java-based tools in the market.
Cloud Foundry and Heroku for Java, however, provide tools that are more suited for Ruby developers than for Java developers. Having used their tools, I suspect that many Java developers will take some time to get used to their conventions and terminologies. In addition Cloud Foundry currently suffers from poor documentation. For instance, much of its documentation is in the form of video tutorials. While video tutorials are great to get developer started, they lack the depth required for deploying serious applications, or for developers who wish to go beyond the scripted scenarios. Their official documentation of getting started guides were dated in 2007, despite significant changes their platform has gone through in the last couple of years. Another important point is that, while Cloud Foundry allows developers to setup their own cloud environments, to deploy Micro Cloud is significantly more involved than to just install an SDK. That is a barrier that makes Cloud Foundry difficult for many developers.

Amazon Beanstalk CloudBees Cloud Foundry Google App Engine Heroku for Java OpenShift
IDE tools No Yes Yes No No Yes
Command line tools Yes Yes Yes Yes Yes Yes
Web-based console Yes Yes No Yes No Yes
Testing on dev machine Easy Easy Hard Hard Yes Easy
Build without non-standard dependency Yes Yes No No No Yes
Source control integration No Yes Yes No No Partly
Integrated build No Yes No No No Yes
Integrated testing No Yes No No No No
Access to logs via web No Yes No Yes Yes Yes
Third party developer / testing services No Yes No No No No
API access Yes Yes No No Yes No
Documentation Good Good Poor Good Good Good

 

Performance and Scalability

One of the most important features of PaaS is the platform’s ability to auto-scale. That is to increase and decrease server capacity based on real-time demand of traffic. It requires the platform provider to load balance requests across a number of servers, monitor the load on each server, and to spin up new servers as needed.
All PaaS providers support auto-scaling to some extent. But auto-scaling is harder than it looks. For starters, the Java EE application must be configured to access a centralized external database as opposed to a database server co-hosted on the same server. The programming paradigm and tools for all PaaS providers need to force the developer to do that.
An even bigger problem is HTTP sessions. In Java application servers, the session state of HTTP sessions are managed in-memory by default. To build applications that can be load balanced across different servers, the developer must do one of the following:

  • Configure the load balancer to support “sticky session”. That is for the load balancer to inspect the session ID of all incoming requests, and always direct requests in the same session to the same server behind the load balancer. While this is the simplest approach, problems include: the load balancer needs to perform more work, the load distribution could become unbalanced over time, and it is difficult to scale down the infrastructure when the load demand drops since each server will own some sessions. Because of these issues, few PaaS providers support this option.
  • Set up a shared cache for in-memory HTTP sessions. This way, all servers have all HTTP sessions in-memory at all times. However, replicating the in-memory sessions across a cluster is both bandwidth and computationally intensive. It requires work on the application developer’s side to set up the shared cache and replication strategies.
  • The application could also be configured to persist all HTTP sessions into the external relational database.

Of all PaaS platforms reviewed, Google App Engine handles this problem best. The Google App Engine is architected to abstract away the notion of individual servers. It automatically creates data stores in separate servers, and saves HTTP session into the data store by default. The process is completely transparent to developers. However, the issue with Google App Engine is that raw performance is poor. It is not uncommon for a web request to take 1-3 seconds to complete a round trip to databases.
Heroku for Java also provides automatic session sharing across server instances because each of its server instances is wrapped around a custom Jetty instance. However, Heroku does not provide transparent auto-scaling. You will have to watch the dashboard and add resources to the app as needed.
For the rest of the standard Java offerings, all of them do a good job forcing the developer to create database tables on a separate, dedicated database server as part of their deployment process. For HTTP sessions, Cloud Foundry uses sticky sessions in its load balancer. As we discussed above, while it makes life easy for developers, it also has some serious scalability issues. The rest of the PaaS offerings leave session management to application developers, although it is not always clear from their documentation.

Amazon Beanstalk CloudBees Cloud Foundry Google App Engine Heroku for Java OpenShift
Built-in load balancer Yes Yes No Yes Yes Yes
Custom domain for load balancer Yes Yes No Google Apps Yes Yes
Auto-scaling of app server Yes Yes Planned Yes No Yes
Auto-scaling of database No No No Yes No No
User defined performance criteria Yes Yes Planned No No Yes
Web-based monitoring dashboard Yes Yes No Yes Yes Yes
Clustered HTTP session Manual Manual Manual Auto Auto Manual

 

Pricing and Business Concerns

The pricing of those PaaS offerings is an important consideration for developers. Most service providers offer free service tiers for developers to try out. For smaller Java web sites, those free tiers are excellent choices.
However, as Google App Engine’s recent price hike controversy indicated, cost for high volume web applications can be quite high with PaaS providers.
Another important factor to consider is the availability of support options. Google App Engine and Amazon Web Services both have poor track records providing support. Developers are left on their own to find out answers on forums. Smaller providers with Java specialty tend to provide better technical support, even on public forums. In my view CloudBees provides the best combination of paid ticket-based support, and Java-specific technical know-how amongst support staff.

Amazon Beanstalk CloudBees Cloud Foundry Google App Engine Heroku for Java OpenShift
Free tier No Yes N/A Yes Yes Free
Cost for low traffic entry level web apps High Free Free Free Free Free
Cross cloud provider No Yes Planned No No Planned
Private cloud No Yes Yes No No Planned
Support Forum Email and Phone Forum Forum Email and Phone Forum
Support quality Poor Good Good Poor Okay Good

 

What’s Next

In this article, we reviewed 6 well known vendors in the Java PaaS space. There are of course more smaller or lesser known providers. Examples include

  • Jelastic: It supports a wide array of combinations of application servers and databases, including variations of the MySQL database and NoSQL databases.
  • The WSO2 StratosLive: It is a PaaS offering built on the WSO2 application server, which is a Java EE compliant application server.
  • CumuLogic: It provides a Java application environment based on OpenStack.

We will keep a close eye on these vendors as they could easily grow up to challenge both the market share and mind share of bigger players.
PaaS for Java has come a long way in the past 12 months. The product offerings are still fast evolving. That is great news for Java developers looking for low cost, scalable, and hassle free hosted solutions. For Java EE developers, I believe that CloudBees and Open Shift offer the “best of the breed” services so far, and with OpenShift still in beta, CloudBees is the winner of this comparison in this highly competitive landscape. If you are willing to venture outside of the Java EE comfort zone, Heroku for Java and Cloud Foundry (beta) are worthy contenders to the venerable Google App Engine.

1 Comment

  1. Nice comparison, but it seems a bit dated. Could you add the date the article was written? OpenShift is no longer in beta & I see no mention of docker.

    Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>