Definitely 2014 was a year of Big Data. But with all the new development in Big Data, I almost missed on one very important new trend in the industry. I think it will change the way we build, deploy and package software – will be it Big Data frameworks, web sites, java apps or python libraries. I believe in 2015 this technology will take a central stage in software engineering minds. I am talking about Docker. Docker concept of container gained support not only from open-source community, but also from industry heavyweight like Amazon, Google and Microsoft. All of them worked together to contribute to docker development and adopt it in their cloud offerings. So what is the Docker and why it is important?
Docker is a new way for developers and admins to package, ship and run applications whether on laptop, data centers VM or in the cloud. Docker is a tool that helps solve common problems installing, removing, upgrading, distributing, trusting, and managing software. Think about it as a virtualized container that runs your software. But in contrast to any VM technology, docker is very lightweight and fast.
Let’s discuss what common problems we have and how Docker solve those problems and make our life easier.
Software installation and management.
One of the problem that Docker solve is software installation. Installing software is complex. I have experienced installing complex software for several hours only to find out that some dependencies not compatible with my target OS or some important packages are missing. At the end of the day manual installations are fragile and not reprodusable. Tools like Chef or Puppet trying to solve this problem, but even with those tools it’s not easy at all. Docker on the other hand make it a breeze to run and install complex software. And best part, it does not matter on which operating system you install it. For example, it took me couple minutes to install and start graphite and statsd combination by issuing only one command:
sudo docker run -d --name graphite -p 80:80 -p 2003:2003 -p 8125:8125/udp hopsoft/graphite-statsd
and that all! Needless to say that I can remove it as easy as well.
Over time our machines become a junk drawers. Each software package we install brings dependencies, and those dependencies bring another dependencies which in turn maybe incompatible with packages you install later. It’s became maintablility problem to keep computer clean. Docker provide a way to install and run each program in their own lightweight container. Container is like a shipping containers – outside, for operation system, they looks the same, but content of each container is different. This allow us to install and clean up software without affecting underline OS. With Docker, removing software is also easy and clean.
With docker, you can run the the same software on windows, OSX and Linux. That means that your desktop, your development environment, your company’s server, or your company’s cloud can all run the same programs. It helps software developers better understand the systems that will be running their programs. It means fewer surprises. Virtually you running the same OS, the same software stack in all environment.
Docker container make life easier for Big Data engineers as well. Docker can run on YARN, taking advantage of YARN resource management, they can make installing complex distributed systems a breeze.
I predict docker platform will be actively developing in 2015. I see more people will run docker container in production. New ways of building continuous deployment pipelines will be discovered. I highly recommend that you start digging into dockers and how it can help you and you company.
The best place to start is of course docker web site and Docker news letter.