Blog on all Things Cloud Foundry

Creating a BOSH Release for Admin UI, a Monitoring Tool for Cloud Foundry

Alexander Lomov

Some Cloud Foundry components must be available, even if Router fails. Admin UI, a monitoring tool from the Cloud Foundry incubator, is a good example of a utility that you want to have access to no matter what. Getting updates directly from the NATS messaging bus, it gives admins access to CF components, their logs, statistics on DEAs and applications deployed to them, user rights, and other things not available in the Cloud Foundry CLI.

BOSH releases help to achieve high availability by installing important components outside the main CF deployment. By doing so, you can avoid exposing them with Router or deploying them as apps. In addition, they provide the easiest way to bind Cloud Foundry components and custom services.

In this post, I share my experience with creating a BOSH release (not yet available at the time this was published) for a new version of Admin UI. I also provide a temporary workaround for those who need to deploy Admin UI now and cannot wait for the next BOSH release.



Building a Custom BOSH CPI for the Cloud Foundry PaaS: A GCE Example

Alexander Lomov

Portability and cross-platform compatibility are the fundamental principles and also key advantages of the Cloud Foundry PaaS. Despite that, until now, its architecture supported a limited number of cloud platforms: OpenStack, AWS, vSphere, vCloud, and Warden. However, thanks to the efforts of the community some new names have been added to the list of available IaaS vendors. At the end of May, Pivotal released its Google Compute Engine CF-BOSH CPI. Developers are currently discussing ways to create a CPI for Microsoft Azure in the BOSH Developers Google group. Finally, the BOSH team have released an experimental version of the external CPI that can serve as a new way for creating CPIs.

In this post, I will share my experience with developing a custom CPI for Cloud Foundry using the standard CPI mechanism. Read on to learn about the issues I have encountered and get some tips on how to address them. (more…)

1 Comment

Hadoop Distributions: Comparison and Top 5 Trends

Kirill Grigorchuk

Ever wondered how Hadoop distros differ from each other? In a recent article for NetworkWorld, I overview how Hadoop became what it is today and explore the differences between the standard edition vs. Hortonworks, Cloudera, and MapR. I also provided insights into 5 major trends that are shaping their evolution—in terms of features, ecosystem, enterprise adoption, etc.

Read the article to learn about:

– The top 5 trends currently affecting the evolution of Hadoop distributions
– Why enterprises need Hadoop distros and how they differ
– How YARN has solved the issues present in Hadoop 1.0
– What will become of Hadoop in the foreseeable future

Continue to the article at NetworkWorld: “Comparing the Top Hadoop Distributions.”

No Comments

Architectures of OpenShift and Cloud Foundry PaaS—in a Nutshell

Alexander Lomov

When I was digging into the architectures of Cloud Foundry and OpenShift for a high-level overview, it became obvious that many of their major components have similar functionality:

  • Routers manage user traffic.
  • Working Nodes are used to stage and run Web applications.
  • Managers are the components that manage and monitor Working Nodes and take care of them in case of failures.
  • The Messaging Bus enables collaboration between different parts of a distributed PaaS.

Although similar in functionality, these components use different technologies:

Component / Function
Cloud Foundry
Router Brokers, HAProxy Gears Router
Working Nodes Gears Warden containers within DEA
Messaging Bus ActiveMQ NATS
Managers Brokers Cloud Controller, Health Managers
Providing resources and services to applications Cartridges Buildpacks and services

OpenShift uses unified abstractions to work with applications in the cloud. Its Gears are designed to run apps with separated access to shared resources and are implemented as lightweight containers. Cartridges provide the actual functionality necessary to run a user application. They add support for programming languages and access to various databases. Cartridges are the add-ons that contain binaries, setup, and control scripts that make it possible to deploy and maintain the functionality of applications.

OpenShift’s Brokers are the point of contact for all application management activities and traffic. They are implemented as daemons responsible for managing user logins, DNS, and application status. Nodes and Broker Support Nodes (BSN) represent the lower layer of the OpenShift architecture: Nodes are the physical machines where the Gears are allocated, while BSN are the Nodes that run Brokers. BSN and Nodes are connected through a Messaging Bus—for this purpose, OpenShift uses ActiveMQ.

Cloud Foundry has more direct and straightforward abstractions. The dynamic routing layer (Router) handles all the traffic. It resolves application traffic and developers’ Cloud Foundry REST API requests. Cloud Controller is responsible for all management tasks. Being the main endpoint for the Cloud Foundry REST API, it uses the UAA module to authenticate and authorize users. Health Manager monitors the status of applications and takes appropriate actions, when it changes—e.g. if an instance is down, it automatically triggers the system to create a new one. The NATS messaging system processes notifications. Services provide everything that an application may need to work properly with other resources, such as databases and external services.

Both OpenShift and Cloud Foundry support container management. Red Hat provides application container isolation via Docker, while Cloud Foundry relies on Warden. Although Docker and Warden serve the same purpose—providing resource isolation to applications (CPU, memory, etc.)—they use different approaches to achieve this. However, with the Decker project, Cloud Foundry is now a step closer to Docker.

It is also worth mentioning that one of the greatest advantages of Cloud Foundry is support for Heroku buildpacks. These bundles of detection and configuration scripts can deploy applications, as well as install all the application dependencies. When you push an application, the system automatically applies the appropriate buildpack—or you can set it manually.

You can find more details on the topic in my basic overview of OpenShift and Cloud Foundry (features and architectures). The document is by no means exhaustive, but only part of a bigger project. Currently, I’m occupied working on Juju charms for Cloud Foundry together with Pivotal and Canonical, so this comparison will be finalized a bit later.

No Comments

The Cloud Foundry Foundation: a PaaS Revolution?

Renat Khasanshyn

cloud_foundry_logoThe open source project Cloud Foundry exploded overnight. Just as we are witnessing the tipping point of the Ukrainian revolution, something similar just happened in the cloudy world of IT. Pivotal announced that a group of major IT vendors will form the Cloud Foundry Foundation, a legal entity aimed at providing a formal governance body for the open source software project. The group united around Cloud Foundry pulls a combined market capitalization of almost half a trillion dollars. IT superpowers have spoken.

What happened today might be the beginning of a new era for IT folks in the vendor ecosystem and their customers.

The facts

  • The founding “Platinum sponsors” of the Cloud Foundry Foundation are IBM, HP, Rackspace, SAP, Pivotal, and Pivotal’s parents—VMware and EMC. Each of these companies made a pledge to pay $500,000 per year to the foundation for at least three years. ActiveState and Savvis joined as “Gold sponsors”, pledging $250,000 per year.
  • Cloud Foundry is licensed under the Apache License 2.0. This is important because, compared to the General Public License, the Apache License is much more vendor-friendly. It allows modifications without the requirement to open source these modifications, thus empowering a larger ecosystem of vendors to monetize the code base.
  • According to my observations, over the last 12 months, there were around 100 monthly active contributors to the project.
  • The size of the project’s code-base is no joke. At the moment, it is approaching 600,000 lines of code.

What does this announcement mean for enterprise IT customers?

With this announcement, backers of the foundation are sending their customers a simple message: the traditional way of delivering applications is outdated, and Cloud Foundry will fix it. Here is why: At Altoros, we see three types of customers—those that use Cloud Foundry to make money (as the backbone of an innovation engine while they search for or roll out new business models); those that use Cloud Foundry to save money (by reducing the cost of compliance or increasing the utilization of infrastructure); and those that use Cloud Foundry to deliver productivity gains.

Capturing these benefits is not an easy job, especially when your IT teams design and manage almost each and every development stack nuts-to-bolts, in most cases—one at a time. What we learned is that the companies who became really good at capturing these benefits are using PaaS to limit the number of permutations in their development stacks all the way from virtual machines to the applications. Instead of providing flexibility for each and every stakeholder on each and every layer of the application architecture, they aim to provide a set of pre-defined, pre-approved combinations of run-time environments and database services in the form of APIs, focusing their application development teams on agility and business results.

What are the goals of the Cloud Foundry Foundation?

The goal of the Cloud Foundry Foundation is to promote the development and adoption of Cloud Foundry. Those of you who are bullish on the PaaS technology should read the goal as “to establish the project as a new standard for delivering software applications.”

Was forming the foundation really necessary?

To understand why the Cloud Foundry Foundation was created, let me share with you how the Cloud Foundry project got to where it is today.

  • As of fall 2012, the Cloud Foundry project was led by VMware almost single-handedly. At the time, VMware drove adoption of the technology through a handful of small vendors mainly pushing a hosted “Heroku-style” PaaS offering based on Cloud Foundry. Yet, a few large companies became early adopters of private deployments, including Warner Music Group, Sky, NTT, and Rakuten.
  • For the vast majority of enterprise IT folks, any given open source project is not “solvent,” unless it commands a diverse community of active contributors. When it comes to attracting community and contributors, many companies who were interested in sponsoring the Cloud Foundry project faced a competitive conflict of interest with VMware, either directly or indirectly. Because of that conflict of interest, the growth of the Cloud Foundry ecosystem was limited.
  • In March 2013, EMC and VMware put together its business units (Cloud Foundry, Pivotal Labs, Greenplum, Spring, and Cetas) and spun them off as a separate company named Pivotal. The company employed a total of 1,250 employees and generated about $300 million in annual revenue. EMC commanded a 69% interest in Pivotal, with the remaining 31% owned by VMware.
  • In April 2013, General Electric announced a strategic investment of $105 million in Pivotal. In exchange, GE received 10% of interest in the company. The investment aimed to support GE’s Internet of Things strategy, providing access to technology and talent.
  • In July 2013, IBM announced that it was joining the Cloud Foundry project.
  • Between July 2013 and December 2013, governance of the project became a community-driven process led by IBM. Some project decisions started to happen in collaboration with representatives of 10+ companies, including ActiveState, Altoros, Anynines, Canonical, IBM, Intel, Piston Cloud, Savvis, and Warner Music Group.
  • By the end of 2013, this “community-driven process” became a “working prototype” for the forthcoming Cloud Foundry Foundation.

How will the Cloud Foundry Foundation help vendors and customers?

Speaking the language of the revolution, the foundation offers a set of rules, a legal basis for enforcement of such rules, and—most importantly—guarantees.

  • Guarantees for vendors. In an industry that changes as fast as ours, building new revenue models with open source software requires the roadmap of the “project core” to match the strategy of a given vendor. The decisions about what to include or exclude from that “core” are pretty much the currency of the game. Voting rights on every level of the project (from forming a definition of what the “core” is or is not to accepting or denying an individual contribution) is the denomination of that currency. The charter of the foundation defines, among other things, the currency and its denomination. It specifies in precise detail how the money (decision power and voting rights) is distributed between citizens (various stakeholders in the project). In a nutshell, the foundation provides guarantees to stakeholders in exchange for source code contributions or funding.
  • Guarantees for enterprise customers. Enterprise customers, on the one hand, are also building new (or protecting existing) revenue models. They want to avoid vendor lock-in and de-risk long-term technology liabilities. The foundation addresses the requirements of a typical due-diligence process, provides the needed transparency, and naturally pulls the vote of confidence from the conservative side of the enterprise IT.

What does the announcement mean for IT vendors?

Legacy vendors (infrastructure/hardware, software, and hosting) are forced to reinvent their businesses every now and then. The remarkable rise and domination of AWS disrupted many in the vendor ecosystem. Amazon is known for changing the landscape of many (but not all) industries it has played in. Legacy vendors have to move very fast. Cloud Foundry provides a perfect ground for many vendors to unite around a common goal.

Here is what some of the foundation members could do with Cloud Foundry:

  • SAP could package Cloud Foundry to an as-a-Service layer above its database, creating a Hana-as-a-Service offering. If it does well, SAP might even build the successor to its NetWeaver business on Cloud Foundry.
  • Cloud Foundry may help RackSpace to better fulfill its new mission as a hybrid cloud service provider, delivering a hybrid IaaS/PaaS offering.
  • IBM. Last week, I got a chance to play with IBM BlueMix, a PaaS offering based on Cloud Foundry. Interestingly, BlueMix already contains Liberty, the cloud-ready version of the IBM WebSphere Application Server, SSO and mobile backend as a service. It seems like IBM is creating a unified platform for the cloud infrastructure, big data, and their existing middleware business. Cloud Foundry could very well play one of the key roles in their product strategy.
  • For HP, Cloud Foundry may help in differentiating its software-defined infrastructure all the way from servers to applications. Its strength could be in the granularity of control, which HP’s customers could have over the form of infrastructure ownership, covering every type of deployment—from legacy to in-house private cloud, to hybrid and public clouds. In a deployment built for one of Altoros’s customers, Cloud Foundry performed remarkably well together with HP’s Moonshot, a next-generation 45-node, $65k-$100k micro-server chassis.

One vendor from the Cloud Foundry ecosystem was not present among the members of the foundation. It was Canonical, one of the key beneficiaries of the announcement. Ubuntu, Ubuntu OpenStack, and Cloud Foundry, all enjoy perfect synergies with the exception of conflicting deployment tool chains (Cloud Foundry’s BOSH vs. Canonical’s Juju—both can deploy anything, anywhere). In one of my next posts, I will cover in depth why the union of Cloud Foundry and Ubuntu OpenStack changes the game for many.

The question is—did anyone lose? Well, the PaaS vendors competing with Cloud Foundry just got more work to do, and lots of it. Red Hat will now face even more challenges in building an ecosystem around OpenShift. I really hope that folks at Red Hat will work with the Cloud Foundry community to standardize around run-time and services, which are now incompatible. Wouldn’t it be nice to see the cloud moving to the next level of interoperability, at least at the PaaS level?

Related study:

A High-level Overview of OpenShift and Cloud Foundry PaaS: Features and Architectures

No Comments

Performance of RAID Arrays on Windows Azure: an Alternative to Horizontal Scaling

Sergey Balashevich

While working with several different NoSQL databases heavily loaded with write requests, we faced a situation when the hard drive became a bottleneck. Scaling the cluster horizontally could easily solve this kind of problem, but it would also increase the monthly payments. This is why we decided to take a look at other options.

The first thing that comes to mind when a DB starts experiencing HDD performance issues is to combine several virtual drives into a RAID array, but how will it work with Windows Azure virtual infrastructure? To check this, we compared the performance of a single virtual drive and different RAID arrays (types: 0, 1, 4, 5, and 6) using the Bonnie++ tool for hard drive subsystem verification.

Below you will find the test results and step-by-step instructions on how to configure a RAID array on your own.


Test 1: RAID performance under Write/Read/Re-write workloads

In the first test, we measured the performance of different RAID arrays for simple read/write operations:

sudo bonnie++ -d /raid1/ -m 'raid1' -u root -n 100:8192:16384:20 -x10 -s 16g -f > raid1.csv

Bonnie++ was run 10 times (-x10). Each test worked with 100 files of 8-16 KB in size and 20 subdirectories. In total, there were 16 GB of “files” in each iteration. Since a large Windows Azure instance has 7 GB of RAM, we had a chance to avoid caching.

You can see the first test results below. The x-axis stands for megabytes per second, the y-axis indicates repetitions (we ran each test 10 times).

Write test results:




Pivotal One & Cloud Foundry – Great Promise, Great Challenge

Renat Khasanshyn

Paul Maritz and company just launched Pivotal One and Pivotal CF, and reminded me yet again what a great platform Pivotal is about to become. Earlier today in a Twitter chat led by John Furrier of SiliconAngle, we discussed whether Pivotal One is vapor–or real. This post expands my opinion on the subject.

Great Opportunity

Pivotal‘s promise is that using their tool set, an IT architect can fill the entire “meat section” between raw virtual machines and an application. All in one shot. Indeed, at Altoros, we see more and more customer deployments involving every piece of the pie involving NoSQL/NewSQL and Hadoop data stores, real-time and analytics engines, messaging and apps deployed and scaled with the help of a PaaS layer.

To the naked eye, Pivotal’s offering makes a lot of sense, as it brings a one-stop solution that addresses quite a lot of the needs of next-generation application architecture at an average enterprise IT shop.

Great Challenge(s)

On March 13, 2013, when the Pivotal Initiative was announced, a high bar was set for the company. That is, to achieve $1B in revenue in 5 years. I believe that a few things should come together for this to happen.

If Pivotal can solve two key challenges–making a quantum leap in market leadership for a few more of their products and integrating the entire product suite into a single platform–they will probably not only achieve $1B in revenue in 5 years, but will have an amazing shot at becoming the bellwether of enterprise software moving ahead.

Challenge #1 – “best of breed incumbents”

Competing products are quickly becoming “best of breed” incumbents in five categories of next generation enterprise software where Pivotal is playing:

  1. Hadoop
  2. Massively Parallel RDBMS, with focus on analytics
  3. Next generation databases
  4. Analytics
  5. PaaS


No Comments

Evaluating the Apriori Algorithm as a Base for a Recommendation Engine

Sofia Parfenovich

aprioriData analytics for a large online store involves a number of challenges. Product data may be complex by nature and reach terabytes in size, your data stores may be (geo-) distributed, association algorithms may require significant memory resources, etc.

One of our customers needed a recommendation engine for a media streaming service to increase sales. My task was to develop a model that would provide relevant movie suggestions to users. Due to the extremely large size of data, the customer wanted to avoid using clustering, which groups data based on purchasing history. The decision was to go with the Apriori algorithm that builds association rules based on frequent sequences found in transactions. However, when working with real data, we stumbled upon some limitations.

In my most recent research, “Using the Apriori Algorithm for a Movie Recommendation Engine,” I came up with:

  • an overview of 4 most popular data processing algorithms for building association rules
  • 3 ways to speed up processing and decrease data size when working with big data
  • 3 methods that can improve the quality of search recommendations based on association rules
  • pros and cons of implementing the Apriori algorithm for building association rules
  • 10 diagrams that illustrate the theory and our findings

Download the white paper to learn more about the Apriori algorithm and what other options (such as clustering) you may have for building a recommendation engine. (Note: The document will be updated with more findings within a month.)

No Comments

Overview of Performance Bottlenecks in Hadoop

Dmitriy Kalyada

Jack Vaughan, Editor at TechTarget, gathered information on the main Hadoop performance bottlenecks in his article at Big data engineers from Yahoo and other members of the Hadoop ecosystem—including myself—dispel the myths about linear performance increase. I also explain why sometimes it is better to opt for commercial distributions.

Read the full article to learn about the main challenges that Hadoop implementations face and how they can be addressed.


No Comments

Data Visualization Tools: Flot vs. Highcharts vs. D3.js

Igor Zalutsky

Today’s Web applications deal with massive data sets that require high-performance systems for processing and analysis. However, information becomes even more valuable, if you can efficiently visualize it.

We have prepared a comparison of three wide-spread but very different JavaScript libraries to see how they cope with big data and real-time visualization. The libraries were selected based on popularity, performance, implementation approach, and relevance:

1)    Flot, an open source jQuery plug-in designed for drawing diagrams in Canvas
2)    Highcharts, one of the most popular proprietary libraries
3)    D3.js, a large open-source framework for data visualization

Below is a brief comparative table that will give you a general idea of what big data and real-time visualization capabilities you can expect from these three tools.flot vs. highcharts altoros

Download this document to get a more detailed comparison of Flot, Highcharts, and D3.js with 16 sample diagrams, a vendor-independent overview, as well as information on required code size, platform support, etc.


« Previous Page   |   Next Page »

Download Benchmarks and Research

Subscribe to new posts

Get new posts right in your inbox!