My Gripes with Cloud Foundry

by Subbu Allamaraju on March 22, 2012 · 9 comments

I tried setting up Cloud Foundry this week using Micro Cloud Foundry. See my post on ql.on Cloud Foundry for details. The setup and experience is not that hard – it took two late nights to figure out the details. But I’m not sure if it is ready for the prime time. Here are my observations.
  • Cloud Foundry has the notion of multiple apps on a given VM, but is this the right model? I would let the VM worry about isolating CPU, memory and the disk so that apps remain independent of one another and tuning can be done be independently. The multi-app model may be okay for several small apps running on a single VM, but not otherwise.
  • Cloud Foundry uses an L7 router (ngnix) between clients and apps. The router needs to parse HTTP before it can route requests to apps. This approach does not work for non-HTTP protocols like WebSockets. Folks running node.js are going to run into this issue but there are no easy fixes in the current architecture of Cloud Foundry.
  • As of now, I can’t get chunked responses work on node.js apps deployed on Cloud Foundry VM. See https://gist.github.com/2164855 for an example.
  • In the real world, I need to explicitly control routing and load balancing at a tier in front of my apps. But Cloud Foundry adds one more unnecessary routing tier on each VM.
  • There is no support for rolling upgrades. May be there are ways to enable this, but it is not clear from the documentation. So availability becomes a manual exercise.
  • The story about logging, monitoring and metrics is vague. I see some references to third-party plug ins for certain kinds of apps, but support seems to be missing in the PaaS layer.

If you find that my understanding is incorrect, please help clarify and set us in the right direction. My interest is in getting node.js stack run well on Cloud Foundry.

Update: A work-around for the chunked encoding issue is to always add the Content-Length to responses.

{ 9 comments… read them below or add one }

Mark Nottingham March 22, 2012 at 2:06 pm

The response that you link to is double-chunked; note the *two* “chunked” transfer encodings. Ew.

Reply

Subbu Allamaraju March 22, 2012 at 2:09 pm

What can I say! The node.js app is serving chunked response. Without CF, the response is

< HTTP/1.1 200 OK
< X-Powered-By: ql.io/node.js v0.6.10
< vary: accept-encoding
< Server: ql.io-0.4.12/node.js-v0.6.10
< Date: Thu, 22 Mar 2012 22:08:29 GMT
< Connection: keep-alive
< Transfer-Encoding: chunked
< content-type: application/json

Reply

Mark Nottingham March 22, 2012 at 2:17 pm

Maybe try sticking both the back-end URLs into REDbot, see if it catches anything. If not, looks like they’re not properly de-chunking responses — although I’d be really surprised if this was an nginx bug, given how widely deployed it is. Maybe they’re using some sort of custom client side?

Reply

Duke Leto (@dukeleto) March 22, 2012 at 4:45 pm

An interesting dissection about trying to get ql.io running on #cloudfoundry by @sallamar http://t.co/580PlC7d

Reply

Ken Robertson March 22, 2012 at 8:06 pm

I agree there is room for a few improvements. Here are a few counter points from my own observations:

* Cloud Foundry itself doesn’t do anything with VMs. It doesn’t provide anything at the IaaS level. You might be mislabeling it as the DEA. The DEAs have some built in handling for ensuring instances are more balanced between nodes, but think some of that changed in the latest release, supposedly for the better.

* Although it would be nice to have more control over the nginx layer, if you wanted lower level routing of requests, you’d basically need your own IP per application or non-standard external ports. This isn’t necessarily practical either.

* If running your own Cloud Foundry setup, you could customize the process to allow have the router in front, or behind something non-nginx like an F5, and then look at running nginx on each DEA with a port/config per app. I’ve been looking at doing something along this line, putting nginx closer to the app to make it more customizable.

* Restarts (or deploys overall), metrics, and monitoring is definitely something that is on my mind with the PaaS I’ve been building.

Reply

Subbu Allamaraju March 23, 2012 at 7:05 am

* Cloud Foundry itself doesn’t do anything with VMs. It doesn’t provide anything at the IaaS level. You might be mislabeling it as the DEA. The DEAs have some built in handling for ensuring instances are more balanced between nodes, but think some of that changed in the latest release, supposedly for the better.

You’re right about VM vs DEA. Thanks for correcting.

* Although it would be nice to have more control over the nginx layer, if you wanted lower level routing of requests, you’d basically need your own IP per application or non-standard external ports. This isn’t necessarily practical either.

I would prefer having no L7 level interception. AFAIU, the route is a consequence of using a particular architecture for DEA and not otherwise.

Reply

Jeromy Carriere March 25, 2012 at 4:10 am

Good observations, Subbu. Our experiences, in building our private cloud for X.commerce, align well with Ken’s:

* We’re building from the IaaS up, so the application/DEA/VM allocation is completely within our control. We’re currently doing 1-1-1 allocation to simplify our resource management, but will likely move to Linux containers in the future, to improve efficiency.

* Like Ken notes, we’re going to, over time, need to look at changing where nginx sits in the architecture. Right now, it’s worked fine (aside from minor issues), and has let us get up and running very quickly. Our application architecture is uniform, such that we can push routing and load balancing out of the application’s control.

* Rolling upgrades, logging, monitoring, etc. are things we’re implementing as part of our cloud build.

And thanks for getting chunked responses working quickly!

Reply

Ramashri Umale March 26, 2012 at 4:25 pm

Here’s how rolling upgrades can be performed.
https://gist.github.com/923303

Reply

Subbu Allamaraju March 26, 2012 at 4:37 pm

Awesome. Thanks for the info.

Reply

Leave a Comment

Previous post:

Next post: