Twitter Scale

High scalability has a great blog about architecture of Twitter. How Twitter grew up from just a web application to handle thousands of tweets and the internal architecture of components they have. Last week I wrote a blog about how a simple web application might evolve according to different requirements with time and how it will save you a lot of time to come up with a loosely coupled architecture between your componets and how it would help immensely when it comes to handle difficult problems. Above article is a wonderful real life example of how Twitter evolved to handle massive number of users and tweets. Twitter is all about posting messages that are only 140 characters long. How hard can it be to write a webpage to handle that?! Great example how simple things can yield to a complex architecture on the backend to give a responsive user experience.

The article has a lot of details about problems specific to Twitter and tweets. Couple of big picture points I want to highlight are (straight from High Scalability article),

  • Twitter no longer wants to be a web app. Twitter wants to be a set of APIs that power mobile clients worldwide, acting as one of the largest real-time event busses on the planet.
  • Internal clients use roughly the same API as external clients.
  • 1+ millions apps are registered against 3rd party APIs
  • Tweets are forked off in many different ways, mostly to decouple teams from each other. The search, push, interest email, and home timeline teams can work independently of each other.
  • For performance reasons the system has been being decoupled.

Read the original article.

Understanding an Enterprise Service Bus (ESB)

This blog tries to give you an understanding of what an Enterprise Service Bus is and what are the typical functionalities of an ESB. What scenarios that it can be utilized, for what purpose, and how it will help you to solve similar problems you’re having in a consistent, standardized way. Also incorporating a wealth of knowledge from real life production deployments. That’s important. If you have something which can solve world hunger or claim to be the world’s best at something and if it’s not used in real life production scenarios, it’s good as dead. Those stuff tend to make great folk stories, like bed time stories that you tell your geek younglings of how, once upon a time this great and complex thing was built over a weekend with some pizza and mountain dew ;-)


Let’s forget about ES part in ESB and concentrate about the B. The bus. This is not something new. You probably have learned about busses when you read about PC architecture. If you look at how data is being transferred from CPU to memory and also how data is received from keyboard, mouse and other peripheral devices plugged into a PC they all use a bus. Why? Main reason being, it eleminate point to point links between those systems. In todays world, where there are thousands of different peripherals that can be plugged into a computer, imagine the mess it would create if all of them need point-to-point links to the CPU, memory etc… So you have a single bus that everything connects to. That will take care of transferring information back and forth between connected peripherals.

Integration problems

Companies tend to use a lot of software for their day to day activities. This will only increase with time, not decrease. Also, things that were done using manual labour or hardware also tend to be converted to software as this excellent blog explains. Using software is great you can increase the efficiency of business operations. Even this is true, there are instances where you want the data in one system to be fed into or do some form of consolidation to have a unified overview of what’s going on. Otherwise there’s little to no advantage of having disconnected systems. Then you have to some kind of manual intervention for keeping data in sync among all these different systems.

Collection of services

Because of sheer tediousness all these enterprise software systems started exposing their functionalities as web services. Now you have easier ways of connecting one system to the other. Importing data and exporting to another system is just a matter of writing some glue code to call several web services. Just like that your data synchornization issues between two systems can be solved.

When the number of systems gets increased then you have to maintain point to point links with different glue code that connects these systems. Again, it gets error prone and tedious when you have more than a handful of systems in place. It’s not going to be a scalable solution to the problem.

As you can see from the above image, it gets very complex and hard to maintain when the number of systems are increased. Add to the complexity, each and every system or the service that’s exposed might be operating on their own message format. Which you have to map from one message format to the other through your glue code when connecting different systems.

This is where a bus architecture is useful again. A service bus. Since it’s connecting all enterprise systems, enterprise service bus. Although I have no idea how the name came probably something along those lines :-)

As you can see, this simplifies the process a LOT. Now you can move all the “glue code” logic you had to the ESB.

Functions of an ESB

When you’re integrating different systems with some glue code that should have certain functions which can be different from one system to another, you need to take all those into account when you’re selecting an ESB. An ESB should allow you to do all those things. Let’s see what they are.

Expose services

An ESB as the name says, a service bus. It should have the ability to expose services. For example, taking the above diagram as an example, it should connect the CRM and expose CRM’s connectivity as a service to other systems. CRMService may be. That service we call it as a proxy service because it proxies request for the actual CRM service.

Message transformation

Since the message format one system accepts can be different from another, an ESB should allow you to transform one message format to another. This can typically be transforming from one XML format to another XML format. XML -> JSON, JSON -> XML, Binary -> XML and so on.

Protocol transformation

There are instances where systems expose their service through different protocols. HTTP, HTTPS, JMS, FIX, FTP, SFTP, WebDAV etc… So an ESB should support accepting messages from one protocal and sending it in a different protocol.

Routing messages

This is another commonly used function of an ESB. You receive a message and based on certain values in the message you want to route the message to different systems/services. So ESB should allow you to traverse through the content of the message and filter on any attribute that’s there in message content.

Message cloning/splitting/aggregation

Another useful functionality is being able to clone an incoming message and send it to one, two or several services that accept the same message format, all at the same time. Also, splitting a message before sending and aggregating messages that comes from different services.

There are many such forms of communications and different ways of processing messages. Based on this knowledge of systems and different ways of connecting and message processing, you can identify certain patterns from these integrations.

Enterprise integration patterns

This fortunately has been documented in the excellent Enterprise Integration Patterns book. The book has a pattern catalog that’s has been developed or extracted from by looking at different real life integration scenarios in the industry. Now an ESB should be able to support or fascilitate implementing these patterns. When you see how different patterns can be implemented through an ESB with the configuration, it becomes easy to understand how the pattern is to be implemented as well as if you need alterations to match it to your specific use case, then you can do so very easily. Here’s all the configurations how an ESB can be use to implement these enterpise integration patterns.

What Is Platform as a Service?

Tech industry is filled with acronyms. People build new acronyms and buzzwords all the time which contributes to this confusion. One such acronym is PaaS - Platform as a Service. If you ask 5 people what does PaaS mean, you probably will hear 5 different stories. All of them would be right! So what is PaaS?

To answer that we need to ask what a platform is. as-a-Service part is easy. You give something as a service. There’s no downloads involved, it’s hosted somewhere on the internet and is accessible through a browser or some other tool that will know how to communicate with with a service that’s hosted on the Internet. So what is a platform? This can mean many things and that’s where the confusion lies. So platform can mean,

  1. An operating system
  2. A programming language and associated libraries/frameworks
  3. A suite of products
  4. An application container (e.g.: application servers)

There are companies and products out there which provide a “PaaS” at all these different levels. Also they refer to them as PaaS providers or companies that enable to you use a platform as a service. Which is not wrong considering different meanings for the word platform. Gartner, your friendly neightborhood researh company has tried to defined these terms and some additional terms to clear out this confusion. Gartner difines PaaS as,

A platform as a service (PaaS) offering, usually depicted in all-cloud
diagrams between the SaaS layer above it and the IaaS layer below, is
a broad collection of application infrastructure (middleware) services
(including application platform, integration, business process management
and database services). However, the hype surrounding the PaaS concept
is focused mainly on application PaaS (aPaaS) as the representative of
the whole category.

This clearly state a PaaS is about an entire middleware platform. Not about any specific application server or a programming language/framework. Also, Gartner has introduced some more acronyms for clarifying this confusion. aPaaS and iPaaS.

Gartner definition for aPaaS is,

Application platform as a service (aPaaS) is a cloud service that offers
development and deployment environments for application services.

That covers offering an application server as a service giving users to develop and deploy on top of that.

Gartner definition for iPaaS is,

Integration Platform as a Service (iPaaS) is a suite of cloud services
enabling development, execution and governance of integration flows
connecting any combination of on premises and cloud-based processes,
services, applications and data within individual or across multiple

Even though the Wikipedia page for Google App Engine and the intro document on Google help site mention it’s a PaaS, Google App Engine is not a PaaS. It’s an aPaaS. To be a PaaS, according to Gartner definition above it has to have a set of middleware services. Like for example Stratos.

AppFactory Picks Up Where SourceForge Left Off

SourceForge as the title says is a website for finding, creating and publishing open source software for free. Some very popular projects are still hosted there. If you’re doing a technology related job chances are you probably have come across this website more than once. When you create a project in sourceforge you get all the infrastructure you need for the project. A source code repository, support ticket system for tracking/reporting issues, a forum like discussion medium, user reviews, distribution system for releases, track user downloads (generate graphs for each version of a project release) and so on. This is all very useful. There are many such systems out there that allows you to create projects, host them and distribute releases. Google Code, Launchpad, Github, CodePlex are some of them. This seems like a good system to have if you’re a softawer development shop. If you have various projects going on this provide an easier way to get builds for QA, and a feedback system that the QA team use to report bugs and so on. There are open source projects that you can download and install to get a SourceForge like system for yourself and your fellow developers. If you develop a lot of internal applications that’s used inside an organization this is immensely helpful for that too.

So that’s mainly about application development aspects. Where your infrastructure is hosted, issue trackers are configured, what releases have been done etc… At this stage you would probably have configured automated build tools too to run continuous builds from the source. Then there’s the other side of application runtime. Application runtime usually will involve having multiple environments for staging, QA, and production. In a given time an app can be in any of those stages. There was little to no software that will allow you to see into what’s going on in this runtime space. Certainly no open source ones that I was aware of.

Until now.

This is one aspect that AppFactory is trying to fill. Each of those environments you have can be configured as separate PaaS deployments. So you’re having your staging PaaS, QA PaaS and your production PaaS. Entire application lifecycle can be managed through a web based portal. Deploying from your staging environment to the QA environment and subsequently into production can all be managed through a web based interface. This follows a check listed approach where you can “tick off” items that’s necessary to carry out before moving from one environment to the other. If the criteria is not met then demotion is also possible. Further, AppFactory includes having an issue tracker, source repository, automated builds, managing application versions, place to create resources that will be used in your application like DBs, APIs etc… So it helps at the application development stage too. Giving visibility into what’s going on right now, what project is at which stage, what are the products we have now in production and which versions are all business critical information to have through a web based dashboard.

Samisa has written a nice blog on how AppFactory revolutionize application development. Also this mindmap about AppFactory puts it into the broader context of what it is and what are the problems it tries to solve.

Honda Insight as a Zombie Response Car?

I saw someone has posted this picture on Facebook.

This is a very easy mistake to make. You would think that since Honda Insight is a hybrid it will give you better gas mileage to go further in a Zombie Apocalypse. But no, Honda Insight is not a good car to be in incase of a Zombie Apocalypse let alone use as a response vehicle! Let me clarify.

Yes you’re right, in a zombie apocalype, petrol sheds will all be on fire and it will be a very scarce resource. However, getting good gas mileage from a car is least of your worries when that outbreak happens. There will be debris all over the place and ofcourse the zombies. You have to drive your way through all this to rescue people and/or to get away from that mess. When you’re driving through, you probably going to hit and run over a more than dozen zombies. Considering the impact it will cause on the body of the car, your ride will not get you far. Yes, if your goal is to be discrete and hide for couple of hours, a Honda Insight is perfect. It will be the last place a zombie would come looking for people. Definitely not as a response vehicle.

This brings us to the question what would be a good Zombie Response Vehicle?

Image credits - Simon Williams

Since you will be trying to make your escape very fast, is F1 car a good candidate? Not so. When you try to save someone and flee away, you would only be able to save yourself because F1 cars have only one seat. If inadvently a zombie wave their hand when you’re passing by, because of your speed and your head is sort of exposed there’s the possibility of serious neck damage. The debris can cause serious nose damage to the car as well. Besides, you have to be careful about your left rear tyre even in straight roads where you don’t get any zombies.

F1 car - No good.

Image credits - Joseph Thornton

Since we have a petrol shortage, will Tesla Model S be a good candidate? Specifically the 85kW model. This seems a savvy choice at first but there are couple of minor downsides. One of which will get you bitten. You don’t want that. There are records where it out drags an M5 so it’s well equipped for quick getaways. You can rescue about 4 - 5 people. 6 or 7 if you count the trunk space. As zombies and debris goes, this can cause serious body damage. If you’re lucky there will still be electricity in super charge stations. So your getaways are limited to roads leading to super charge stations. Also, you have to pray real hard that there will be no zombies after 265 miles. If not, well … to put it mildly … you’re fucked. You really don’t want that from zombies.

Tesla Model S - No good.

So what would a good zombie response vehicle looks like. It has to be fast, it has to have a reasonable top speed so that you can gain some momentum and run over zombies and doesn’t cause much body damage to the vehicle from all the debris. Also, it should support running on different terrains to do shortcuts and avoid heavy zombified areas that otherwise too thick and too much to run over. As it turns out, the most suitable zombie response car is not a car but a truck. Mastiff used by the UK Army. When you go through that page as you’ll see it’s the perfect vehicle to have. You can rescue upto 8 people, I’m sure in an emergency situation you could probably squeeze in a 10 more easily. Fitted with guns, grenade launchers and a top speed of 90kph. Also by the looks of it and commentary elsewhere seems to have torque figures that can pull Mars and swap it with the Moon.

So there, if you’re buying a Honda Insight expecting it to act as a good response vehicle in case of a Zombie Apocalypse, stop now! Buy the Mastiff instead!