Wednesday, July 14, 2010

Disaster Recovery Management (DRM) in the Cloud

Although the occurrence of a disaster is unexpected over a certain time span of a system, the possibility of it occurring cannot be ignored. Automatic Disaster recovery in the Cloud becomes a largely indistinguishable operation if your entire Cloud infrastructure falls apart. You should have the capabilities in place to restore it on internal servers, at a managed hosting services provider, or at another Cloud provider within minutes or hours.

Through disaster recovery planning, you identify an acceptable recovery state and develop processes & procedures to achieve the recovery state in the event of a disaster.

Defining a disaster recovery plan involves two key metrics:

1. Recovery Point
2. Recovery Time

Recovery Point determines how much data are you ready to lose in number of hours or days of data. At the same time, Recovery Time determines how much downtime is acceptable and is specified as number of hours/days to get things fully operational once again.

DRM plan in the Cloud should consider these four major factors before going in effect;

1. Backup and data recovery (fast, secure, easy)
2. Geographical presence of data center (redundancy of data center according to geographical presence)
3. Recovery Time and Point ( should be ~0 hours if possible )
4. Monitoring tools residing in third cloud service provider’s infrastructure (as monitoring systems cannot live in either your primary or secondary cloud provider’s infrastructure)

WOLF has a DRM plan which expects zero downtime and data loss by having multiple data centers in different geographic locations that are synchronized. In other words, we are operating with multiple data centers from different infrastructure providers with dedicated, high-bandwidth connections serving different geographical customers. 

Though the cold reality is that the cost of losing 24 hours of data is less than the cost of maintaining a zero downtime/zero loss of data infrastructure, we still try to emphasize this level of redundancy in our DRM plan, which gives our customers 99.96% service availability and zero loss of data. 

We have implemented a strict backup recovery management plan for all the applications- design and encrypted data - with a daily on-site backup on different data center. WOLF application designers can also take a backup of the design & data at anytime with the click of a button! Thus we guarantee zero data loss with maximum downtime of approx. 2 hours as per our DRM plan.

With the acceptability of the Cloud and related services increasing daily, the DRM plan could become a point of competitive difference between vendors in the Cloud.

Wednesday, July 7, 2010

Two faces of the SaaS simplicity coin

While there is a common consensus that "less is more" with SaaS, application developers/SaaS designers need to consider a couple of factors when contemplating application features for an end customer on the web:

1. Tech-aversion of the target user/segment
2. The real world analogy of a feature

For a non-technical end user, the application developer should necessarily incorporate "props" that would increase usability. This includes pictures, icons or even larger text that would draw the attention of the user or prioritize a certain process.

For example, a customer in the logistics domain emphasized the importance of setting up driver details and vehicle routes before any data was recorded in the system. The developers used icons and pictures of different sizes & colors to indicate priority.

In addition to this, a feature which can be cut down to a single step for a tech-savvy user may need to be stretched out for the target segment/user. In doing so the system emulates the real world scenario as closely as possible and reduces erroneous data/user training thereby increasing adaptability.

To illustrate this, lets consider the tyre life estimation process which, in addition to maintenance, is one of the major costs for logistics companies. The SaaS designer deciphered the process as shown below:

After consulting the maintenance staff, it was uncovered that there was a need to track the history of all tyre checks at every point in the process - Number of initial checks, Checks for retreading, etc. - as well as the staff involved in checks. Hence a process that can be simplified may need to be scrutinized and elaborated for the sake of usability.

Do you have any similar experiences to share? Please share your stories in the comments.