Design Strategies to Prepare for Azure and SQL Data Services
November 5, 2008 4:05 PM |
Comments (3) 
| Rate this article: 

Introduction

Microsoft Azure is the new cloud-based computing platform from Microsoft.  Azure is a platform that includes web hosting, computing, messaging, and storage components.  In addition to the raw components in Azure, Microsoft has additional complementary services that can be composed within an Azure solution to add even more power.  These include (but are not limited to) Live Services, .Net Services, and SQL Services.

As the senior architect of ADXSTUDIO CMS, I am going to give some of the perspectives on the strategies that we took to successfully port our CMS product into the cloud using Azure, Live ID, and SQL Data Services.  This article is not an introduction to Azure or SQL Data Services, but a guide for architects on how to design their applications for Azure and other Microsoft cloud services.   These suggestions and the guidance I am going to give can be applied to your existing applications, even if you are not going to host them in the cloud.  I would recommend that all architects consider integrating the methodologies into their own projects so that they have the power of choice in the future to select where to host the applications - on premises, with a partner, or in the Microsoft cloud.

Azure – Development

You can think of Azure as a sandboxed implementation of IIS 7.  The great thing is that you are going to be able to use familiar tools and technologies as you are already used to for web development.  However, there are subtle and important differences that are important for you to consider.

A web role in Azure is a Web Application project (as opposed to a website project).  There are a few subtle differences between the two, and those are summarized by the existence of a project file, the concept of compiling your website, and the use of codebehind instead of codefile in your page and user control directives.  Many of us became accustomed to the new website project where we could just save a file and refresh our browser and the website would reflect the change - that made for some very rapid web development capabilities.  To use Azure, you should become more aware of the web application project and getting used to compiling your sites to see the changes.  The nice thing is that you can do this now with your existing sites to prepare them for easy insertion into Azure at a later date.   If you have a project with a lot of pages that you have in a website project, you can convert the project to a web application and have all of the files adjusted to use codebehind instead of codefile.  You will also no longer need to store all of your classes in an App_Code folder as every class that needs to be compiled is included in the project file.

I would recommend that your web developers get used to using Vista with IIS 7 and convert your website projects to the Web Application Project model that is included in Visual Studio 2008.  With Azure, you cannot use the IIS administration tool to configure the website - you are required to use the web.config to manage the few IIS settings you can.  This is another great reason to get your developers to start their development cycle on Vista, which supports the IIS 7 integrated pipeline.

Azure – Partial Trust

A large majority of web developers do not have a good understanding of the security model baked into the .Net framework.  If you don't know what code access security is, Azure will give you the reason you need to educate your developers on that feature of .Net.  When you host a site in Azure (including the development fabric), the ‘website' runs in partial trust mode.  Many websites today are designed to be run in full trust mode, which basically means your code can do whatever you have programmed it to do.  When you run in a partial trust environment, your code will be restricted to only do ‘safe things' that are allowed in that mode.  The mode that Azure is using is most similar to what is known as medium trust in ASP.NET, but there are some subtle differences.  When you are running in Azure, your application will not have access to the registry, most of the file system, environment variables, will not be able to use reflection, and cannot run unmanaged code.  This last one can be a gotcha even if you think you don't use unmanaged code - in our case, we were using mutexes from the .Net Framework and those in turn used unmanaged code to obtain a lock.

Running your website in medium trust today on IIS is a good idea from a security standpoint, so I recommend that the first thing you do is to flip your website into medium trust mode and see what breaks.  When you change your website to run in medium trust mode, you are well along the way to have your code running in Azure.  The one significant difference between ‘Azure trust' and Medium trust is that Medium trust in IIS will allow you to use SQL, where in Azure, you are not allowed to do that.   I will cover the data modeling requirements later in this article.

Your web application may require the use of other components.  In .NET, you simply add a reference to the component and the DLL is brought into the project at compile time.  This is the same with Azure, but because Azure is not running in full trust, the first thing you may notice is that you cannot bring in a reference to some external components.  For a component to be used in any mode other than full trust (known as partial trust), you must decorate the assembly with the AllowPartiallyTrustedCallers attribute.  That is simple if the assembly is something that you have the source code for, otherwise you may have to locate the external vendor and get an updated version that is designed to run in Azure.

Azure – Authentication and Security

When you deploy your web application to your own servers, your servers typically live within the network security of your DMZ, and more frequently than not, you have Active Directory in the background for authentication as well as group policies.  When you deploy to the Azure cloud, you don't have any influence over the network configuration of the machine, including any local or domain accounts.  For all intents and purposes, you must consider that you don't have any domain nor any access to local machine security.  As such, you need to ensure that you are using membership and role providers in your web.config, but you cannot use the Active Directory, SQL, or Windows membership providers included in the .Net Framework.  Fortunately, the Azure team provides sample membership and role providers in the Azure SDK that you can use.

The strategy that we took was to integrate tightly with Live ID.  Microsoft has announced the availability of their Active Directory Service connector that will allow an organization to federate their Active Directory with Live ID.  One of the key scenarios in our situation was that we understand that clients don't want to have to manage another set of credentials for each application that is hosted in the cloud - they want to use their own LAN credentials.  With the combination of us using Live ID and a customer using the new Active Directory Service Connector, this means that users can authenticate into the cloud transparently using their own network credentials.  We already had a membership provider in our CMS that was integrated with Live ID, so it was a natural fit to use in Azure.  I would recommend that every application architect seriously consider the new Active Directory federation capabilities in designing their own application that may eventually be hosted in the cloud, and the use of Live ID and the Access Control capabilities provided by .Net Services.

Data in the Cloud

Cloud computing requires that you use data services instead of a connection to a SQL database server.  As such, Azure does not give you access to a SQL database.  Porting your application to run in the cloud will mean that you need to design your application to use a cloud capable data model.  You are going to spend most of your porting efforts in this area.

You have multiple choices for data storage when you are using Azure.  Azure comes with a rudimentary blob and table storage mechanism, and you also can use SQL Data Services which has an entity model and blob storage.  The models are similar, but they are also a bit different.  One of the main differences today is that the query capability in SQL Data Services is much more powerful.  We chose to integrate with SQL Data Services mainly because of the richer query capability as well as the stronger security and federation capabilities.  There are other advantages to using the storage model in Azure, so I would recommend that you consider both platforms and pick the best one for your situation.

These cloud storage models are like SQL, but they are not like SQL at the same time.  You have to think of them as a data service, so if you already subscribe to an SOA approach, this will be second nature to you.  If you are still using traditional client/server approaches, you are going to have to take some extra time to get your head around some of the design principals.  I would recommend training your developers on modern SOA approaches as soon as you can.

Here are some of the differences you are going to find from using traditional SQL Server for your database storage:

  1. You do not have stored procedures or any server-side logic. Azure storage and SDS are raw data storage APIs that presently do not have any server-side processing logic.
  2. You do not have any identity columns (auto-incrementing integers). Your existing SQL model may be reliant upon integer identities for primary keys and foreign key references.
  3. SDS has no concept of a fixed schema. SDS uses a dynamic property bag to store attributes of an entity. Further still, these attributes can be of different types. Relational database models are highly structured, so it is going to feel very different for you to use these new data services.
  4. There is no referential integrity constraints in these new data services. Your application is going to have to take care of its own data consistency instead of relying on the schema of the database to do that for you.
  5. There is no transaction model. You cannot write to multiple entities and issue a commit or a rollback.

Data Modeling Recommendations

There are a lot of steps that you can take today with your existing data model, even if you are not prepared to move completely into the cloud.  Taking these approaches today will make it easier to move into Azure / SDS in the future when it becomes important for you to do so.

Consider an ORM Model

There are a number of strong ORM modelling solutions that can be used to generate classes around your data model.  The best thing you can do is to design classes for your data and business logic as soon as you can.  This will provide a solid framework of which will shorten the impact on your application for moving your data from SQL to the cloud in the future.  This approach will mean that you no longer will be embedding SQL statements in your application - only in your data access layers.  Centralizing this to your data access layers will enable you to change the storage model later without having to rewrite your entire application.

Migrate stored procedures to your class framework

I have been a long time proponent of moving ALL of your business logic in SQL stored procedures into your class framework.  My apologies to the SQL Server team - no offense meant, but T/SQL is the most rudimentary language we have in our development toolsets today.  It is not object-oriented, does not have polymorphism, inheritance, or any of the other goodies we take for granted in our standard programming languages (like C#).  It has rudimentary handling of strings, and it does not have a comprehensive framework behind it.  I have found that it is much better to encompass your business logic in your class framework.  Now, since Azure and SQL Data Services have no stored procedure capabilities, you are going to have to seriously look at moving your business logic into .Net classes that can be deployed with your application.

Guids are a 'unique' solution

Many database developers use identity columns for primary keys.  Identity columns are integers that automatically increment, making them a very convenient choice for a primary key on a table.  There are two considerations that you need to look at:  Azure and SDS do not have identity columns, and you have to provide your own identity for entities as you insert them.  The typical pattern for using identity columns is to insert a row and then retrieve the @@identity value after the insert to obtain the value of that new record.  When you program with cloud data services, you need to be able to have your application generate the identities of the new entities before they are inserted.  The best thing that the .Net framework has for that purpose is a Guid.  The new pattern that you should get used to is for your application to generate a new guid, then insert the data item with that as a unique identifier. 

If you want to prepare your existing SQL model  to make it easier to move towards a cloud data service in the future, I highly recommend you start using SQL's uniqueidentifier in your SQL schemas today.  A couple of hints so that you don't run into trouble - do NOT use a clustered index on a Guid column, and consider setting the new attribute as the rowguid.

Provider Pattern

Microsoft introduced the provider pattern with the introduction of .NET 2.0, and the most famous providers are the membership, role, profile, and site map providers.  The key to the provider pattern is that it has a common interface and Microsoft gives us the plumbing so that a provider is wired up in the configuration file instead of in code.  Providers are designed to be pluggable and changed at deployment time.

We highly recommend that you design your data models around a provider pattern, so that you can have a separate provider for SQL storage that you use now or when you deploy on premises, and an SDS provider that you can use when you deploy in Azure and need to use cloud data storage.  In our case, we had quite a large number of entity types in our data model.  We decided to have a master data access provider that would be wired up in the web.config that was a class that provides all of the individual data providers for each ‘family' (SQL or SDS).  For example, we have site markers, and tags as separate entities.  We wire up the SQLDataAccessProvider in the web.config and that class gives us a reference on the SQL-based tag data provider class and the SQL-based site marker data provider.  The SDS provider provides us with a different set of classes that are all using SDS for data storage.  In our business logic, we access the appropriate class by using DataAccessProvider.TagData to get a handle on the currently configured data provider.  This abstracts away all of the plumbing for the individual storage providers from any of the business logic classes that consume or use the data.  I recommend that you consider a design that is based on a provider model for any new development in your application.  You can start now and integrate this within your existing system and you will still be improving your code design and at the same time you will be preparing your application for future change of underlying storage models.

SDS - Slower than Sockets

I overheard one of our developers talking with a customer the other day: "SDS is slower than sockets".  That phrase stuck in my mind, but it brings up an important consideration.  For the most part, our web servers are not that far away from our SQL servers, and with connection pooling, we are accustomed to very fast turnaround on SQL queries.  This is different when you are using data services.  When you are using cloud data services, you are doing this over http, and http is a stateless protocol that does not maintain a constant connection between the client and server.  That is one of the main reasons that cloud data services can scale very easily, but it creates another problem at the other end - there is a lot of overhead to set up a connection, query the data, and receive a response.  Simple queries no longer are measures in milliseconds - they are measured in the hundreds or sometimes thousands of milliseconds.  Unless you have already considered caching throughout your entire data model, you are likely to notice that your application runs very slow in the cloud.  As such, it is very important to consider the implications of adding a data caching layer into your application.  Our developers have implemented a data caching layer just above the raw data providers to minimize some of the code and increase standardization, but caching is also implemented at many different layers.  Caching is a topic that is too large to be covered in this blog post (stay tuned for more posts), but what I will say is that you need to consider data caching requirements in your application and you should pay close attention to Velocity and the improvements being made to the caching APIs in .Net 4.0.

Our Experience

We received an early CTP version of the Azure developer cloud fabric in the first week of September, and we got our SDS account in the middle of September.  We had four developers working on our product from then until PDC.  In that time, those developers upgraded the CMS to support Azure and SQL Data Services and we were able to demonstrate it in our booth on October 27.  In addition to that, they wrote a LINQ provider for our CMS, implemented ADO.NET Data Services, supported Live ID, and reworked our page model to support MVC.  Finally, in the last three weeks leading up to PDC, we also build a demo website from scratch to demonstrate in one of the CRM sessions that was integrated with CRM online, Live ID, running in Azure, and using our CMS.  And they did this without working overtime.  I am very proud of our product development team and the agility and productivity that they demonstrated throughout this process.  Special credit also goes to a couple of members on the Microsoft Evangelist Teams for their guidance, encouragement, and support throughout this process.

We were lucky that we had made some critical design decisions over the past couple of years that made our journey into the cloud an easy one.  Porting to Azure and SQL Data Services was actually fairly straight forward for us and a natural progression for our APIs.  I was pleasantly surprised as to how closely our SDS data model was to our existing SQL data model, despite SDS looking like a completely different architecture.  What I can say to you is to not fear all of these changes, but to carefully consider your architecture and the guidance that I am giving here so that you can make your journey into the cloud a smooth one.  Take some of these steps early so that you have the power to choose Azure hosting in the future.

Finally, if you are looking to build a modern website to host on-prem or in the Azure cloud, consider ADXSTUDIO CMS - the first CMS to support Azure and SQL Data Services.

Links & Attachments 
Comments (3)
#re: Porting CMS to Azure
I work at design studio that is skeptical of Azure (which I'm not!). A case study of a fully fledged CMS working within Azure sounds very promising. I'd certainly be interested in posts about security within Azure for corporate usage and use of the WorkerRole (how many scheduled or workflow related tasks have you got set up as WorkerRoles?). PS. very minor typo, you've put "blog" storage rather than "blob" storage a couple of times.
12/10/2008 5:34:08 AM by Andy
#re: Great Article
Great article, very clear and concise. Thank you. My question is, are you suggesting that SOA style development will permanently do away with the 5 database tenets listed, or that this is a temporary limitation of the cloud technology. Those items are axiom from a database developers perspective. Without referential integrity, the world would fall apart in a matter of days; chaos would ensue.
12/18/2008 11:22:34 AM by Steven
#re:
Currently, those tenants are not in the web service. Microsoft has committed to adding more capabilities to their data services that are traditionally a part of a standard relational database, including transactions and referential integrity. That said, those are not there today, but you can implement them by using an SOA approach where all modifications to your data go through a web service layer instead of the raw data manipulations itself.
12/18/2008 11:27:14 AM by Shan McArthur
Submit a Comment
Title:  
Name:    
Comment:    
Verification:

Type the characters you see in the picture below.