Monday, September 27, 2010

I am a big fan of Steven McConnell's body of work. I came across an interesting piece which deserves sharing promptly for practioners of Software Engineering.

Some of the outlined Descriptions of Mistakes make for an interesting read and it evokes a feeling of Deja Vu for those who have been through it.

Abandonment of planning under pressure
Projects make plans and then routinely abandon them when they run into schedule trouble. This would not be a problem if the plans were updated to account for the schedule difficulties. The problem arises when the plans are abandoned with no substitute, which tends to make the project slide into code-and-fix mode.

Adding people to a late project ( This may sound like a lift off of the famous Mythical Man Month but unfortunately people are yet to learn their lessons)

When a project is behind, adding people can take more productivity away from existing team members than it adds through new ones. Adding people to a late project has been likened to pouring gasoline on a fire.

Assuming global development has a negligible impact on total effort
Multi-site development increases communication and coordination effort between sites. The greater the differences among the sites in terms of time zones, company cultures, and national cultures, the more the total project effort will increase. Some companies naively assume that changing from single-site development to multi-site development will have a
negligible impact on effort, but studies have shown that international development will typically increase effort by about 40% compared to single-site development.

Code-like-hell programming
Some organizations think that fast, loose, all-as-you-go coding is a route to rapid development. If the developers are sufficiently motivated, they reason, they can overcome any obstacles. This is far from the truth. The entrepreneurial model is often a cover for the old code-and-fix paradigm combined with an ambitious schedule, and that combination almost never works.

Confusing estimates with targets
Some organizations set schedules based purely on the desirability of business targets without also creating analytically-derived cost or schedule estimates. While target setting is not bad in and of itself, some organizations actually refer to the target as the ‘estimate,’ which lends
it an unwarranted and misleading authenticity as a foundation for creating plans, schedules, and commitments.

Developer goldplating
Developers are fascinated by new technology and are sometimes anxious to try out new capabilities of their language or environment or to create their own implementation of a slick feature they saw in another product— whether or not it’s required in their product. The effort required to design, implement, test, document, and support features that are not
required adds cost and lengthens the schedule.

But hey !!! Who is rectifying them. This cycle goes on and on in the churn of a typical IT Services Business where it ultimate objective is Margin with a capital M. Add to it some Office Politics and you have a sure shot recipe for what can be termed as "Setting up for Failure".

The big question still remains - Who will bell the cat ?

Tuesday, July 6, 2010

Debug = true


Its been ages since I wrote something useful. It was time to break the shackles and dust off my blog. So after a long hiatus I start over again. This one pertains to common problems observed in ASP.Net Application bottlenecks and issues.

Microsoft Developer Support or ("CSS" - Customer Support Services) is where you're sent within Microsoft when you've got problems. They see the most interesting bugs, thousands of issues and edge cases and collect piles of data. They report this data back to the ASP.NET team (and other teams) for product planning. With all those cases and all the projects, there's basically two top things that cause trouble in production ASP.NET web sites. Long story short, Debug Mode and Anti-Virus software.

The excerpts for this post has been taken from MSDN content and other insightful blog posts.


#1 Issue - Configuration

Seems the #1 issue in support for problems with ASP.NET 2.x and 3.x is configuration.

Symptoms

Notes

  • OOM
  • Performance
  • High memory
  • Hangs
  • Deadlocks

There are more debug=true cases than there should be.

People continue to deploy debug versions of their sites to production. We can automatically transform the web.config and change it to a release version. More on that later.

Additionally, if you leave debug=true on individual pages, note that this will override the application level setting.

Here's why debug="true" is bad:

  • Overrides request execution timeout making it effectively infinite
  • Disables both page and JIT compiler optimizations
  • In 1.1, leads to excessive memory usage by the CLR for debug information tracking
  • In 1.1, turns off batch compilation of dynamic pages, leading to 1 assembly per page.
  • For VB.NET code, leads to excessive usage of WeakReferences (used for edit and continue support).

An important note: Contrary to what is sometimes believed, setting retail="true" in a element is not a direct antidote to having debug="true"!

#2 Issue - Problems with an External (non-ASP.NET) Root Cause

Sometimes when you're having trouble with an ASP.NET site, the problem turns out to not be ASP.NET itself. Here's the top three issues and their causes. This category are for cases that were concluded because of external reasons and are outside of the control of support to directly affect. The sub categories are 3rd party software, Anti-virus software, Hardware, Virus attacks, DOS attacks, etc.

If you've ever run a production website you know there's always that argument about whether to run anti-virus software in production. It's not like anyone's emailing viruses and saving them to production web servers, but you want to be careful. Sometimes IT or security insists on it. However, this means you'll have software that is not your website software trying to access files at the same time your site is trying to access them.

Here's the essence as a bulleted list

  • Concurrency while under pressure: This causes problems in big software. Make sure your anti-virus software is configure appropriately and that you're aware of which processes are accessing which files, as well as how, why and when
  • Profile your applications: .NET and the Web are not black boxes. You can see what's happening if you look. Know what bytes are going out the wire. Know who is accessing the disk. Measure twice, cut once, they say? I say measure a dozen times. You'd be surprised how often folks put an app in production and they've never once profiled it.
  • Anti-Virus Software: It can't be emphasized enough that site owners should ensure they are running the latest AV engine and definitions from their chosen anti-malware vendor. They've seen people hitting hangs due to flakey AV drivers that are over two years out of date. Another point about AV software is that it is not just about old-school AV scanning of file access. Many products now do low level monitoring of port activity, script activity within processes and memory allocation activity and do not always do these things 100% correctly. Stay up to date!
  • Know where you're calling out to: Also, connection to remote endpoints: calling web services, accessing file systems etc. All of this can slow you down if you're not paying attention. Is your DNS correct? Did you add your external hosts to a hosts file to remove DNS latency?
  • processModel autoconfig=true: This is in machine.config and people mess with it. Don't assume that you know better than the defaults. Everyone wants to change the defaults, add threads, remove threads, change the way the pool works because they think their textboxes-over-data application is special. Chances are it's not, and you'd be surprised how often people will spend days on the phone with support and discover that the defaults were fine and they had changed them long ago and forgotten. Know what you've changed away from the defaults, and know why.

...and here's the table of details:

Issue

Product

Description

Symptoms

Notes

Anti-virus software

All

Anti-virus software is installed onto Servers and causes all kinds of problems.

  • Application restarting
  • Slow performance
  • Session variable are null
  • Cannot install hotfix
  • Intermittent time outs
  • High memory
  • Session lost
  • IDE Hangs
  • Deadlocks

This consists of all AV software reported by our customers. All cases do not report the AV software that is being used so the manufacturer is not always known.

KB821438, KB248013,KB295375, KB817442

3rd party Vendors

All

This is a category of cases where the failure was due to a 3rd party manufacturer.

  • Crash
  • 100% CPU
  • High memory
  • Framework errors
  • Hang

The top culprits are 3rd party database systems, and 3rd party internet access management systems.

Microsoft component

All

Microsoft software

  • Intermittent time outs
  • High memory
  • Deadlocks
  • 100% CPU
  • Crash

Design issues that cause performance issues like sprocs, deadlocks, etc. Profile your applications and the database! (Pro tip: select * from authors doesn't scale.) Pair up DBAs and programmers and profile from end to end.

Hopefully this post at least gets us started

Thursday, July 30, 2009

Isn't that Impossible?

Credits: Dharmesh Mehta
The contents of this article are original works of Dharmesh M. Mehta taken verbatim from his blog posting at http://smartsecurity.blogspot.com/2009/06/isnt-that-impossible.html . I liked the way Dharmesh captured the common arguments people make for not implementing security in a surrealistic way and hence posting it here too.
Permissions from original author: Pending


Not every organization and their people know about software security issues nor do they respect the same.

In most of my workshops conducted with developers for secure coding, I often hear the proclamation, "Isn't that Impossible..." and then the drama starts...

Many developers do not understand how the web works
• “Users can’t change the value of a drop down”
• “That option is greyed out”
• “We don’t even link to that page”

Many developers doubts attacker motivation
• “You are using specialized tools; our users don’t use those”
• “Why would anyone put a string that long into that field?”
• “It’s just an internal application” (in an enterprise with 80k employees and a flat network)
• “This application has a small user community; we know who is authenticated to it” (huh?)
• “You have been doing this a long time, nobody else would be able to find that in a reasonable time frame!”

Many developers do not understand the difference between network and application security
• “That application is behind 3 firewalls!”
• “We’re using SSL”
• “That system isn’t even exposed to the outside”

Many developers do not understand a vulnerability class
• “That’s just an error message” (usually related to SQL Injection)
• “You can’t even fit a valid SQL statement in 10 characters”

Many developers cite incorrect or inadequate architectural mitigations
• “You can’t execute code from the stack, it is read-only on all Intel processors”
• “Our WAF protects against XSS attacks” (well, clearly it didn’t protect against the one I’m showing you)
Developer cites questionable tradeoffs
• “Calculating a hash value will be far too expensive” (meanwhile, they’re issuing dozens of Ajax requests every time a user click a link)

There would be dozens more. The point that is developer education for security is one of the largest gaps in most SDLCs. How can you expect your developers to write secure code when you don’t teach them this stuff? You can only treat the symptoms for so long; eventually you have to attack the root cause.

Friday, July 24, 2009

PDB files.

Most developers realize that PDB files are something that help us debug, but that's about it. Don't feel bad if you don't know what's going on with PDB files because while there is documentation out there, it's scattered around and much of it is for compiler and debugger writers. While it's extremely cool and interesting to write compilers and debuggers, that's probably not a normal developer's job.

What I want to do here is to put in one place what everyone doing development on a Microsoft operating system has to know when it comes to PDB files. This information also applies to both native and managed developers, though I will mention a trick specific to managed developers. I'll start by talking about PDB file storage as well as the contents. Since the debugger uses the PDB files, I'll discuss exactly how the debugger finds the right PDB file for your binary. Finally, I'll talk about how the debugger looks for the source files when debugging and show you a favorite trick related to how the debugger finds source code.

Before we jump in, I need to define two important terms. A build you do on your development machine is a private build. A build done on a build machine is a public build. This is an important distinction because debugging binaries you build locally is easy, it is always the public builds that cause problems.

The most important thing all developers need to know: PDB files are as important as source code! Yes, that's red and bold on purpose. Nobody can find the PDB files for the build running on a production server. Without the matching PDB files you just made your debugging challenge nearly impossible. With a huge amount of effort, you can disassemble and can find the problems without the right PDB files, but it will save you a lot of effort if you have the right PDB files in the first place.

As
John Cunningham, the development manager for all things diagnostics on Visual Studio, said at the 2008 PDC, "Love, hold, and protect your PDBs." At a minimum, every development shop must set up a Symbol Server. Briefly, a Symbol Server stores the PDBs and binaries for all your public builds. That way no matter what build someone reports a crash or problem, you have the exact matching PDB file for that public build the debugger can access. Both Visual Studio and WinDBG know how to access Symbol Servers and if the binary is from a public build, the debugger will get the matching PDB file automatically.

Most of you reading this will also need to do one preparatory step before putting your PDB files in the Symbol Server. That step is to run the Source Server tools across your public PDB files, which is called source indexing. The indexing embeds the version control commands to pull the exact source file used in that particular public build. Thus, when you are debugging that public build you never have to worry about finding the source file for that build. If you're a one or two person team, you can sometimes live without the Source Server step.

The rest of this entry will assume you have set up Symbol Server and Source Server indexing. One good piece of news for those of you who will be using TFS 2010, out of the box the Build server will have the build task for Source Indexing and Symbol Server copying as part of your build.


One complaint against setting up a Symbol Server is that their software is too big and complex. There's no way your software is bigger and more complex than everything Microsoft does. They source index and store every single build of all products they ship into a Symbol Server. That means everything from Windows, to Office, to SQL, to Games and everything in between is stored in one central location.


My guess is that Building 34 in Redmond is nothing but SAN drives to hold all of those files and everyone in that building is there to support those SANs. It's so amazing to be able to debug anything inside Microsoft and you never have to worry about symbols or source (provided you have appropriate rights to that source tree).

With the key infrastructure discussion out of the way, let me turn to what's in a PDB and how the debugger finds them. The actual file format of a PDB file is a closely guarded secret but Microsoft provides APIs to return the data for debuggers.

A native C++ PDB file contains quite a bit of information:
a) Public, private, and static function addresses
b) Global variable names and addresses
c) Parameter and local variable names and offsets where to find them on the stack
d) Type data consisting of class, structure, and data definitions
e) Frame Pointer Omission (FPO) data, which is the key to native stack walking on x86
f) Source file names and their lines

A .NET PDB only contains two pieces of information, the source file names and their lines and the local variable names. All the other information is already in the .NET metadata so there is no need to duplicate the same information in a PDB file.

When you load a module into the process address space, the debugger uses two pieces of information to find the matching PDB file. The first is obviously the name of the file. If you load ZZZ.DLL, the debugger looks for ZZZ.PDB. The extremely important part is how the debugger knows this is the exact matching PDB file for this binary. That's done through a GUID that's embedded in both the PDB file and the binary. If the GUID does not match, you certainly won't debug the module at the source code level.

The .NET compiler, and for native the linker, puts this GUID into the binary and PDB. Since the act of compiling creates this GUID, stop and think about this for a moment. If you have yesterday's build and did not save the PDB file will you ever be able to debug the binary again? No! This is why it is so critical to save your PDB files for every build. Because I know you're thinking it, I'll go ahead and answer the question already forming in your mind: no, there's no way to change the GUID.

However, you can look at the GUID value in your binary. Using a command line tool that comes with Visual Studio,
DUMPBIN, you can list all the pieces of your Portable Executable (PE) files. To run DUMPBIN, open the Visual Studio 2008 Command Prompt from the Program's menu, as you will need the PATH environment variable set in order to find the DUMPBIN EXE.

There are numerous command line options to DUMPBIN, but the one that shows us the build GUID is /HEADERS. The important piece to us is the Debug Directories output:
Debug Directories Time Type Size RVA Pointer -------- ------ -------- -------- -------- 4A03CA66 cv 4A 000025C4 7C4 Format: RSDS, {4B46C704-B6DE-44B2-B8F5-A200A7E541B0}, 1, C:\junk\stuff\HelloWorld\obj\Debug\HelloWorld.pdb

With the knowledge of how the debugger determines the correctly matching PDB file, I want to talk about where the debugger looks for the PDB files. You can see all of this order loading yourself by looking at the Visual Studio Modules window, Symbol File column when debugging. The first place searched is the directory where the binary was loaded. If the PDB file is not there, the second place the debugger looks is the hard coded build directory embedded in the Debug Directories in the PE file. If you look at the above output, you see the full path C:\JUNK\STUFF\HELLOWORLD\OBJ\DEBUG\HELLOWORD.PDB. (The MSBUILD tasks for building .NET applications actually build to the OBJ\ directory and copy the output to DEBUG or RELEASE directory only on a successful build.) If the PDB file is not in the first two locations, and a Symbol Server is set up for the on the machine, the debugger looks in the Symbol Server cache directory. Finally, if the debugger does not find the PDB file in the Symbol Server cache directory, it looks in the Symbol Server itself. This search order is why your local builds and public build parts never conflict.

How the debugger searches for PDB files works just fine for nearly all the applications you'll develop. Where PDB file loading gets a little more interesting are those .NET applications that require you to put assemblies in the Global Assembly Cache (GAC). I'm specifically looking at you SharePoint and the cruelty you inflict on web parts, but there are others. For private builds on your local machine, life is easy because the debugger will find the PDB file in the build directory as I described above. The pain starts when you need to debug or test a private build on another machine.

On the other machine, what I've seen numerous developers do after using
GACUTIL to put the assembly into the GAC is to open up a command window and dig around in C:\WINDOWS\ASSEMBLY\ to look for the physical location of the assembly on disk. While it is subject to change in the future, an assembly compiled for Any CPU is actually in a directory like the following:
C:\Windows\assembly\GAC_MSIL\Example\1.0.0.0__682bc775ff82796a
Example is the name of the assembly, 1.0.0.0 is the version number, and 682bc775ff82796a is the public key token value. Once you've deduced the actual directory, you can copy the PDB file to that directory and the debugger will load it.


If you're feeling a little queasy right now about digging through the GAC like this, you should, as it is unsupported and fragile. There's a better way that seems like almost no one knows about,
DEVPATH. The idea is that you can set a couple of settings in .NET and it will add a directory you specify to the GAC so you just need to toss the assembly and it's PDB file into that directory so debugging is far easier. Only set up DEVPATH on development machines because any files stored in the specified directory are not version checked as they are in the real GAC.

To use DEVPATH, you will first create a directory that has read access rights for all accounts and at least write access for your development account. This directory can be anywhere on the machine. The second step is to set a system wide environment variable, DEVPATH whose value is the directory you created. The documentation on DEVPATH doesn't make this clear, but set the DEVPATH environment variable before you do the next step.
To tell the .NET runtime that you have DEVPATH set up requires you to add the following to your APP.CONFIG, WEB.CONFIG, or MACHINE.CONFIG as appropriate for your application:

Once you turn on development mode, you'll know there's a problem with either the DEVPATH environment variable missing for the process or the path you set does not exist if your application dies at startup with a COMException with the error message saying the completely non-intuitive: "Invalid value for registry." Also, be extremely vigilant if you do want to use DEVPATH in MACHINE.CONFIG because every process on the machine is affected. Causing all .NET applications to fail on a machine won't win you many friends around the office.

The final item every developer needs to know about PDB files is how the source file information is stored in a PDB file. For public builds that have had source indexing tools run on them, the storage is the version control command to get that source file into the source cache you set. For private builds, what's stored is the full path to the source files that compiler used to make the binary. In other words, if you use a source file MYCODE.CPP in C:\FOO, what's embedded in the PDB file is C:\FOO\MYCODE.CPP.

Ideally, all public builds are automatically being source indexed immediately and stored in your Symbol Server so if you don't have to even think any more about where the source code is. However, some teams don't do the source indexing across the PDB files until they have done smoke tests or other blessings to see if the build is good enough for others to use. That's a perfectly reasonable approach, but if you do have to debug the build before its source indexed, you had better pull that source code to the exact same drive and directory structure the build machine used or you may have some trouble debugging at the source code level. While both the Visual Studio debugger and WinDBG have options for setting the source search directories, I've found it hard to get right.

For smaller projects, it's no problem because there's always plenty of room for your source code. Where life is more difficult is on bigger projects. What are you going to do if you have 30 MB of source code and you have only 20 MB of disk space left on your C: drive? Wouldn't it be nice to have a way to control the path stored in the PDB file?
While we can't edit the PDB files, there's an easy trick to controlling the paths put inside the PDB files:
SUBST.EXE. What SUBST does is associate a path with a drive letter. If you pull your source code down to C:\DEV and you execute "SUBST R: C:\DEV" the R: drive will now show at its top level the same files and directories if you typed "DIR C:\DEV." You'll also see the R: drive in Explorer as a new drive. You can also achieve the drive to path affect by mapping a drive to a shared directory in Explorer.

What you'll do on the build machine is set a startup item that executes your particular SUBST command. When the build system account logs in, it will have the new drive letter available and that's where you'll do your builds. With complete control over the drive and root embedded in the PDB file, all you need to do to set up the source code on a test machine is to pull it down wherever you want and do a SUBST execution using the same drive letter the build machine used. Now there's no more thinking about source matching again in the debugger.

While not all of this information about PDB files I've discussed in this entry is entirely new. I hope by getting it all together that you'll find it easier to deal with what's going on and debug your applications faster. Debugging faster means shipping faster so that's always high on the good things scale.

This information is from a Wintellect article by John Robbins.




Thursday, July 9, 2009

Active Directory Federation Services (ADFS)

Federated Identity is a standards based technology. IBM, SUN, and Versign all have stakes in this technology. ADFS is simply Microsoft's solution for Federation management.
ADFS is part of the R2 release of Server 2003. You cannot purchase or download ADFS separately.

So exactly what is ADFS?
ADFS is a service (actually a series of web services) that provides a secure single-sign-on (SSO) experience which allows a user to access multiple web applications spread across different enterprises and networks.

ADFS fills a much needed gap in the following scenarios:
Extranet Applications
Many organizations host at least one application used by business partners or other outside users. So we stand up this application and supporting infrastructure in our DMZ right? In most cases this involves at least an IIS Server, SQL Server, and something to authenticate and authorize users. If we plan on having more than just a handful of users using our application and need advanced user management then chances are we're going to put an Active Directory forest in our DMZ. Ok great. Now we just secure it all and create user accounts for our business partners and we're off and running. But wait... All of a sudden our internal users need access to the extranet application as well. So now what? We could create and manage a second user account for each internal user needing access (not to mention reset the user's passwords when they forget their DMZ account information). Our second option is to open up several ports and create a trust relationship between the two domains. This would give us the ability to provide extranet application access to the intranet AD users, however opening the required ports decreases our security and makes our internal AD environment more vulnerable to attack. This is where ADFS comes in.


ADFS gives us the ability to setup what's known as Federation Servers in our internal network and dmz. The federation servers then securely (via certificates and SSL) allow our internal AD users to acquire a "token" which in return gives them access to the extranet application. This prevents the internal user from having to enter any credentials just like the application was sitting on the internal network. This is all done without exposing internal account information to the DMZ.

B2B Extranet Applications
Now lets take ADFS to the next level. Remember the business partner we want to provide access to our extranet application? Remember how we put AD in the DMZ so we can setup user accounts for our partner? What if our application supports different levels of security? And what if we want to give each user from the business partner unique access to our application? Well all still fairly simple. We just setup and manage a user account for each of those user's in our DMZ domain right? What if our business partner decides they want all 1,000 employees accessing our extranet app. Well now we have an account management nightmare. This is where ADFS comes to the rescue again. With ADFS we can provide Federated access to accounts from our partner's active directory domain. We setup a federation server in our DMZ and our business partner does the same (once again encrypting communication over SSL). We then grant application access to what we call "claims groups" which map to real groups within our partner's domain. Our partner then simply places their domain's already created, and managed user accounts into their own group within AD and suddenly they are browsing our extranet application, with SSO I might add. Please note that credentials, SIDS, and all other AD account information is NEVER passed between federation servers (or organizations). Federation servers simply provide "tokens" to user accounts when they need to access the application on the other side.

Final Thoughts
ADFS is an exciting new technology that many vendors and companies are beginning to buy into. With that said keep in mind that it is a "new" technology as well. Be sure to look for future standard and protocol changes. You should also be aware that ADFS can be very complicated and confusing to setup the first time, however it can be simplified.

Please do NOT setup a production ADFS (especially in a B2B scenario) unless you have extensively tested and are comfortable with the security of your configuration. After all you are providing access to one of your applications, extranet or not. I would also suggest some sort of signed legal agreement between the two organizations in a B2B scenario. If you would like to see a followup post on how to setup ADFS please leave comments or email me informing me of your interest. Here are some followup links to get you started:

http://www.microsoft.com/windowsserver2003/default.mspx http://www.microsoft.com/WindowsServer2003/R2/Identity_Management/ADFSwhitepaper.mspx http://technet2.microsoft.com/WindowsServer/en/Library/050392bc-c8f5-48b3-b30e-bf310399ff5d1033.mspx

Wednesday, June 17, 2009

Silverlight


Every new technology brings its own mechanism to mitigate security threats. This post discusses on how silverlight deals with cross site scripting.

What is Cross Site Scripting?
Cross-site scripting (XSS) is a type of computer security vulnerability typically found in web applications which allow code injection by malicious web users into the web pages viewed by other users. Examples of such code include HTML code and client-side scripts. An exploited cross-site scripting vulnerability can be used by attackers to bypass access controls such as the same origin policy. Vulnerabilities of this kind have been exploited to craft powerful phishing attacks and browser exploits.

To avoid Cross-Site Scripting (XSS), Silverlight runtime enforces restrictions in the framework APIs. Any cross domain request requires that the server has explicitly granted permissions to access its resources from Silverlight client. Cross domain access means the Silverlight client is making network calls to domain which is not same as the domain from which the client itself has been downloaded. The restrictions are same as what Flash based clients also experience.
To allow flash based clients to access its resources, servers need to place a policy file at the root of the domain called crossdomain.xml and all access permission in that file.
Silverlight uses the same logic to allow the APIs to access cross domain resources. It supports flash based policy file. It also supports a file specific to Silverlight clients named as clientaccesspolicy.xml. This is also a xml based file with published format but different from flash format.Silverlight runtime first tries to download the clientaccesspolicy.xml file and if found, all access permissions are granted using this file. If this file is not available, it tries to download flash based policy file. If none is found, access is denied. These files are not downloaded in case of same domain access.

Sunday, June 7, 2009

Google Wave

What is Google Wave? It is a new communication service that Google unveiled at Google IO this week. It is a product, platform and protocol for communication and collaboration designed for today’s world. Is that too much of technical jargon…let’s make it simple…and take it in chewable bite size…

It is like reinventing email that was designed 40 years ago i.e. many years before internet, wiki, blogs, twitter, forums, discussion boards etc existed. The world has evolved, but we are still hooked to “Store-and-forward” architecture of email systems which mimics snail-mail. In spite of the technological advances, we are living in highly segmented world, with information living on islands – emails, blogs, photo, blogs, micro blogs like twitter, web collaboration, net meetings, IM and so on.

In Google Wave you create a wave (can be an email or IM conversation or a document for collaboration or to publish on a blog or just to play a game) and add people to it. Everyone on your wave can use richly formatted text, photos, gadgets, and even feeds from other sources on the web. You can insert a reply or edit the wave directly. Google Wave an HTML 5 app, built using Google Web Toolkit. It includes a rich text editor and other desktop functions like drag-and-drop. It has concurrent rich-text editing, where you see on your screen instantly what your fellow collaborators are typing in your wave. This means Google Wave integrates email, IM and collaborative document creation into a single experience. The most important feature is that you can also use “playback” to rewind the wave to see how it evolved. My elder son was very excited to see that. He said “If I am playing chess with my friends using Wave, I will be able to rewind and replay it to see every move. WoHoooooo..”

Google Wave can also be considered as a platform with a rich set of open APIs that allow developers to embed waves in other web services, and to build new extensions that work inside waves. The Google Wave protocol is designed for open federation, such that anyone’s Wave services can interoperate with each other and with the Google Wave service. To encourage adoption of the protocol, we intend to open source the code behind Google Wave.

Vic Gundotra of Microsoft fame is now leading this effort as VP engineering at Google. Lars and Jens Rasmussen (brothers) who came to Google with acquisition of “2 Tech” in 2004, have been driving this effort at Google for more than 18 months. They also have credible history and star reputation at Google as creators of Google Maps.

The underlying assumption is that a large scale disruptive innovation can dislodge the existing leaders and give an opportunity to other to take leading positions. Hence an attempt to create an online world where people can seamlessly communicate and collaborate across various information exchange scenarios including email, IM, blog, wiki and multi-lingual (including translation) . With this bold move, Google is trying to overcome the challenges of integration by hosting the conversation object on the server, allowing multiple channels of interactions and breaking many barriers in the process. The service seems to combine Gmail and Google Docs into an interesting free-form workspace that could be used to write documents collaboratively, plan events, play games or discuss recent news. Google has announced this as an open source project and is publishing all the standards at www.waveprotocol.org The ripples of this Google wave have potential of impacting the technology world for decades to come.

Some helpful links:
Main Site:
http://wave.google.com
API: http://code.google.com/apis/wave
Federation Protocol: http://www.waveprotocol.org
Web Toolkit: http://code.google.com/webtoolkit