A Look at the Microsoft Security Response Center's Playbook

Say what you will about the competence of Microsoft when it comes to security. North of 90 percent of desktop computers run Windows; Internet Explorer is the Web browser for nearly 90 percent of users; and 65 percent of new server units go out the door with Windows installed. Like it or not, how Microsoft handles -- or fails to handle -- security vulnerabilities in its ubiquitous platform directly affects about nine out of every 10 computers users. The nerve center of Microsoft's security operation is the Microsoft Security Response Center. And the MSRC is Ground Zero for nearly all major security threats, given Microsoft's reach.

Stephen Toulouse, security program manager for Microsoft's Security Business and Technology Unit, offered a look inside the processes at the MSRC last week during the Microsoft TechEd 2005 show in Orlando. Toulouse gave a presentation called, "Inside the Microsoft Security Response Center Process." Among the interesting things Toulouse revealed about the process are that MSRC relies on the press more than you might think, takes managing the relationship with vulnerability finders very seriously and has an internal Service Level Agreement to report back to a finder within 24 hours and the commitment to plowing lessons learned back into the product development process.

Releasing a Security Update

Toulouse dedicated one detailed slide to the process of releasing a security update.

The MSRC generally receives reports through two avenues: direct contact from the finder of vulnerability via Secure@microsoft.com or an anonymous report over the Microsoft TechNet Security site. The center's policy is a Service Level Agreement of a response within 24 hours to the vulnerability's finder, according to the slide. Unmentioned in Toulouse' slide deck are two other key avenues that are rare but exceptionally dangerous. One is flaws that surface because they are found by security researchers who publish the details immediately rather than playing by Microsoft's rules. The other is any flaw that emerges after being discovered directly by an attacker who is already exploiting it.

Once a report rolls in, the center begins a triage process, which consists of assessing the report and its possible impact on customers, understanding the severity of the vulnerability and assigning it a priority. It is at this stage that Microsoft initially rates vulnerabilities critical, important, moderate or low.

Toulouse said work begins simultaneously on creating the fix. With product teams, the security team works to investigate the vulnerability impact, locate variants and potential impact and interplay with surrounding code and design. It is at this stage that the first fix is generated for a test.

SPONSOR: Free Guide: Fault and Disaster Tolerance for Blades
Is blade server consolidation making you put all your eggs in one vulnerable basket? Unprotected blades are not fault tolerant and put your business at risk for downtime. Learn simple, practical methods to keep Windows blade servers running without cluster failover.
Click here for more information.

Another important part of the process is managing the relationship with the finder of the vulnerability. Among MSRC priorities are to establish a communications channel with the finder through which to provide a quick response and regular updates. Longer term, the goal is to build the community of vulnerability finders and reward and encourage those finders who keep vulnerabilities secret until Microsoft can produce a fix.

Meanwhile, the fix goes through several levels of testing. There's a setup and build verification; a depth test; and an integration and breadth test. Once the lab tests are finished, the Microsoft IT department is corralled into the process. Microsoft IT tests the bug fix on the Microsoft Corporate network, which in addition to serving as a test bed for all Microsoft pre-release software must maintain that network to support the operations of Microsoft, a Fortune 500 company.

The pre-release version of the vulnerability fix also goes out as a controlled beta to customers and partners. This test group is very limited given the high risk of the fix falling into the wrong hands before Microsoft makes it generally available.

As testing continues, the center gets to work on creating content. This means writing up the familiar security bulletin with its technical description of the vulnerability, instructions for workarounds, discussion of mitigating factors, FAQs and acknowledgements.

With the immediate tasks of patch preparation out of the way, one of the activities the center engages in is updating development tools and practices for product development teams at Microsoft. The steps including updating best practices, testing tools and process to prevent similar flaws from emerging in future code.

When all those steps are done, the bulletin is ready to be released on the next Patch Tuesday -- the second Tuesday of the month when Microsoft releases all security bulletins.

Playing Defense After a Patch

The release of a Microsoft patch is much like the starting line of a race. In one lane are the IT departments and consumers worldwide, pushed ahead by Microsoft, trying to test and apply the new patch as quickly as possible. In the other lane are the hackers trying to exploit the new flaw that has come to light because of the patch.

Given the way patches highlight vulnerabilities, what Microsoft does after posting patches is every bit as interesting as how they respond to the initial vulnerability report.

Microsoft's formal process for identifying and addressing those kinds of flaws is called the Software Security Incident Response Plan (SSIRP).

In general, the center employs a four-step process. The first stage, or "Watch" stage, is a general state of readiness but it applies especially to the period immediately after a Microsoft patch is released. The phase includes observing available environments to detect any potential issues and leveraging existing relationships with partners and security researchers. In the watch phase, the MSRC also relies heavily on customer requests and press inquiries to alert Microsoft to dangerous issues.

When the first reports of an exploit arrive, Microsoft goes into the "Alert and Mobilize" stage. During the step, the center evaluates the severity, pays closer attention to press interest and customer support calls and mobilizes security response teams into an Emergency Engineering Team and an Emergency Communications Team.

With exploit code or other hard technical data on hand, the MSRC enters the "Assess and Stabilize" step, which includes working on a solution and communicating initial guidance and workarounds to the public. Once Microsoft has developed tools, updates or fixes, the center enters the "Resolve" stage.

Microsoft officials contend they've come a long way with the patching processes from the days of Blaster and SQL Slammer, which Microsoft senior vice president for server applications Paul Flessner made a point of apologizing again for at TechEd this month.

According to the Toulouse presentation, Microsoft has an "expedited process for releasing cleaner tools (within 3 days for Sasser vs. 38 days for Blaster)."