Microsoft's Infrastructure Investments Have Given Azure Major AI Chops
- By Jeffrey Schwartz
- September 28, 2016
Microsoft's ongoing upgrades to its public cloud infrastructure means Azure has the fastest and most AI-optimized network among cloud providers, the company said this week at its Ignite conference in Atlanta.
According to Microsoft, it has been quietly upgrading every node in Azure with software-defined network (SDN) infrastructure, developed using field-programmable gate arrays (FPGAs). The upgrades started two years ago when Microsoft began installing the FPGAs -- effectively SDN-based processors from Altera, now a part of Intel. The massive global SDN upgrade means that the Azure public cloud fabric is now built on a 25Gbps backbone -- up from 10Gbps -- with a 10x reduction in latency.
The improvements, combined with new GPU nodes that were recently made available in the Azure Portal, also mean that Azure can function as the world's fastest supercomputer, capable of running AI, cognitive computing and even neuro networking-based applications, Microsoft said.
Microsoft detailed the Azure infrastructure and network upgrades at Ignite. During his keynote session on Monday, CEO Satya Nadella demonstrated some of the AI supercomputing capabilities the newly bolstered Azure is capable of.
"We have the ability, through the magic of the fabric that we've built, to distribute your machine learning tasks and your deep neural nets to all of the silicon that is available so that you can get performance that scales," Nadella said.
Doug Burger, a networking expert from Microsoft Research, joined Nadella on stage to describe why Microsoft made a significant investment in the FPGAs and SDN architecture.
"FPGAs are programmable hardware," Burger explained. "What that means is that you get the efficiency of hardware, but you also get flexibility because you can change their functionality on the fly. And this new architecture that we've built effectively embeds an FPGA-based AI supercomputer into our global hyperscale cloud. We get awesome speed, scale and efficiency. It will change what's possible for AI."
Burger said Microsoft is using a special type of neural network called a "convolutional neural net," which can recognize the content within a collection of images. Adding a 30-watt FPGA to a server turbocharges it, allowing the CPU to recognize images significantly faster.
"It gives the server a huge boost for AI tasks," he said.
Showing a more complex task, Burger demonstrated how adding four FPGA boards to a high-end 24-CPU core configuration can translate the 1,400-page "War and Peace" from Russian to English in 2.5 seconds.
"Our accelerated cognitive services run blazingly fast," he said. "Even more importantly, we can now do accelerated AI on a global scale, at hyperscale."
Applying 50 FPGA boards to 50 nodes, the AI-based cloud supercomputer can translate 5 billion words into another language in less than a tenth of a second, according to Burger, amounting to 100 trillion operations per second.
"That crazy speed shows the raw power of what we've deployed in our intelligent cloud," he said.
In an interview, Burger described the deployment of this new network infrastructure in Azure as a major milestone and differentiator for Microsoft's public cloud.
"This architecture is disruptive," Burger said, noting it's also deployed in the fabric of the Bing search engine. "So when you do a Bing search, you're actually touching this new fabric."
Jeffrey Schwartz is editor of Redmond magazine and also covers cloud computing for Virtualization Review's Cloud Report. In addition, he writes the Channeling the Cloud column for Redmond Channel Partner. Follow him on Twitter @JeffreySchwartz.