News

Microsoft Launches Preview of Azure Databricks

Microsoft this week unveiled a new service that integrates its Azure platform with a globally distributed streaming analytics solution.

Now available in preview, Azure Databricks is designed to help app users and developers take advantage of machine learning, graph processing and AI-based applications. It was one of many announcements made by Microsoft during the opening keynote of its annual Connect() developer conference, which kicked off Wednesday in New York City.

Azure Databricks will enable organizations to build modern data warehouses that support self-service analytics and machine learning using all data types in a secure and compliant architecture, according to Scott Guthrie, Microsoft's executive vice president of Cloud and Enterprise, speaking at the the Connect() keynote.

Databricks is the creator and steward of Apache Spark. Azure Databricks is effectively a first-party Spark-as-a-Service platform for Azure. "It allows you to quickly launch and scale up the Spark service inside the cloud on Azure," Guthrie said. "It includes an incredibly rich, interactive workspace that makes it easy to build Spark-based workflows, and it integrates deeply across our other Azure services."

Those services include Azure SQL Data Warehouse, Azure Storage, Azure Cosmos DB, Azure Active Directory, Power BI and Azure Machine Learning, Guthrie said. It also provides integration with Azure Data Lake stores, Azure Blob storage and Azure Event Hub. "It's an incredibly easy way to integrate Spark deeply across your apps and drive richer intelligence from it," he said.

Databricks customers have been pushing the company to build its Spark platform as a native Azure service, said Ali Ghodsi, the company's co-founder and CEO, who joined Guthrie on stage. "We've been hearing overwhelming demand from our customer base that they want the security, they want the compliance and they want the scalability of Azure," Ghodsi said. "We think it can make AI and big data much simpler."

In addition to integrating with the various Azure services, Azure Databricks is designed to let those who want to create new data models to do so. According to Databricks, a user can target data regardless of size or create projects with various analytics services, including Power BI, SQL Server, Streaming, MLlib and Graph.

"Once you manage data at scale in the cloud, you open up massive possibilities for predictive analytics, AI, and real-time applications," according to a technical overview of the Azure Databricks service. "Over the past five years, the platform of choice for building these applications has been Apache Spark. With a massive community at thousands of enterprises worldwide, Spark makes it possible to run powerful analytics algorithms at scale and in real time to drive business insights."

However, deploying, managing and securing Spark at scale has remained a challenge, which Databricks believes will make the Azure service compelling.

Internally, Databricks is using Azure Container Services to run the Azure Databricks control plane and data planes using containers, according to the company's technical primer. It's also using accelerated networking services to improve performance on the latest Azure hardware specs.  

About the Author

Jeffrey Schwartz is editor of Redmond magazine and also covers cloud computing for Virtualization Review's Cloud Report. In addition, he writes the Channeling the Cloud column for Redmond Channel Partner. Follow him on Twitter @JeffreySchwartz.