News

Microsoft Readies Azure-Based Big Data Offerings

Microsoft on Monday gave an update on some forthcoming additions to its Big Data portfolio, including Azure Data Lake Store.

First announced this past April, Azure Data Lake Store will be released as a preview later this year. Azure Data Lake Store is a new HDFS-compatible Hadoop File System data store aimed at enabling organizations to run large analytics workloads. Microsoft describes Azure Data Lake Store as a single repository that lets users capture data of any size or format without requiring changes to the application as data scales. Data can be securely stored and shared, as well as processed and queried from HDFS-based applications and tools, according to T. K. "Ranga" Rengarajan, Microsoft's corporate vice president for data platform, in a blog post on Monday.

"Azure Data Lake includes all the capabilities required to make it easy for developers, data scientists, and analysts to store data of any size, shape and speed, and do all types of processing and analytics across platforms and languages," Rengarajan said. "It removes the complexities of ingesting and storing all of your data while making it faster to get up and running with batch, streaming, and interactive analytics. Azure Data Lake works with existing IT investments for identity, management, and security for simplified data management and governance. It also integrates seamlessly with operational stores and data warehouses so you can extend current data applications."

Complementing Azure Data Lake Store will be the newly announced Azure Data Lake Analytics, an Apache YARN-based service designed to dynamically scale to handle large Big Data workloads. The new Azure Data Lake Analytics service will be based on U-SQL, a language that will "unify the benefits of SQL with the power of expressive code," Rengarajan said. "U-SQL's scalable distributed query capability enables you to efficiently analyze data in the store and across SQL Servers in Azure, Azure SQL Database and Azure SQL Data Warehouse."

In a MSDN blog posted on Monday, Michael Rys, a principal program manager for Big Data at Microsoft, explained why U-SQL is suited for Azure Data Lake Analytics:

Taking the issues of both SQL-based and procedural languages into account, we designed U-SQL from the ground-up as an evolution of the declarative SQL language with native extensibility through user code written in C#. This unifies both paradigms, unifies structured, unstructured, and remote data processing, unifies the declarative and custom imperative coding experience, and unifies the experience around extending your language capabilities.

U-SQL is built on the learnings from Microsoft's internal experience with SCOPE and existing languages such as T-SQL, ANSI SQL, and Hive. For example, we base our SQL and programming language integration and the execution and optimization framework for U-SQL on SCOPE, which currently runs hundred thousands of jobs each day internally. We also align the metadata system (databases, tables, etc.), the SQL syntax, and language semantics with T-SQL and ANSI SQL, the query languages most of our SQL Server customers are familiar with. And we use C# data types and the C# expression language so you can seamlessly write C# predicates and expressions inside SELECT statements and use C# to add your custom logic. Finally, we looked to Hive and other Big Data languages to identify patterns and data processing requirements and integrate them into our framework.

Microsoft also announced the general availability of managed clusters for its Azure HDInsight service on Linux, which the company claims has a 99.9 percent uptime SLA. The company also is offering Azure Data Lake Tools for Visual Studio and said that ISV applications can be offered in the Azure Marketplace.

About the Author

Jeffrey Schwartz is editor of Redmond magazine and also covers cloud computing for Virtualization Review's Cloud Report. In addition, he writes the Channeling the Cloud column for Redmond Channel Partner. Follow him on Twitter @JeffreySchwartz.

Featured

  • Report: Cost, Sustainability Drive DaaS Adoption Beyond Remote Work

    Gartner's 2025 Magic Quadrant for Desktop as a Service reveals that while secure remote access remains a key driver of DaaS adoption, a growing number of deployments now focus on broader efficiency goals.

  • Windows 365 Reserve, Microsoft's Cloud PC Rental Service, Hits Preview

    Microsoft has launched a limited public preview of its new "Windows 365 Reserve" service, which lets organizations rent cloud PC instances in the event their Windows devices are stolen, lost or damaged.

  • Hands-On AI Skills Now Outshine Certs in Salary Stakes

    For AI-related roles, employers are prioritizing verifiable, hands-on abilities over framed certificates -- and they're paying a premium for it.

  • Roadblocks in Enterprise AI: Data and Skills Shortfalls Could Cost Millions

    Businesses risk losing up to $87 million a year if they fail to catch up with AI innovation, according to the Couchbase FY 2026 CIO AI Survey released this month.