In-Depth

SQL Server 2000: All Charged Up!

Microsoft's made its relational database management system fully Web-enabled, scalable, reliable, and faster for development. What justifies putting it into a qualifying heat in your organization?

If you’re interested in databases at all, you already know that Microsoft is about to ship a new version of SQL Server. In line with this year’s naming model, it’s called SQL Server 2000, although internally it’s also called SQL Server 8.0. Microsoft’s design goals for this edition were to make the application fully Web-enabled, scalable, reliable, and fast to develop with. But how did those goals translate to product features?

In this article, I’ll introduce you to some of the more significant new features in SQL Server 2000, so you can start the process of evaluating it for your own organization.

Super Speedway Support

Let’s start with a look at some of the numbers. Microsoft is getting more serious all the time about making SQL Server an enterprise-level database. For starters, if you could afford it, you could run SQL Server on a 32-processor server with 64GB of RAM. That server could host a database containing more than a million terabytes of data. But of course, numbers never tell the whole story. Microsoft has implemented a number of improvements in support for very large, reliable databases.

For example, failover support has been substantially improved, making it easier to run SQL Server in a 24x7 environment without outages due to hardware failures. You can create clusters of up to four nodes, and set up failover or failback to or from any node in the cluster in case of problems. Individual nodes can be reinstalled or rebuilt without affecting other nodes in the cluster.

If you’re maintaining multiple copies of a distributed database, you’ll want to look into the new log shipping feature. With log shipping, copies of a transaction log are fed from a source database to a destination database on a constant basis. This lets you keep the destination database as a warm standby, ready to replace the source database at any time. It also provides an easy way to separate OLAP analysis applications from OLTP data entry applications; let the data entry proceed at your primary server, then use log shipping to update a secondary server where you can run time-consuming analyses. The Database Maintenance Plan Wizard has been enhanced to include log shipping setup. A separate server within your organization can be designated as the Monitor Server to keep track of all log shipping operations.

For those occasions when scaling up your server to the largest possible monster server doesn’t handle the traffic, you can now look at “scaling out.” Microsoft’s tool for doing this is the federated server. With federated servers, you split a table horizontally across as many database servers as necessary. That is, one server contains the first hundred thousand rows of the table, the next contains the next hundred thousand, and so on. SQL Server 2000 sports updateable distributed partitioned views, which is a fancy way to say that a single view can treat this collection of federated tables as a single table. With federated servers, there’s no practical limit to the amount of data you can store and update in a single table. Scaling out has the added bonus of being less expensive than scaling up for equivalent power.

For those maintaining data in heterogeneous sources, the new OPENROWSET function in T-SQL will come as a distinct improvement over the old OPENQUERY. OPENROWSET allows you to specify OLE DB connection information directly in an ad-hoc query. On a practical level, this means that you can use a rowset from any OLE DB provider in place of a native SQL Server base table.

Under the Hood

You may have noticed recently that Microsoft is integrating XML into all of its products. SQL Server 2000 is no exception. For starters, SQL Server 2000 is tightly integrated with Microsoft Internet Information Server. You can define a virtual root in an IIS Web site that refers to a database on SQL Server. Properly constructed URLs directed at that virtual root can contain T-SQL queries, including the new FOR XML clause, to generate XML documents dynamically for display in a Web browser. Figure 1 shows this capability in action.

Figure 1. Retrieving database results as XML via IIS. Note the use of special characters in the URL.

SQL Server 2000 can also work directly with XML documents. The new system stored procedure sp_xml_preparedocument can read any well-formed XML document into memory and parse its contents. From there, the OPENXML keyword can be used in a SELECT statement to make any set of nodes from the document available as an updateable recordset.

It’s become clear that the SQL Server team couldn’t quite get all the XML debugged in time to make it into the box. Fortunately for developers, it appears that new XML technology will continue to roll out on a regular basis. The first offering is the updategram, which (as of this writing) is available as a download for SQL Server 2000 beta 2. With an updategram, you can construct an XML message that shows the state of data before and after an update, send it to the server, and have SQL Server translate the updategram into the appropriate UPDATE, INSERT, and DELETE statements. This facility brings us one step closer the realization of XML as a universal language that can be used by different applications to exchange data.

One more new feature of possible interest to those running databases for the Internet is support for multiple server instances on the same computer. With SQL Server 2000, you can install more than one server on a single computer as completely isolated applications (although they share much of the same disk footprint). This is ideal for service providers that want to let multiple customers share a single computer without sharing a single copy of SQL Server.

Souped-up Tranny

Don’t worry—the database administrator hasn’t been overlooked in the upgrades here. One of the new features of SQL Server 2000 is the use of Kerberos to support authentication across multiple computers. By using Kerberos in conjunction with Windows 2000 Active Directory, servers can pass user credentials among themselves. This is especially useful for distributed querying scenarios. If a user authenticates to one server that then uses OPENROWSET to retrieve data from a second server, Kerberos provides a secure way for the second server to check the user’s credentials.

The backup and restore architecture in SQL Server 2000 has been reworked as well. You can now save a log mark at the start of any transaction, which allows restoring the database to just before or just after that transaction. In addition, SQL Server will use multiple threads to perform backup and restore operations in parallel whenever possible, making it easier to manage very large databases. For example, if a backup set is being striped across multiple devices, SQL Server will devote a thread to each device.

The venerable Database Consistence Checker (DBCC) utility, too, has been improved by the addition of parallelism. As you know, DBCC executes through Transact SQL to perform maintenance tasks, check the status of the server, and validate the data stored on the server. If you have multiple processors on your server, the new version of DBCC will run on all of them, allowing much faster consistency checking in a database. DBCC can also check tables without taking locks, allowing it to run at the same time that tables are being updated.

Moving Around the Track

Anyone who’s maintained a large database installation knows that just getting the data into the database isn’t enough. You frequently need to manipulate and transport the data to multiple locations. SQL Server 2000 sports improvements in this area as well.

Replication, which allows maintaining copies of a database in diverse locations, was an immense step forward when Microsoft first introduced it in version 6.0. In SQL Server 2000, replication has been enhanced in numerous ways:

  • Replication of schema changes on published databases.
  • Replication of scripts to subscribers.
  • Pre- and post-snapshot scripts.
  • Remote agent activation, to better distribute the processing load of replication.
  • Greater parallelism in the replication agents.
  • Vertical filters for merge publications.
  • Alternate synchronization partners for merge publications.
  • New merge replication conflict resolvers.
  • Better support for heterogeneous data sources in replication.
  • Queued updating for transactional replication.
  • Data transformation during the publishing process.
  • Integration of replication publishing with Active Directory.

The result is a robust and full-featured replication service that goes far beyond the original goal of maintaining two identical copies of a database from a single point of data entry.

SQL Server’s other data movement tool, Data Transformation Services (DTS), has been similarly enhanced. The variety of data sources and tasks built into DTS has been expanded, and the documentation and examples for creating your own custom task is improved. Figure 2 shows an example of a DTS package that includes multiple data sources and conditional workflow in the new version of the DTS designer, with the icons for the new tasks and data sources visible.

Figure 2. DTS Package in the Package Designer. This package includes SQL and Send Mail tasks, conditional workflow, and both SQL Server and text data sources.

DTS can also now export entire packages to Visual Basic projects, making it very easy to use DTS from standalone applications.

Analyzing the Results

Microsoft OLAP Server has been renamed Microsoft Analysis Services in this version of SQL Server, and it’s been substantially improved from the version included with SQL Server 7.0. Perhaps the biggest news is the addition of data mining to Analysis Services. With data mining, Analysis Services will help you explore your data for correlations and even predict values for new data. Figure 3 shows a data mining model in action.

Figure 3. The new Data Mining Model Browser. Darker color corresponds to more records; the most important factors are closest to the root of the tree. The Content Navigator window provides a schematic overview of the entire model.

In this case, the data in a set of tables is being analyzed to determine which factors are the most important in predicting the Yearly Income column of a table. The core OLAP portion of Analysis Services has been enhanced as well. A few of the key new features here include:

  • Parent-Child dimensions to handle tables with self joins.
  • Write-enabled dimensions that let you update data through Analysis Manager.
  • Ragged dimensions for better handling of data with an incomplete or inconsistent hierarchy.
  • Dimension filtering to limit the rows included in a dimension.
  • Kerberos authentication and finer-grained security.
  • Actions linked to cube data. For example, you can jump straight from a customer to their Web page.
  • Linked cubes for use by distributed Analysis Servers.
  • A drag-and-drop builder for MDX Expressions.
  • Client-side PivotTable Service connections via HTTP and IIS.

If you’ve been struggling with some of the limitations in SQL Server 7.0’s OLAP Server, you should be pleasantly surprised by all of the work that went into making Analysis Services a much more serious tool for working with large amounts of data.

Improvements in the Pit

Developers haven’t been left behind, either. The core of SQL Server, after all, is the Transact-SQL language. Although there aren’t a lot of new extensions to the language this time around, the ones that were incorporated are significant. SQL Server now supports user-defined functions, or UDFs. By using the CREATE FUNCTION keyword, you can wrap up your own custom processing in a package that’s indistinguishable from a built-in SQL Server function:

CREATE FUNCTION twoTimes
( @input int=0 )
RETURNS int
AS
BEGIN
RETURN 2 * @input
END

After you create a function, it’s easy to use it in a SQL statement:

SELECT OrderID, dbo.TwoTimes(Quantity) AS Extra
FROM [Order Details]

Another nice extension to the language is the addition of indexed views. You can use the CREATE INDEX statement on the results of a view to improve the response time of future statements that retrieve data from that view. New data types in T-SQL include bigint for eight-byte integers, sql_variant for storage of variant data (and tighter integration with, for example, Visual Basic applications), and table for temporary storage of result sets for later use. Many developers will be pleased to know that cascading declarative referential integrity is now fully supported. For those using triggers for custom processing, INSTEAD OF and AFTER triggers are welcome improvements as well. All of this T-SQL is wrapped up in the latest iteration of SQL Server Query Analyzer. Figure 4 shows this tool in its SQL Server 2000 version.

Figure 4. SQL Server Query Analyzer. Note the separate server and client impact analyses.

The Object Browser window offers drag-and-drop access to both the objects in your databases and templates for common bits of SQL (and, yes, you can add your own templates to the list). You can also trace both the client and server impact of queries directly from query analyzer, as well as estimate the execution plan without actually executing the query.

Should You Sign Up for a Test Drive?

Feature lists are a thing of joy to marketers and developers, but for the business side, there’s really only one question: Is it worth spending the money to upgrade to a new version of a complex and core program such as SQL Server? After working with beta versions for most of a year, I can see several key situations where the upgrade is immediately justified:

  • If your organization has made a commitment to new technologies such as XML, Active Directory, or Kerberos, you’ll appreciate the integration with these features.
  • If you’re running a large Web site or internal application that’s pushing the limits of SQL Server 7.0, the improved scalability in SQL Server 2000 will remove those limits. If you have a large amount of data to analyze from a business perspective, Analysis Services can pay for the cost of the upgrade all by itself.
  • If you have many servers to administer and develop with, the incremental improvements to both the administrative and development tools can make a substantial difference in your productivity.

Of course, as with any other business-critical software, you should approach a database server upgrade with caution. If at all possible, install SQL Server 2000 on a test server and run your own application in that environment before rolling it out on your production servers. Microsoft makes this process easy with evaluation copies and a wealth of information on the SQL Server Web site at www.microsoft.com/sql/default.htm. Speaking personally, I’ve found SQL Server 2000 to be a polished and stable release, and I look forward to helping my clients implement it.

Featured