Drilling Down on Queries

Now that you understand the basic structure of a SQL application, let's dissect the basis of information retrieval: the query.

For many techies, especially those that focus on systems administration, the language used to communicate with database servers can be a mystery. You can't easily set up a database or build a new application using only GUI tools alone. Fortunately, there's a standard language for communicating with relational database management systems, such as Microsoft's SQL Server 2000.

That's the focus of this article: the basis of the Structured Query Language (SQL) and how you can use it to access data from relational data sources. We'll cover the basic syntax of the four major types of SQL commands and we'll look at some hints and tips for creating useful SQL scripts.

Let's start with the basics: SQL stands for Structured Query Language. If you've read this far, I'm assuming there's a good chance that this doesn't answer all of your questions. So, exactly what is this SQL (pronounced, "sequel") stuff, anyway? Many people feel like they may have missed the first one since it's yet another strange language to learn. One of the main benefits of SQL is that it's a standard for communicating with databases; there's even a standards committee that controls how the language is developed and extended (see "About SQL Standards"). Simply put, SQL queries are used to instruct database servers to make changes to relational data (that is, create, modify, delete and read information stored in database tables). SQL is supported by other relational database platforms, including Oracle, IBM DB/2 and Informix. Furthermore, you can use SQL to access information through other ODBC-compliant data sources, such as Microsoft Excel files, text files, etc. For now, trust me - it's useful!

About SQL Standards
As is often the case with many types of languages, there are several standards and versions. The "official" version of SQL is defined by the American National Standards Institute (ANSI) and is named based on the year the specification because active. For example, ANSI SQL-92 is a standard that's probably the most commonly supported standard. Although most current database platforms support a large portion of the specification, each has its own additions. For example, Oracle uses the Procedural Language/Structured Query Language (PL/SQL) and Microsoft uses Transact-SQL to extend the functionality provided by the ANSI specifications. These extensions define how features like stored procedures, triggers and other database objects are implemented.

In this article, I'll start with building simple SQL queries and then move to advanced queries that include joins between multiple tables. Although I'll be using examples from Microsoft's SQL Server 2000 in this article, the same queries should run fine without modifications.

Setting Up the Basics
Before we get started, you should have access to an installation of SQL Server 7.0 or SQL Server 2000, and you should have the Query Analyzer tool installed. When you open Query Analyzer, you'll be asked to log on to a database server. You can use either standard authentication or a username and password combination. In general, the name of the instance will be the same as the machine name for the SQL Server installation you're trying to access. If you're referring to the local machine, you can simply use a "." for the server name. Once you've got Query Analyzer open (see Figure 1), you're ready to start writing queries. Use the drop-down box at the top to select the Northwind database.

The Query Analyzer has many useful features, which I won't cover here in an attempt to focus on the basics. To find out what those features are, use the toolbar or press F1 to get the Transact-SQL help. You can also get more by using the Object Browser and templates that help you create all types of SQL queries.

Figure 1. Using SQL 2000's Query Analyzer. (Click image to view larger version.)

Types of SQL Queries
Let's jump into the commands that you can use to modify data within your relational databases. There are four main types of SQL statements:

  • SELECT - Used to retrieve information from a database. A SELECT query defines the data you're trying to return and embeds the logic that's required to find and format the results.
  • INSERT - Used to add rows to a table.
  • UPDATE - Used to modify rows in a table.
  • DELETE - Deletes a row from a database table.

In general, these statements are known as Data Manipulation Language (DML) statements. How's that for a brief overview? Well, how about if we dig into the details.

SELECT Queries
A basic SQL SELECT query has two main clauses: SELECT, which tells the server which columns you want to return, and FROM, which tells the server from which tables you want to retrieve data. Here's a query that returns a list of all of the CustomerIDs and their associated Company Names from the Customers table.

SELECT CustomerID, CompanyName
FROM   Customers

It's as simple as that! Figure 2 provides an example of the results from a fairly complicated SELECT query (don't worry about the syntax).

Figure 2. Running a complicated SELECT query in Query Analyzer. (Click image to view larger version.)

Let's look at some features of SQL queries:

  • SQL queries are not case sensitive. SELECT generally means the same things as Select or select. However, various database servers may be case-sensitive, so if you're searching for "Bobby Fischer", you may not receive a row that has a value of "bobby fisher". (By default, SQL Server is case-preserving but not case-sensitive).
  • The formatting of SQL queries is not important to the database server. For example, the above query could have been written on a single line. The formatting of the query is important to developers, however, from a management standpoint. If I write a 50-line SQL statement on a single line, the code will be difficult to read, manage and maintain. There are many different ideas regarding how objects should be named and queries should be written. The short rule is to pick a standard and stick with it.

OK, let's look at some more features of SQL queries. First, you can use the "*" character as a wildcard to return all of the columns from one or more tables. Second, you can use aliases as shortcuts for referring to tables. Aliases are specified in the FROM clause and can significantly reduce the amount of typing required to get the information you want. The following SELECT query uses these shortcuts:

SELECT c.*
FROM   Customers c

There are additional features that make SQL queries much more useful. Here are some additional clauses that you can add to your queries:

  • AS - Use the "as" keyword to provide friendly names for the columns in a result set. For example, if your table defines a column name as "CustomerID", you can format the result as "CustomerNumber" or something more meaningful to your users and developers.
  • WHERE - Perhaps the single most important clause, the WHERE clause defines the criteria for the rows you want to return. You might want to obtain a list of customers that are located in the state of Michigan and have placed at least one order.
  • GROUP BY - Often, you'll want to perform aggregate operations (such as a SUM, which totals the values in a column). The columns in the GROUP BY clause specify the order in which the values will be grouped.
  • ORDER BY - When you're viewing report information, you'll often want to sort the information. The order of the columns specified in the ORDER BY clause determine how the results will be sorted. For example, this query:

SELECT c.CompanyName as CompanyName,
       c.City as CityName, c.Country
FROM Customers c
WHERE (c.Country = 'USA'
   OR c.Country = 'France')
ORDER BY CompanyName

returns basic company information for records in your Customers table. It restricts the display to only those customers who reside in the USA or in France and sorts the results by the name of the company.

Earlier in the article, I promised that we'd stick to the basics. There's a lot more to know about basic SQL statements. For example, JOINs are absolutely vital in retrieving information from most real-world databases. And, there are a lot of other features such as sub-select statements, formatting of result sets, aggregate functions and much more. If you want to dive into the details of building these types of queries, you can start by using the SQL Server Books Online (a great resource).

INSERT Queries
A common SQL operation is to add rows to a table. Enter the SQL INSERT query. An INSERT statement is quite simple, and defines the columns you want to modify, along with the values you want to insert into those fields. This query might be used to add a row to an Employee table:

INSERT INTO Employee (EmployeeID, FirstName, LastName)
VALUES (937, 'John', 'Petrucci')

It's as simple as that - kind of. You can also dynamically generate the list of columns you want to return using a SELECT statement. Again, for more information, see the SQL Books Online.

UPDATE Queries
UPDATE queries are used to modify the data in rows in a database. I might want to increase all of the prices in my Products table for products that currently cost less than $10. A sample query might look like this:

UPDATE Products
SET Price = Price * 1.08
WHERE Price < 10.00="">

This query updates all rows in the Products table. Specifically, it increases the price of all items by eight percent if the item price is less than $10.

DELETE Queries
The DELETE statement is used to remove rows from a database table. This example removes all of the customers that have not placed an order in more than two years from my database:

DELETE FROM OrderHistory
WHERE LastOrderDate < '31-mar-2000'="">

One note of caution that your DBAs will thank me for: Be careful that you don't forget the WHERE clause. A simple two-line statement can easily delete all of the data in your entire OrderHistory table. This is perhaps the most common (and potentially embarrassing) database error you can make.

Tips for Writing SQL Queries
Finally - a place where I can rant about some of my pet peeves of database development! If you're planning to write SQL queries (especially within a development team), it's important that you follow some fairly simple rules. Here are some pointers on getting started:

  • Develop a naming convention - It's important the you create a naming convention for your databases, the tables within them and even specific columns. It can be extremely annoying for other database developers to determine whether the column is supposed to be called "EmployeeNo", "EmployeeNumber", "Employee_Number", etc.
  • Simplify security administration - Most database servers offer you the ability to set database security permissions on specific objects (such as tables). Often, you'll find that creating objects such as views and stored procedures and then placing permissions on those objects can be simpler and more secure than placing permissions on the tables themselves. Also, be sure to take advantage of features such as pass-through authentication and the use of roles for database users.
  • Establish a test environment - Successful developers understand the importance of testing code after making modifications. Be sure that you test interactions with all of your database objects when you modify something like a table. A seemingly innocuous change like renaming a column in a table can wreak havoc on the rest of your database.
  • Measure the performance of your queries and transactions - It's all too easy to think, "It works and returns the correct results, so my job is done." That may be true in some cases, but it's also possible that you've written a query that returns the correct results - the only problem is that it takes four weeks to do so! Be sure to measure performance of your queries through tools like SQL Server's Query Analyzer before you drop this code into production.

I could go on forever with these tips (and I have done so before - just ask my co-workers). But, that would fill another article (hey, there's an idea!).

Waiting for the SQL?
There you have it - the basics of writing SQL queries. There's a lot more to know and, if you're planning to write database-driven applications, you'll soon find some of the strengths and weaknesses of SQL. However, understanding SQL and how it can be used to retrieve and modify data is a useful skill to add to your bag of tricks as a developer or database administrator. Remember to use your new powers for good, not evil! Next time, I cover database design issues.

Portions of this article have been adapted from SQL Server 2000 Backup & Recovery by Anil Desai (Osborne/McGraw-Hill).

Featured