Lots of products offer tools to build a powerful Web site. Big deal, right? Site Server moves to another level, letting you analyze and manage your site better and personalize the visitor experience.
        
        Broaden Your Sites: The Site Server 3.0 Story
        Lots of products offer tools to build a powerful Web site. Big deal, right? Site Server moves to another level, letting you analyze and manage your site better and personalize the visitor experience.
        
        
			- By John West
 - November 01, 1998
 
		
        Have you ever been to a Web site where you felt like 
        the host knew you were coming? Maybe it said, “Hi, 
        John,” or perhaps it already knew your address when 
        you went to place an order. Without your asking, it may 
        even have told you about a sale on your favorite type 
        of product. 
      Or perhaps you develop intranet applications at your 
        company. Wouldn’t it be nice to recognize that a 
        particular employee is browsing, without that person having 
        to log on separately from the network? How about providing 
        content tailored to his or her business unit or job? 
      On the back end, wouldn’t it be great to track what 
        your users are doing at your site and have built-in tools 
        for allowing new content to be created, categorized, and 
        deployed to production? 
      In a nutshell, that’s what Microsoft’s Site 
        Server 3.0 is all about. 
      Last year, Microsoft introduced Site Server 2.0, a compendium 
        of purchased technologies that added up to a Web management 
        suite. While Site Server 3.0 still struggles to offer 
        tight integration (separate groups developed the main 
        components of the latest product), this release takes 
        Web management and e-commerce to a new level of affordable 
        functionality. 
      In this article, I introduce you to the functional areas 
        of Site Server. Once you understand what each area encompasses, 
        you may discover new ways to make your intranet or Internet 
        site more valuable to your users. 
      I should state up front: Site Server comes in two flavors. 
        I cover Site Server 3.0 in this article. It allows you 
        to create intranets and Internet sites that are personalized, 
        customized, searchable, and maintainable. Site Server 
        3.0, Commerce Edition, allows you to create online stores 
        and Web-based extranet applications. It includes tools 
        to set up a Web-based storefront, take orders securely, 
        customize the ordering experience based on your business 
        needs, and analyze how your store is used. A story about 
        that edition is planned for a future issue of the magazine. 
      
      
      While Microsoft usually refers to four main areas of 
        Site Server (publishing, search, personalization and membership, 
        and analysis) I prefer to break it down into the areas 
        shown in the chart below. 
      Membership: Don’t I Know You? 
      Membership allows an administrator to maintain information 
        about the members—users who are browsing your site—in 
        a way that’s secure and reliable. This information 
        can be used to secure parts of a site from unauthorized 
        users. It can also be used to create a sense of community 
        by allowing users to view other users’ information, 
        in case you want to provide a directory of users or offer 
        chat capabilities. Membership and personalization can 
        be used together to provide each user with a customized 
        site visit experience. Membership (along with personalization) 
        also provides the ability to cross-sell and up-sell related 
        products within an e-commerce environment. You can recommend 
        products based on past purchasing decisions—currently, 
        one of the holy grails of Web marketing. 
      With Site Server, every member on a site is recognized 
        by his or her credentials. These credentials could be 
        a user name and password, a certificate, a cookie, or 
        some other way of uniquely identifying the visitor. 
      The Membership Directory stores data relating to membership 
        and personalization, in an ODBC-compliant database. Choose 
        the database carefully, because if you change your mind 
        later, data can’t be migrated. In my experience, 
        the Membership Directory seems to work on SQL Server not 
        Oracle. (All other parts of Site Server seem to work with 
        Oracle and other ODBC-compliant DBMSs. Only the LDAP membership 
        directory doesn’t; this limitation is fairly crippling 
        if you want to use another database system, because membership 
        is at the heart of this product.) 
      The Membership Directory is an LDAP-compliant database 
        (see Figure 1). It’s arranged as a hierarchy of objects 
        that defines all aspects of Membership. It includes the 
        Directory schema (the configuration and relationship of 
        the objects within the Membership Directory), user information 
        (for example, user name, password, and birth date), groups, 
        site information (site vocabulary, distribution lists, 
        and application data), and data about various sources 
        of content on your site. 
      
         
            | 
        
         
          | Figure 1. The Membership directory 
            is hierarchical. As you drill down through it, you 
            begin to see more detail. | 
        
      
      The Membership Directory also stores dynamic data. While 
        dynamic data looks and feels like any other data in the 
        Membership Directory, it’s never physically written 
        to the Membership Directory. Instead, it’s kept in 
        memory. This enables information about a user to persist 
        while the user’s session is active. For example, 
        if a user is placing an order, data about the contents 
        of the current order might be kept in dynamic data as 
        that person shops the site. Data like this is needed only 
        for a short time. Once the user places the order or the 
        session times out because he or she has left the site, 
        the memory is freed. 
      Tip: If you’ve 
        worked with Active Server Pages (ASP) and specifically 
        the Session and Application objects, you might be wondering 
        if it’s better to use those objects or dynamic data 
        within the Membership Directory. In many cases, there’s 
        no hard and fast rule. However, one case makes the benefit 
        of dynamic data over ASP objects very clear: Web farms. 
        If you have a Web farm that needs to load-balance and 
        be completely fault tolerant, dynamic data is the way 
        to go. Configure each Web server to use the same Membership 
        Directory. That way, even if the user connects to different 
        servers as he or she makes HTTP requests during a session, 
        the data that needs to persist throughout the session 
        will be available to each server via dynamic data. 
      
         
           
            
               
                 
                  
                     
                      | Functional 
                        Area of Site Server | 
                      What 
                        It Does | 
                     
                     
                      | Membership | 
                      Allows a site to maintain 
                        information about visitors in a way that’s 
                        secure and scalable. | 
                     
                     
                      | Personalization | 
                      Allows users to experience 
                        a site in a way that’s customized 
                        to their needs based on their identity. | 
                     
                     
                      | Search | 
                      Provides powerful mechanisms 
                        for users to easily find the information 
                        they need in an organization. | 
                     
                     
                      | Push | 
                      Makes it possible to have 
                        customized information automatically delivered 
                        to users instead of requiring users to 
                        search for it. | 
                     
                     
                      | Analysis | 
                      Provides the ability to 
                        see usage statistics about your site and 
                        to analyze your site for any problems 
                        with its structure. | 
                     
                     
                      | Content Management | 
                      Enables users to post content 
                        to a site in a structured way. Users must 
                        tag content with standard values. Also, 
                        content can be sent through an editorial 
                        process where another user has to approve 
                        the content that has been posted before 
                        it appears on the site. | 
                     
                     
                      | Content Deployment | 
                      Allows administrators to 
                        set up the Web infrastructure in a way 
                        that’s scalable and fault-tolerant 
                        by setting up replication projects and 
                        routes for content distribution. | 
                     
                   
                 | 
               
                
             
           | 
        
      
       Many Routes to Authentication 
      When an administrator sets up a Membership Directory, 
        he or she must select one of two possible choices to store 
        users’ credentials: Windows NT Authentication or 
        Membership Authentication. While the two authentication 
        methods store user credentials in different places, they 
        both use the Membership Directory to store the user’s 
        profile. (With NT Authentication mode, the NT SAM is leveraged 
        for user credentials. With Membership Authentication mode, 
        user credentials are stored in the Membership Directory. 
        All additional user attributes and information are stored 
        in the Membership Directory. 
      NT Authentication interfaces with NT’s security 
        account database to authenticate users. This option is 
        preferable for intranets, since administrators need maintain 
        only a single user database and since users can have a 
        single logon for both file and intranet access. For authentication 
        to be seamless, however, users need Internet Explorer 
        3.x or later. This enables you to use NT Challenge/ Response—better 
        known as NTLM—authentication for your Web site. NTLM 
        authentication uses the same credential validation process 
        to access the Web site that NT Server uses to authenticate 
        a user. The key is that NTLM doesn’t require the 
        actual password to be sent across the network. 
      If you can’t use NTLM, the other option with which 
        to validate users of a Web site when using NT Authentication 
        is basic/clear text. With basic/clear text, users don’t 
        have to maintain two separate accounts or passwords. However, 
        they must re-enter the user names and passwords they use 
        to access resources in the NT domain when getting onto 
        the Web site. Because the password gets sent over the 
        network as the first request and each subsequent request 
        is made, it’s a potential security risk. If you have 
        to go this route, make sure you’re using Secure Socket 
        Layers (SSL). 
      A final option besides NTLM or basic/clear text for authenticating 
        users with NT Authentication is certificates. A user’s 
        certificate is manually mapped to an NT user account. 
        This method is the most secure, but also the most complex. 
        If you have a large user population, manually mapping 
        certificates for each user can become burdensome. And 
        since there’s no global way of knowing when a certificate 
        has been revoked, you may not be able to tell that a user 
        has a certificate he or she should no longer have. 
      
         
           
            
               
                 
                  
                     
                      | 8 
                        Tips to Site Server Savvy  | 
                     
                     
                      
                        
                          - Plan your site vocabulary carefully. 
                            While it can be changed, doing so 
                            will take some effort and coordination. 
                            As much as possible, try to get it 
                            right the first time. 
 
                          - Use Active Channels or direct mailings 
                            to distribute any changes to important 
                            sections of your site, such as internal 
                            policies and procedures. 
 
                          - If you’re providing personalized 
                            pages to your users, you might allow 
                            them to change the order in which 
                            different content appears on the page. 
                            For instance, some users might want 
                            company news to be displayed first, 
                            while others might want the company’s 
                            prior day’s stock quote to be 
                            at the top. 
 
                          - If you’re providing a personalized 
                            experience via cookies and want to 
                            allow your users to gain access to 
                            their personalization settings from 
                            multiple machines, provide a way of 
                            recreating their cookies on a new 
                            machine. 
 
                          - Schedule automatic analysis reports 
                            about your site and send them to the 
                            appropriate parties or post them to 
                            a secure directory to be viewed at 
                            will. 
 
                          - Create rules so that a user sees 
                            only content that has been posted 
                            since his or her last time online. 
                          
 
                          - Crawl your competitor’s sites. 
                            Getting a direct mailing of changes 
                            that have occurred on a competitor’s 
                            site saves you from having to look 
                            for changes manually. 
 
                          - Because Membership Directory can 
                            store dynamic data (data held in memory 
                            on the server instead of in the physical 
                            directory) about a user, use it to 
                            store a user’s session information, 
                            if you can’t ensure a user will 
                            connect to the same Web server as 
                            he or she traverses your site. 
 
                         
                                 
                          —John West 
                        | 
                     
                   
                 | 
               
                
             
           | 
        
      
      With Membership Authentication, user credentials are 
        stored in the Membership Directory. If you’re providing 
        membership to Internet users or using NT for Web services 
        and don’t provide each internal user with an NT account, 
        this is probably the way to go. 
      There are several benefits to Membership Authentication 
        over NT Authentication. First, users can create their 
        own accounts. Second, this method scales to millions of 
        users; NT Authentication can accommodate around 40,000 
        users. Third, because credentials are stored with the 
        user’s profile information in the Membership Directory, 
        there’s no need to perform reads from two separate 
        locations. Therefore, performance is better. 
      As with NT Authentication, there are several methods 
        of authenticating a user with this method. You can use 
        basic/clear text, which I described earlier. Another option 
        is Distributed Password Authentication (DPA.) This option 
        is similar to NTLM. However, instead of working at the 
        NT Domain level, it works at the root of the Membership 
        Directory level. It can cache user credentials so that 
        you get a single logon for any application using the Membership 
        Directory. As with NTLM authentication, this method only 
        works with IE. Unlike NTLM, where the logon dialog box 
        can’t be customized, you can customize the dialog 
        box using the Site Server SDK. 
      A third option for authenticating a user is HTML Forms 
        Authentication. With this method, a Web form containing 
        a user name and password prompt is displayed in the browser 
        for a user to be authenticated. This method supports the 
        largest user base, since HTML forms are almost universal 
        among browsers. Once a user signs on with this method, 
        he or she gets a cookie that contains authentication information, 
        eliminating the need for another logon unless the session 
        expires. With this method, SSL is highly recommended since 
        the password is transmitted over the network at initial 
        logon. 
      The fourth option involves certificates. This works like 
        NT Authentication, except that instead of mapping a certificate 
        to an NT account, you map it to a Membership Directory 
        account. 
      A fifth option for authentication, Automatic Cookie Authentication, 
        is useful if your users need personalization only, without 
        verifying who they are. Each user gets assigned an arbitrary 
        identifier (a GUID), and the account is placed under the 
        Anonymous container within the Membership Directory. This 
        method is completely insecure, because anyone with access 
        to the cookie can use it to impersonate the original user. 
      
      Once you can recognize your users, you can set up security 
        on your Web site so that you know what users are accessing 
        your site and restrict or allow access to the content 
        based on their group and user permissions. For your content, 
        you can allow everyone rights to access it, require that 
        users provide specific information before they access 
        it, allow only registered users to access it, or allow 
        only a specific group or groups of those registered users 
        to access it. How you control access is up to you. 
      Personalization: Have It Your Way 
      Personalization works on top of membership to allow you 
        to provide users with a customized experience while browsing 
        your site. By providing personalized services and views 
        of the information on your site, you allow a user to quickly 
        get to the information he or she needs to be as productive 
        as possible. 
      The really cool idea here is delivering content within 
        a page based on a user’s activities—called “passive 
        profiling.” This becomes especially useful in e-commerce, 
        where you can recommend products based on other products 
        the user seems interested in. Cross-selling and up-selling, 
        while common in retail, are really just being introduced 
        on the Web. 
      There are several ways you can personalize content for 
        your users. The most common method is to create Web pages 
        that have content relevant to your users or to automatically 
        deliver content to a user’s mailbox via direct mail 
        (or rather, “direct e-mail”). For example, you 
        might want to send your sales force an e-mail each morning 
        with the current warehouse inventories. You don’t 
        have to maintain the list; you simply create a rule that 
        says, “Send message X to everyone who meets criteria 
        Y and Z,” and Site Server figures the rest out on 
        its own. More on this shortly. 
      
         
           
            
               
                 
                  
                     
                      | Minimum 
                        Daily Requirements | 
                     
                     
                      | Site Server 3.0 appeared 
                        in final shipping form at the end of April. 
                        It comes in two flavors: the standard 
                        edition for standard business sites and 
                        intranets, and a Commerce Edition for 
                        large-scale sites that entail business-to-business 
                        or business-to-consumer transactions. 
                        The estimated retail price for the standard 
                        edition is $1,239 per server, which includes 
                        five client access licenses; the Commerce 
                        Edition starts at $4,609 with 25 client 
                        access licenses. Upgrade pricing is also 
                        available if you’re currently using 
                        Site Server or Site Server Enterprise 
                        Edition. 
                         Along with all the software needed 
                          to install and customize Site Server 
                          3.0 to meet your needs, the package 
                          includes a copy of FrontPage 98 and 
                          Visual Interdev 1.0. (By the time you 
                          read this, however, Visual Interdev 
                          6.0 should be available, which is a 
                          much more robust development tool than 
                          1.0.)  
                        Site Server’s hardware requirements 
                          aren’t minimal. To start, I recommend 
                          a dedicated server with at least 128M 
                          of RAM and a dual-processor configuration. 
                          However, to determine requirements for 
                          your specific site, you first have to 
                          determine what functional areas of Site 
                          Server 3.0 you’ll be using and 
                          how heavily. Also, since you can scale 
                          the different functional areas of Site 
                          Server across multiple servers, you’ll 
                          have to consider the benefits of a few 
                          large servers or several smaller ones. 
                          Since the topic of server specifications 
                          could take an entire article, I won’t 
                          cover it here. Make sure you review 
                          the Site Server 3.0 docs carefully when 
                          defining your environment.  
                        You’ll also need an ODBC-compliant 
                          database management system for the personalization 
                          and membership database and the analysis 
                          database. I suggest SQL Server 6.5 or 
                          later for most installations, although 
                          Site Server installs with a default 
                          of Access. (If you’re using SQL 
                          6.5 instead of SQL 7.0 you’ll need 
                          to implement the latest service pack 
                          and patches for 6.5.)  
                        —John West 
                        | 
                     
                   
                 | 
               
                
             
           | 
        
      
      Template-based publishing provides the key to personalizing 
        your Web pages. Web content templates are Active Server 
        Pages containing a combination of static content and personalized 
        sections. The server dynamically generates these to create 
        the personalized HTML pages your users see as they browse. 
        A Web page like the one in Figure 2 could have been created 
        with Visual Interdev and the Membership.FormatRuleset 
        Design Time Controls (DTCs). The hyperlinks on this page 
        would be generated dynamically based on the user’s 
        attributes. 
      Direct mail uses a concept of templates called mail content 
        templates. Similar to Web content templates, direct mail 
        templates are used to send customized e-mails to members 
        of your user population. There are two extra metatags 
        you can use on mail content templates. The first is DmailAttachment. 
        This allows you to specify an URL to a file you want to 
        include as an attachment when the e-mail is sent. The 
        second is DmailFormat. This specifies whether the mail 
        gets formatted as straight text, MIME format, or HTML 
        format. 
      
         
            | 
        
         
          | Figure 2. Holt Outlet, an educational 
            toy company at www.holtoutlet.com, recognizes you 
            each time you log on. Notice the author's name under 
            the Shopping Lobby logo. Also, it displays dynamic 
            toy lists based on what it knows about the family's 
            children for whom you're buying gifts. The site uses 
            membership and personalization so that you can view 
            another family's children's toy preferences. For instance, 
            if family members have a child and they've registered 
            their child here, you can view the registry they've 
            created. | 
        
      
      Site Server 3.0 includes tools for creating Web and mail 
        content templates. Rule Builder, for instance, allows 
        you to create rules for displaying content on a page. 
        An example of a rule is the following: 
       When 
       
        -  
          
CreditRating > 4 
         
      
      Select content where Keywords 
       
        -  
          
Is exactly equal to GoodCredit 
            
         
      
       In this example, visitors with a credit rating greater 
        than four would be shown the best offers available. 
      Another tool, Rule Manager, allows you to create rule 
        set files. These files contain rules that have been prioritized 
        so that content is personalized based on multiple criteria. 
      
      Site Server 3.0 also includes several DTCs, which can 
        be used in FrontPage and Visual Interdev to give you a 
        visual interface that writes Active Server Page scripts 
        automatically, based on parameters you pass the DTC. One 
        DTC example is “Insert Property.” This DTC creates 
        a script that displays a user attribute of your specification. 
        For instance, you might want to display the user’s 
        name each time that person visits your site. 
      On a Treasure Hunt 
      A major problem with finding information on a corporate 
        network is that the details probably reside all over—on 
        the file server, in a message or two in Exchange public 
        folders, on an intranet page, and maybe in the customer 
        database. Whew! Where do you begin your research? Site 
        Server 3.0’s Search, of course. [For 
        a related article, see “Search the World Over” 
        by Larry Cooper.—Ed.] 
      Site Server indexes two aspects of content. First, of 
        course, it indexes the words in the content itself. It 
        also indexes the properties of the content, such as the 
        subject of a document, the author, and the creation date. 
        The program pulls these from the content in different 
        ways, depending on the content type. For Office documents, 
        it pulls default and custom properties you’ve enumerated 
        in the document. For HTML documents, it pulls the properties 
        from the metatags on the document. Properties in Exchange 
        are pulled as well. Properties are important because you 
        can specify to search the properties instead of the content 
        itself. For instance, you could search for all documents 
        that were created after a certain date or all documents 
        written by a certain person. Also, searching properties 
        can be quicker than searching the index itself, since 
        the property index is kept in memory as much as possible. 
      
      Site Server 3.0 allows you to catalog four types of content: 
        Web (http/nntp), file system, Exchange, and ODBC databases. 
      
      
         
           
            
               
                 
                  
                     
                      | A 
                        Bug Note | 
                     
                     
                      | If you’re working with 
                        the Inspired Technologies demo in Site 
                        Server 3.0, and you get an error when 
                        setting up your user preferences, it’s 
                        because the userpref.asp page being referenced 
                        contains a bug. To get an updated version 
                        of the page that works, visit the following 
                        hyperlink: www.microsoft.com/siteserver/intranet/Update/ssenhance.asp?A=5&B=1 | 
                     
                   
                 | 
               
                
             
           | 
        
      
      When indexing Web content, Search follows links from 
        one page to the other recursively. How the crawler works 
        can be configured to your needs. One point to realize, 
        however, is that if access to content depends on answers 
        you submit via a form, the crawler won’t be able 
        to index that content, since it can’t know the values 
        to place in the form. 
      Tip: And you may 
        be wondering how ASP pages get indexed. Since ASP pages 
        have script on them, does the script itself get indexed 
        or does the resulting page from the script running get 
        indexed? The answer is the latter. To the Web server, 
        the Search crawler is just another user, and it runs all 
        scripts on a page before returning it to the crawling 
        agent. 
      Files on any operating system can be indexed as long 
        as the server doing the indexing has appropriate access. 
        If the OS is NT, Search won’t just index the content, 
        but will also store the security rights to the files; 
        when a user performs a search, any content to which that 
        person doesn’t have rights will be filtered from 
        the list of results. 
      
         
            | 
        
         
          | Figure 3. You can distribute 
            catalogs to multiple servers to provide load-balancing 
            and fault-tolerance. | 
        
      
      If Search recognizes the file type, the file will be 
        indexed intelligently; properties from the document, such 
        as title or author, can be read as well as the contents 
        of the file. Search supports indexing Office documents. 
        However, other third-party indexers, called filters, are 
        also available, such as a filter for Adobe Acrobat .PDF 
        files. If Search doesn’t recognize the file type, 
        it can still index it; but if the file has binary data 
        in it as well as the textual content itself, Search will 
        try to index that data as well. This won’t break 
        your system (at least not that I’ve seen), but your 
        index will be bigger. How much bigger depends on how much 
        of the file is extraneous data. 
      Exchange’s public folders can be indexed. Private 
        mailboxes, however, can’t. When querying against 
        a catalog that contains indexed Exchange content, users 
        will see in the results list only the content that they 
        have permissions to. This works differently from the way 
        it works with file system-based content. Search includes 
        the permissions in the index itself when cataloging file 
        system content. It doesn’t do this when cataloging 
        Exchange content. Instead, Exchange permissions get checked 
        at the time of the query. When Search finds Exchange content 
        that matches the query, it communicates with the Exchange 
        server to ensure you have rights to see it. If you do, 
        it gets returned with the result set. If you don’t, 
        you’ll never even know the content exists. 
      
         
           
            
               
                 
                  
                     
                      | Knowing 
                        Knowledge Manager | 
                     
                     
                      | Knowledge Manager isn’t 
                        a functional area of Site Server; it’s 
                        an intranet application included with 
                        Site Server that leverages the functional 
                        areas of Site Server to provide services 
                        to your users. It includes the ability 
                        to share information, search for content, 
                        and have content delivered to you. It 
                        uses the concept of “briefs” 
                        to help users organize relevant information. 
                        These briefs can be created by each user 
                        for his or her needs. Also, briefs can 
                        be shared so that any user with a need 
                        for the information contained in the shared 
                        brief can have access to it. 
                         You can use Knowledge Manager in two 
                          ways on your intranet. If Knowledge 
                          Manager does what you need it to do, 
                          then by all means implement it for your 
                          users. If it doesn’t, use it instead 
                          to learn how to pull the different aspects 
                          of Site Server together to create applications 
                          that empower your users to find the 
                          information they need.  
                        —John West 
                        | 
                     
                   
                 | 
               
                
             
           | 
        
      
      The fourth content type you can search is information 
        contained in an ODBC database. For example, if you keep 
        product data in a SQL database, you can have Search catalog 
        the data and make it available to your users. I know what 
        you’re thinking: Why would you want to catalog data 
        in a database when databases can already be searched? 
        Good question. There’s more than one answer for that. 
        First, by indexing the content via Search, results from 
        the database content can be returned on the same page 
        as results from other data sources. Also, data that’s 
        been cataloged from databases is available to all the 
        other tools within Site Server that use Search’s 
        functionality. For instance, database content can be included 
        in daily briefs or pushed to users via Active Channels. 
      
      Indexing content can be a very machine-intensive process. 
        For this reason, Site Server enables you to separate your 
        indexing and querying processes onto different servers. 
        You can even specify that a catalog be propagated to multiple 
        servers to load-balance querying and for fault-tolerance 
        (see Figure 4). 
      
         
            | 
        
         
          | Figure 4. When indexing content 
            that’s displayed based on a querystring variable, 
            you must select the “Follow URLs containing question 
            marks” option. | 
        
      
      You can also choose whether to do full or incremental 
        builds of the catalogs. A full build means that every 
        document gets read every time. Testing has shown that 
        Search can perform a full build on over half a million 
        pages in an eight-hour period. If you have more pages 
        than this, or you don’t have the eight-hour windows 
        most businesses do, then you can take advantage of incremental 
        builds. With incremental builds, documents get re-indexed 
        only if they’ve changed. While the results vary depending 
        on how often content is changed, my testing indicated 
        that incremental builds are two to three times faster 
        than full builds. If you’re indexing a large content 
        base, you’ll probably want to schedule some combination 
        of both of these methods. 
      One thing you should be careful of is indexing the file 
        system. As I’ve already mentioned, Search indexes 
        the access control lists along with the content itself 
        when indexing files on an NT server. However, Search doesn’t 
        recognize that these permissions have been changed when 
        doing incremental builds. Therefore, you should always 
        do full builds every once in a while with your file system 
        indexes. How often depends on how long you can afford 
        to have someone see content during a search that they 
        may have had rights revoked from since the last full build. 
      
      Channels: The Power of Broadcasting 
      Push functionality in Site Server 3.0 allows you to deliver 
        content to your users via channels. Channel technology 
        is a method by which you can deliver content directly 
        to the user via the browser (currently, only Internet 
        Explorer 4.x fully supports Active Channels; Netscape 
        also supports channels, but through Netcaster’s Java-based 
        technology, unsupported by IE). Active Channel Server, 
        which comes with Site Server 3.0, delivers the channels. 
        A Channel Definition Format (CDF) file, which is XML-compliant, 
        defines what a channel contains. The content can be a 
        variety of different source types. 
      Users can either receive the content of the channel, 
        which allows them to view the content off-line, or they 
        can just receive notice that the channel has changed and 
        then hyperlink to the content via the channel interface. 
      
      Two types of channels exist. Content channels provide 
        content, as the name implies, and software distribution 
        channels send new software programs and updates to clients. 
        With software distribution you can either simply let the 
        user know the new software or software update exists or, 
        in the case of an automated environment, have it automatically 
        install at delivery time. 
      
         
           
            
               
                 
                  
                     
                      | The 
                        Right Kind of Indexing | 
                     
                     
                      | When you create a catalog, 
                        by default Search Server won't follow 
                        URLs with querystrings when crawling the 
                        Web site. Many dynamic pages display header 
                        information about records in a database 
                        and provide hyperlinks for each record 
                        that points to a page where more information 
                        is displayed. The hyperlink for each record 
                        contains a querystring with the record 
                        number. 
                         For example, you might display job 
                          openings on a Web page, as in this example: 
                         
                        Solution 
                        Manager Systems 
                        Consultant Product 
                        Specialist  
                        When a user clicks on the page, he 
                          or she gets taken to JobDetail.Asp and, 
                          based on the RecNum passed, the correct 
                          job details get displayed. In this example 
                          you'd want the job details to be indexed 
                          so that users searching for employment 
                          will have the best chance of finding 
                          the job opportunities they're interested 
                          in.  
                        If you want Search to follow each of 
                          these links, you must turn on the “Follow 
                          URLs containing question marks” 
                          option by going to the Catalog Properties 
                          sheet and choosing the URLs tab, as 
                          shown in the screenshot.  
                        —John West 
                        | 
                     
                   
                 | 
               
                
             
           | 
        
      
      User Analysis: Going Beyond Numbers 
      Analysis evaluates two things: how people are using your 
        site and how your content is structured. This is necessary 
        to understand your site from an administrative perspective 
        and also to understand how to make the site more useful 
        to the people who are browsing it. 
      The data for analyzing user patterns on your site comes 
        from your Web servers’ log files. Before you start 
        analysis of your site, you must import the log files into 
        your Analysis database. This database can be either Access 
        or SQL Server. The log files you import can get quite 
        large (many megabytes) on bigger Web sites, so I’d 
        recommend that you use SQL Server. Once you’ve imported 
        the log files, you can run reports against them to see 
        how your site is being used. Fortunately, Site Server 
        3.0 provides dozens of reports to get you started, and 
        you can create custom reports as well. 
      One of the most powerful aspects of usage analysis is 
        being able to merge data from your Web servers’ log 
        files with data from your Membership Directory. Information 
        on the user is kept in the log files. You can then associate 
        the user’s attributes in the Membership Directory 
        to extrapolate usage patterns based on those attributes. 
        For example, if you want to know not only how much a particular 
        area of your site is being used, but also the ages of 
        site visitors, you can include the age attribute for the 
        users from the Membership Directory in the reports you 
        generate. Having this level of integration enables you 
        to more fully understand and optimize your site for your 
        audience. 
      You can also use Analysis to explore the content of your 
        site. Using a hyperbolic view of your site (see Figure 
        5), you can move around your site’s structure in 
        a graphical way, enabling you to get a better feel for 
        how your site is laid out. In addition to letting you 
        see how pages link to others, this view can be used to 
        see usage patterns, page sizes, and other key information. 
        In addition to this view, there are 20-plus reports available, 
        to show you the broken links on your site, the number 
        and location of errors encountered when crawling your 
        site, the hierarchy of your site, duplicate files on your 
        site, and so on. 
      
         
            | 
        
         
          | Figure 5. You can click on a 
            page in the hyperbolic view and change your perspective 
            dynamically. This allows you to quickly get to the 
            area you’re most interested in.. | 
        
      
      Understanding analysis is key to optimizing your site 
        for your users. With a full understanding of what content 
        your users want and how they’re getting to it, you 
        can optimize both the content you provide and the placement 
        of that content within your site’s structure. For 
        example, if you find that a popular page is four levels 
        down in your site’s hierarchy and that people aren’t 
        interested in the pages in the preceding levels, you might 
        create a link to the popular page from the home page itself. 
        This will make your site more user-friendly and will also 
        help lessen the load on your server, since fewer pages 
        ultimately have to be served. 
      Complete Control with Content Management 
      
      As intranets get bigger, it becomes harder to post and 
        organize content on a site. As more documents get created, 
        they get scattered throughout the site with no easy way 
        to categorize or find them. Site Server 3.0 makes the 
        process easier by letting you easily post content to the 
        site, create an editorial process for approval or rejection 
        of the content, and tag content with pertinent information. 
      
      Content Management allows you to post any type of document 
        to the intranet, whether it was created in Notepad, Office, 
        or Lotus SmartSuite. The only limitation is that the user 
        must have a compatible reader application installed on 
        the browsing machine in order to read it. 
      Posting new content to the site can be done in two ways. 
        First, you can allow users to add documents to your sites 
        simply by dragging and dropping them in the Web browser, 
        using an ActiveX control that will even let you browse 
        your file system. Second, if you’re using another 
        browser such as Netscape, you can specify the file path. 
      
      Content Management also gives you editorial capabilities. 
        You can set up rules to require that an editor approve 
        content before anyone else can view it. The editor can 
        review the document and either approve or reject it. If 
        the editor approves the document, it becomes visible to 
        everyone on the Web. If the editor rejects it, the user 
        can post a revised version of the material and go through 
        the editorial process again. 
      Even more important than the editorial process is another 
        aspect of Content Management: the ability to tag documents 
        with attributes. Just as you can post any type of document 
        to the intranet, you can also tag documents with attributes 
        such as subject, product line, and company division. These 
        tags make it possible to organize all the documents being 
        added to your Web from different sources using your company’s 
        vocabulary. (What is vocabulary? Simply the ability to 
        add predefined choices for attributes to select from when 
        tagging documents.) 
      
         
           
            
               
                 
                  
                     
                      | A 
                        Word of Warning | 
                     
                     
                      | Make sure after you’ve 
                        created catalogs that you try searching 
                        for information that shouldn’t be 
                        found. For example, you should try to 
                        search for terms like “Salary” 
                        and “Confidential.” Make sure 
                        you only get what you expect! Search doesn’t 
                        affect your security in any way, but it 
                        does make it easier for users to stumble 
                        upon poor security policies already in 
                        place. If you don’t find the holes 
                        first, they will. | 
                     
                   
                 | 
               
                
             
           | 
        
      
      Here’s an example to help you understand how tagging 
        and vocabulary work together. Let’s say you have 
        areas on your intranet dedicated to the different business 
        units within your company, such as Finance, Training, 
        and Production. You then have a required attribute called 
        “Business Unit” for all content posted to the 
        intranet. To ensure that all users choose only from these 
        business units, you create a vocabulary for the Business 
        Unit attribute with values for the three units defined. 
        By doing this, you force anyone who posts a new document 
        to choose from the three values. If you ever needed to 
        look up all documents relating to the Education business 
        unit, you wouldn’t have to figure out whether or 
        not to look under Education, Training, Education Unit, 
        or some other value that looks similar but isn’t. 
        You enforce consistency. Also, as the site administrator 
        you can set up a rule that all documents tagged with a 
        value of Finance for the business unit attribute must 
        be reviewed by the Director of Finance. 
      I’ll close this section on content management with 
        one gripe. When you post content to the site, it becomes 
        read-only. The only way to revise posted content is to 
        delete it and repost the changed version. It would be 
        much more useful if you could revise posted content in-place. 
        With this ability, Site Server 3.0 could become an organization’s 
        document repository. I hope to see this added to a future 
        version. 
      Content Deployment: Divvying Up the Work 
      
      Content deployment is the ability to copy Web projects 
        from one location to another. You can copy a project from 
        one location on a server to another location on the same 
        server, from one server to another, or from one server 
        to many servers. You can even set up routes that traverse 
        several servers in a path. With content deployment, you 
        can roll back projects if necessary to their state before 
        replication occurred. 
      The most common scenario for content deployment is a 
        situation in which you have a development server and production 
        servers. Your developers work against the development 
        server. Once their changes are complete and tested, you 
        can replicate them to all of the production servers in 
        your Web farm. You may even have an intermediate staging 
        server, with a route from development to staging to production. 
        You replicate from the development server to the staging 
        server, and do additional testing against the staging 
        server. This lets you test your changes on that server 
        without affecting the development or production servers. 
        Once you’ve tested on the staging server, the project 
        can be replicated to the production Web farm. 
      If you have a small development environment, you can 
        place your development and production locations on the 
        same box. While this doesn’t help if your changes 
        hang the server, it does keep your users from seeing changes 
        as they’re being made. 
      If you’ve used Site Server 2.0’s Content Replication 
        System in an environment where many people and departments 
        were developing content, you might have been frustrated 
        by the inability to give users the right to replicate 
        their own content. With Site Server 3.0 you can designate 
        users as operators on their own projects. You can even 
        create custom Web pages to control project replication. 
      
      Powerful and Complex 
      This article is by no means comprehensive. The level 
        of functionality contained in Site Server 3.0 could fill 
        a book (or, in this industry, many books by many publishers!). 
        I’ve simply tried to give you a feel for the product’s 
        power and complexity. I hope you come away from this article 
        with a better feel for what Site Server 3.0 can do for 
        your intranet or Internet site. If you do use Site Server 
        to enable your site, drop me an e-mail with the URL. I’d 
        love to see it in action.