Managing the Search Experience in Microsoft SharePoint 2010

  • 6/24/2010
This chapter from Microsoft SharePoint 2010 Administrator's Pocket Consultant will show how to configure search so that users can easily define and refine both the query and the results.
  • Configuring the Thesaurus and Noise Word Files

  • Defining Authoritative Pages

  • Federated Queries

  • Managed Properties

  • Creating and Managing Search Scopes

  • Search Results Removal

  • Site Collection Search Management

  • Working with Keywords and Best Bets

  • Creating and Customizing Search Centers

  • Customizing Search Pages

  • Working with Query Reporting

  • Local Search Configuration Options

When a user executes a search query, the goal is quite simple—to get a results set that includes everything relevant to the search and nothing else. Achieving this goal is not so simple, but this chapter will show how to configure search so that users can easily define and refine both the query and the results. The chapter is organized according to the scope of the configurations: starting with the file systems and then moving on to the search services application, the site collection, and the search centers.

Configuring the Thesaurus and Noise Word Files

Microsoft SharePoint 2010 continues to provide thesaurus and noise word files to manipulate the search process, but the scope of their usage has been changed in this product. In this section, we discuss the more common ways to configure these elements.

Crawl components no longer use the files to eliminate words from the index. However, query components use both the noise word files to remove words from query terms and thesaurus files to modify queries.

Noise Word Files

A noise word file is a text file that contains all the words that have little or no refinement value in a search query in your environment. Such words often include your organization’s name, product names, registered names, and so on. Noise words apply only to text content, not metadata.

SharePoint Server 2010 provides noise word and thesaurus files in 54 languages. They are located in a number of directories named Config. The hierarchy of these directories is significant because the installation and implementation of SharePoint Server determine which set of files is used during a query.

Files located in the %ProgramFiles%\Microsoft Office Servers\14.0\Data\Config folder are for SharePoint Foundation Server installations. This folder is not used in SharePoint Server 2010.

For a SharePoint Server 2010 standalone server farm or Microsoft Search Server 2010, the files under %ProgramFiles%\Microsoft Office Servers\14.0\Data\Office Server\config are copied to the Microsoft Office Servers\14.0\Data\Office Server\Applications\(serviceGUID)\Config folder to be used at query time.

When you are setting up a complete server farm, whether it contains one server or more, files under %ProgramFiles%\Microsoft Office Servers\14.0\Data\Office Server\config are copied to all %ProgramFiles%\Microsoft Office Servers\14.0\Data\Office Server\Applications\(service and service component GUID)\Config folders. However, only files under query component GUIDs are used at query time.

For consistent query responses, all files under all query components on all servers should be identical. If noise word and thesaurus file modifications are known before you create search service applications, the set of files in the %ProgramFiles%\Microsoft Office Servers\14.0\Data\Office Server\config folder can be modified prior to the copy process. These files must be identical on all members of the farm because any member can host the search service components.

To configure a noise word file, perform the following steps:

  1. Go to the appropriate noise word file, and open it using a text editor such as Notepad.

  2. Enter the words you do not want used in queries, one word per line. Maintaining the list in alphabetical order makes reviewing terms easier.

  3. Save the file.

Configuring the Thesaurus

The thesaurus provides a mechanism to assist users in constructing a query by expanding or replacing query terms as the query is executed against the index. It differs from search suggestions in that the changes are transparent to the user and are not optional for the user. You can create expansion or replacement sets, as well as weight or stem the terms within the expansion or replacement sets.

You can use thesaurus file entries to correct commonly misspelled query terms, add synonyms to queries, or replace query terms. Because modifying these files requires access to the file system of all Web front ends, you probably will find the new functionality of search suggestions easier to maintain.

The thesaurus is configured via an XML file, which has the format of TS<XXX>.XML, where XXX is the standard three-letter code for a specific language. For English, the file name is Tsenu.xml.

The default code for the file is as follows:

<XML ID="Microsoft Search Thesaurus">
<!--  Commented out
    <thesaurus xmlns="x-schema:tsSchema.xml">
    <diacritics_sensitive>0</diacritics_sensitive>
        <expansion>
            <sub>Internet Explorer</sub>
            <sub>IE</sub>
            <sub>IE5</sub>
        </expansion>
        <replacement>
            <pat>NT5</pat>
            <pat>W2K</pat>
            <sub>Windows 2000</sub>
        </replacement>
        <expansion>
            <sub>run</sub>
            <sub>jog</sub>
        </expansion>
    </thesaurus>
-->
</XML>

To create new expansion sets, perform the following steps:

  1. Open Windows Explorer, and go to the location of the thesaurus XML file.

  2. Open the XML file using Notepad or some other text editor.

  3. Enter your expansion terms within the tags using well-formed XML, as illustrated here:

    <expansion>
       <sub>term1</sub>
       <sub>term2</sub>
       <sub>term3</sub>
    </expansion>
  4. Save the file.

  5. Restart the Mssearch.exe service.

To create new replacement sets, perform the following steps:

  1. Open My Computer, and go to the location of the thesaurus XML file.

  2. Open the XML file using Notepad or some other text editor.

  3. Enter your replacement terms within the tags using well-formed XML. Note that the terms being replaced are in the <sub> extensions, and the term to replace them is in the <pat> extension. This is illustrated here:

    <replacement>
       <sub>term1</sub>
       <sub>term2</sub>
       <pat>term3</pat>
    </replacement>
  4. Save the file.

  5. Restart the SharePoint Server Search 14 service (Net stop/start osearch14).