- By Ben Curry
- Configuring the Thesaurus and Noise Word Files
- Defining Authoritative Pages
- Federated Queries
- Managed Properties
- Creating and Managing Search Scopes
- Search Results Removal
- Site Collection Search Management
- Working with Keywords and Best Bets
- Creating and Customizing Search Centers
- Customizing Search Pages
- Working with Query Reporting
- Local Search Configuration Options
Although a search query across the full text of a document might be useful, the power of an enterprise search query comes from its ability to query attributes or properties of objects, whether it can crawl the actual content or not. The Search schema contains two types of properties:
Crawled properties are automatically extracted from crawled content, and the metadata field is added to the search schema. The text values of crawled properties that are included in the index are treated the same as text content unless they are mapped to a managed property.
Managed properties are created to group common properties with dissimilar names under standardized names and expose this grouping to search tools. Users can perform specific queries over managed properties.
Crawled properties can be columns on a list or document library, metadata for a content type, or properties within the properties of a document created in a Microsoft Office application. If your users use custom names in these scenarios, mapping crawled properties to a managed property will be more difficult than if they used existing properties or columns. Determining which custom properties should be grouped into a managed property is frequently a time-consuming research job, particularly if there is no naming convention established.
The value in mapping crawled properties to managed properties is that it groups metadata into usable units. The metadata (crawled properties) are grouped into a logical, single unit (managed properties). Multiple crawled properties can be mapped to a single managed property, or a single crawled property can be mapped to multiple managed properties. Managed properties can then be used to create search scopes and enable your users to focus their search to a limited portion of the corpus. Managed properties can also be included in the Advanced Search Web part interface to narrow a query to specific properties and in the Refinement Web part for focusing on specific search results. We will discuss these uses later in this chapter.
To administer metadata properties, navigate to the Metadata Property Mappings page shown in Figure 9-11 by clicking the Metadata Properties link under the Queries And Results heading of the Search Service Application page.
Figure 9-11 Metadata Property Mappings page.
Use this page to create and modify managed properties and map crawled properties to managed properties. Changes to properties of existing content take effect after the next full crawl, but they are applied to new content during incremental crawls.
On this page, several properties of each managed property are displayed, including a linked name and linked crawled properties mapped to the managed property. If you need to configure a new managed property, click the New Managed Property link to open the property page shown in Figure 9-12. Editing from the context menu opens essentially the same page. There are several sections to configure:
Name And Type. The name must be unique and should follow a naming convention that is meaningful and easy to remember. The data type must match that of the crawled properties that will be mapped to this managed property. Your choices are Text, Integer, Decimal, Date And Time, or Yes/No. There is also a Has Multiple Values check box you can select to indicate that the property has multiple values.
Mappings To Crawled Properties. This is the collection of crawled properties that will be represented by this managed property. This configuration section also includes the option of including values from all mapped crawled properties or including values from a single crawled property determined by the order in which the mapped properties are listed.
Use In Scopes. This Boolean choice determines whether the managed property will be available in the drop-down list when defining search scopes.
Optimize Managed Property Storage. The first of two choices here determines whether the text properties are automatically treated as a hash, which reduces the size but limits comparisons to equal or not equal instead of less than, greater than, order by, and so on. The next choice determines if the managed property will be added to the restricted set of managed properties that are shown in custom search results pages.
Figure 9-12 New (Edit) Managed Property page.
Other settings for managed properties can be configured programmatically using the Microsoft.Office.Server.Search.Administration.ManagedProperty class or the Windows PowerShell cmdlets for SPEnterpriseSearchMetadataManagedProperty:
MappingDisallowed. Indicates whether a crawled property can be mapped to this managed property.
Retrievable. Affects whether the property can be displayed, sorted, or used with operators. The two settings under Optimize Managed Property Storage also influence this setting.
FullTextQueriable. Governs whether this managed property is stored in the index and can be used in a CONTAINS or FREETEXT clause so that the property is specified through a query.
NoWordBreaker. Controls whether the values for this managed property go through a word breaker.
RemoveDuplicates. Determines whether the managed property receives multiple values, if there are duplicates.
Weight. Adjusts the relevance configuration.
To see all the crawled properties, from the Metadata Property Mappings page click the Crawled Properties link to open the page shown in Figure 9-13. This page presents a view of crawled properties in alphabetical order by name and displays the type, managed property mappings, whether a particular property is included in the index, and whether a particular property is multivalued.
Figure 9-13 Crawled Properties page.
To edit a crawled property, select Edit/Map Property from the context menu, which opens the page shown in Figure 9-14.
Figure 9-14 Edit Crawled Property page.
Within this page, you can manage the mappings of the crawled property to one or more managed properties. The Include Values For This Property In The Search Index option controls whether the property values is included in queries if the crawled property is not mapped to a managed property. Not including the values reduces the size of the index and the query efficiency but impacts the relevance ranking.
For instance, if this option is not selected and the crawled property is author, simple queries such as Smith return documents containing the word Smith in the body but do not return items whose author property is Smith. However, a query against the managed property with the keyword filter author:Smith returns the documents. The existence of Smith in a property is more relevant than a single instance within the body of a document.
Crawled properties are organized into categories. The Categories link opens a page of hyperlinked categories, which are shown in Figure 9-15:
Basic. Contains metadata associated with the gatherer, search, core, and storage property sets. In my environment, there are 10 different GUIDs (property sets) in the Basic Crawled Property Category.
Business Data. Contains metadata associated with content in the Business Data Catalog.
Internal. Contains metadata internal to SharePoint.
Mail. Contains metadata associated with Microsoft Exchange Server.
Notes. Contains metadata associated with Lotus Notes.
Office. Contains metadata contained in Microsoft Office documents such as those created with Word, Excel, PowerPoint, and so on.
People. Contains metadata associated with the people profiles in SharePoint. The majority of this metadata is also mapped to various managed properties from Active Directory and SharePoint information.
SharePoint. Contains metadata that is part of the Microsoft Office schema available out of the box.
Tiff. Contains metadata associated mainly with documents that have been scanned or faxed, along with word-processing and Optical Character Recognition (OCR) information.
Web. Contains HTML metadata associated with Web pages.
XML. Contains metadata associated with the XML filter.
Figure 9-15 Categories page.
Each category can be opened to expose just the crawled properties within that group. You can open the page to edit the properties of each category from its context menu.
Bulk actions on all properties within the category can be taken on the category’s property page, shown in Figure 9-16.
Figure 9-16 Edit Category page.
Enabling all these options not only ensures that crawled properties for this category will be discovered, but also that managed properties are automatically created when new SharePoint columns are created.
Your solution can use these new managed properties to present to the user. Unfortunately, the name of the automatically generated managed property is not user friendly. Because SharePoint crawled properties are prefixed with ows_, the auto-generated managed property is also prefixed with ows.
For example, if a user creates a new column in a document library called CostCenter, the crawled property will be ows_CostCenter and the managed property will be owsCostCenter. If the column name includes a space, as in Cost Center, the crawled property will be ows_Cost_x0020_Center and the managed property will be owsCostx0020Center.
The programming effort to correct the naming scheme can exceed the cost of manual administration of managed properties.
From the context menu or from the Edit Category page, you can delete an empty category. New categories can be created only programmatically or with the Windows PowerShell SPEnterpriseSearchMetadataCategory cmdlets.