Monday, October 19, 2009

SPC 2009 - Enterprise Search Overview

***UPDATE***
Updated with content from Enterprise Search Deep-Dive Session.

Just attended the Enterprise Search Overview for SharePoint 2010 at SPC2009.  It focused a lot around the FAST integration with SP2010.  Below are some quick notes I took.  Obviously I was not able to capture everything, but there is some cool new stuff coming out...

New Capabilities:

  • Increased Scalability over MOSS 2007

  • Multiple Crawlers

  • Query: Partition and mirror the index

  • Sub-second latency scaling to 100 Million documents

  • Rich Content Processing

  • Wizard driven installation

  • Full Fault Tolerance

  • Native 64-bit: hyper-v support

  • Powershell Support

  • SCOM support

  • Full search reporting


Search Product Line:

  • Search Server 2010 Express (Still Free)

    • Basic Search



  • SharePoint Server 2010

    • Intranet-wide search, people and expertise search



  • FAST Search Server 2010 for SharePoint

    • Visual experiences, extreme adaptability and Advanced Content Processing




Scalability:

  • Multiple Indexers (Yes! True redundancy!)

  • Indexers are stateless crawlers and do not store a copy of the index.  They crawl, index, and immediately propagate to Query Servers.

  • Index Partitioning

  • Query Mirroring

  • Multiple Property Databases

  • Admin Database + Admin Component (equivalent of SSP database in MOSS)

  • Crawl Distribution

    • Built-in load balancer distributes hosts to crawl databases

    • Crawlers crawl content that is covered by crawl database

    • Default configuration can be overwritten using host distribution rules



  • Query Distribution

    • Low query latency if all index partitions are equal in size

    • Distribution by hash of documentId

    • Crawlers partitions indexed data and propagate to query servers



  • Multiple scale-out options now, including multiple crawl databases on the same SQL server or multiple SQL servers


Engine Enhancements

  • Support for regular expressions in Crawl Rules

  • Native support for crawling case sensitive repositories

  • Ability to prioritize Content Sources to distribute crawler resources

  • New Crawl Policy to define how crawler treats error conditions

  • Low indexing downtime (Search now only pauses for approx. an hour during backups)


Extensibility Enhancements

*Note: Protocol handler API still supported

  • Change web part properties - no code

    • Modify XSLT

    • Modify config XML

      • Refinement Panel - control metadata available for refinement

      • Advanced Search - control metadata available for advanced search queries





  • Extend OOB web parts programatically

    • All OOB web parts are public (Sweet!)



  • Connector Framework

    • Support for attachments

    • Item level security

    • Crawl through entity associations



  • Inline caching for better citizenships

  • Richer crawl options

    • Regular full crawl

    • Time stamp based incremental crawl

    • Change log crawl + deleted count

    • Change log + delete log crawl




Ricker Manageability

  • Consolidated admin UI dashboard

  • Automated service password management through "managed accounts"

  • PowerShell support for scripted administration

  • Built-in system health monitoring, support for SCOM monitoring and alerting

  • Built-in and extensible search analytics


Seach Site:

  • Native Wildcard Search

  • Type-ahead search box (aka Query Suggestions)


Search Results Page New Features:

  • Metadata Extraction

  • Refinement Panel

    • Filtering by Results Type, Site, Author, Modified Date (displayed in a separate panel kind of like federated results)

    • Suggested Searches



  • Did you mean?


FAST - Search Results Page Additional New Features:

  • Thumbnail of document appears with each result (This is awesome!)

  • Preview documents within the search results window

  • View in Browser Link

  • Similar Results

  • Refinement Panel (In addition to items listed above)

    • Dynamically generate metadata from documentation for filtering (Metadata fields do not have to be populated first



  • Rich Text Best Bets (including images)

  • Federate Results directly to desktop


People Search:

  • "Address Book Style" Search

    • Phonetic name matching

    • Nickname matching



  • Refinement Panel

    • Filtering on Focus, Job Title, Past Projects, Interests



  • Vanity Search (Self Search) - When a user looks themselves up in People Search

    • Help People Find Me Appears

      • Results return the users profile seen by others in the search results.  Also returns some metrics including the number of searches that led to them and keywords





  • Additional Drill-down for Each User

    • Browse in Organizational Chart

    • View Recent Content




Connectors:

  • Indexing Connectors

  • Supports the OpenSearch Standard for Federation


Reporting and Administration:

  • Greatly Refined

  • Search Administration dashboard (looks much like the one added with MOSS 2007 SP2) is web part based and can be easily customized

  • Error log reporting will now allow the administrator to select a specific item and decide whether to remove the item from the index or re-crawl.


There were also several web analytics enhancements for more robust reporting.

More later.

Jeff

No comments:

Post a Comment