Oracle® Secure Enterprise Search Administrator's Guide 10g Release 1 (10.1.8) Part Number B32259-01 |
|
|
PDF · Mobi · ePub |
This chapter contains the following topics:
Oracle Secure Enterprise Search (SES) provides uniform search capabilities over multiple repositories.
Oracle SES uses a crawler to collect data from these sources. The crawler supports a number of built-in source types, as well as a published plug-in (or connector) architecture for adding new types. Multiple Oracle SES instances can also share content through the federated source type.
Oracle SES supports numerous built-in source types:
Web: A Web source represents the content on a specific Web site. Web sources facilitate maintenance crawling of specific Web sites.
Table: A table source represents content in an Oracle database table or view.
File: A file source is the set of documents that can be accessed through the file protocol.
E-mail: An e-mail source derives its content from e-mails sent to a specific e-mail address. When Oracle SES crawls an e-mail source, it collects e-mail from all folders set up in the e-mail account, including Drafts, Sent Items, and Trash e-mails.
Mailing list: A mailing list source derives its content from e-mails sent to a specific mailing list.
OracleAS Portal: An OracleAS Portal source lets you search across multiple OracleAS Portal repositories, such as Web pages, files on disk, and pages on other OracleAS Portal instances.
Oracle Calendar: An Oracle Calendar source represents the content in an Oracle Calendar repository. Oracle SES can crawl content (meetings and events) and metadata in Oracle Calendar and provide secure full-text search over an Oracle Calendar repository. You can specify more than one thread to crawl. Deleted items are removed from the index during incremental crawling. You can search based on title, author, start or end date (year, month, day), event type, status, or location.
Oracle Content Database: An Oracle Content Database source represents the content in an Oracle Content Database repository.
Note:
Oracle Content Database and Oracle Content Services are the same product. This book uses the product name Oracle Content Database to mean Oracle Content Database and Oracle Content Services. Oracle Content Database sources are certified with Oracle Content Database release 10.2 and Oracle Content Services release 10.1.2.3.Oracle Applications (Oracle E-Business Suite 11i and Siebel 8): Search Oracle Applications with an Oracle E-Business Suite 11i source or a Siebel 8 source.
Federated: A federated source lets you search secure content across distributed Oracle SES instances.
Additionally, out-of-the-box, with no additional coding required, Oracle SES 10.1.8 provides more access than any other enterprise search engine. It can find and verify information in the following:
Files in Microsoft NT file systems (NTFS)
EMC Documentum Content Server DocBases
IMB Lotus Notes databases
FileNet Content Engine object stores
FileNet Image Services libraries
Open Text Livelink
Microsoft Exchange
See Also:
Oracle Secure Enterprise Search Release Notes for version information and known issues
Oracle Secure Enterprise Search Installation and Upgrade Guide for installation requirements and tips, upgrade steps, and information on how to get started using Oracle SES
The Oracle SES home page for updated information on known issues, as well as code samples and best practices: http://www.oracle.com/technology/products/oses/index.html
Oracle SES includes the following components:
The Oracle SES crawler is a Java process activated by a set schedule. When activated, the crawler spawns a configurable number of processor threads that fetch information from various sources and index the documents. This index is used for searching sources.
The crawler maps links and analyzes relationships. Whenever the crawler encounters embedded non-HTML, or non-textual documents during the crawling, it automatically detects the document type and filters and indexes the document.
Use the Oracle Secure Enterprise Search administration tool to manage and monitor Oracle SES components. For example:
Define sources and crawling scope
Configure the search application
Monitor crawl progress and search performance
See Also:
Oracle SES administration tutorial for help understanding common administrator tasks:
http://st-curriculum.oracle.com/tutorial/SESAdminTutorial/index.htm
Oracle SES administration tool context-sensitive online help
Oracle Secure Enterprise Search provides several APIs. For example, the Crawler Plug-in API enables you to create a custom secure crawler plug-in (or connector) to meet your requirements. With the Web Services API, you can integrate Oracle SES search capabilities into your search application.
Oracle SES also provides an out-of-the-box search application.
Information in an enterprise can be spread across Web pages, databases, mail servers or other collaboration software, document repositories, file servers, and desktops. Oracle SES searches all your data through the same interface. Oracle SES is fully globalized and works with 27 languages including Chinese, Japanese, Korean, Arabic, and Hebrew.
This section introduces a few of the features in Oracle SES. It includes the following topics:
See Also:
Chapter 3, "Understanding Crawling and Searching" for more features relating to the crawlerMuch of the information within an organization is publicly accessible. Anyone is allowed to view it. Therefore, it is relatively easy for a crawler to find and index that information.
However, there are other sources that are protected. These protected sources might only be viewable by certain users or groups of users. For example, while users can search in their own e-mail folders, they should not be able to search anyone else's e-mail.
For protected sources, the Oracle SES crawler will index documents with the proper access control list. When end users perform a search, only documents that they have privileges to view will be returned.
See Also:
"Enabling Secure Search"Oracle Secure Enterprise Search provides the capability of searching multiple Oracle SES instances with their own document repositories and indexes. It provides a unified framework to search the different document repositories that are crawled, indexed, and maintained separately. A federation broker calls the federation endpoint to collect content matching the search criteria for the sources managed at that endpoint.
Federated search allows a single query to be run across all Oracle SES instances. It aggregates the search results to show one unified result list to the user. User credentials are passed along with the query so that each federation endpoint can authenticate the user against its own document repository.
Create a federated source on the Home - Sources page of the Oracle SES administration tool.
The following diagram illustrates Oracle SES federation architecture.
Oracle SES offers a Web services API that lets you integrate Oracle SES search capabilities into your search application.
Oracle SES provides an extensible crawler plug-in (or connector) framework that lets you crawl and index proprietary document repositories.
See Also:
The Oracle Secure Enterprise Search home page at http://www.oracle.com/technology/products/oses/index.html
for updated information on known issues, as well as code samples and best practices