Use SharePoint As a Content Source

This article walks you through the process of installing SharePoint as a content source.

PREREQUISITES

  • You have to be an admin to crawl SharePoint site data. If you are not an admin, then you can crawl only the sites accessible to you.

  • Any SharePoint user can crawl community sites.

PERMISSIONS.

  • The permissions work on the basis of email.

  • site permissions

    • Community sites are visible to all search users.

    • Team sites are visible only to those users who have the right to view them.

    • Admins can view all sites.

    • Users from outside Sharepoint cannot view any sites.

    • If you no longer have access to a team site, ask the Sharepoint admin to provide you access and recrawl Sharepoint. Recrawling is important.

  • Data returned from SharePoint may be throttled. Take a look at the latest SharePoint rate limits.

Establish a Connection

  1. Navigate to Content Sources.

  2. Click Add New Content Source.

  1. Find SharePoint from the search box and click Add.

  2. Enter details.
    • Name. Enter a label. Labels help you distinguish content sources from one another.
    • Client URL. Enter the SharePoint instance web address.
    • Authentication Type. Select either Basic or OAuth. Enter your SharePoint login ID and password upon selecting Basic. Enter client ID and client secret after selecting OAuth. This article explains how to Obtain Client ID and Client Secret for SharePoint Authentication
    • Language. Select the content language.

  3. Click Connect.

Set Up Crawl Frequency

The first crawl is always manual and is performed after configuring the content source. In Choose A Date, select a date to start crawling; the data created after the selected date will be crawled. For now, keep the frequency to its default value Never and click Set and move to the next section.

Select Fields and Websites for Indexing

SearchUnify can index three SharePoint content types: list, page, and document. You can select to index one, two, or all three of them in By Content Type. You can further define which properties (content fields) of these content types are indexed.

  1. Click to view the properties of a content type.

  2. A dialog will open. You can click to remove a content field. The removed content fields are not indexed. You can use the Name column to find content types, the Label column to rename them, and the Type column to change the default a data type. To edit existing content fields, click . Once the configurations are complete, click Save.

  3. OPTIONAL. Repeat the previous two steps for other content types.
  4. Navigate to By Place and use the alphabetical index to find your SharePoint websites. A website named Canopus will be found by clicking the letter C, a website named Sirius by clicking the letter S, and so on. 0-9 lists all the websites that either start with a digit or with a non-ASCII Latin character. Both 6-dimensional and éducation-de-nos-amis will be listed under 0-9.

  5. Use the checkboxes in the Enable column to set websites for indexing. Once you have checked all the websites, click Save.

Return to the Content Sources screen and click . If the number in the Total Documents column is one or more, then you have successfully set up SharePoint as a content source.

NOTE.

Either your SharePoint has no data or the content source wasn't successfully set up if the number of Total Documents remains zero.

After the First Crawl

Return to the Content Sources screen and click in Actions. The number of indexed documents is updated after the crawl is complete. You can view crawl progress in in Actions. Documentation on crawl progress is in View Crawl Logs.

NOTE 1

Review the settings in Rules if there is no progress in Crawl Logs.

NOTE 2

For Mamba '22 and newer instances, search isn't impacted during a crawl. However, in older instances, some documents remain inaccessible while a crawl is going on.

Once the first crawl is complete, click in Actions open the content source for editing, and set a crawl frequency.

  1. In Choose a Date, click to fire up a calendar and select a date. Only the data after the selected date is indexed.

  2. Use the Frequency dropdown to select how often SearchUnify should index the data. For illustration, the frequency has been set to Weekly and Tuesday has been chosen as the crawling day. Whenever the Frequency is other than Never, a third dropdown appears where you can specify the interval. Also, whenever Frequency is set to Hourly, then manual crawls are disabled.

  3. Click Set to save crawl frequency settings. On clicking Set, you are taken to the Rules tab.

Last updatedWednesday, April 10, 2024

Or, send us your review at help-feedback@searchunify.com