Use Salesforce As a Content Source

SearchUnify can index the data stored in your org, including on Service Cloud, Sales Cloud, and Community Cloud. You can choose to index all the objects and make them searchable for quick reference. Or, you can limit indexing to select objects and fields. For instance, for the case object, you can index both private and public comments on a case, case subjects, and case priority. This article shows how to set up your Salesforce org for indexing.

Prerequisites

Licenses

  • Salesforce Platform license.
  • Knowledge User license if you are attempting to index articles
  • Right set of permissions; see the Permissions Chart for details

Permissions Chart

Type Sub-Type Name Use
Profile-Level Permission Administrative Permissions API Enabled* Mandatory to call Salesforce APIs by making a user an API user.

View Setup and Configuration*

OR

Assign Permission Sets*

OR

Manage User*

Either is mandatory to crawl Salesforce permissions or access control settings.
View All Data Index feed items.
1. Used only when the goal is to index feed items.
Manage Data Categories Index data categories.

1. Preferred: In Default Visibility Settings, select the default visibility for the categories in Category Group Visibility to All Categories.
2. Acceptable: If Category Group Visibility is set to Custom, then before sharing with us, assign the user the role, permission sets, and profile needed to index data categories.
General User Permissions Manage Articles*

Mandatory permission for indexing and removing article drafts.

1. Used when the goal is to index articles and drafts.

2. Used when the goal is to update the index by removing archived articles and drafts.

Manage Cases Index cases.
Standard Object Permissions Read Index only the objects with the Read permission.
Article Type Permissions Read

Index only the articles with the Read permission

Community Members   Community Member

Index navigational topics on a community.

1. Before sharing the user, ensure that its profile is a member of the community.

Establish a Connection

  1. Navigate to Content Sources and click Add New Content Sources.

  1. Give your content source a name.

  2. Select Sandbox or Production in Client URL. Your selection will decide if SearchUnify indexes the data from your sandbox or production instance.

  3. Select the content language. If your Knowledge Base hosts articles in more than one language, then specify them all in Language. That will allow SearchUnify to index your content in all supported languages. After making a selection, click Connect.

  4. If you are already logged into Salesforce in another tab, you will be greeted by a "Connection Succeeded" message. Click next.

It's important to establish a connection again when the user who did it the first time is deleted from the Salesforce org. Establishing a connection isn't necessary when all that has changed is the user's password on Salesforce. Once the connection has been set up successfully, you will be prompted to the next action - Set Frequency.

Re-Connect

The Authentication screen is displayed when an existing Content Source is opened for editing, as given below. An admin can edit a Content Source for multiple reasons:

  • To reauthenticate

  • To fix a crawl error

  • To change frequency

  • To add or remove an object or a field for crawling

  • For multiple other reasons

When you edit a Content Source, there can be any one of two cases as listed below:

If the Content Source authentication is successful; a Connected message is displayed.

Case 1: There are no crawl errors and the Content Source authentication is valid.

If the Content Source authentication fails or is disrupted; a Re-Connect button is displayed.

There is a crawl error or the authentication details have changed. In both cases, the SearchUnify Content Source connection must be authenticated again i.e. re-authenticated. To authenticate a Content Source again, enter the authentication details, and click Re-Connect.

Set Up Crawl Frequency

The first crawl is always manual and is performed after configuring the content source. In Choose A Date, select a date to start crawling; the data created after the selected date will be crawled. For now, keep the frequency to its default value Never and click Set and move to the next section.

Select Content Fields for Indexing

Two Appinium objects, ViewTrac and LearnTrac, are supported out-of-the-box. The content is shown to the searchers according to the permissions defined using the Media-sharing rules under Appinium. Assign permission sets for objects carefully. In the LearnTrac app, permissions are set on the document level. So, if there's any change made on permissions then a manual crawl is required.

  1. You can decide which object fields are indexed. SearchUnify supports standard and custom object fields. To select an object, enter its Object API name. The list of valid object API names can be found in the second column of Object Manager. (Log into Salesforce to view.)

    NOTE.

    SearchUnify supports the indexing of record fields as well. You might want to crawl them if your knowledge base is hosted on a Salesforce org.

    To index article drafts, use enter knowledge__kav_draft in the Object API field.

    In order to crawl the feeditem object, add a network scope condition first.

    When adding fields, you can try adding new objects instead. A big advantage of object crawl is that it's safer. When an object crawl fails, the overall index isn't impacted.

    File types supported for attachments data are pdf, doc, docx, ppt, pptx, potx, csv, xsl, txt, rtf, evtx, log.

  2. Add Object will become clickable if the name API name is valid. Otherwise you will receive a warning.

  3. After entering a valid API name, enter a label and click Add Object. If you cannot add an object, then verify the object name. Misspelled and downright wrong object names throw an error. Users cannot add objects in three other scenarios: (a) when they don't have access to in Salesforce, (b) when the credentials used to authenticate their org in SearchUnify have expired, and (c) when SearchUnify APIs are down.

  4. After adding the objects, the next task is to select object fields (content fields) for indexing. Click to select a content field.

  5. Add properties—such as createdDate and accountDescription—one at a time. You can change a property's label and type. It's important to note here than when a crawled field is deleted from Salesforce, then crawling cannot take place. Each time SearchUnify tries to update the index, an error shows up. If you haven't been able to crawl Salesforce for some time, then tally the indexed fields of each object in SearchUnify with the fields in your Salesforce org.

  6. Repeat the previous step with all the objects and press Save.

You have successfully added Salesforce as a content source in SearchUnify. Perform a manual crawl to start indexing data in SearchUnify.

Related

Difference between Manual and Frequency Crawls

After the First Crawl

Return to the Content Sources screen and click in Actions. The number of indexed documents is updated after the crawl is complete. You can view crawl progress in in Actions. Documentation on crawl progress is in View Crawl Logs.

Once the first crawl is complete, click in Actions open the content source for editing, and set a crawl frequency.

  1. In Choose a Date, click to fire up a calendar and select a date. Only the data after the selected date is indexed.

  2. The following options are available for the Frequency field:

    • When Never is selected, the content source is not crawled until an admin opts for a manual crawl on the Content Sources screen.

    • When Minutes is selected, a new dropdown appears where the admin can choose between four values: 2, 15, 20, and 30. Picking 20 means that the content source crawling starts every 20 minutes.

    • When Hours is selected, a new dropdown is displayed where the admin can choose between eight values between 1, 2, 3, 4, 6, 8, 12, and 24. Picking 8 means that the content source crawling starts every 8 hours.

    • When Daily is selected, a new dropdown is displayed where the admin can pick a value between 0 and 23. If 15 is chosen, then the content source crawling starts at 03 p.m. or 1500 hours every single day.

    • When Day of Week is selected, a new dropdown is displayed where the admin can pick a day of the week. If Tuesday is chosen, then content source crawling starts at 0000 hours on every Tuesday.

    • When Day of Month is selected, a new dropdown appears where the admin can select a value between 1 and 30. If 20 is chosen, then content source crawling starts on the 20th of each month.

      It’s recommended to pick a date in the range 1-28. If 30 is chosen, then the crawler may throw an error in February. The error will be “Chosen date will not work for this month.”

    • When Yearly is selected, the content source crawling starts at midnight on 1 January each year.

  3. Click Set to save crawl frequency settings. On clicking Set, you are taken to the Rules tab.

A crawl usually fails for six reasons: (1) Salesforce APIs are down, (2) the authenticated user has been deleted from Salesforce, (3) the authenticated user doesn't have access to the object or field he is trying to crawl, (4) the credentials used to connect Salesforce with SearchUnify have expired, (5) a custom field has been removed from SearchUnify, and (6) there is an error in the formula in Manage Fields.

Manage Archived Articles

Archived articles are automatically removed from the SU index if the archiving rate is less than 200 articles per hour. However, if the archiving rate in your org exceeds that limit and you don't want the archived articles to be searchable, place a request with your CSM to delete the extraneous articles.

Crawl Attachments

Files in Salesforce are stored in an object named contentdocument. The files can be linked to objects, such as case, knowledge, and feeds. You can turn your files searchable in two ways.

In the first method, the files appear as independent search results. End-users can see sample-file.pdf and sample-file.doc in results after searching sample-file.

In the second method, the files don't appear in results but the object they are attached with turn up in results. On searching sample-file, end-users find only sample-object to which sample-file.pdf and sample-file.doc are attached. Then, they can open the object and download the file.

Crawl the files as a separate object (contentdocument) to adopt the first method.

Crawl Depth

In Salesforce knowledge, when Category_nested field is crawled, you get an option to configure crawl depth through Number of Levels in Data Category Group Hierarchy in the Facets column of Search Clients > Edit > Content Sources.

You can select a value between 1 and 5 in Number of Levels in Data Category Group Hierarchy has a value. The selected value is used and shown on the checked facet on a search client.

Data Deletion and SU Index

From the perspective of a support manager, Knowledge articles in Salesforce are of three types:

  • Online articles are generally available to the public. These articles can be found through search.

  • Draft articles are generally only available internally. These articles can also be found by people with the right role.

  • Archived articles is generally old content which no one wants to see in search results.

SearchUnify cleans its index regularly so that users can find content corresponding to their role. All archived articles are removed from the index so that they aren't in the search results. Similarly, when a draft is published, its status is changed in the index so that everyone can find the newly-published article. Sometimes the opposite happens and a published article is moved back into drafts. This status is captured and the drafts become searchable only internally.

For SearchUnify instances on Q2 '24 release, index cleaning for deleted documents runs with every frequency crawl. New articles are also added to the index during frequency crawls. Deleted data from the Salesforce org is removed from the SearchUnify index in every frequency crawl.

For all objects except knowledge, the data deleted from the Salesforce org is removed from the SearchUnify index within three years.

However, because of an API limitation in Salesforce, archived feedback cannot be deleted from the index. So it's possible that a user can find archived feedback, already removed in the org, in search results.

Troubleshooting