Use Document360 as a Content Source

Document360 is a cloud SaaS platform for knowledge management. It is a popular among organizations focusing on either public or private knowledge bases.

PERMISSIONS

SearchUnify ignores user permissions during searches. All indexed files can be search by all users.

Establish a Connection

  1. Navigate to Content Sources and click Add New Content Sources.

  1. Find Document360 and click Add.

  2. Enter the following details for authentication : 

    • Name. Insert a label for your content source.

    • Client URL. The value is always https://apihub.document360.io.

    • Language. Select the content language.

    • API Key. Enter the API token key. To obtain it, refer to How to Set up API token in Document360. NOTE. While generating the API token, select GET as Allowed Method(s) for successful authentications. Alternatively, you can select all four: GET, POST, PUT, and DELETE.

  3. Click Connect.

Once the connection has been set up successfully, you will be prompted to the next action - Set Frequency.

Re-Connect

The Authentication screen is displayed when an existing Content Source is opened for editing, as given below. An admin can edit a Content Source for multiple reasons:

  • To reauthenticate

  • To fix a crawl error

  • To change frequency

  • To add or remove an object or a field for crawling

  • For multiple other reasons

When you edit a Content Source, there can be any one of two cases as listed below:

If the Content Source authentication is successful; a Connected message is displayed.

Case 1: There are no crawl errors and the Content Source authentication is valid.

If the Content Source authentication fails or is disrupted; a Re-Connect button is displayed.

There is a crawl error or the authentication details have changed. In both cases, the SearchUnify Content Source connection must be authenticated again i.e. re-authenticated. To authenticate a Content Source again, enter the authentication details, and click Re-Connect.

Set Up Crawl Frequency

The first crawl is always manual and is performed after configuring the content source. In Choose A Date, select a date to start crawling; the data created after the selected date will be crawled. For now, keep the frequency to its default value Never and click Set and move to the next section.

Select Content Types and Fields for Indexing

Each content type in Document360 has several properties, such as ID, title, and status. In the Rules section, an admin selects content types and fields for crawling and indexing. By default, articles and pages content types are configured.

NOTE.

Full page HTML crawling is supported in SearchUnify for Document360, so this be utilized to generate rich snippets on search page.

  1. Under the Rules section, By Content Type is the selected tab.

  2. You can see that the article and pages are configured already. Click on the edit button from the Actions column to view or edit the fields for content types.

  3. It is not recommended for users to edit or delete any field that is already added. NOTE: No custom fields can be added for crawling.

  4. Switch to By Categories. Use the alphabetical index to find categories. For example, Drive__ is shown under the alphabet D.

  5. Use the checkbox to pick the selected categories for indexing. Once you have selected the categories, click Save.

You have successfully added Jira as a content source in SearchUnify. Perform a manual crawl to start indexing data in SearchUnify.

Related

Difference between Manual and Frequency Crawls

After the First Crawl

Return to the Content Sources screen and click in Actions. The number of indexed documents is updated after the crawl is complete. You can view crawl progress in in Actions. Documentation on crawl progress is in View Crawl Logs.

Once the first crawl is complete, click in Actions open the content source for editing, and set a crawl frequency.

  1. In Choose a Date, click to fire up a calendar and select a date. Only the data after the selected date is indexed.

  2. The following options are available for the Frequency field:

    • When Never is selected, the content source is not crawled until an admin opts for a manual crawl on the Content Sources screen.

    • When Minutes is selected, a new dropdown appears where the admin can choose between three values: 15, 20, and 30. Picking 20 means that the content source crawling starts every 20 minutes.

    • When Hours is selected, a new dropdown is displayed where the admin can choose between eight values between 1, 2, 3, 4, 6, 8, 12, and 24. Picking 8 means that the content source crawling starts every 8 hours.

    • When Daily is selected, a new dropdown is displayed where the admin can pick a value between 0 and 23. If 15 is chosen, then the content source crawling starts at 03 p.m. or 1500 hours every single day.

    • When Day of Week is selected, a new dropdown is displayed where the admin can pick a day of the week. If Tuesday is chosen, then content source crawling starts at 0000 hours on every Tuesday.

    • When Day of Month is selected, a new dropdown appears where the admin can select a value between 1 and 30. If 20 is chosen, then content source crawling starts on the 20th of each month.

      It’s recommended to pick a date in the range 1-28. If 30 is chosen, then the crawler may throw an error in February. The error will be “Chosen date will not work for this month.”

    • When Yearly is selected, the content source crawling starts at midnight on 1 January each year.

  3. Click Set to save crawl frequency settings. On clicking Set, you are taken to the Rules tab.

Limitations

  • During frequency crawls, update crawl doesn't work as Document360 doesn't offer an API to retrieve updated data. The entire data is crawled and indexed in each frequency crawl. So, the frequency crawl should be set to daily, weekly, and monthly intervals.