Use Confluence As a Content Source
SearchUnify can index the pages and blogs stored in your Confluence instance. This article walks you through the process of setting up Confluence as a content source.
PERMISSIONS
Space-level and page-level permissions are supported. Space Level Permissions imply the accessibility of the space which hosts the content (Pages). This is supported. Page Level Permissions. In Confluence, permissions can be defined on page level as well. It has an interesting implication. If a user creates a page and changes the page level sharing setting to be visible only to them, then this page cannot be accessed by any other user even though they have access to the parent Space. SearchUnify handles this case well and provides support for such permission definitions.
Establish a Connection
- Navigate to Content Sources.
- Click Add New Content Source.
- Find "Confluence" through the search box and click Add.
- Give your content source a name.
- Enter the web address of your Confluence instance followed by wiki/ in Client URL.
- Select an Authentication Method.
- Basic. Crawls the data that the ID can access. Requires your Confluence login ID and an API token. How to Create an API Token in Confluence or Jira?
- OAuth. Crawls all the data in your Confluence instance. Requires the creation of an app first. Click here to learn how to create the app.
- Click Connect.
Set Up Crawl Frequency
The first crawl is always manual and is performed after configuring the content source. For now, keep the frequency to its default value Never and click Set and move to the next section.
Select Types and Fields for Indexing
SearchUnify can index Confluence pages and blogs. You can choose to index them both, or select just one of them. You can further index all blog and page fields, or only a few of them.
- Click
to select content fields.
- Use the dropdown in the Name column to add content fields one at a time.
- OPTIONAL. SearchUnify assigns each field a label, type, and either an
isSearchable
orisFilterable
tag. The values don't require a change, but advanced users can edit them. - Press Save.
- Repeat the steps 2-5 with the second content type.
- Navigate to By Place.
- Use the index to find your project and check enable for each one of it.
- Press Save.
You have successfully installed Confluence as a content source.
First Crawl
Return to the Content Sources screen and click in Actions. The number of indexed documents is updated after the crawl is complete. You can view crawl progress in
in Actions. Documentation on crawl progress is in View Crawl Logs.
NOTE 1
Review the settings in Rules if there is no progress in Crawl Logs.
NOTE 2
For Mamba '22 and newer instances, search isn't impacted during a crawl. However, in older instances, some documents remain inaccessible while a crawl is going on.
Once the first crawl is complete, click in Actions open the content source for editing, and set a crawl frequency.
- In Choose a Date, click
to fire up a calendar and select a date. Only the data after the selected date is indexed.
- Use the Frequency dropdown to select how often SearchUnify should index the data. For illustration, the frequency has been set to Weekly and Tuesday has been chosen as the crawling day. Whenever the Frequency is other than Never, a third dropdown appears where you can specify the interval. Also, whenever Frequency is set to Hourly, then manual crawls are disabled.
- Click Set to save crawl frequency settings. On clicking Set, you are taken to the Rules tab.