The first step was to set up the organic crawl of our internal sites, which largely consisted of listing the appropriate entry points;
Screenshot of URL entry points in Search and Promote
And their corresponding URL masks (note the test feature that allows you to try your masks before saving them);
Screenshot of URL masks in Search & Promote
Search & Promote works on a number of pages crawled – your licensing allows you to go to a certain number of pages, and after that the pages are not added to your index. There was a bit of tweaking to figure out what that level was, however there’s a cool feature in Search & Promote that allows the crawl to continue and count the number of pages that you’ve gone over by so you at least have an idea of where you are. From there you can either increase your licensed limit, or identify the larger than expected sites and par down the number of pages found by using the error logs and URL masks.
Compensating for the lack of SEO content
One of the issues I’d talked about previously was a lack of the bare turkey email list 5 million contact leads minimum SEO metadata across many sites, most of which we had no direct control over. We tackled this by using the metatag injection feature in Search & Promote, which can be configured to dynamically inject metadata during a crawl, based on a URL pattern. This metadata is then included in the index as if the metadata was already embedded within each page, and can range from standard title/description metatags, to custom tags that can be use to create search filters (facets).
We soon found, however, that a significant portion of internal content required authentication to access, which meant that the crawler could not get in to that content. The Search & Promote crawler can be given credentials to access that content, however our concern was that content was authenticated for a reason, and to show even a title or extract from authenticated content on a public search may give away too much.
Given that the “we can’t find anything!” comment included authenticated content and applications, we needed an alternate option for this implementation to be successful.
At Murdoch we have a database called the A-Z index, which is maintained by our IT area, and over the past 5-6 years has grown to include an entry for most of our authenticated content and applications. This was a perfect source of information, now we needed to somehow incorporate this content into our search results.
Enter a feature in Search & Promote called ‘index connectors’.
Compensating for the lack of SEO content
-
- Posts: 81
- Joined: Tue Jan 07, 2025 6:36 am