Reference:BlueSpiceExtendedSearch: Difference between revisions

No edit summary
No edit summary
Tag: 2017 source edit
 
(5 intermediate revisions by the same user not shown)
Line 7: Line 7:
|category=Search and Navigation
|category=Search and Navigation
|license=GPL v3 only
|license=GPL v3 only
|features='''ExtendedSearch''' replaces the default MediaWiki search engine. It is based on ElasticSearch engine, and provides many improvements over the standard MediaWiki search both in terms of quality of indexed content and user interface.  
|features='''ExtendedSearch''' replaces the default MediaWiki search engine. It is based on OpenSearch (up to BlueSpice 4.3: Elasticsearch), and provides many improvements over the standard MediaWiki search both in terms of quality of indexed content and user interface.  


*Search titles.
*Search titles.
Line 28: Line 28:
*exclude patterns
*exclude patterns


===Boosting (server configuration)===
==Boosting (server configuration)==
For boosting by match percent, the following settings are available in the extension itself:<syntaxhighlight lang="json">
For boosting by match percent, the following settings are available in the extension itself:<syntaxhighlight lang="json">
"ESRecentBoostFactor": {
"description": "Value must be between 0 and 1. If set to 1, very recent pages will almost double their score",
"public": false,
"value": "0.5"
},
"ESMatchPercentBoostFactor": {
"ESMatchPercentBoostFactor": {
            "description": "How much to boost the result based on the percent of its title taken up by the search term. Set to 0 to disable",
"description": "How much to boost the result based on the percent of its title taken up by the search term. Set to 0 to disable",
            "public": false,
"public": false,
            "value": "0.5"
"value": "0.5"
        },
},
        "ESMatchPercentTitleField": {
"ESMatchPercentTitleField": {
            "description": "Field on which to base the match percent boosting. If empty, default title field of the source will be used",
"description": "Field on which to base the match percent boosting. If empty, default title field of the source will be used",
            "public": false,
"public": false,
            "value": ""
"value": ""
        },
}
</syntaxhighlight>
</syntaxhighlight>




====System-wide boosting====
===System-wide boosting===
BlueSpice influences the score, after the results have been retrieved, in the following ways:
By default, BlueSpice influences the score, after the results have been retrieved, in the following ways:


* '''Creation date / last edit:''' All things equal, more recently edited pages will rank higher. The amount of influence can be adjusted with '$bsgESRecentBoostFactor' (default 0.5). The value for this can be a number between 0 and 1. 0 means that recency of the page does not change the score and 1 means that very recent changes might get their score doubled.
*'''Creation date / last edit:''' All things equal, more recently edited pages will rank higher. The amount of influence can be adjusted with '$bsgESRecentBoostFactor' (default 0.5). The value for this can be a number between 0 and 1. 0 means that recency of the page does not change the score and 1 means that very recent changes might get their score doubled.
* '''Percent of the page title taken up by the search term: ''' An additional score boost is assigned to results that match a larger part of the title. E.g., if the search term is “foo”, a page called “Foo Bar” will have a good match percentage (50%), since the search term takes 50% of the full page name. The page “Foo quick brown fox jumps” will have a lower percentage boost, because the search term takes up only a small percentage of the whole title. This is controlled by '$bsgESMatchPercentBoostFactor' (default 0.5). Same as for the previous setting, a value of 0 will effectively disable this kind of boosting, while 1 would double the score for exact matches (In our example, page “Foo Bar” would get 50% added to its original score).
*'''Percent of the page title taken up by the search term: ''' An additional score boost is assigned to results that match a larger part of the title. E.g., if the search term is “foo”, a page called “Foo Bar” will have a good match percentage (50%), since the search term takes 50% of the full page name. The page “Foo quick brown fox jumps” will have a lower percentage boost, because the search term takes up only a small percentage of the whole title. This is controlled by '$bsgESMatchPercentBoostFactor' (default 0.5). Same as for the previous setting, a value of 0 will effectively disable this kind of boosting, while 1 would double the score for exact matches (In our example, page “Foo Bar” would get 50% added to its original score).
* Wikipages will always be boosted a bit more than other results types (repofile, socialentity, …)
*'''Wikipages''' will always be boosted a bit more than other results types (repofile, socialentity, …)
* Inside the “wikipage” type, pages in content namespaces and especially pages is NS_MAIN will be boosted additionally. Talk pages will receive no boost.
*Inside the “wikipage” type, pages in '''content namespaces''' and especially pages is '''NS_MAIN''' will be boosted additionally. Talk pages will receive no boost.


Additional configuration is possible for:
*'''ESMatchPercentTitleField:''' Field on which the match analysis is performed. Defaults to nothing, so whatever the search lookup sets. It can be set to "prefixed_text" to calculate match percent on the title with namespace prefix, for example, or to something like "basename" to calculate the match percent only on the page name, without the namespace prefix.


*'''ESMatchPercentTitleField:''' Field on which the match analysis is performed. Defaults to nothing, so whatever the search lookup sets. It can be set to "prefixed_text" to calculate match percent on the title with namespace prefix, for example, or to something like "basename" to calculate the match percent only on the page name, without the namespace prefix.
 
|desc=Full-text search in articles and files, faceted search, fuzzy search, spellchecker and sorting as well as search-as-you-type and auto-complete functionality.
===User-related boosting===
 
Users can adjust how pages are ranked individually. Such adjustements will only be applied for search queries made by that user and do not affect global search ranking.
 
*'''Preferred namespace:''' On the page ''Special:Preferences'' (tab "Extended search"), a user can prioritize namespacea, so that pages in those namespaces will be ranked a bit higher.
*'''Favourite results''': In the list of search results, each result has a little "star"-button in the top right corner. Clicking this marks a result as “favorite” for the user who clicked it. This means that this particular result will be boosted more than other results.
}}
}}
{{wcagCheck
{{wcagCheck
Line 60: Line 72:
|wcagTestdate=2022-08-08
|wcagTestdate=2022-08-08
|wcagLevel=AA
|wcagLevel=AA
|wcagSupport=does not support
|wcagSupport=partially supports
|wcagWorkaround=no
|wcagComments=*filter pills, buttons: don't indicate focus, so they cannot easily be opened with keyboard although technically it would already work.
|wcagComments=*filter pills, buttons: don't indicate focus, so they cannot easily be opened with keyboard although technically it would already work.
*all buttons announce “blank” and cannot be distinguished.
*all buttons announce “blank” and cannot be distinguished.

Latest revision as of 14:49, 10 January 2024

Extension: BlueSpiceExtendedSearch

all extensions

Overview
Description:

Elasticsearch search backend

State: stable Dependency: BlueSpice
Developer: HalloWelt License: GPL-3.0-only
Type: BlueSpice Category: Search and Navigation
Edition: BlueSpice pro, BlueSpice free, BlueSpice Farm, BlueSpice Cloud Version: 4.1+

Features

ExtendedSearch replaces the default MediaWiki search engine. It is based on OpenSearch (up to BlueSpice 4.3: Elasticsearch), and provides many improvements over the standard MediaWiki search both in terms of quality of indexed content and user interface.

  • Search titles.
  • Search the full content.
  • Search uploaded or linked files (Office documents and PDFs).
  • Search image data.
  • Search-as-you-type and auto-complete.
  • Ignore upper and lower case (case-insensitive).
  • Search with the operators AND, OR, NOT.
  • Search with wildcards.
  • Search for phrases.
  • Fuzzy search.
  • Search for sentence fragments.

Some aspects of this extension can be configured on Special:BlueSpiceConfigManager, under section "ExtendedSearch". Here wiki administrators can configure:

  • external file paths
  • layout of the autocomplete box
  • language filter
  • exclude patterns

Boosting (server configuration)

For boosting by match percent, the following settings are available in the extension itself:
"ESRecentBoostFactor": {
"description": "Value must be between 0 and 1. If set to 1, very recent pages will almost double their score",
"public": false,
"value": "0.5"
},
"ESMatchPercentBoostFactor": {
"description": "How much to boost the result based on the percent of its title taken up by the search term. Set to 0 to disable",
"public": false,
"value": "0.5"
},
"ESMatchPercentTitleField": {
"description": "Field on which to base the match percent boosting. If empty, default title field of the source will be used",
"public": false,
"value": ""
}


System-wide boosting

By default, BlueSpice influences the score, after the results have been retrieved, in the following ways:

  • Creation date / last edit: All things equal, more recently edited pages will rank higher. The amount of influence can be adjusted with '$bsgESRecentBoostFactor' (default 0.5). The value for this can be a number between 0 and 1. 0 means that recency of the page does not change the score and 1 means that very recent changes might get their score doubled.
  • Percent of the page title taken up by the search term: An additional score boost is assigned to results that match a larger part of the title. E.g., if the search term is “foo”, a page called “Foo Bar” will have a good match percentage (50%), since the search term takes 50% of the full page name. The page “Foo quick brown fox jumps” will have a lower percentage boost, because the search term takes up only a small percentage of the whole title. This is controlled by '$bsgESMatchPercentBoostFactor' (default 0.5). Same as for the previous setting, a value of 0 will effectively disable this kind of boosting, while 1 would double the score for exact matches (In our example, page “Foo Bar” would get 50% added to its original score).
  • Wikipages will always be boosted a bit more than other results types (repofile, socialentity, …)
  • Inside the “wikipage” type, pages in content namespaces and especially pages is NS_MAIN will be boosted additionally. Talk pages will receive no boost.

Additional configuration is possible for:

  • ESMatchPercentTitleField: Field on which the match analysis is performed. Defaults to nothing, so whatever the search lookup sets. It can be set to "prefixed_text" to calculate match percent on the title with namespace prefix, for example, or to something like "basename" to calculate the match percent only on the page name, without the namespace prefix.


User-related boosting

Users can adjust how pages are ranked individually. Such adjustements will only be applied for search queries made by that user and do not affect global search ranking.

  • Preferred namespace: On the page Special:Preferences (tab "Extended search"), a user can prioritize namespacea, so that pages in those namespaces will be ranked a bit higher.
  • Favourite results: In the list of search results, each result has a little "star"-button in the top right corner. Clicking this marks a result as “favorite” for the user who clicked it. This means that this particular result will be boosted more than other results.

Technical Information

This information applies to BlueSpice 4. Technical details for BlueSpice Cloud can differ in some cases.

Requirements

  • MediaWiki: 1.37.0
  • BlueSpiceFoundation: 4.1

Integrates into

  • BlueSpiceArticleInfo
  • BlueSpiceExtendedSearch
  • BlueSpiceExtendedStatistics
  • BlueSpicePrivacy
  • BlueSpiceSimpleFarmer
  • BlueSpiceTagCloud
  • BlueSpiceVisualEditorConnector
  • ContentDroplets
  • VisualEditor

Special pages

  • BSSearchAdmin
  • BSSearchCenter

Permissions

Name Description Role
extendedsearch-search-externalfile Search for external files accountmanager, admin, author, bot, commenter, editor, maintenanceadmin, reader, reviewer, structuremanager
extendedsearch-search-repofile Search for files accountmanager, admin, author, bot, commenter, editor, maintenanceadmin, reader, reviewer, structuremanager
extendedsearch-search-specialpage Search for special pages accountmanager, admin, author, bot, commenter, editor, maintenanceadmin, reader, reviewer, structuremanager
extendedsearch-search-wikipage Search for pages accountmanager, admin, author, bot, commenter, editor, maintenanceadmin, reader, reviewer, structuremanager

Configuration

Name Value
ESAllowIndexingDocumentsWithoutContent true
ESAutoRecognizeSubpages true
ESAutoSetLangFilter false
ESBackendClass '\\BS\\ExtendedSearch\\Backend'
ESBackendHost '127.0.0.1'
ESBackendPassword ''
ESBackendPort '9200'
ESBackendTransport 'https'
ESBackendUsername ''
ESCompactAutocomplete true
ESDefaultSearchOperator 'AND'
ESEnableSearchHistoryTracking true
ESEnableTypeFilter true
ESExternalFilePaths array ( )
ESIndexPrefix ''
ESLookupModifierRegExPatterns array ( 0 => '[0-9]{2}\\-[0-9]{2}\\-[0-9]{4}', 1 => '[0-9]{4}\\-[0-9]{2}\\-[0-9]{2}', 2 => '[0-9]{2}\\-[0-9]{4}\\-[0-9]{2}', 3 => '[0-9]{2}\\/[0-9]{2}\\/[0-9]{4}', 4 => '[0-9]{4}\\/[0-9]{2}\\/[0-9]{2}', 5 => '[0-9]{2}\\/[0-9]{4}\\/[0-9]{2}', 6 => '[0-9]{2}\\.[0-9]{2}\\.[0-9]{4}', 7 => '[0-9]{4}\\.[0-9]{2}\\.[0-9]{2}', 8 => '[0-9]{2}\\.[0-9]{4}\\.[0-9]{2}', 9 => '[0-9]{2}\\\\[0-9]{2}\\\\[0-9]{4}', 10 => '[0-9]{4}\\\\[0-9]{2}\\\\[0-9]{2}', 11 => '[0-9]{2}\\\\[0-9]{4}\\\\[0-9]{2}', 12 => '[0-9]{1}\\-[0-9]{2}\\-[0-9]{4}', 13 => '[0-9]{4}\\-[0-9]{2}\\-[0-9]{1}', 14 => '[0-9]{1}\\-[0-9]{4}\\-[0-9]{2}', 15 => '[0-9]{4}\\-[0-9]{1}\\-[0-9]{2}', 16 => '[0-9]{2}\\-[0-9]{4}\\-[0-9]{1}', 17 => '[0-9]{2}\\-[0-9]{1}\\-[0-9]{4}', 18 => '[0-9]{1}\\/[0-9]{2}\\/[0-9]{4}', 19 => '[0-9]{4}\\/[0-9]{2}\\/[0-9]{1}', 20 => '[0-9]{1}\\/[0-9]{4}\\/[0-9]{2}', 21 => '[0-9]{4}\\/[0-9]{1}\\/[0-9]{2}', 22 => '[0-9]{2}\\/[0-9]{4}\\/[0-9]{1}', 23 => '[0-9]{2}\\/[0-9]{1}\\/[0-9]{4}', 24 => '[0-9]{1}\\.[0-9]{2}\\.[0-9]{4}', 25 => '[0-9]{4}\\.[0-9]{2}\\.[0-9]{1}', 26 => '[0-9]{1}\\.[0-9]{4}\\.[0-9]{2}', 27 => '[0-9]{4}\\.[0-9]{1}\\.[0-9]{2}', 28 => '[0-9]{2}\\.[0-9]{4}\\.[0-9]{1}', 29 => '[0-9]{2}\\.[0-9]{1}\\.[0-9]{4}', 30 => '[0-9]{1}\\\\[0-9]{2}\\\\[0-9]{4}', 31 => '[0-9]{4}\\\\[0-9]{2}\\\\[0-9]{1}', 32 => '[0-9]{1}\\\\[0-9]{4}\\\\[0-9]{2}', 33 => '[0-9]{4}\\\\[0-9]{1}\\\\[0-9]{2}', 34 => '[0-9]{2}\\\\[0-9]{4}\\\\[0-9]{1}', 35 => '[0-9]{2}\\\\[0-9]{1}\\\\[0-9]{4}', 36 => '[0-9]{2}\\-[0-9]{2}\\-[0-9]{2}', 37 => '[0-9]{2}\\/[0-9]{2}\\/[0-9]{2}', 38 => '[0-9]{2}\\.[0-9]{2}\\.[0-9]{2}', 39 => '[0-9]{2}\\\\[0-9]{2}\\\\[0-9]{2}', 40 => '[0-9]{1}\\-[0-9]{1}\\-[0-9]{4}', 41 => '[0-9]{4}\\-[0-9]{1}\\-[0-9]{1}', 42 => '[0-9]{1}\\-[0-9]{4}\\-[0-9]{1}', 43 => '[0-9]{1}\\/[0-9]{1}\\/[0-9]{4}', 44 => '[0-9]{4}\\/[0-9]{1}\\/[0-9]{1}', 45 => '[0-9]{1}\\/[0-9]{4}\\/[0-9]{1}', 46 => '[0-9]{1}\\.[0-9]{1}\\.[0-9]{4}', 47 => '[0-9]{4}\\.[0-9]{1}\\.[0-9]{1}', 48 => '[0-9]{1}\\.[0-9]{4}\\.[0-9]{1}', 49 => '[0-9]{1}\\\\[0-9]{1}\\\\[0-9]{4}', 50 => '[0-9]{4}\\\\[0-9]{1}\\\\[0-9]{1}', 51 => '[0-9]{1}\\\\[0-9]{4}\\\\[0-9]{1}', 52 => '[0-9]{1}\\-[0-9]{1}\\-[0-9]{2}', 53 => '[0-9]{2}\\-[0-9]{1}\\-[0-9]{1}', 54 => '[0-9]{1}\\-[0-9]{2}\\-[0-9]{1}', 55 => '[0-9]{1}\\/[0-9]{1}\\/[0-9]{2}', 56 => '[0-9]{2}\\/[0-9]{1}\\/[0-9]{1}', 57 => '[0-9]{1}\\/[0-9]{2}\\/[0-9]{1}', 58 => '[0-9]{1}\\.[0-9]{1}\\.[0-9]{2}', 59 => '[0-9]{2}\\.[0-9]{1}\\.[0-9]{1}', 60 => '[0-9]{1}\\.[0-9]{2}\\.[0-9]{1}', 61 => '[0-9]{1}\\\\[0-9]{1}\\\\[0-9]{2}', 62 => '[0-9]{2}\\\\[0-9]{1}\\\\[0-9]{1}', 63 => '[0-9]{1}\\\\[0-9]{2}\\\\[0-9]{1}', 64 => '[0-9]{2}\\-[0-9]{4}', 65 => '[0-9]{2}\\/[0-9]{4}', 66 => '[0-9]{2}\\.[0-9]{4}', 67 => '[0-9]{2}\\\\[0-9]{4}', 68 => '[0-9]{4}\\-[0-9]{2}', 69 => '[0-9]{4}\\/[0-9]{2}', 70 => '[0-9]{4}\\.[0-9]{2}', 71 => '[0-9]{4}\\\\[0-9]{2}', 72 => '[0-9]{2}\\-[0-9]{2}', 73 => '[0-9]{2}\\/[0-9]{2}', 74 => '[0-9]{2}\\.[0-9]{2}', 75 => '[0-9]{2}\\\\[0-9]{2}', )
ESMatchPercentBoostFactor '0.5'
ESMatchPercentTitleField ''
ESOfferOperatorSuggestion true
ESRecentBoostFactor '0.5'
ESSearchCenterDefaultFilters array ( 0 => 'namespace_text', 1 => 'categories', )
ESSearchInRawWikitext true
ESSharedUploadsIndexPrefix false
ESSourceConfig array ( 'wikipage' => array ( 'skip_namespaces' => array ( 0 => 8, 1 => 9, ), ), 'repofile' => array ( 'extension_blacklist' => array ( 0 => 'mp4', ), 'max_size' => 20000000, ), 'externalfile' => array ( 'extension_blacklist' => array ( 0 => 'mp4', ), 'max_size' => 20000000, ), )
ESSubpageMasterFilterPatterns array ( )
ESSubpageMasterFilterUseRootOnly true
ESUseSharedUploads false
ESWildcardingOperators array ( 0 => '+', 1 => '|', 2 => '*', 3 => '(', 4 => ')', 5 => '~', )
ESWildcardingSeparators array ( 0 => ',', 1 => '.', 2 => ';', 3 => '-', 4 => '_', )
ExtendedSearchExternalFilePathsExcludes array ( )
TagSearchSearchFieldTemplatePath '/resources/templates'

API Modules

  • bs-extendedsearch-autocomplete
  • bs-extendedsearch-query
  • bs-extendedsearch-resultrelevance
  • bs-extendedsearch-stats
  • bs-extendedsearch-triggerupdate
  • bs-extendedsearch-type-store

Hooks

Accessibility

Test status: 2-testing complete
Checked for: Web
Last test date: 2022-08-08
WCAG level: AA
WCAG support: partially supports (workaround: )
Comments:
  • filter pills, buttons: don't indicate focus, so they cannot easily be opened with keyboard although technically it would already work.
  • all buttons announce “blank” and cannot be distinguished.

erm:29483

Extension type: core
Extension focus: reader

Discussions