Difference between revisions of "Searching"

From NewMarsWiki
Jump to: navigation, search
 
m (1 revision)
 
(No difference)

Latest revision as of 04:02, 21 January 2009

Search facility of Wikipedia

The native Wikipedia search feature via the search box that appears on every Wikipedia page is not available during peak hours or other periods of high server activity (which may last for days). Here are some hints for using it effectively (see also Wikipedia:User preferences help, "Search result settings" section, and Wikipedia:Go button).

Limiting results

Wikipedia's default search mode will turn up results with any of the words in your query. For instance, search engine turns up many results containing only "search" but not "engine" or only "engine" but not "search" in addition to the ones you probably wanted, which contain both words.

To limit to results that include all words, put a "+" at the beginning of each word: +search +engine returns only pages containing both words, like Google's default mode.

You can also do a phrase search by enclosing words in quotes: "search engine" turns up a smaller set of results, which not only have both words but have them in order.

To exclude results that include some word, put a "-" at the beginning: search -engine

Avoid short and common words

If your search terms include a common "stop word" (such as "the", "one", "your", "more", "right", "while", "when", "who", "which", "such", "every", "about") it will be ignored by the search system. If you're trying to do a phrase search or all-words-only search, this may result in returning nothing at all. Short numbers, and words that appear in half of all articles, will also not be found. In this case, drop those words and rerun the search.

See Wikipedia:Common words, searching for which is not possible for the stop words filtered out by the database. From there one can at least go to an article with a stop word as title. Searching for the combination of one or more words and the common word "not" gives a database query syntax error due to a bug in the software.

Search is case-insensitive only for the first word of the entry

The searches for "fortran", "Fortran" and "FORTRAN" all return the same results. If an article has a name including a mixture of capitalized and uncapitalized letters and is neither all initial cap or all lower case following the first word, searching is not case insensitive. For example, consider the article French and Indian Wars. A search for 'french and indian wars' will not find this article. However, a search for 'French and Indian Wars' will find it, as will a search for 'french and Indian Wars'. If the article name were 'French And Indian Wars' or 'French and indian wars' a search with any capitalization variant would find it. Redirects can be used to work around this problem. For example, searches for any capitalization variant of 'Isle of Wight' match Isle of wight which redirects the user to the actual article named Isle of Wight.

Wildcards

You can use some limited wildcards if you really want to, but I forget offhand what. Look up "fulltext search" on http://www.mysql.com/ and look down under 'boolean search' for the details. However, wildcard searches are slower, so go easy on the poor server.

Words with special characters

In a search for a word with a diaeresis, such as Sint Odiliënberg, it depends whether this ë is stored as one character or as "ë". In the first case one can simply search for Odilienberg (or Odiliënberg); in the second case it can only be found by searching for Odili, euml and/or nberg. This is actually a bug that should be fixed -- the entities should be folded into their raw character equivalents so all searches on them are equivalent. See also Wikipedia:Special characters.

Words in single quotes

If a word appears in an article with single quotes, you can only find it if you search for the word with quotes. Since this is rarely desirable it is better to use double quotes in articles, for which this problem does not arise. See the manual of style for more info.

An apostrophe is identical to a single quote, therefore Mu'ammar can be found searching for exactly that (and not otherwise). A word with apostrophe s is an exception in that it can be found also searching for the word without the apostrophe and the s.

Namespaces searched by default

The search only applies to the namespaces selected in the user's preferences. To search the other namespaces check or uncheck the tickboxes in "Search in namespaces" box found at the bottom of a search results page. Depending on the browser, a box may still be checked from a previous search, but without being effective any longer! To make sure, uncheck and recheck it.

Searching the image namespace means searching the image descriptions, i.e. the first parts of the image description pages.

Redirects can be excluded

Check or uncheck the tickbox "List redirects" in "Search in namespaces" box found at the bottom of a search results page.

The source text is searched

The source text (what one sees in the edit box, also called wiki text) is searched. This distinction is relevant for piped links, for Wikipedia:interlanguage links (to find links to Chinese articles, search for zh, not for Zhongwen), special characters (if ê is coded as ê it is found searching for ecirc), etc.

Delay in updating the search index

For reasons of efficiency and priority, very recent changes are not always immediately taken into account in searches.

At the moment, the search engine uses an index that isn't updated at all. This is temporary.