Our search robot  is  registered among the top search engines of the world as a sophisticated software "engine" running from our primary internet server to reach out across the web and fetch content for our database (SQL) index.
It is automated by our programmers to systematically traverse the World Wide Web's hypertext structure and retrieve documents; thereafter recursively retrieving all documents that are linked from within the initial target document. "Recursive" here doesn't limit the definition to any specific traversal algorithm. Even though a robot might be programmed to apply some heuristic rule to the selection and order of documents it will visit; and also spaces out requests over a long span of time; it is still a "robot". Concomitantly, normal Web browsers are not robots because they are operated by a person and don't automatically retrieve referenced documents. Robots are sometimes referred to as WanderersCrawlers, or Spiders. Although arguably apropos, for the lay person, these names are a little misleading if they give the impression the software itself moves between sites like a virus; this not the case. The robot is software, permanently resident in its own computer, communicating from that computer its requests for website documents from other computers (the document server(s)) upon which the target site is resident.

Fig. 1: Searching For the Right Words

A search engine is a software programme resident on a computer that searches through a (usually massive) database. In the context of the World Wide Web, the word "search engine" is most often used for search forms that search through databases of HTML documents gathered by a robot.
Like most search engine service providers
, for both quality and security reasons, URLs submitted by our visitors directly are stored in a temporary database before they are finally crawled and entered into the main search engine's index. We allow interested visitors viewing access to the temporary database. Use the "Pre-Index" engine here by either entering key words; entering your site name; or leave the search field blank, press "Pre-Index" and the engine will show you the entire list of recent submissions. You can see how others describe their sites and get some ideas for your own. If you have submitted your site using our Add Url form, you can check here and see how it looks. If you don't like it, remember that the final index entry will be derived from your web page, so spend your time working on your web page and it's meta-tags instead of resubmitting. 

Our Crawler (Spider Monkey) visits and checks URLs during server off-peak load times and feeds the result to the index. All realms of the main database are refreshed no less than every 30 days. This temp. database is minimally crawled twice monthly and while a URL is fetched from the actual site, each entry here remains for a period of roughly 60 days to verify when and how it was submitted. Note: URLs submitted to our own Site Submit Service or submitted remotely by other authorized servers do not appear in the temp. database but can be found using the Mouse House Search Engine

Spider Monkey abides by the Robot Exclusion Standard. Specifically, Spider Monkey adheres to the 1994 Robots Exclusion Standard (RES). Where the 1996 proposed standard supercedes the 1994 standard, the proposed standard is followed.

Spider Monkey will obey the first record in the robots.txt file with a User-Agent containing "Spider_ Monkey". If there is no such record, It will obey the first entry with a User-Agent of "*".

Before you submit your site for inclusion in our database (index), are there pages you don't want indexed?  If so, put the following in the head of any web page you want excluded. Our crawler (Spider Monkey) will obey this instruction and skip the document.

<META NAME="robots" CONTENT="noindex">

Do you use meta content tags? You should at least set out the content of the page as succinctly as possible. If present, this will become the introduction to your page in the search results our visitors see. An example follows:

<meta name="Description" content="Learn, laugh and enjoy at the same time. International Information Technology firm has superb entertainment website for clients, employees and guests.">

 

Here is a sample of what you see when you visit SpiderMonkey

SpiderMonkey

 

Terms

Select Index

Advanced
Dictionary
Home
Contact
Search where?
Searched for dns 1-10 of 73 901155 pages searched

In this example the word (acronym) "dns" was searched. As the above line indicates, the searcher got 73 qualified URLs with 10 displayed per page. Try a search from here. Use your mouse to mark off "dns" in the "terms" space and enter as many words as you like separated by spaces and by using the "Terms" menu tell us if you'd like to find pages containing: any term or all the words. You can also select a "Global" or specialized search index using the "Index" drop menu. For more help Click Here

Search Engine

Our search engine finds documents at Mouse House and throughout the World Wide Web. Here's how it works: you tell our search engine what you're looking for by typing in keywords, phrases, or questions in the search box. Our search engine responds by giving you a list of all the Web pages in our crawler's (we call it SpiderMonkey and you can read its technical details from the WWW robot registry by clicking here) index relating to those topics. The most relevant content will appear at the top of your results.

Most foul language is ignored by our Search Engine. Conclude it is not a tool for seeking porn sites.

Spider Monkey's index is a large, growing, organized collection of data comprised of Web pages, their content and location and discussion group pages from around the world. The 'index' becomes larger every day as people send us the addresses for new Web pages and as our systems administrators search for new material. We own sophisticated technology that crawls the Web daily during lower server load periods looking for links to new pages. When you use the Mouse House search engine, you search the entire collection using keywords or phrases, just like other search engines such as Yahoo or Alta Vista

When searching, think of a word as a combination of letters and numbers. The search engine needs to know how to separate your words and numbers to seek out exactly what you want on the Internet. You can separate words using white space and tabs.

You can link words and numbers together into phrases if you want specific words or numbers to appear together in your result pages. If you want to find an exact phrase, use "double quotation marks" around the phrase when you enter words in the search box.

Some Terminology Related To Search Engines

Boolean search: A search allowing the inclusion or exclusion of documents containing certain words through the use of operators such as AND, NOT and OR.

Concept search: A search for documents related conceptually to a word, rather than specifically containing the word itself.

Full-text index: An index containing every word of every document cataloged, including stop words (defined below).

Fuzzy search: A search that will find matches even when words are only partially spelled or misspelled.

Index: The searchable catalog of documents created by search engine software. Also called "catalog." Index is often used as a synonym for search engine.

Keyword search: A search for documents containing one or more words that are specified by a user.

Phrase search: A search for documents containing a exact sentence or phrase specified by a user.

Precision: The degree in which a search engine lists documents matching a query. The more matching documents that are listed, the higher the precision. For example, if a search engine lists 80 documents found to match a query but only 20 of them contain the search words, then the precision would be 25%.

Proximity search: A search where users to specify that documents returned should have the words near each other.

Query-By-Example: A search where a user instructs an engine to find more documents that are similar to a particular document. Also called "find similar."

Recall: Related to precision, this is the degree in which a search engine returns all the matching documents in a collection. There may be 100 matching documents, but a search engine may only find 80 of them. It would then list these 80 and have a recall of 80%.

Relevancy: How well a document provides the information a user is looking for, as measured by the user.

Spider: The software that scans documents and adds them to an index by following links. Spider is often used as a synonym for search engine.

Stemming: The ability for a search to include the "stem" of words. For example, stemming allows a user to enter "swimming" and get back results also for the stem word "swim."

Stop words: Conjunctions, prepositions and articles and other words such as AND, TO and A that appear often in documents yet alone may contain little meaning.

Thesaurus: A list of synonyms a search engine can use to find matches for particular words if the words themselves don't appear in documents.

 

Add URL | Main Page

Mouse House is the Creative Technology and ISP Division of MPRM Group Limited
Site Index
[ Robot Tech. Specs. ] [ Search Engine Tech. Specs. ]


Valid HTML 4.0!