CSC 379 SUM2008:Week 2, Group 3: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 12: | Line 12: | ||
A search engine is an information retrieval system that match queries with an index it creates. Search engines consist of four essential modules: | A search engine is an information retrieval system that match queries with an index it creates. Search engines consist of four essential modules: | ||
:# Document Processor - this prepares, processes, and inputs the documents, pages, or sites that users search against. | :# Document Processor - this prepares, processes, and inputs the documents, pages, or sites that users search against. | ||
:# Query | :# Query Processor - this consist of seven possible steps | ||
:#*Tokenizing usually by breaking inputs into strings separated by white space. | :#*Tokenizing usually by breaking inputs into strings separated by white space. | ||
:#*Parsing operators like reserved punctuation or reserved terms in specialized format (e.g., AND, OR). This may also include boolean, adjacency, or proximity operators. | :#*Parsing operators like reserved punctuation or reserved terms in specialized format (e.g., AND, OR). This may also include boolean, adjacency, or proximity operators. | ||
Line 19: | Line 19: | ||
:#*Query expansion employs synonyms to optimize the search results. | :#*Query expansion employs synonyms to optimize the search results. | ||
:#*Query term weighting is used to judge the importance of each term in the query. | :#*Query term weighting is used to judge the importance of each term in the query. | ||
:# Search and | :# Search and Matching Function - this is based on which theoretical model of information retrieval underlies the system's design philosophy. | ||
:# Ranking | :# Ranking Capability - this is done several ways | ||
:#*Term frequency | :#*Term frequency | ||
:#*Location of terms | :#*Location of terms | ||
Line 37: | Line 37: | ||
===Politics=== | ===Politics=== | ||
==External Links== | |||
[http://www.infotoday.com/searcher/may01/liddy.htm How a Search Engine Works] |
Revision as of 19:56, 18 July 2008
Search Engines
Search engines fill an important role in our lives, helping us locate information within a wide array of multimedia. However many ethical considerations are involved in their operation; the ordering of rankings, the range of content indexed (or not), and how advertisements are incorporated, are a few. Broadly examine the ethics of search engine operation and use.
- http://www.i-r-i-e.net/inhalt/003/003_hinman.pdf
- http://www.i-r-i-e.net/inhalt/003/003_editorial.pdf
- http://www.boingboing.net/2008/04/04/usfunded-health-sear.html
- http://www.scu.edu/ethics/publications/submitted/search-engine-panel.html
Function
A search engine is an information retrieval system that match queries with an index it creates. Search engines consist of four essential modules:
- Document Processor - this prepares, processes, and inputs the documents, pages, or sites that users search against.
- Query Processor - this consist of seven possible steps
- Tokenizing usually by breaking inputs into strings separated by white space.
- Parsing operators like reserved punctuation or reserved terms in specialized format (e.g., AND, OR). This may also include boolean, adjacency, or proximity operators.
- Stop list and stemming might contain words from commonly occurring querying phrases. Engines may drop these two steps.
- Creating the query depends on the method used to do the matching.
- Query expansion employs synonyms to optimize the search results.
- Query term weighting is used to judge the importance of each term in the query.
- Search and Matching Function - this is based on which theoretical model of information retrieval underlies the system's design philosophy.
- Ranking Capability - this is done several ways
- Term frequency
- Location of terms
- Link analysis
- Popularity
- Date of Publication
- Length
- Proximity of query terms
- Proper nouns