2. Search engine crawl system
As we have already seen, several steps are necessary for a search engine to function properly:
In the first phase, software crawls the Web from link to link, retrieving information from web pages;
This information is then indexed by indexing engines, with the indexed terms enriching a regularly updated – database of words contained in – pages;
finally, a search interface enables results to be returned to users by ranking them in order of relevance.
2.1 Spiders / Robots
Spiders (also known as agents, crawlers, robots or bots) are browser programs that constantly visit web pages and links to index...
Exclusive to subscribers. 97% yet to be discovered!
Already subscribed? Log in!
Search engine crawl system
Article included in this offer
"Software technologies and System architectures"
(
227 articles
)
Updated and enriched with articles validated by our scientific committees
A set of exclusive tools to complement the resources
Bibliography
- (1) - BRIN (S.), PAGE (L.) - The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN Systems. - https://snap.stanford.edu/class/cs224w-readings/Brin98Anatomy.pdf (1998).
- ...
Exclusive to subscribers. 97% yet to be discovered!
Already subscribed? Log in!