|
Answer» The Web crawler is a search engine-related service like Google, DuckDuckGo and is used for indexing website contents over the Internet for making them available for every result. - What are some of the Required Features?
- Design and DEVELOP a Scalable service for collecting INFORMATION from the entire web and fetching millions of web documents.
- Fresh data has to be fetched for every search query.
- What are some of the common PROBLEMS encountered?
- How to HANDLE the updates when users are typing very fast?
- How to prioritize dynamically changing web pages?
- Possible tips for consideration:
- Look into URL Frontier Architecture for implementing this system.
- Know how crawling is different from scraping.
|