Web Search Engines: Part 1 and Part 2 (Hawking, D.)
- I would be very interested in what kinds of indexing methods
search engines use for organizing information on the Web—crawling algorithms.
- I found the part about spam rejection very
interesting—especially how some spam sites will give crawlers different
information than they give to site visitors.
- I found both of these articles completely fascinating and
have determined that I want to be a crawler when I grow up.
Current Developments and Future Trends for the OAI Protocol
for Metadata Harvesting (Shreeves, S., et al)
- This article was written in 2005. How was OAI changed since
then?
- How was it decided that Dublin Core became a sort of
metadata standard for OAI?
- For the Sheet Music Consortium, how effective is it that
users are annotating the metadata records?
White Paper: The Deep Web: Surfacing Hidden Value (Bergman,
M.)
- What search engines, if any, use BrightPlanet’s search
technology that retrieves both deep and surface content?
- When it comes to the size of the surface and deep Web, how
has it changed since 2000?
- Does having full access to the deep Web mean that we will be
able to retrieve more accurate and relevant information or that we will simply
retrieve more information that is perhaps irrelevant?
- Are people not finding what they want on the Web because of
limited access or because they lack sufficient searching skills, such as using
synonyms for search terms? Or perhaps there’s already too much information that
people have to sift through?