Search Engine

From BC$ MobileTV Wiki
Jump to: navigation, search

A Search Engine is a querying tool for searching for information matching a specified set of criteria, within a given set of data.


Specifications

Robots.txt

Robots.txt logo

OpenSearch

OpenSearch logo

Sitemap

Sitemap protocol logo

Sitemap specifies how to announce your website's URL layout and specific content entries to Search Engines such as Google, Yahoo! and Bing.

Structure

Corpus

An index, database, word set or link list. An initial dataset upon which to search, rank and build query responses for.

Crawler

To build a #corpus corpus when starting out from scratch, you have a few choices on places you can get the data. The first and easiest one (that comes with a steep price) is to buy a large database within a particular domain that has already been manually curated and possibly ranked, sorted and categorized.

The more intensive yet cost-effective way to build a corpus is to generate it yourself from a publicly available dataset (just like Google did with the web in March 1998[16])



Services

Google

The best example of a traditional search engine is currently that of a most basic search engine which provides a rich set of search results, the industry leader Google.

Yahoo!

Yahoo! is an Internet Portal which also provides its own Search Engine, however the majority of its search functionality has been outsourced to Microsoft's Bing.

Bing

Bing is Microsoft's latest foray into the search market, and the successor to MSN Search.

Twitter

Thanks to its widespread use, Micro-Blog Twitter is quickly becoming the leading search engine for immediate relevancy of current events and sentiments. They also recently acquired Summize[20] a real-time "twitter mind" search engine.

Baidu

QQ

The 3rd largest internet company by revenue and 4th largest search engine by volume is China's #1 portal & #2 search site

Yandex

DuckDuckGo

[25][26] [27] [28] [29] [30] [31]

JRank

  • JRank: http://www.jrank.org/ (focused on offering customized search for your website, picking up where Google left off when it deprecated/deactivated its Google Site Search & Google Search API)

Blekko

Hakia

Hakia uses semantic approaches to building results for its search engine.

IndexTank

YaCy

Carrot2

eTools

Omgili

Oh My God I Love It (Omgili) - FORUM discussion search engine: http://omgili.com/

Oamos

ThumbShots

Search the world's largest human-powered thumbnail/website directory

SearchCube

SearchCube is an innovative new service from Symmetri which provides search results in a 3D visual cube layout. It uses ThumbShots API to create the thumbnails of all the pages in the cube.

SearchMe

  • SearchMe - Visual Thumbnail browsing with Protoflow (iPod Coverflow) type search capabilities: http://www.searchme.com/

Quintura

EyePlorer

Kartoo

Clusty

Clusty is a clusterization-based search engine which clusterizes (groups together) related data sources.

Swoogle

Search the Semantic Web on Swoogle to find existing ontologies (structures of data, but not the data itself).

DogPile

Aggregate several search engines' results with DogPile.

Grokker

Ask

Ask used to have API access to their search capabilities, but have since removed it.

Cuil

  • Cuil - Arguably the most over-hyped only to be disappointing search engine ever (from former Google employees, claims to have world's largest index): http://cuil.com

WolframAlpha

WolframAlpha is a Knowledge Engine with one simple input field that gives access to a huge system, with trillions of pieces of curated data and millions of lines of algorithms.

FeedMil

Dorthy

DeepDyve

  • DeepDyve delivers fast, easy access to the vast amounts of expert information hidden in the Deep Web: http://www.deepdyve.com/

FindLinks

MyWebSearch

API-specific (by content type) search tool: http://home.mywebsearch.com/

StartPage


People

Pipl

Spokeo

CV Gadget

Thomson Reuters


Products

Octoparts



Tools


Resources

Lucene

[35] [36]


Solr

Tutorials


External Links


References

  1. OpenSeach 1.1 spec (draft 6): https://github.com/dewitt/opensearch/blob/master/opensearch-1-1-draft-6.md
  2. OpenSeach 1.1 spec (draft 5): https://web.archive.org/web/20120510160608/http://www.opensearch.org/Specifications/OpenSearch/1.1#OpenSearch_response_elements
  3. OpenSearch XSD: http://weblogs.asp.net/wkriebel/archive/2008/02/04/opensearch-xsd.aspx
  4. Submit your OpenSearch provider to Amazon A9 client (used in Alexa): http://opensearch.a9.com/
  5. Developer how to guide: http://www.opensearch.org/Documentation/Developer_how_to_guide
  6. OpenSearch v1.1 Cheat Sheet: http://www.scribd.com/doc/6114752/OpenSearch-Cheat-Sheet-15
  7. Introducing OpenSearch: http://www.xml.com/pub/a/2007/07/20/introducing-opensearch.html
  8. OpenSearch Google in Windows 7: http://www.mzzt.net/2009/01/14/opensearch-google-in-windows-7/
  9. Windows 7 Federated Search Providers: http://www.sevenforums.com/tutorials/742-windows-7-federated-search-providers.html
  10. Sitemap Generators: http://code.google.com/p/sitemap-generators/wiki/SitemapGenerators
  11. C# Sitemap Generator: http://sourceforge.net/projects/sitemapgen/
  12. Google Sitemap Generator in PHP: http://www.idealog.us/2006/09/google_sitemap_.html
  13. PHP Sitemap Generator: http://www.phpclasses.org/package/5838-PHP-Generate-sitemaps-and-notify-updates.html
  14. How To Use Google Video XML Sitemaps For Video SEO: http://www.reelseo.com/how-video-sitemaps/
  15. How to create video sitemap to drive more traffic: http://fourblogger.com/how-to-create-video-sitemap-more-traffic/
  16. Foundin on Gd oole:http://www.wired.com/wired/archive/13.08/battelle.html
  17. Google Sitemap/Webmaster Tools: http://www.google.com/webmasters/tools/
  18. Yahoo! Site Explorer: http://siteexplorer.search.yahoo.com/submit
  19. Bing Webmasters' Tools: http://www.bing.com/toolbox/webmasters/
  20. Confirmed -- Twitter Acquires Summize Search Engine: http://techcrunch.com/2008/07/15/confirmed-twitter-acquires-summize-search-engine/
  21. Yandex Tries to Solidify Search Dominance, Keep Google Down in Russia: http://searchenginewatch.com/article/2157877/Yandex-Tries-to-Solidify-Search-Dominance-Keep-Google-Down-in-Russia
  22. A Duck & a Wiki Team Up Against the Content Farms: http://www.readwriteweb.com/archives/a_duck_a_wiki_team_up_against_the_content_farms.php
  23. Escape your search engine Filter Bubble! - An illustrated guideby DuckDuckGo.com: http://dontbubble.us/
  24. The Trouble With the Echo Chamber Online: http://www.nytimes.com/2011/05/29/technology/29stream.html
  25. Ideas for DuckDuckGo Instant Answer Plugins and Data sources: https://duckduckhack.uservoice.com/forums/5168-ideas-for-duckduckgo-instant-answer-plugins/status/904946
  26. DuckDuckGo FAQ - where do the search results come from?: http://help.dukgo.com/customer/portal/articles/216399
  27. How can I use DuckDuckGo in my application? Is there any API for the same?: https://www.quora.com/How-can-I-use-DuckDuckGo-in-my-application-Is-there-any-API-for-the-same
  28. DuckDuckGo API - PHP client: https://www.phpclasses.org/package/10443-PHP-Search-for-data-and-related-topics-from-DuckDuckGo.html
  29. DuckDuckGo slams Google following EU antitrust decision: https://www.theverge.com/2018/7/20/17595612/google-antitrust-eu-duckduckgo-chrome: https://www.theverge.com/2018/7/20/17595612/google-antitrust-eu-duckduckgo-chrome
  30. Twitter -- DuckDuckGo account: https://twitter.com/DuckDuckGo
  31. Fetch DuckDuckGo Web Search Results in 20 lines of Java code: https://medium.com/@sethsubr/fetch-duckduckgo-web-search-results-in-20-lines-of-java-code-3a34ea9da085
  32. No Joke? Blekko is 63rd Largest Pure Search Entity in the World: http://blog.searchenginewatch.com/101111-094700
  33. The Secrets Behind Blekko's Search Technology: http://www.readwriteweb.com/hack/2010/12/the-secrets-behind-blekkos-search-technology.php
  34. Blekko Search Engine Slashes Through the Web: http://tech.ca.msn.com/pcworld-article.aspx?cp-documentid=26184155
  35. DZone -- Refcard #137 - Understanding Lucene: https://dzone.com/refcardz/lucene
  36. Understanding Lucene - Powering Better Search Results: http://www.scribd.com/doc/51555360/DZone-Refcard-137-Understanding-Lucene-Powering-Better-Search-Results

See Also

Recommendation Engine | Advertising | Google | Yahoo | Bing | News | Websites | Multimedia | Local Business | Maps