About e-Species
This is a pure Python
CGI-based implementation of a taxonomically intelligent species search engine. It searches
biological databases for a taxonomic name. The search is done "on
the fly" using web services (SOAP/XML) or URL API's.
Classification
Synonyms and higher taxa for a taxon name are retrieved using the Catalogue of Life Web Services.
Tags
Automated tagging uses Yahoo! Term Extraction to retrieve a list of significant words or phrases extracted from the Wikipedia article snippets.
Description
Display of snippets from Wikipedia articles makes use of the XML export format provided by Wikipedia. A link to the original article is also displayed.
Genomics
Queries to NCBI are performed using the Entrez Programming Utilities. The ESearch tool is used to look up a taxon name and, if the name is found, the ESummary
tool is called to
get basic statistics on what NCBI holds for that taxon. Links to
external information resources for the taxon are retrieved using the Elink tool.
Maps
Distribution maps for a taxon are retrieved from GBIF using code inspired in the Species Distribution Widget written by Tim Robertson and Dave Martin.
Images
Yahoo! Image Search web service is used to find up to five images for the query term.
Documents
A Python script written by Yusdi Santoso is used to search Google Scholar. The script extracts references by screen scraping, since Google has not released any API for Google Scholar.
Related Projects
Rod Page has written the original iSpecies taxonomically-based search engine, that also uses web services. David Shorthouse has written an iSpecies Clone, that uses JSON.
Source Code
The e-Species search engine
has been developed on an IBM-PC compatible machine running Linux Ubuntu
8.04 Hardy Heron and Python 2.5. The Google Scholar script needs the BeautifulSoup module. The e-Species source code is released under the terms of the GNU General Public License, and is available from Google Code.
News
- Version 1.00, 29th Jun 08: Initial public release
- Version 1.01, 6th Jul 08: Added spelling suggestion from Yahoo! Spelling Suggestion service to provide a suggested spelling correction for a given name.
- Version 1.02, 10th Jul 08: Improved handling of synonym status; fixed a bug in spelling suggestion.
- Version 1.03, 11th Jul 08: Added a method to class COLSearch to check for the existence of a taxon name.
- Version 1.04, 31th Jul 08: Added automated tagging from Yahoo! Term Extraction for Wikipedia snippet.
- Version 1.05, 1st
Aug 08: Added a method to class NCBISearch to return a list of
external information resources for search name.
- Version 1.06, 11th Aug 08 - Added a function to strip out markup tags from Wikipedia snippet.
- Version 1.07, 05th Sep
08 - Fixed a bug in handling Unicode characters in the author of a
taxon name returned from CoL
- Version 1.08, 09th Sep
08 - Renamed class YahooSearchImage to YahooSearch and added
functions spellingSuggestion (renamed to spellCheck) and
termExtraction (renamed to termExtract) as new methods.
- Version 1.09, 21th Oct
08 - Removed dependency of Set module, using tuple instead, and
fixed a problem with the display image thumbnails from Yahoo
search.
Todo
- Make use of synonyms in
the searches, merging the results from searches using different
names, and present those together (as suggested by Rod Page in the iPhylo Blog).
Acknowledgements
Thanks to Rod Page for his implementation tips on the iSpecies Blog and overall inspiration, to Eduardo Dalcin for crash testing and pointing out several flaws, to Flavio Coelho and other members of the PyScience-Brasil discussion list for support and constructive comments, and to Douglas Soares de Andrade for providing patches and creating an svn trunk for the e-Species source code. I am also much grateful to Artur Bracet, owner and manager of the AtivaHost hosting service (where e-Species and several other of my websites are hosted), that for years has kindly supported my efforts.
Contact
Send comments and suggestions to Mauro J. Cavalcanti.