When to use and not to use
The ES comes as an integrated package with software and operating system. It support extracting text from many file formats and can automatically generate thumbnails from many images and many other functions. For example here is how a search hit for an image can look like out of the box.
No pun intended. We just happened to have that logo in the test set and think the search result looks good.
However all this comes with some overhead, so it may be that the ES isn’t right for your project. At list the following may be some points to consider.
Overkill for small sites
If you have a relative small website that is public available you may want to consider using a simple PHP or CGI script, or a fully hosted solution instead of having to run another (virtual) server.
The ES is not based on tables and fields like a SQL database, so it can’t mimic an SQL table. Instead the ES uses a flexible concept of documents and document attributed. This document system is great for indexing real documents and websites, but not for maintaining table relations.
If you need to mimic the field in a database you may want to look into using Apache Solr instead.
Automatic distribution and replicating
While the ES do support distribution and clustering this is a kind of undocumented black art to setup. If you need a system where the data indexed needs to be spited up on many servers you may want to look into ElasticSearch instead. However be aware that ElasticSearch don’t support any crawlers and data converters so you have to convert all data to text and push it to ElasticSearch by using JSON.
Automatic clustering support is something we are working on.