Fork me on GitHub

Introduction

This part of the manual describes functions that are expected to be released soon. Normally these functions will not work on your ES, but we sometimes posted proposed documentation her for further references and feedback.

Add documents with REST api

Resource GET POST Put DELETE
Collection
http://example.com/documents/collection
      Delete the entire collection
Document
http://example.com/documents/collection/item
  Add or Updates a document Same as POST Delete the addressed document

 

Add or Updates a document - POST

Will upload the local file test.png to the ES as test.png into the httpup collection.

	curl --data-binary @test.png http://example.com/documents/httpup/test.png

 

Download the the addressed document - GET

	# curl -XGET http://example.com/documents/httpup/test.png

 

Delete the addressed document - DELETE

Delete the test.png from the httpup collection.

	curl -XDELETE http://example.com/documents/httpup/test.png

 

Delete a collection - DELETE

Delete the httpup collection.

	curl -XDELETE http://example.com/documents/httpup

Add delayed. A faster way to add by batch

When you add a document by just posting it the document get indexed immediately. Indexing is a costly process that is best done in batches. If you are going to add several documents to the same collection it is better to add them using the add delayed function.

The add delayed function writes the document to disk, but don’t do the expensive indexing before you closes the collection.

	curl --data-binary @test.png -X ADDDELAYED http://example.com/documents/httpup/test.png
	curl --data-binary @other.png -X ADDDELAYED http://example.com/documents/httpup/other.png

	curl -XCLOSE http://example.com/documents/httpup

 

Example: Add a folder with a Windows bat file using Curl

It is easy to make a Windows bat file that uploads a folder to the ES. Copy the code into a file named pushit.bat and run it from the command line with desired folder and ES server as command line:

pushit.bat example.

Usage: pushit.bat folder collection server

 

@echo off
setlocal enableDelayedExpansion


for /f "usebackq tokens=*" %%f in (`dir /b/s /a:-D %1`) do (
    set "url=%%f"
    set "url=!url: =%%20!"

    curl --data-binary "@%%f" "http://%3/documents/%2/!url!"
)

Require curl for Windows. Available for free her ( you probably want the "Win32 - Generic" or "Win64 - Generic" version).

 

Example: Uploading a file using libcurl and C

 

#define _GNU_SOURCE /* For asprintf */
#include <stdio.h>
#include <curl/curl.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(int argc, char **argv)
{
  CURL *curl;
  CURLcode res;
  struct stat file_info;
  double speed_upload, total_time;
  FILE *fd;

  char *file;
  char *url;

  if(argc < 4) {
        fprintf(stderr,"Usage: httpput file server collection\n");
        return 1;
  }
  file= argv[1];
  if(asprintf(&url,"http://%s/documents/%s/%s",argv[2],argv[3],argv[1]) < 0) {
    perror("Building url");
  }


  fd = fopen(file, "rb"); /* open file to upload */
  if(!fd) {
    perror(file);
    return 1; /* can't continue */
  }

  /* to get the file size */
  if(fstat(fileno(fd), &file_info) != 0) {
    perror("fstat");
    return 1; /* can't continue */
  }

  curl = curl_easy_init();
  if(curl) {
    /* upload to this place */
    curl_easy_setopt(curl, CURLOPT_URL,url);

    /* tell it to "upload" to the URL */
    curl_easy_setopt(curl, CURLOPT_UPLOAD, 1L);

    /* set where to read from (on Windows you need to use READFUNCTION too) */
    curl_easy_setopt(curl, CURLOPT_READDATA, fd);

    /* and give the size of the upload (optional) */
    curl_easy_setopt(curl, CURLOPT_INFILESIZE_LARGE,
                     (curl_off_t)file_info.st_size);

    /* enable verbose for easier tracing */
    curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L);

    res = curl_easy_perform(curl);
    /* Check for errors */
    if(res != CURLE_OK) {
      fprintf(stderr, "curl_easy_perform() failed: %s\n",
              curl_easy_strerror(res));

    }
    else {
      /* now extract transfer info */
      curl_easy_getinfo(curl, CURLINFO_SPEED_UPLOAD, &speed_upload);
      curl_easy_getinfo(curl, CURLINFO_TOTAL_TIME, &total_time);

      fprintf(stderr, "Speed: %.3f bytes/sec during %.3f seconds\n",
              speed_upload, total_time);

    }
    /* always cleanup */
    curl_easy_cleanup(curl);
  }

  free(url);
  return 0;
}


Downlaod the source here.

Compiling

Probably something like this:

gcc fileupload.c -o fileupload -lcurl  -lssl -lcrypto –ldl

Please refer to the libcurl manual for more info.

Additional add parameters

Basic parameters

title The documents title
acl_allow Comma separated list of users and groups that has access to the document
acl_denied Comma separated list of users and groups that shall not have access to the document
documenttype  
documentformat  

Example:

 

Will upload the local file test.png to the ES and set the title to "Test image".
 

	curl --data-binary @test.png "http://example.com/documents/httpup/test.png?title==Test%20image"

 

Attributes

Attributes are meta-information about a file. For example witch email folder a particular email is stored in, or witch project a file belongs to. You can use attributes to display meta-information and do filtering from the search results.


Attributes are key value pair separate with = and each pair is then separated with a comma. The basic format is:

Key=value, Key2=some value2

You then have to url encode this separately. So the data you will give to curl will be:

Key%3Dvalue, Key2%3D some%20value2

Example:

Will upload the local file test.png to the ES as test.png into the httpup collection and set the attribute “project” to be “test” and “author” “Runar Buvik”.
 

	curl --data-binary @test.png "http://example.com/documents/httpup/test.png?attributes=project%3Dtest,author%3DRunar%20Buvik"

 

Searching with the REST api

The ES comes with a well-designed API for returning search results so you can design your own user interfaces. For searching there are 3 different modes, depending on how you want to authenticate your users. All the different modes can be used on the same ES at the same time.

Anonymous

No authentication. You will only be able to search data stored in collections set to anonymous. This is the default if you don't have a user system.

Forward authentication

Users will supply both a username and a password to your application. Your application will forward this to the ES, and the ES will then handle authentication. For example to make an iPhone search app you would prompt the user for his username and password, and then forward it to the ES. The ES then check against the user system, for example Microsoft Active Directory, to verify that the username and password is correct.

Pre authenticated

You will handle authentication and only send the username to the ES. For example in a CRM system, different users have access to different documents, but the CRM system has its own user system with information about each user. There is no need to send passwords to the ES. Instead you will be sending a special key that tells the ES that it can trust that authentication already have been performed. The ES will still handle document level security based on users and groups.

 

Api url

In the administrator interface there is a page named “Api info” that can help you generate the correct url for api calls.

Basic format

Replace query=example with query=word to search for other words.

Anonymous

Url: http://{hostname}/webclient2/api/anonymous/sd/2.1/search?query=example

Forward authentication

Url: http://{hostname}/webclient2/api/sd/2.1/search?query=example

Pre authenticated

Url: http://{hostname}/cgi-bin/dispatcher_allbb?query=example&user=Everyone&bbkey={secret key}


Additional url parameters

navmenucfg

Base 64 encoded config.

collection

Limit hits to a specific collection.

outformat

Set to "opensearch" to get output xml in the Open Search format or "json" to get JSON output.

maxhits

Maximum number of hits to return in a single response. If there is more hits that can be returned use “page” to get the next set. Also see results paging below.

Set to 0 to only get the result_info header, without any results or navigation menu.

page

See results paging below.

Results paging

Results are return as pages of number of maxhits. For example; if a query has 49 hits and you use a maxhits of 20, a basic api call will give you the first 20 hits. You can then set page=2 to get results 21-40, and page=3 to get results 41-49.

XML result

The results from an API call are returned as XML.

Basic elements

RESULT_INFO

Info about the results.

Element  
TOTAL Total number of results found
SPELLCHECKEDQUERY If the query was misspelled a suggestion may be here
QUERY Query as typed by the user. Can be used to show the user what he did search for
TIME Total time used
FILTERED Number of results that were removed by filters
SHOWABAL Number of results returned. May be maxhits or less
CASHE 1 if result was from internal cache, 0 else
NROFSEARCHNODES Number of backend nodes that was involved in answering your query
XMLVERSION The version number of the xml. This is not the same as the API version

RESULT

A single result.

Element  
TITLE Title of the result
URL Uniform Resource Locator
URI Uniform Resource Identifier. Do not use
FULLURI Uniform Resource Identifier. Do not use
Attributes List of attributes. See below
VID Virtual id. An uniq identifier
DOCUMENTLANGUAGE Written languages of the underlying document. Currently not in use
DOCUMENTTYPE Type of the underlying document
POSISJON Position in the result set
filetype Filetype of the underlying document
icon What icon to display
THUMBNAIL Link to Thumbnail. You must prefix with responding server
THUMBNAILWIDTH Thumbnail width
THUMBNAILHEIGHT Thumbnail height
DESCRIPTION_LENGTH Length of description
DESCRIPTION Description to show the user. May be plain text or an html table
CRC32 CRC32 of the underlying document
TERMRANK Dynamic ranking describing how good this query matches this result
POPRANK Static ranking
ALLRANK Merge of dynamic and static ranking
NROFHITS Number of times the query occurred in the result
RESULT_COLLECTION Name of the collection where the result is stored
TIME_UNIX UNIX timestamp of last change
TIME_ISO Iso time of last change
CACHE Info needed o retrieve an cached version of the document

 

<< Back to documentation overview

Copyright © Searchdaimon AS. All rights reserved.