VMware Player installation

This is a guide to installing the free version of Searchdaimon ES on the free VMware® Player.

Step 1: Check system requirements

Searchdaimon ES runs on a 64-bit operating system. To run 64-bit operating systems under VMware, a 64-bit cpu with virtualization enabled is required (Intel VT or AMD-V). Some PC providers require you to enable virtualization in BIOS. Some older 64-bit processors doesn't support virtualization.

Step 2: Short download and installation instructions

  • Go to the main download download area
  • Download Searchdaimon_ES_wmvare.zip, and unpack. You will see the files searchdaimon_es.ovf, searchdaimon_es-disk1.vmdk and searchdaimon_es-disk2.vmdk in the same folder.
  • Open VMware Player and select open a Virtual Machine.
  • In the dialog box brows to the folder where you stored the .ovf and .vmdk files. Select “Open Virtual Machine Format Images (*.ovf, *.ova)”. Open the .ovf file.
  • The VMware Player will start to import the virtual machine.
  • Select the new Searchdaimon ES machine from the menu at the left. Click “Edit virtual machine settings”.
  • On the Hardware tab select the network adapter. Disable “Connect at power on”.
  • The ES needs two network cards. To add the second click the add button, and select “Network Adapter”.
  • Click next.
  • Select that the connection to be bridged. Also verify that the device status is set to ”Connect at power on”. Click finished to go back to the virtual machine settings.
  • Optional: Select memory to edit the amount of ram the ES can use.
  • Click ok to save.
  • Start the Searchdaimon ES.
  • If you get an error message about updating VMware tools just click “Remind Me Later” for now.

Next: setup and configuration of Searchdaimon ES.

Step 2: Download and installation with screenshots

  • Go to the main download download area
  • Download Searchdaimon_ES_wmvare.zip, and unpack. You will see the files searchdaimon_es.ovf, searchdaimon_es-disk1.vmdk and searchdaimon_es-disk2.vmdk in the same folder.
  • Open VMware Player and select open a Virtual Machine.
    vmPlayer 1
  • In the dialog box brows to the folder where you stored the .ovf and .vmdk files. Select “Open Virtual Machine Format Images (*.ovf, *.ova)”. Open the .ovf file.
    vmPlayer 2
  • Create full clone.
    vmPlayer 3
  • The VMware Player will start to import the virtual machine.
    vmPlayer 4
  • Select the new Searchdaimon ES machine from the menu at the left. Click “Edit virtual machine settings”.
    vmPlayer 5
  • On the Hardware tab select the network adapter. Disable “Connect at power on”.
    vmPlayer 6
  • The ES needs two network cards. To add the second click the add button, and select “Network Adapter”. Then click next.
    vmPlayer 7
  • Select that the connection to be bridged. Also verify that the device status is set to ”Connect at power on”. Click finished to go back to the virtual machine settings.
    vmPlayer 8
  • Optional: Select memory to edit the amount of ram the ES can use.
    vmPlayer 9
  • Click ok to save.
    vmPlayer 10
  • Start the Searchdaimon ES.
    vmPlayer 11
  • If you get an error message about updating VMware tools just click “Remind Me Later” for now.
    vmPlayer 12

Next: setup and configuration of Searchdaimon ES.

From console

How to manually configure your IP-address from the console

Log in as user 'setup'. You don't need a password.
image

Select "Network configuration".
image

Select eth1.
image

Uncheck dhcp and enter your IP-address, netmask and gateway.
image

Increasing the size of the virtual disk in VMware

You may increasing the size of the ES virtual data disk.

If you have ES 2.3 or newer all you have to do is to increase the second hard drive in VMware. Se VMWare KB Article http://kb.vmware.com/kb/1004047 for hove this is done on your platform. Then reboot, and ES will detect and handle the rest automatically.

If you don't see two disks please contact support for instructions.

Exchange with Outlook 2008 or 2010 in Windows

When you click on a url from a Exchange connector in ES's search result page the email should open Outlook. This is done by crafting a special url in the format "outlook:000000003eb852348...". Sometimes Outlook don't register the "outlook:" url handler correctly. If this is the case noting will happen when you click on the link ( or the browser try to open a webpage ).

We have created a small program that fixes this, without doing anything more. It can be downloaded below. Please read the README.txt for instructions.

Download fix for url scheme in Windows: Outlook2007Scheme.zip

The program creates and sets the registry key HKEY_CLASSES_ROOT\outlook\shell\open\command to the correct path for outlook.exe. Full source code is included.

Microsoft Exchange

Preparing the Exchange server

To crawl Exchange you normally have change to tings on the Exchange server.

  • Webdav. The ES uses webdav to connect to Exchange, so webdav has to be installed. This is the same access method the Entorage on Mac are using. A god guide on how to set it up are available her Accessing Exchange 2007 from your Apple Macintos.
  • The ES need the user right "Receive As" on all mailboxes it is to crawl. You can ether setup this for each user, or set the permission on the mailbox store to affect all users.

Setting up the ES part

Go to add manually in the Collections/Resources menu and select the Exchange connector.

Add exchange collection

  • Select or add a new usernam for the crawler to use when accessing the Exchange server.
  • Set the Exchange server address. Normal this is the hostname or ip-address to the Exchange server.
  • Select whits users mailboxes to crawl. This list is populated from your Active directory, and thus require a working user system setup.

Add exchange collection next

Overview

Collections/Resources -> Overview
Back


        Click to enlarge.

Overview is where you check status and configure your collections. Collections are grouped by type of crawler (SMB, Exchange etc.).

To recrawl a collection, select Crawl now. The collection will immediately start to recrawl if the Collection manager is allowed to crawl in this time interval (see Collection manager). To configure the collection, select Manage. You can also see some statistics there. Test-collections are managed from Crawler extensions (under Connectors). Remote installed crawlers generate Pushed collections, and should be managed from the remote host.

Manage

Edit collection


        Click to enlarge.

In the Edit collection tab you can edit your collection settings in the same way you configured it when creating the collection. Details here.

Advanced management


        Click to enlarge.

Under Advanced management you'll find the following functions:

  • Force full crawl now Deletes downloaded content, and starts all over again.
  • Test crawl You can choose to only download the first documents. This is useful when testing a crawler without running a full crawl.
  • Enable anonymous search Enables search for everyone. Visit
    http://<your Searchdaimon ES address>/public
  • Share alias For Windows Shared Folders. Display this alias in the address-field of the documents instead of an IP-address or a hostname.

Result customization


        Click to enlarge.

  • Summary The short summary presented for each document in the search results is called a snippet. When "Generate snippet" is selected, our software extracts an abstract of the text according to the query.
  • Cache Choose if the results should show a link to our cached version of the document.
  • Ranking (advanced) It is possible to experiement how a document is ranked by adjusting scores according to where in the document query-words are matched.

Statistics


        Click to enlarge.

Shows how many documents are crawled every second.

 

VMware ESX installation

This is a guide to downloading and installing the free version of Searchdaimon ES on a VMware® ESX Server. There is also a video tutorial showing the complete installation process.

Step 1: Check system requirements

  • VMware ESX Server
  • Support for 64-bit OS
  • 30Gb free disk space
  • We recommend 2048MB free memory for best performance

This version of Searchdaimon ES comes as a Virtual Appliance, and has been tested on VMware ESX Server. It may also work on other virtualization platforms supporting the Open Virtualization Format. VMware ESX Server and VMware Infrastructure Client can be downloaded from www.vmware.com.

Searchdaimon ES runs on a 64-bit operating system. To run 64-bit operating systems under VMware, a 64-bit cpu with virtualization enabled is required (Intel VT or AMD-V). Some PC providers require you to enable virtualization in BIOS. Some older 64-bit processors doesn't support virtualization.

Step 2: Download and installation

  • Connect to your ESX-server with VMware Infrastructure Client
  • Select File->Virtual Appliance->Import
  • Select "Import from URL", and copy and paste the URL to Searchdaimon_ES_wmvare.ovf from the main download area at searchdaimon.com
  • Select Next and follow the instructions from the Import Virtual Appliance Wizard
  • Searchdaimon ES will now appear in the list over Virtual Machines
  • Select Searchdaimon ES, and select "Power on the virtual machine"

Next: setup and configuration of Searchdaimon ES.

Find the IP-address

  • If you have dhcp, Searchdaimon ES will already have allocated an IP-address. You can always see the current IP-address on the operating system console.
  • If you don't have dhcp, you can configure a static ip-address using the operating systems console. Log in as user 'setup' (no password), and set IP-address for eth1. (detailed guide)

Tip: If the console screen is black, the screensaver has been activated. Click your mouse pointer in the middle of the screen, and move it around. Press Ctrl-Alt to exit the console.

se console
The ES console visible from a VirtualBox session. The IP-address is visible on the third line.

Opensearch v1.1

Method search

Parameter
Description
query
User search query
start
Result offset (start at document n)

Example:

http://demo.searchdaimon.com/webclient2/api/opensearch/1.1/
        search?query=test&start=1

Username: demo
Password: 1234Asd

Tips:

You may add the possibility for your users to subscribe to search results as rss feeds, by adding a link to api/opensearch/1.1/search?query=test to the head section of your result page.
Example:

<link rel="alternate"
        type="application/rss+xml"
        title="test - Searchdaimon search"
        href="api/opensearch/1.1/search?query=test" />  

Getting started

To log on, enter the IP-address of Searchdaimon ES (http://{IP-address}/) in your favorite browser. The address can be obtained from your system administrator.

Logon image

Your username and password should be the same as when logging on to your computer.

Tip: Adding this IP to trusted sites in Internet Explorer enables you to open MS Office documents in MS Office when clicking on them in your browser.

Introduction

Searchdaimon ES is a free enterprise search solution, suitable for corporate use, adding search to webpages, OEM or all of the above. If you haven’t tested ES yet you should try the demo at /pages/demo/ and watch a short video tutorial to get a feel about what this is about: /pages/demo/introduction_video/.

Introduction

It is very easy to write your own server side connector. One of the strengths of the ES is the ability to write your own connectors in Perl, which run directly on the ES server. These connectors only needs to download the data from the source, then all data converting will be handled by the ES.

Scan

Collections/Resources -> Scan
Back


        Click to enlarge.

SMB supports scanning for Windows shared folders. This can be a good alternative if you don' want to add your shares manually.


        Click to enlarge.

Fill out the IP address on the server to be scanned, or a range to scan more than one computer (ex: "192.168.1.0/24"). Ping-scanning is much faster if you are scanning a network, but some computers may not answer these types of ping requests.

After the scan has completed, it's easy to add new collections.

 

VirtualBox installation

This is a guide to installing the free version of Searchdaimon ES on the free Oracle VirtualBox® .

Step 1: Check system requirements

  • VirtualBox (free to download from VirtualBox.org)
  • 5 Gb free disk space
  • Windows, OS X, Linux or Solaris operating system
  • Should run on most operating system and cpu platforms. Both 32 and 64 bit versions of Windows, OS X, Linux and Solaris is supported.

Searchdaimon ES is a 64-bit system. If you have a 64-bit cpu you most enable virtualization (Intel VT or AMD-V). Most PC providers require you to enable virtualization in BIOS. Some older 64-bit processors doesn't support virtualization.

If you have a 32-bit cpu you can skip this.

Step 2: Short download and installation instructions

  • Go to the main download download area
  • Download Searchdaimon_ES_vbox.zip, and unpack. You will see the files searchdaimon_es.ovf, searchdaimon_es-disk1.vmdk and searchdaimon_es-disk2.vmdk in the same folder.
  • Open VirtualBox and select Import Appliance from the Files menu.
  • Click on the Choose button and use the dialog box to brows to the folder where you stored the searchdaimon_es.ovf and searchdaimon_es-disk1.vmdk files. Open the searchdaimon_es.ovf file.
  • Click next.
  • Optional: Select RAM to edit the amount of ram the ES can use.
  • Click Finish.
  • VirtualBox will start to import the virtual appliance.
  • Select Searchdaimon ES and start it..
  • Optional:If you get an error about Auto capture read it and click ok.
  • Optional:If you get an error about VT-x/AMD-V you must enable virtualization in your BIOS

Next: setup and configuration of Searchdaimon ES.

Step 2: Download and installation with screenshots

  • Go to the main download download area
  • Download Searchdaimon_ES_vbox.zip, and unpack. You will see the files searchdaimon_es.ovf, searchdaimon_es-disk1.vmdk and searchdaimon_es-disk2.vmdk in the same folder.
  • Open VirtualBox.
    Es on Virtualbox 1
  • Select Import Appliance from the Files menu.
    Es on Virtualbox 2
  • Click on the Choose button.
    Es on Virtualbox 3
  • Use the dialog box to brows to the folder where you stored the searchdaimon_es.ovf and searchdaimon_es-disk1.vmdk files. Open the searchdaimon_es.ovf files.
    Es on Virtualbox 4
  • Click next.
    Es on Virtualbox 5
  • Optional: Select RAM to edit the amount of ram the ES can use.
    Es on Virtualbox 6
  • Click Finish.
  • VirtualBox will start to import the virtual appliance.
    Es on Virtualbox 7
  • Select Searchdaimon ES and start it.
    Es on Virtualbox 8
  • Optional: If you get an error about Auto capture read it and click ok.
    Es on Virtualbox 9
  • Optional: If you get an error about VT-x/AMD-V you must enable virtualization in your BIOS.
    Es on Virtualbox 32VT-X

Next: setup and configuration of Searchdaimon ES.

From webadmin

Configuration -> Settings ->Network configuration


        Click to enlarge.

You can change the servers network settings from the Configuration Settings ->Network configuration.

Searchdaimon v2.1

Method search

Parameter
Description
query
User search query
start
Result offset (start at document n)

Example:

http://demo.searchdaimon.com/webclient2/api/sd/2.1/
        search?query=test&start=1

Username: demo
Password: 1234Asd

Searching

Searching is of course the most important aspect of ES. Start by writing one or more words that define what you are looking for. You can then drill down with filters and sorting.

  • Typing '-' in front of a word, will remove all documents containing this word.
  • Inflections are added automatically.
  • Adding '"' before and after a word or a sentence will search for the exact sentence.
  • Adding '|' befor a word means OR. For example apple |oranges gives bout documents containing apples or oranges.

 

Filtering and sorting

You can restrict your search to type of document, data source, date and meta-information such as contacts, customers, sales and projects. You can also sort on date or relevancy.

Filter menu

In the above picture, you can see the results of the query "enterprise search". The search has been further broken down to only include documents from the "Sales" collection. You can also filter the search to only include documents from a file type like Excel or PowerPoint, or from a date interval like this year or older than two years.

Collections

Collections are sources of documents. This might be shared files, your e-mail, or a CRM-system. Collections will appear as tabs in the search result. Clicking on a tab will filter out all other collections.

Suggest

Searchdaimon ES suggests query words while you are writing. The words proposed are fetched from documents the user has access to, so that domain and product names, which you can't find in traditional dictionaries, are included.

Suggest

Spell checking

The ES can propose correctly spelled words if you have misspelled a word. As for Suggest, the dictionary is built from indexed documents.

Spelling

Inflections and stemming

Searching for "car" also shows documents containing "cars", etc.

Steming

Next: Also see the demo for more examples.

Different versions, virtual appliance

ES comes in four different versions. One for installing on VMware ESX server, one for installing on the stand alone VMware Player, one for installing on Sun/Oracle VirtualBox and one general for all OVF compatible platforms. The different VMware versions requires 64-bit cpu with virtualization enabled. VirtualBox version can be run on a 32-bit cpu.

All is available from the download section: Download ES

The ES crawler API

The ES connector API require you to make a Perl package that exports at least the subroutine crawl_update().

crawl_update() is called at regular intervals to see if it is any new data available. It shall inspect its source data and determents if new data have arrived. If so, it uses add_document() to add it to the search index.

The data added to the ES is always referred to as a “document”, regardless of type and source.

Open the administration panel

Use a web browser to navigate to your administration panel at http://{IP-address}/admin . Username is 'admin', and password 'water66'. Follow the instructions from the First time wizard. This will allow you to change network settings and add the primary usersystem.

Add manually

Collections/Resources -> Add manually
Back


        Click to enlarge.

 

Add new collection.


When adding a new collection, you need to specify which crawler to use. Crawlers included in the standard version:

  • SMB Microsofts file sharing protocol. Use this for shared folders in Windows.
  • Exchange Microsoft Exchange e-mail server.
  • Intranet Web servers located on the intranet.
  • SuperOffice SuperOffice CRM.
  • ExchangePublic Public Folders in Microsoft Exchange.
  • Sharepoint Microsoft Sharepoint.

 


        Click to enlarge.

 

Specifying detailed information

The details needed can vary between crawlers. For instance the SMB crawler:

  • User system The crawler fetches data about user access, and has to know what user system the collection belongs to. For most companies, this will be Microsoft Active Directory. Some systems, like SuperOffice, use their own user systems. In such cases, the user system has to be added before the collection, under User systems.
  • Authentication Specify username and password for a user who has access to read all documents you want to search.
  • User prefix Sometimes an installation of the system we are connecting to needs us to add a prefix to our usernames. For instance a domain name: "DOMAIN\".
  • What to crawl Address of the resource. The format varies between crawlers.
  • Crawling behaviour Specify how often we should look for document updates. For some crawlers, every file has to be checked. You should therefore avoid crawling too often. Once per day is usually good.

When everything is filled in, click "Add collection". The collection will now appear in Overview, and will immediately start to fetch documents if we are allowed to crawl at this time of day (see Collection manager).

 

Add primary usersystem

The ES can be integrated with Microsoft Active Directory for handling authenticating and authorizing of end-users. We currently do not support any other user systems, or the possibility to add users to the ES directly. But you can use the ES without a users system to make public search functions, like search on a website.

For legacy reasons you most setup a Active Directory as primary user system, even if you don't have an Active Directory. If you don't have an Active directory, please see below how to get true this step.

Plain v1.0

Method suggest

Parameter
Description
prefix
User search query prefix

Example:

http://demo.searchdaimon.com/webclient2/api/plain/1.0/suggest?prefix=s

Username: demo
Password: 1234Asd

Method cache

Returns local copy of requested document.

Parameter
Description
document
Internal document ID
collection
The collection the document is in.
host
Hostname of cache server
time
Unixtime for when signature was generated.
signatureSignature, used to check that all parameters are valid.

Example:

http://demo.searchdaimon.com/webclient2/api/plain/1.0/cache/
        Administration2/2195/?signature=580553829&time=1247055462

Where Administration2 is the collection parameter, and 2195 is the document parameter.

Searching without logging in (anonymous / public search)

Searching without logging in (anonymous user)
When you want documents to be available to the public, you can enable the anonymous user. This way you don't have to log in to access the search interface.

In the administration panel, go to Overview. Select "Manage" on the collection you want to be publicly available. Go to the "Advanced management" tab, and scroll down to the "Edit settings for Administration" section. Under "Anonymous collection", select "Enable anonymous search", and click "Submit changes".

The changes will take effect immediately. To access the public interface, add "/public" to your url: http://<ip-address>/public

32-bit cpu issues

Searchdaimon ES is a 64-bit system. If you have a 64-bit cpu you most enable virtualization (Intel VT or AMD-V). Most PC providers require you to enable virtualization in BIOS. Some older 64-bit processors don’t support virtualization.

If you only have a 32-bit cpu your only option is to use VirtualBox.

Example: A Twitter connector using the Twitter json API and Perl

This example will show you how to make a custom connector for the ES. We will be crawling Twitter, a public data source, so we don’t have to worry about authenticating and data permissions.

Twitter has an http api where you can see the latest twits for a user. This is done by crafting a special url in the format http://twitter.com/statuses/user_timeline/{USER}.{FORMAT} For e.xample CNN Breaking News have twiter page http:// twitter.com/cnnbrk . Making Rss and Json available from the following url's.

Getting startet

Start by selecting the “Connectors” section in the ES admin. Then create a new connector by clicking on the “Create a new connector” button.

The new connector will be issued a default name. So our first step is to change this to something reasonable. At the settings and parameters tab, set name to “MyTwitter” and click the “Update” button.

To make this connector as general as possible we are going to have with twitter screen name to index as an parameter. To do so we must first go to the settings and parameters tab and add a parameter called “screen name”.

At the configure test collection tab, set screen name to the twitter screen name you want to crawl. In this case “cnnbrk”.


The code

Then go to edit source tab where we will write the actual sorce code. The ES will have filed in some example code, but we don’t need that now. So start with removing all source code in crawl_update() so you get a clean routine like this.

sub crawl_update {
    my (undef, $self, $opt) = @_;

};

The $opt variable is a hash reference containing all input options. For example the screen name we configured above will be at $opt->{'screen name'} . You can see the content in $opt by adding the following line to crawl_update().

warn "Options received: ", Dumper($opt), "\n"; 

At this point it's smart to test that the framework is working as exspected. Update the crawl_update() so you get:

sub crawl_update {
    my (undef, $self, $opt) = @_;

    warn "Options received: ", Dumper($opt), "\n"; 

};

Then click the save and run button below the code window.

The errors about mysql and bbdn can safely be ignored. You are not using threads and persistent bbdn connection.


Implementing

Back at the edit source window we can start to implement the Twitter connector.

We will be using the Cpan modules JSON::XS, use Date::Parse; and LWP::Simple in this connector. So first we add refferanses to them at the top of the source just below the other "use" and our statements.We gets:

use Crawler;
our @ISA = qw(Crawler);

use LWP::Simple qw(get);
use JSON::XS qw(from_json);
use Date::Parse;

Then we wil modefy crawl_update() to crawl Twitter.

We build the url to the json feed. Then uses get() and from_json() to download and decode it.

my $jurl = "http://twitter.com/statuses/user_timeline/" . $opt->{'screen name'} . ".json";

my $t = from_json(get($jurl));

Finally we loop thru the json data, format it correctly, and submit is to the ES.

    for my $usr (@{$t}) {
        my $content = $usr->{text};
        my $url = "http://twitter.com/" . "$usr->{user}{screen_name}/statuses/$usr->{id}";

        next if $self->document_exists($url, 0);

        my $substr = substr($content, 0, 50);
        my $title = "$usr->{user}{name}: $substr ..";
        my $created_at = str2time($usr->{created_at});

        
        warn "Adding $title";
        $self->add_document((
            content   => $content,
            title     => $title,
            url       => $url,
            type      => "tapp",
            acl_allow => "Everyone",
            last_modified => $created_at,
       ));
    }

Click Save and Run. Hopefully you will see something like this.

Finally all we have to do is to enable anonymous search of this collection. Go to the Settings and parameters and select accesslevel as a input field. Then at the Configure test collection tab set accesslevel to "Anonymous".

Click on the Public search page button in the left top corner and you will se the search page. Search for something.


Full code

package Perlcrawl;
use Carp;
use Data::Dumper;
use strict;
use warnings;

use Crawler;
our @ISA = qw(Crawler);

use LWP::Simple qw(get);
use JSON::XS qw(from_json);
use Date::Parse;

##
# Main loop for a crawl update.
# This is where a resource is crawled, and documents added.
sub crawl_update {
    my (undef, $self, $opt) = @_;

    warn "Options received: ", Dumper($opt), "\n"; 

    my $jurl = "http://twitter.com/statuses/user_timeline/" . $opt->{'screen name'} . ".json";
    my $t = from_json(get($jurl));

    for my $usr (@{$t}) {
        my $content = $usr->{text};
        my $url = "http://twitter.com/" . "$usr->{user}{screen_name}/statuses/$usr->{id}";

        next if $self->document_exists($url, 0);

        my $substr = substr($content, 0, 50);
        my $title = "$usr->{user}{name}: $substr ..";
        my $created_at = str2time($usr->{created_at});

        
        print "Adding $title\n";
        $self->add_document((
            content   => $content,
            title     => $title,
            url       => $url,
            type      => "tapp",
            acl_allow => "Everyone",
            last_modified => $created_at,
       ));
    }
};

sub path_access {
    my ($undef, $self, $opt) = @_;
    
    # During a user search, `path access' is called against the search results 
    # before they are shown to the user. This is to check if the user still has
    # access to the results.
    #
    # If this is irrelevant to you, just return 1.

    # You'll want to return 0 when:
    # * The document doesn't exist anymore
    # * The user has lost priviledges to read the document
    # * .. when you want the document to be filtered from a user search in general.

    return 1;
}

1;

Download the full source code at: http://www.searchdaimon.com/files/code%20examples/Simple%20Twitter%20connector.txt

End users

Users -> End users
Back


        Click to enlarge.

 

To make it possible to search at all, you have to active those users who should have access. Under End users, all users in the primary user system are listed, so it will be easy to click on those users you want to activate. Just remember to click "Update user access" when done.

 

User system: Setup Microsoft Active Directory


Tips: The ES needs a user account that can access Microsoft Active Directory and the resources you want to crawl. We recommend that you setup a separate user account for the ES. You can then tie down security later by giving this account only "read only" access to the different systems.

The ES uses Ldap to connect to Microsoft Active Directory. Ldap is enabled as default in Windows server. If you have a standard setup of Active Directory you will only need to specify:

  • Domain: Your Active Directory domain.
  • Ip: Ip address or hostname to one of your Active Directory servers.
  • User and Password: A user account the ES can use to connect to the directory. Normally all users can access Active Directory by Ldap, so the user account can be any account.

Exemple

Setings

Exemple

Domain:

sdtest.local

Ip:

213.179.59.97

User:

sdes

Password:

*********


Verify the user system

If you are using Microsoft Active Directory go to “End users” and verify that you can list users. If you can’t you may have to go over the settings for your ad again. You will find this settings as “User systems” in the main menu.

Enable users to login

Select End users, and select which users should have search access.

a

You have to active those users who should have access. Her you see the Under End users, all users in the primary user system are listed. Click on those users you want to activate, and remember to click "Update user access" when done.

Interacting directly with the ES

While the webgui is nice for doing small tasks and adding a line or two of code, you will probably need more direct access to the ES to do real work. First we will need ssh access to log in.

User systems

Users -> User systems
Back


        Click to enlarge.

To run Searchdaimon ES, you need at least a primary user system. Many companies already run Microsoft Active Directory. It is also possible to install secondary user systems, and map their users to the primary system.

 

User system: Don’t use a user system or single sign-on

If you don’t have a Microsoft Active directory you can just name it "Fake ad" and use the following values to get true this step.

Setings

Exemple

Domain:

localhost.local

Ip:

127.0.0.1

User:

Admin

Password:

12345678


Using thus values no end-users can log in, but you can allow everyone search access, as described in the faq at Searching without logging in (anonymous user) .

Getting ssh access for the “boitho” user

Step 1. Setting a password for the boitho user

Log on the the ES as root and execute:

	passwd boitho

Follow the instructions on the screen to setup a password.

 

Step 2. Configure ssh

Open the file /etc/ssh/sshd_config and find whers it sess "PasswordAuthentication no" and change it to "PasswordAuthentication yes".

	nano /etc/ssh/sshd_config

 

Restart ssh:

	/etc/init.d/sshd restart

 

You should now be able to login using the password you provided in step 1.

Settings

Configuration -> Settings
Back

Main settings


        Click to enlarge.

In the Main settings tab, you can change license or administrator password. It is recommended to run the production version of Searchdaimon ES, but it is also possible to run the testing version. The development version is available by agreement only.

Collection manager


        Click to enlarge.

Crawling and recrawling can take up a lot of resources. Every collection is scanned for new content. This means that every file on your server is checked for changes, new e-mail gets downloaded, and other content is scanned for updates. This can slow down performance of your content servers while crawling is active. If this is an issue for you, we recommend limiting crawling to nighttime.

Advanced settings


        Click to enlarge.

These are values used internally in the search engine. Do not make changes here unless you know what you are doing.

Network configuration


        Click to enlarge.

In this tab you can change the servers network settings.

 

Running the crawler from the connsole

First stop the crawler by execute:

	/etc/init.d/crawlManager stop

To correctly execut the crawler you need to setup the BOITHOHOME environmental path and be in the correct folder. So

	export BOITHOHOME=/home/boitho/boithoTools
	cd /home/boitho/boithoTools/

You can then run it with:

	bin/crawlManager2

You can then sent command to the crawler to recrawl, crawl, delete etc from the web based administration interface. Be aware that this only work with crawlers that you run from the main "Overview" part of the administration interface. Crawling jobs you run from the "Connectors->Modify" will redirect output to the administration interface, and show nothing on the console.

Perl based crawlers is located as a file called main.pm in the folder /home/boitho/boithoTools/crawlers/[crawler name]/ . For example you MyTwitter crawler from the "Example: A Twitter connector" article above, should be in /home/boitho/boithoTools/crawlers/MyTwitter/main.pm . If you set the file permissions to 777 you can edit it using your favorite text editor from the console.

	su -
	chmod 777 /home/boitho/boithoTools/crawlers/Zendesk/main.pm
	exit

Statistics & logs

System -> Statistics & Logs
Back


        Click to enlarge.

You can see which users are the most active, what the most popular queries are, and how many searches are performed every day.


        Click to enlarge.

These are log files for running processes, and are meant for debugging.


        Click to enlarge.

The query log shows the last 50 searches.

 

Add collections

You can add collections manually, or by scanning.

Tips: If your active directory is sdtest.local and your username is sdes most servers need your username to be sdtest\sdes . Mark thet it is "\", not "/".

Phone home

Help -> Phone home
Back


        Click to enlarge.

 

If you need help, our job may become easier if you activate Phone home. Then it will be possible for us to log in and perform maintenance if necessary. Contact us before you activate it.

 

Use a webbrowser to log in as a normal user at http://{IP-address}/. or http://{IP-address}/public if you have enabled public search in any collections.

<< Back to documentation overview

Copyright © Searchdaimon AS. All rights reserved.