Tomcat 6 et SOL’R 4.3

This is an example for a connection between a MySQL / MariaDB database and a SOL’R server

1 – Download latest version of sol’r

from : http://lucene.apache.org/solr/

for my installation, i downloaded http://mirror.cogentco.com/pub/apache/lucene/solr/4.3.0/solr-4.3.0.tgz
into /tmp

> tar zxvf /tmp/solr-4.3.0.tgz

 

2 – Install TOMCAT 6

as it is written in http://www.arunchinnachamy.com/apache-solr-installation-and-configuration/

> sudo apt-get install tomcat6 tomcat6-admin
> cp /tmp/solr-4.3.0/dist/solr-4.3.0.war /var/lib/tomcat6/webapps/
> cp  -fr /tmp/solr-4.3.0/example/solr /var/lib/tomcat6/

 

3 – Install corrects directories and  .jar into TOMCAT directory

 

 

you have to download JDBC for MYSQL from

http://dev.mysql.com/downloads/connector/j (i downloaded mysql-connector-java-5.1.25-bin.jar)

lucene and solr jar files

you may have errors into /var/log/tomcat6/localhost.* like this one « org.apache.solr.common.SolrException: Could not find necessary SLF4j logging jars »   you have several .jar files into /tmp/solr-4.3.0/dist/solrj-lib and /tmp/solr-4.3.0/example/solr-webapp/webapp/WEB-INF/lib/ you should copy them into

 

results

/var/lib/tomcat6/webapps/solr/WEB-INF/lib/ here is the file list i have now :

commons-cli-1.2.jar
commons-codec-1.7.jar
commons-fileupload-1.2.1.jar
commons-io-2.1.jar
commons-lang-2.6.jar
guava-13.0.1.jar
httpclient-4.2.3.jar
httpcore-4.2.2.jar
httpmime-4.2.3.jar
jcl-over-slf4j-1.6.6.jar
jul-to-slf4j-1.6.6.jar
log4j-1.2.16.jar
lucene-analyzers-common-4.3.0.jar
lucene-analyzers-kuromoji-4.3.0.jar
lucene-analyzers-phonetic-4.3.0.jar
lucene-codecs-4.3.0.jar
lucene-core-4.3.0.jar
lucene-grouping-4.3.0.jar
lucene-highlighter-4.3.0.jar
lucene-memory-4.3.0.jar
lucene-misc-4.3.0.jar
lucene-queries-4.3.0.jar
lucene-queryparser-4.3.0.jar
lucene-spatial-4.3.0.jar
lucene-suggest-4.3.0.jar
mysql-connector-java-5.1.25-bin.jar
noggit-0.5.jar
org.restlet-2.1.1.jar
org.restlet.ext.servlet-2.1.1.jar
slf4j-api-1.6.6.jar
slf4j-log4j12-1.6.6.jar
solr-analysis-extras-4.3.0.jar
solr-cell-4.3.0.jar
solr-clustering-4.3.0.jar
solr-core-4.3.0.jar
solr-dataimporthandler-4.3.0.jar
solr-dataimporthandler-extras-4.3.0.jar
solr-langid-4.3.0.jar
solr-solrj-4.3.0.jar
solr-test-framework-4.3.0.jar
solr-uima-4.3.0.jar
solr-velocity-4.3.0.jar
spatial4j-0.3.jar
wstx-asl-3.2.7.jar
zookeeper-3.4.5.jar

 

4 – Configure SOL’R FOR Multi Core

I choose to create a SOL’R multi core version in order to be able to add several collections when i want

 

/var/lib/tomcat6/solr/solr.xml
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="false">
  <cores adminPath="/admin/cores" defaultCoreName="core1">
   <core name="core1" instanceDir="core1" />
   <core name="core2" instanceDir="core2" />
  </cores>
</solr>

but we also need to create each directory from core1 and core 2 from the « exemple » directory provided with sol’r
which is now into /var/lib/tomcat6/solr

> cd /var/lib/tomcat6/solr
> sudo cp -fr collection1 core1
> sudo cp -fr collection1 core2
> chown -R tomcat6:tomcat6 /var/lib/tomcat6/solr

 

5 – SOL’R should run with these 2 collections now

> sudo service tomcat6 restart

go to

http://localhost:8080/solr

verify you have

 

6 – MySQL Part of the Installation

In this tutorial, we’ll try to import data from a MYSQL Database into core1

> cd /var/lib/tomcat6/solr/core1/conf

data-config.xml

create a data-config.xml file like this

<dataConfig>
 <dataSource type="JdbcDataSource"
   driver="com.mysql.jdbc.Driver"
   url="jdbc:mysql://yourmysqlserver/yourdatabase"
   user="youruser"
   password="yourpassword"
   batchSize="-1"
 />
 <document>
   <entity name="id"
    query="select uniq_id, title, description, url from yourtable;">
    <field column="uniq_id" name="id"/>
    <field column="title" name="title"/>
    <field column="description" name="description"/>
    <field column="url" name="url"/>
   </entity>
 </document>
</dataConfig>

Did you notice batchSize= »-1″ ? it is VERY Important for huge import of database !

schema.xml

you need here to specify which fiels will be indexed

the only fields i have into schema.xml are

...
<fields>
  <field name="_version_" type="long" indexed="true" stored="true"/>
   <field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/>

   <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
   <field name="title" type="text_general" indexed="true" stored="true" multiValued="true"/>
   <field name="description" type="text_general" indexed="true" stored="true"/>
   <field name="url" type="text_general" indexed="true" stored="true"/>

   <!-- Text fields from SolrCell to search by default in our catch-all field -->
   <copyField source="title" dest="text"/>
   <copyField source="description" dest="text"/>
   <copyField source="url" dest="text"/>
 </fields>
 <uniqueKey>id</uniqueKey>

If you want to find the same results when searching characters with accents (accentuated ?).

 <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
      </analyzer>
    </fieldType>

stopwords.txt

le
la
les
de
des
a
pour
avec
sans
en
du
au
et
ou
aux
un
votre
une
à

solrconfig.xml

 

between 2 others <requestHandlers> nodes, i specified
    <requestHandler name="/dataimport">
    <lst name="defaults">
      <str name="config">data-config.xml</str>
    </lst>
  </requestHandler>

restart tomcat6
> sudo service tomcat6 restart

 

Import data from MYSQL to SOL’R

go to http://yoursolrserver:8080/solr/#/core1/dataimport//dataimport

you can clean, import, optimize your data, it takes some times