comparison sites/all/modules/custom/solrconnect/README.txt @ 0:015d06b10d37 default tip

initial
author dwinter
date Wed, 31 Jul 2013 13:49:13 +0200
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:015d06b10d37
1
2 This module integrates Drupal with the Apache Solr search platform. Solr search
3 can be used as a replacement for core content search and boasts both extra
4 features and better performance. Among the extra features is the ability to have
5 faceted search on facets ranging from content author to taxonomy to arbitrary
6 Field API fields.
7
8 The module comes with a schema.xml, solrconfig.xml, and protwords.txt file which
9 must be used in your Solr installation.
10
11 This module depends on the search framework in core. When used in combination
12 with core search module, Apache Solr is not the default search. Access it via a
13 new tab on the default search page, called "Site". You may configure it
14 to be default at ?q=admin/config/search/settings
15
16 Updating from 6.x
17 -----------------
18
19 IMPORTANT: there is no upgrade path from 6.x-1.x or 6.x-2.x. If you previously
20 installed those modules you must disable and uninstall them prior to
21 installing 7.x-1.x.
22
23 You will have to install the new schema.xml and solrconfig.xml files, and restart
24 the Solr server (or core) and delete your index and reindex all content.
25
26 Installation
27 ------------
28
29 Prerequisite: Java 5 or higher (a.k.a. 1.5.x). PHP 5.2.4 or higher.
30
31 Install the Apache Solr Drupal module as you would any Drupal module. Note
32 that the Drupal 7.x-1.x branch does not require the SolrPhpClient to
33 be installed. All necessary code is now included with this module.
34
35 Before enabling the module, you must have a working Solr server, or be
36 subscribed to a service like Acquia Search.
37
38 The Debian/Ubuntu packages for Solr should NOT be used to install Solr.
39 For example, do NOT install the solr or solr-jetty packages.
40
41 Download the latest Solr 1.4.x or 3.x release (e.g. 1.4.1 or 3.6.1) from:
42 http://www.apache.org/dyn/closer.cgi/lucene/solr/
43
44 Apache Lucene 3.1, 3.2 or 3.3, have a possible index corruption bug on
45 server crash or power loss (LUCENE-3418) and have bugs that interfere
46 with the Drupal admin reports. Solr 3.4 has a problem with
47 SortMissingLast so Solr 3.5.0 or later is strongly preferred.
48
49 Unpack the Solr tarball somewhere not visible to the web (not in your
50 webserver docroot and not inside of your Drupal directory).
51
52 The Solr download comes with an example application that you can use for
53 testing, development, and even for smaller production sites. This
54 application is found at apache-solr-1.4.1/example.
55
56 You must use 3 Solr configuration files that come with the Drupal
57 module or the integration will not work correctly.
58
59 For Solr 1.4 use the ones found in:
60 solr-conf/solr-1.4/
61
62 for Solr 3.5.0 or 3.6.1 use:
63 solr-conf/solr-3.x/
64
65 While the Solr 1.4 files will work for Solr 3.5+, they are not optimal
66 and you will be missing important new features.
67
68 For example, when deploying solr 1.4:
69
70 Move apache-solr-1.4.1/example/solr/conf/schema.xml and rename it to
71 something like schema.bak. Then move the solr-conf/solr-1.4/schema.xml
72 that comes with this Drupal module to take its place.
73
74 Similarly, move apache-solr-1.4.1/example/solr/conf/solrconfig.xml and rename
75 it like solrconfig.bak. Then move the solr-conf/solr-1.4/solrconfig.xml
76 that comes with this module to take its place.
77
78 Finally, move apache-solr-1.4.1/example/solr/conf/protwords.txt and rename it
79 protwords.bak. Then move the solr-conf/solr-1.4/protwords.txt that comes
80 with this module to take its place.
81
82 Make sure that the conf directory includes the following files - the Solr core
83 may not load if you don't have at least an empty file present:
84 solrconfig.xml
85 schema.xml
86 elevate.xml
87 mapping-ISOLatin1Accent.txt
88 protwords.txt
89 stopwords.txt
90 synonyms.txt
91
92 Now start the solr application by opening a shell, changing directory to
93 apache-solr-1.4.1/example, and executing the command java -jar start.jar
94
95 Test that your solr server is now available by visiting
96 http://localhost:8983/solr/admin/
97
98 Now, you should enable the "Apache Solr framework" and "Apache Solr search"
99 modules. Check that you can connect to Solr at ?q=admin/setting/apachesolr
100 Now run cron on your Drupal site until your content is indexed. You
101 can monitor the index at ?q=admin/settings/apachesolr/index
102
103 The solrconfig.xml that comes with this modules defines auto-commit, so
104 it may take a few minutes between running cron and when the new content
105 is visible in search.
106
107 To use facets you should download facetapi http://drupal.org/project/facetapi
108 This module will allow you to define and set facets next to your search pages.
109 Once this module is enabled, enable blocks for facets first at
110 Administer > Site configuration > Apache Solr > Enabled filters
111 then position them as you like at Administer > Site building > Blocks.
112
113 Settings.php
114 ------------
115 You can override environment settings using the following syntax in your
116 settings.php
117
118 $conf['apachesolr_environments']['my_env_id']['url'] = 'http://localhost:8983';
119
120 Configuration variables
121 -----------------------
122
123 The module provides some (hidden) variables that can be used to tweak its
124 behavior:
125
126 - apachesolr_luke_limit: the limit (in terms of number of documents in the
127 index) above which the module will not retrieve the number of terms per field
128 when performing LUKE queries (for performance reasons).
129
130 - apachesolr_tags_to_index: the list of HTML tags that the module will index
131 (see apachesolr_add_tags_to_document()).
132
133 - apachesolr_exclude_nodeapi_types: an array of node types each of which is
134 an array of one or more module names, such as 'comment'. Any type listed
135 will have any listed modules' hook_node_update_index() implementation skipped
136 when indexing. This can be useful for excluding comments or taxonomy links.
137
138 - apachesolr_ping_timeout: the timeout (in seconds) after which the module will
139 consider the Apache Solr server unavailable.
140
141 - apachesolr_optimize_interval: the interval (in seconds) between automatic
142 optimizations of the Apache Solr index. Set to 0 to disable.
143
144 - apachesolr_cache_delay: the interval (in seconds) after an update after which
145 the module will requery the Apache Solr for the index structure. Set it to
146 your autocommit delay plus a few seconds.
147
148 - apachesolr_query_class: the default query class to use.
149
150 - apachesolr_index_comments_with_node: TRUE | FALSE. Whether to index comments
151 along with each node.
152
153 - apachesolr_cron_mass_limit: update or delete at most this many documents in
154 each Solr request, such as when making {apachesolr_search_node} consistent
155 with {node}.
156
157 - apachesolr_index_user: Define with which user you want the index process to
158 happen.
159
160 Troubleshooting
161 ---------------
162 Problem:
163 You use http basic auth to limit access to your Solr server.
164
165 Solution:
166 Set the Server URL to include the username and password like
167 http://username:password@example.com:8080/solr
168
169 Problem:
170 Links to nodes appear in the search results with a different host name or
171 subdomain than is preferred. e.g. sometimes at http://example.com
172 and sometimes at http://www.example.com
173
174 Solution:
175 Set $base_url in settings.php to insure that an identical absolute url is
176 generated at all times when nodes are indexed. Alternately, set up a re-direct
177 in .htaccess to prevent site visitors from accessing the site via more than one
178 site address.
179
180 Problem:
181 The 'Solr Index Queries' test fails with file permission errors.
182
183 Solution:
184 When running this test you should have your tomcat/jetty running as the same user
185 as the user under which PHP runs (often the same as the webserver). This is
186 important because of the on-the-fly folder creation within PHP.
187
188
189 Themers
190 ----------------
191
192 See inline docs in apachesolr_theme and apachesolr_search_theme functions
193 within apachesolr.module and apachesolr_search.module.
194