Cool stuff about Drupal search I learned from Robert Douglass
I've been a Robert Douglass fan for a long time. My first Drupal book was penned by Robert. I learned from him then and continue to pick up bits of wisdom from his ongoing stream of articles and presentations.
Here's a list of ways to improve Drupal search that I've compiled from things I've learned from Robert over the past two years:
Install the Porter-Stemmer Module [1]
From the project page: The Porter-Stemmer module reduces each word in the index to its basic root or stem (e.g. 'blogging' to 'blog') so that variations on a word ('blogs', 'blogger', 'blogging', 'blog') are considered equivalent when searching. This generally results in more relevant results.
Adjust the Content Ranking Weights [2]
Out of the box, results on Drupal's search results page are equally weighted. You can change what's important to you (and your users) by tweaking the weights found at bottom of the /admin/settings/search/ page.
| The default settings: | Where I like to set them: YMMV | ||
|---|---|---|---|
| Factor | Weight | Factor | Weight |
| Keyword relevance | 5 | Keyword relevance | 7 |
| Recently Posted | 5 | Recently Posted | 5 |
| Number of comments | 5 | Number of comments | 4 |
| Number of views | 5 | Number of views | 6 |
Get to Know Views Fast Search [3]
Use Views Fast Search to build blindingly fast, incredibly flexible search pages. In addition to being a really fast search replacement, Fast Search allows you to build user-friendly and highly functional search forms. On Symantec's "Juice" site, we use this form to drive the content finder in the "Juice Cellar".
Use H1 Tags Sparingly [4]
When you mark up a story (using HTML) on a Drupal site, the indexing engine reads your markup and applies a "multiplier" (a measure of importance) to each word. When a user does a search, words you've marked as important play a greater role in whether or not a story shows up high in the search results. Here's a list of tags and their multipliers.
- h1 => 25
- h2 -> 18
- h3 -> 15
- h4 -> 12
- h5 -> 9
- h6 -> 6
- u -> 3
- b -> 3
- i -> 3
- strong -> 3
- em -> 3
- a -> 10
Each time you wrap a heading in <h1> tags, you're telling the search indexer that each word in the heading is 25 times more important than the normal words in your story. Note to self, "engage brain before attempting markup".
Drupal already wraps the page title in h1 tags (in most of the themes I'm aware of). Maybe one phrase holding very important words is enough -- use h1 tags with moderation.
Use Search 404 Module to Help the Lost Find Their Way [5]
At the risk of looking (even more) like a stalker, here's one more tip. I was reading a Planet Drupal post that linked to Robert's site. When I attempted to follow the link, I ran into a 404 page but this one was fairly cool (as far as 404 pages go). It actually used words from the URL to perform a search and help me find the missing page by displaying a list of search results instead of the less-than-helpful "404 page not found".
When I emailed Robert to see what kind of custom magic he was working, he responded with "Search 404 Module".
+++++++++++++++++++++++++
[1] From the 2007 OSCMS Summit talk, "Drupal Search: Core API; CCK and Views".
[2] Drupal's search module and scoring factors
[3] Custom search forms with Views and Fastsearch
[4] Lullabot Training, Portland
[5] robshouse.net
- Printer-friendly version
- Send to friend
- Kevin Millecam's blog
- Login or register to post comments
- 11322 reads

Comments
We can only expect the best
We can only expect the best from Drupal, even some major search engines could search improvements based on Drupal search. I realized now that there is a lot of things I don't know yet about cms, Drupal is a little more specific as a web content management system, that's why you guys make books about it.
Nice job and a brilliant
Nice job and a brilliant post!!! I did some stuff with views_fast search module last night and was amazed. It saved me! I will definitely look through this later this afternoon. Good stuff here. I should pick-up that book.
Help for all
This was very helpful, I am so pumped that drupal has search weights, I have NEVER seen that in ANY CMS before.... I've been reading so much about this the last few nights , it's incredible we were considering developing our own overseas... whichever way we go, regardless search weight's is phenomenal! I wonder if it's a spinoff of the site search from google, or the google hardware appliance. I've never seen it but sounds like something only they would have (EXCEPT DRUPAL lol)
I think it is better to use
I think it is better to use a google custom search or similar. However I will give this approach a try too.
Good article.
reply
I installed the weight module, assigned weight to a node-type and some of its nodes, but in the search results the nodes still turn up lower than other nodes. Should I disable the weighting options on the search settings page? Or are there other things I should change for this to work?
Great post
Hey,
This is a great post for those who are new to the system.
One thing I have to add is that I would give the 'Number of comments' the weight '3'. The number of comments doesn't say much about the quality or importants of a post.
reply
I am wondering is there a way to control what (or which part) of content being indexed? Looks like drupal search.module always do a full text index on all content types? I'd like to see, e.g:1. Prevent indexing on certain content types.2. Index only node title or teaser, not full text
indexing Drupal
Hi If you are using Drupal 5.x or a newer version it has a built in robots.txt file which you can use to control what gets indexed by the search engines, its particularly good for avoiding duplicate content. Please not if you are running multiple sites you will need a separate robots.txt file for each site.
Thanks!
Thanks for this -- I had some notes from a presentation Rob gave that I promptly lost, and I think you covered a lot of what I was trying to remember... saved by the internet! ;-)
Live Search
I've given in to the temptation of doing some shameless advertising. Live Search integrates nicely with Drupal search should anyone want any kind of hip, Ajaxy live search on their site.
Nice Job
Great list, Kevin. I did some stuff with views_fastsearch module last night and was amazed. It saved me! I will definitely look through this later this afternoon. Good stuff here. I should pick-up that book.
See you at the next meet-up.
Andrew
Nice list =)
One more that I hope you'll add in the future:
The new ApacheSolr search module is a great beginning to a wonderful relationship between Drupal and Solr. If you watch the video of my ApacheSolr FOSDEM presentation, you'll note that I repeatedly call for help. The goodies that Solr offers are worth it, now I just need the help of a bunch of smart Drupal developers to help whip it into shape and make it shine.
Here's a video about Acquia Search/Apache Solr in action
For anyone who's interested in how the Apache Solr module enhances searching by providing faceted searching, I've posted a Drupal search video on GotDrupal.com.