The SpamFilter is a JSPWiki filter that can be used to block questionable edits.  This filter is available since JSPWiki 2.1.117 as a [CoreFilter]. [http://doc.jspwiki.org/2.4/wiki/SpamFilter] contains the most up-to- date instructions.

!Parameters

;__wordlist__: The name of the WikiPage on which the word list resides.  Default is "~SpamFilterWordList".
;__errorpage__: The name of the page to which the user is redirected, if the edit contains a matched word.  On that page, the variable [[{$msg}] is available, telling the reason.  Default is [RejectedMessage].
;__blacklist__: Name of the attachment that contains a blacklist, where each line is interpreted as a pattern to check against.  Any lines starting with "#" are ignored as comments. (Since 2.3.98)

!The word list

The SpamFilter looks at the [WikiVariable] called 'spamwords' on the ''wordlist'' page.  This must contain a space-separated list of words not allowed in a page.  In fact, each word is a full Perl5 regular expression, so you can do pretty complex matches as well.

Of course, it is a good idea to allow only trusted users to edit the ''wordlist'' page.  Otherwise a spammer can remove the list...

!Example

Put the following in your filters.xml file (See [PageFilter Configuration] for more information):
{{{
    <filter>
      <class>com.ecyrd.jspwiki.filters.SpamFilter</class>
    </filter>
}}}

to start the filter.  Create a page called "~SpamFilterWordList" and put the following on it:
{{{
[{SET spamwords='vaigra money'}]
}}}
to prevent anyone from saving a page that contains either the word "vaigra" or "money".  In a bit more complicated example:
{{{
[{SET spamwords='[vV][aA][iI][gG][rR][aA]'}]
}}}
would block the words "vaigra", "Vaigra", "vAIGra" and so on.

(The word "vaigra" is misspelled on purpose, because otherwise it would be caught in the spam trap...)

----
Q. Would it be possible to remove the changes in ~SpamWordFilterList from the RecentChanges page and RSS feeds? \\
Not to be picky, but by including those pages you are doing "spamming by side-effect"... :-)
 -- NascifAbousalhNeto

----
Nascif, we might look into including an {{exclude}} parameter on the RecentChangesPlugin. You might submit that as an idea to keep it visible.

-- MurrayAltheim

Q.  How do you set the Captcha which is available in 2.6.  I have searched around and cannot see much to help with this.

--Elrond

A: The Captcha is fully automated.

----

Q. Why does the SpamFilter for this site reject gmaildotcom (with a"." instead of dot")? I can't even register an account with the site using my gmail email address.

-- JonHanson

A. Because gmail became a spammer haven after someone cracked their captcha.  We got a few thousand bot registrations here one day...  But it's enabled again now, since it seems that they've fixed the captcha.


----

There is a conceptual problem with [SpamFilterWordList] and the attached [SpamFilterWordList/blacklist.txt]:

__Spamers can easily locate, download and analyze the blacklist definitions and then adjust their spamming strategies.__

Based on what I can observe on my own JSPWiki site [http://km-works.eu/mathel-wiki] they indeed take this opportunity regulary. 

Now I found a simple and effective solution for this problem: Rename the blacklist definition file to some nonsense name with an image mime-type, e.g. {{hjg451234gkl.jpg}}, which is of course wrong and misleading to JSPWiki.

The wrong mime-type effectively hides the attachment from being indexed and viewed from the wiki system, because JSPWiki cannot recognize it as a valid image file anymore (and lucene just wont index image files).

Now adjust your spam filter definition to reflect the new name for the blacklist, like:

{{{
    <filter>
      <class>com.ecyrd.jspwiki.filters.SpamFilter</class>
      <param>
         <name>blacklist</name>
         <value>hjg451234gkl.jpg</value>
      </param>
}}}

Note that, although the file extension is misleading, the SpamFilter plugin can still access the content of the blacklist file.


--ChristianLerch, 09-Apr-2011 08:18