From 2b9d4f86cd1155b548a92b4429f05ed36a7a7a79 Mon Sep 17 00:00:00 2001
From: Daniel Black <grooverdan@users.sourceforge.net>
Date: Sun, 29 Dec 2013 07:26:41 +0000
Subject: [PATCH] DOC: filter documentation was bigger than DEVELOP so
 separated it out. Hopefully it may get read more

---
 DEVELOP  | 460 +----------------------------------------------------
 FILTERS  | 469 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 MANIFEST |   1 +
 3 files changed, 471 insertions(+), 459 deletions(-)
 create mode 100644 FILTERS

diff --git a/DEVELOP b/DEVELOP
index 3939fe47..18d29ad4 100644
--- a/DEVELOP
+++ b/DEVELOP
@@ -34,465 +34,7 @@ When submitting pull requests on GitHub we ask you to:
 * Include a change to the relevant section of the ChangeLog; and
 * Include yourself in THANKS if not already there.
 
-Filters
-=======
-
-Filters are tricky. They need to:
-* work with a variety of the versions of the software that generates the logs;
-* work with the range of logging configuration options available in the
-  software;
-* work with multiple operating systems;
-* not make assumptions about the log format in excess of the software
-  (e.g. do not assume a username doesn't contain spaces and use \S+ unless
-  you've checked the source code);
-* account for how future versions of the software will log messages
-  (e.g. guess what would happen to the log message if different authentication
-  types are added);
-* not be susceptible to DoS vulnerabilities (see Filter Security below); and
-* match intended log lines only.
-
-Please follow the steps from Filter Test Cases to Developing Filter Regular
-Expressions and submit a GitHub pull request (PR) afterwards. If you get stuck,
-you can push your unfinished changes and still submit a PR -- describe
-what you have done, what is the hurdle, and we'll attempt to help (PR
-will be automagically updated with future commits you would push to
-complete it).
-
-Filter test cases
------------------
-
-Purpose:
-
-Start by finding the log messages that the application generates related to
-some form of authentication failure. If you are adding to an existing filter
-think about whether the log messages are of a similar importance and purpose
-to the existing filter. If you were a user of Fail2Ban, and did a package
-update of Fail2Ban that started matching new log messages, would anything
-unexpected happen?  Would the bantime/findtime for the jail be appropriate for
-the new log messages?  If it doesn't, perhaps it needs to be in a separate
-filter definition, for example like exim filter aims at authentication failures
-and exim-spam at log messages related to spam.
-
-Even if it is a new filter you may consider separating the log messages into
-different filters based on purpose.
-
-Cause:
-
-Are some of the log lines a result of the same action? For example, is a PAM
-failure log message, followed by an application specific failure message the
-result of the same user/script action?  If you add regular expressions for
-both you would end up with two failures for a single action.
-Therefore, select the most appropriate log message and document the other log
-message) with a test case not to match it and a description as to why you chose
-one over another.
-
-With the selected log lines consider what action has caused those log
-messages and whether they could have been generated by accident? Could
-the log message be occurring due to the first step towards the application
-asking for authentication? Could the log messages occur often? If some of
-these are true make a note of this in the jail.conf example that you provide.
-
-Samples:
-
-It is important to include log file samples so any future change in the regular
-expression will still work with the log lines you have identified.
-
-The sample log messages are provided in a file under testcases/files/logs/
-named identically as the corresponding filter (but without .conf extension).
-Each log line should be preceded by a line with failJSON metadata (so the logs
-lines are tested in the test suite) directly above the log line. If there is
-any specific information about the log message, such as version or an
-application configuration option that is needed for the message to occur,
-include this in a comment (line beginning with #) above the failJSON metadata.
-
-Log samples should include only one, definitely not more than 3, examples of
-log messages of the same form. If log messages are different in different
-versions of the application log messages that show this are encouraged.
-
-Also attempt to inject an IP into the application (e.g. by specifying
-it as a username) so that Fail2Ban possibly detects the IP
-from user input rather than the true origin. See the Filter Security section
-and the top example in testcases/files/logs/apache-auth as to how to do this.
-One you have discovered that this is possible, correct the regex so it doesn't
-match and provide this as a test case with "match": false (see failJSON below).
-
-If the mechanism to create the log message isn't obvious provide a
-configuration and/or sample scripts testcases/files/config/{filtername} and
-reference these in the comments above the log line.
-
-FailJSON metadata:
-
-A failJSON metadata is a comment immediately above the log message. It will
-look like:
-
-# failJSON: { "time": "2013-06-10T10:10:59", "match": true , "host": "93.184.216.119" }
-
-Time should match the time of the log message. It is in a specific format of
-Year-Month-Day'T'Hour:minute:Second.  If your log message does not include a
-year, like the example below, the year should be listed as 2005, if before Sun
-Aug 14 10am UTC, and 2004 if afterwards.  Here is an example failJSON
-line preceding a sample log line:
-
-# failJSON: { "time": "2005-03-24T15:25:51", "match": true , "host": "198.51.100.87" }
-Mar 24 15:25:51 buffalo1 dropbear[4092]: bad password attempt for 'root' from 198.51.100.87:5543
-
-The "host" in failJSON should contain the IP or domain that should be blocked.
-
-For long lines that you do not want to be matched (e.g. from log injection
-attacks) and any log lines to be excluded (see "Cause" section above), set
-"match": false in the failJSON and describe the reason in the comment above.
-
-After developing regexes, the following command will test all failJSON metadata
-against the log lines in all sample log files
-
-./fail2ban-testcases testSampleRegex
-
-Developing Filter Regular Expressions
--------------------------------------
-
-Date/Time:
-
-At the moment, Fail2Ban depends on log lines to have time stamps.  That is why
-before starting to develop failregex, check if your log line format known to
-Fail2Ban.  Copy the time component from the log line and append an IP address to
-test with following command:
-
-./fail2ban-regex "2013-09-19 02:46:12 1.2.3.4" "<HOST>"
-
-Output of such command should contain something like:
-
-Date template hits:
-|- [# of hits] date format
-|  [1] Year-Month-Day Hour:Minute:Second
-
-Ensure that the template description matches time/date elements in your log line
-time stamp.  If there is no matched format then date template needs to be added
-to server/datedetector.py.  Ensure that a new template is added in the order
-that more specific matches occur first and that there is no confusion between a
-Day and a Month.
-
-Filter file:
-
-The filter is specified in a config/filter.d/{filtername}.conf file. Filter file
-can have sections INCLUDES (optional) and Definition as follows:
-
-[INCLUDES]
-
-before = common.conf
-
-after = filtername.local
-
-[Definition]
-
-failregex = ....
-
-ignoreregex = ....
-
-This is also documented in the man page jail.conf (section 5). Other definitions
-can be added to make failregex's more readable and maintainable to be used
-through string Interpolations (see http://docs.python.org/2.7/library/configparser.html)
-
-
-General rules:
-
-Use "before" if you need to include a common set of rules, like syslog or if
-there is a common set of regexes for multiple filters.
-
-Use "after" if you wish to allow the user to overwrite a set of customisations
-of the current filter. This file doesn't need to exist.
-
-Try to avoid using ignoreregex mainly for performance reasons. The case when you
-would use it is if in trying to avoid using it, you end up with an unreadable
-failregex.
-
-Syslog:
-
-If your application logs to syslog you can take advantage of log line prefix
-definitions present in common.conf.  So as a base use:
-
-[INCLUDES]
-
-before = common.conf
-
-[Definition]
-
-_daemon = app
-
-failregex = ^%(__prefix_line)s
-
-In this example common.conf defines __prefix_line which also contains the
-_daemon name (in syslog terms the service) you have just specified. _daemon
-can also be a regex.
-
-For example, to capture following line _daemon should be set to "dovecot"
-
-Dec 12 11:19:11 dunnart dovecot: pop3-login: Aborted login (tried to use disabled plaintext auth): rip=190.210.136.21, lip=113.212.99.193
-
-and then ^%(__prefix_line)s would match "Dec 12 11:19:11 dunnart dovecot:
-". Note it matches the trailing space(s) as well.
-
-Substitutions (AKA string interpolations):
-
-We have used string interpolations in above examples.  They are useful for
-making the regexes more readable, reuse generic patterns in multiple failregex
-lines, and also to refer definition of regex parts to specific filters or even
-to the user.  General principle is that value of a _name variable replaces
-occurrences of %(_name)s within the same section or anywhere in the config file
-if defined in [DEFAULT] section.
-
-Regular Expressions:
-
-Regular expressions (failregex, ignoreregex) assume that the date/time has been
-removed from the log line (this is just how fail2ban works internally ATM).
-
-If the format is like '<date...> error 1.2.3.4 is evil' then you need to match
-the < at the start so regex should be similar to '^<> <HOST> is evil$' using
-<HOST> where the IP/domain name appears in the log line.
-
-The following general rules apply to regular expressions:
-
-* ensure regexes start with a ^ and are as restrictive as possible. E.g. do not
-  use .* if \d+ is sufficient;
-* use functionality of Python regexes defined in the standard Python re library
-  http://docs.python.org/2/library/re.html;
-* make regular expressions readable (as much as possible). E.g.
-  (?:...) represents a non-capturing regex but (...) is more readable, thus
-  preferred.
-
-If you have only a basic knowledge of regular repressions we advise to read
-http://docs.python.org/2/library/re.html first.  It doesn't take long and would
-remind you e.g. which characters you need to escape and which you don't.
-
-Developing/testing a regex:
-
-You can develop a regex in a file or using command line depending on your
-preference. You can also use samples you have already created in the test cases
-or test them one at a time.
-
-The general tool for testing Fail2Ban regexes is fail2ban-regex. To see how to
-use it run:
-
-./fail2ban-regex --help
-
-Take note of  -l heavydebug  / -l debug  and -v as they might be very useful.
-
-TIP: Take a look at the source code of the application you are developing
-     failregex for. You may see optional or extra log messages, or parts there
-     of, that need to form part of your regex.  It may also reveal how some
-     parts are constrained and different formats depending on configuration or
-     less common usages.
-
-TIP: For looking through source code - http://sourcecodebrowser.com/ . It has
-     call graphs and can browse different versions.
-
-TIP: Some applications log spaces at the end. If you are not sure add \s*$ as
-     the end part of the regex.
-
-If your regex is not matching, http://www.debuggex.com/?flavor=python can help
-to tune it.  fail2ban-regex -D ...  will present Debuggex URLs for the regexs
-and sample log files that you pass into it.
-
-In general use when using regex debuggers for generating fail2ban filters:
-* use regex from the ./fail2ban-regex output (to ensure all substitutions are
-done)
-* replace <HOST> with (?&.ipv4)
-* make sure that regex type set to Python
-* for the test data put your log output with the date/time removed
-
-When you have fixed the regex put it back into your filter file.
-
-Please spread the good word about Debuggex - Serge Toarca is kindly continuing
-its free availability to Open Source developers.
-
-Finishing up:
-
-If you've added a new filter, add a new entry in config/jail.conf. The theory
-here is that a user will create a jail.local with [filtername]\nenable=true to
-enable your jail.
-
-So more specifically in the [filter] section in jail.conf:
-* ensure that you have "enabled = false" (users will enable as needed);
-* use "filter =" set to your filter name;
-* use a typical action to disable ports associated with the application;
-* set "logpath" to the usual location of application log file;
-* if the default findtime or bantime isn't appropriate to the filter, specify
-  more appropriate choices (possibly with a brief comment line).
-
-Submit github pull request (See "Pull Requests" above) for
-github.com/fail2ban/fail2ban containing your great work.
-
-Filter Security
----------------
-
-Poor filter regular expressions are susceptible to DoS attacks.
-
-When a remote user has the ability to introduce text that would match filter's
-failregex, while matching inserted text to the <HOST> part, they have the
-ability to deny any host they choose.
-
-So the <HOST> part must be anchored on text generated by the application, and
-not the user, to an extent sufficient to prevent user inserting the entire text
-matching this or any other failregex.
-
-Ideally filter regex should anchor at the beginning and at the end of log line.
-However as more applications log at the beginning than the end, anchoring the
-beginning is more important. If the log file used by the application is shared
-with other applications, like system logs, ensure the other application that use
-that log file do not log user generated text at the beginning of the line, or,
-if they do, ensure the regexes of the filter are sufficient to mitigate the risk
-of insertion.
-
-
-Examples of poor filters
-------------------------
-
-1. Too restrictive
-
-We find a log message:
-
-    Apr-07-13 07:08:36 Invalid command fial2ban from 1.2.3.4
-
-We make a failregex
-
-    ^Invalid command \S+ from <HOST>
-
-Now think evil. The user does the command 'blah from 1.2.3.44'
-
-The program diligently logs:
-
-    Apr-07-13 07:08:36 Invalid command blah from 1.2.3.44 from 1.2.3.4
-
-And fail2ban matches 1.2.3.44 as the IP that it ban. A DoS attack was successful.
-
-The fix here is that the command can be anything so .* is appropriate.
-
-    ^Invalid command .* from <HOST>
-
-Here the .* will match until the end of the string. Then realise it has more to
-match, i.e. "from <HOST>" and go back until it find this. Then it will ban
-1.2.3.4 correctly. Since the <HOST> is always at the end, end the regex with a $.
-
-    ^Invalid command .* from <HOST>$
-
-Note if we'd just had the expression:
-
-    ^Invalid command \S+ from <HOST>$
-
-Then provided the user put a space in their command they would have never been
-banned.
-
-2. Unanchored regex can match other user injected data
-
-From the Apache vulnerability CVE-2013-2178
-( original ref: https://vndh.net/note:fail2ban-089-denial-service ).
-
-An example bad regex for Apache:
-
-    failregex = [[]client <HOST>[]] user .* not found
-
-Since the user can do a get request on:
-
-    GET /[client%20192.168.0.1]%20user%20root%20not%20found HTTP/1.0
-Host: remote.site
-
-Now the log line will be:
-
-    [Sat Jun 01 02:17:42 2013] [error] [client 192.168.33.1] File does not exist: /srv/http/site/[client 192.168.0.1] user root not found
-
-As this log line doesn't match other expressions hence it matches the above
-regex and blocks 192.168.33.1 as a denial of service from the HTTP requester.
-
-3.  Over greedy pattern matching
-
-From: https://github.com/fail2ban/fail2ban/pull/426
-
-An example ssh log (simplified)
-
-    Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser remoteuser
-
-As we assume username can include anything including spaces its prudent to put
-.* here. The remote user can also exist as anything so lets not make assumptions again.
-
-    failregex = ^%(__prefix_line)sFailed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$
-
-So this works. The problem is if the .* after remote user is injected by the
-user to be 'from 1.2.3.4'. The resultant log line is.
-
-    Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4
-
-Testing with:
-
-    fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$'
-
-TIP: I've removed the bit that matches __prefix_line from the regex and log.
-
-Shows:
-
-    1) [1] ^ Failed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$
-       1.2.3.4  Sun Sep 29 17:15:02 2013
-
-It should of matched 127.0.0.1. So the first greedy part of the greedy regex
-matched until the end of the string. The was no "from <HOST>" so the regex
-engine worked backwards from the end of the string until this was matched.
-
-The result was that 1.2.3.4 was matched, injected by the user, and the wrong IP
-was banned.
-
-The solution here is to make the first .* non-greedy with .*?. Here it matches
-as little as required and the fail2ban-regex tool shows the output:
-
-    fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .*? from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$'
-
-    1) [1] ^ Failed \S+ for .*? from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$
-       127.0.0.1  Sun Sep 29 17:15:02 2013
-
-So the general case here is a log line that contains:
-
-    (fixed_data_1)<HOST>(fixed_data_2)(user_injectable_data)
-
-Where the regex that matches fixed_data_1 is gready and matches the entire
-string, before moving backwards and user_injectable_data can match the entire
-string.
-
-Another case:
-
-ref: https://www.debuggex.com/r/CtAbeKMa2sDBEfA2/0
-
-A webserver logs the following without URL escaping:
-
-    [error] 2865#0: *66647 user "xyz" was not found in "/file", client: 1.2.3.1, server: www.host.com, request: "GET ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host", host: "www.myhost.com"
-
-regex:
-
-    failregex = ^ \[error\] \d+#\d+: \*\d+ user "\S+":? (?:password mismatch|was not found in ".*"), client: <HOST>, server: \S+, request: "\S+ .+ HTTP/\d+\.\d+", host: "\S+"
-
-The .* matches to the end of the string. Finds that it can't continue to match
-", client ... so it moves from the back and find that the user injected web URL:
-
-    ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host
-
-In this case there is a fixed host: "www.myhost.com" at the end so the solution
-is to anchor the regex at the end with a $.
-
-If this wasn't the case then first .* needed to be made so it didn't capture
-beyond <HOST>.
-
-4. Application generates two identical log messages with different meanings
-
-If the application generates the following two messages under different
-circumstances:
-
-    client <IP>: authentication failed
-    client <USER>: authentication failed
-
-
-Then it's obvious that a regex of "^client <HOST>: authentication
-failed$" will still cause problems if the user can trigger the second
-log message with a <USER> of 123.1.1.1.
-
-Here there's nothing to do except request/change the application so it logs
-messages differently.
-
+If you are developing filters see the FILTERS file for documentation.
 
 Code Testing
 ============
diff --git a/FILTERS b/FILTERS
new file mode 100644
index 00000000..fd441e58
--- /dev/null
+++ b/FILTERS
@@ -0,0 +1,469 @@
+                         __      _ _ ___ _
+                        / _|__ _(_) |_  ) |__  __ _ _ _
+                       |  _/ _` | | |/ /| '_ \/ _` | ' \
+                       |_| \__,_|_|_/___|_.__/\__,_|_||_|
+
+================================================================================
+Developing Filters
+================================================================================
+
+Filters
+=======
+
+Filters are tricky. They need to:
+* work with a variety of the versions of the software that generates the logs;
+* work with the range of logging configuration options available in the
+  software;
+* work with multiple operating systems;
+* not make assumptions about the log format in excess of the software
+  (e.g. do not assume a username doesn't contain spaces and use \S+ unless
+  you've checked the source code);
+* account for how future versions of the software will log messages
+  (e.g. guess what would happen to the log message if different authentication
+  types are added);
+* not be susceptible to DoS vulnerabilities (see Filter Security below); and
+* match intended log lines only.
+
+Please follow the steps from Filter Test Cases to Developing Filter Regular
+Expressions and submit a GitHub pull request (PR) afterwards. If you get stuck,
+you can push your unfinished changes and still submit a PR -- describe
+what you have done, what is the hurdle, and we'll attempt to help (PR
+will be automagically updated with future commits you would push to
+complete it).
+
+Filter test cases
+-----------------
+
+Purpose:
+
+Start by finding the log messages that the application generates related to
+some form of authentication failure. If you are adding to an existing filter
+think about whether the log messages are of a similar importance and purpose
+to the existing filter. If you were a user of Fail2Ban, and did a package
+update of Fail2Ban that started matching new log messages, would anything
+unexpected happen?  Would the bantime/findtime for the jail be appropriate for
+the new log messages?  If it doesn't, perhaps it needs to be in a separate
+filter definition, for example like exim filter aims at authentication failures
+and exim-spam at log messages related to spam.
+
+Even if it is a new filter you may consider separating the log messages into
+different filters based on purpose.
+
+Cause:
+
+Are some of the log lines a result of the same action? For example, is a PAM
+failure log message, followed by an application specific failure message the
+result of the same user/script action?  If you add regular expressions for
+both you would end up with two failures for a single action.
+Therefore, select the most appropriate log message and document the other log
+message) with a test case not to match it and a description as to why you chose
+one over another.
+
+With the selected log lines consider what action has caused those log
+messages and whether they could have been generated by accident? Could
+the log message be occurring due to the first step towards the application
+asking for authentication? Could the log messages occur often? If some of
+these are true make a note of this in the jail.conf example that you provide.
+
+Samples:
+
+It is important to include log file samples so any future change in the regular
+expression will still work with the log lines you have identified.
+
+The sample log messages are provided in a file under testcases/files/logs/
+named identically as the corresponding filter (but without .conf extension).
+Each log line should be preceded by a line with failJSON metadata (so the logs
+lines are tested in the test suite) directly above the log line. If there is
+any specific information about the log message, such as version or an
+application configuration option that is needed for the message to occur,
+include this in a comment (line beginning with #) above the failJSON metadata.
+
+Log samples should include only one, definitely not more than 3, examples of
+log messages of the same form. If log messages are different in different
+versions of the application log messages that show this are encouraged.
+
+Also attempt to inject an IP into the application (e.g. by specifying
+it as a username) so that Fail2Ban possibly detects the IP
+from user input rather than the true origin. See the Filter Security section
+and the top example in testcases/files/logs/apache-auth as to how to do this.
+One you have discovered that this is possible, correct the regex so it doesn't
+match and provide this as a test case with "match": false (see failJSON below).
+
+If the mechanism to create the log message isn't obvious provide a
+configuration and/or sample scripts testcases/files/config/{filtername} and
+reference these in the comments above the log line.
+
+FailJSON metadata:
+
+A failJSON metadata is a comment immediately above the log message. It will
+look like:
+
+# failJSON: { "time": "2013-06-10T10:10:59", "match": true , "host": "93.184.216.119" }
+
+Time should match the time of the log message. It is in a specific format of
+Year-Month-Day'T'Hour:minute:Second.  If your log message does not include a
+year, like the example below, the year should be listed as 2005, if before Sun
+Aug 14 10am UTC, and 2004 if afterwards.  Here is an example failJSON
+line preceding a sample log line:
+
+# failJSON: { "time": "2005-03-24T15:25:51", "match": true , "host": "198.51.100.87" }
+Mar 24 15:25:51 buffalo1 dropbear[4092]: bad password attempt for 'root' from 198.51.100.87:5543
+
+The "host" in failJSON should contain the IP or domain that should be blocked.
+
+For long lines that you do not want to be matched (e.g. from log injection
+attacks) and any log lines to be excluded (see "Cause" section above), set
+"match": false in the failJSON and describe the reason in the comment above.
+
+After developing regexes, the following command will test all failJSON metadata
+against the log lines in all sample log files
+
+./fail2ban-testcases testSampleRegex
+
+Developing Filter Regular Expressions
+-------------------------------------
+
+Date/Time:
+
+At the moment, Fail2Ban depends on log lines to have time stamps.  That is why
+before starting to develop failregex, check if your log line format known to
+Fail2Ban.  Copy the time component from the log line and append an IP address to
+test with following command:
+
+./fail2ban-regex "2013-09-19 02:46:12 1.2.3.4" "<HOST>"
+
+Output of such command should contain something like:
+
+Date template hits:
+|- [# of hits] date format
+|  [1] Year-Month-Day Hour:Minute:Second
+
+Ensure that the template description matches time/date elements in your log line
+time stamp.  If there is no matched format then date template needs to be added
+to server/datedetector.py.  Ensure that a new template is added in the order
+that more specific matches occur first and that there is no confusion between a
+Day and a Month.
+
+Filter file:
+
+The filter is specified in a config/filter.d/{filtername}.conf file. Filter file
+can have sections INCLUDES (optional) and Definition as follows:
+
+[INCLUDES]
+
+before = common.conf
+
+after = filtername.local
+
+[Definition]
+
+failregex = ....
+
+ignoreregex = ....
+
+This is also documented in the man page jail.conf (section 5). Other definitions
+can be added to make failregex's more readable and maintainable to be used
+through string Interpolations (see http://docs.python.org/2.7/library/configparser.html)
+
+
+General rules:
+
+Use "before" if you need to include a common set of rules, like syslog or if
+there is a common set of regexes for multiple filters.
+
+Use "after" if you wish to allow the user to overwrite a set of customisations
+of the current filter. This file doesn't need to exist.
+
+Try to avoid using ignoreregex mainly for performance reasons. The case when you
+would use it is if in trying to avoid using it, you end up with an unreadable
+failregex.
+
+Syslog:
+
+If your application logs to syslog you can take advantage of log line prefix
+definitions present in common.conf.  So as a base use:
+
+[INCLUDES]
+
+before = common.conf
+
+[Definition]
+
+_daemon = app
+
+failregex = ^%(__prefix_line)s
+
+In this example common.conf defines __prefix_line which also contains the
+_daemon name (in syslog terms the service) you have just specified. _daemon
+can also be a regex.
+
+For example, to capture following line _daemon should be set to "dovecot"
+
+Dec 12 11:19:11 dunnart dovecot: pop3-login: Aborted login (tried to use disabled plaintext auth): rip=190.210.136.21, lip=113.212.99.193
+
+and then ^%(__prefix_line)s would match "Dec 12 11:19:11 dunnart dovecot:
+". Note it matches the trailing space(s) as well.
+
+Substitutions (AKA string interpolations):
+
+We have used string interpolations in above examples.  They are useful for
+making the regexes more readable, reuse generic patterns in multiple failregex
+lines, and also to refer definition of regex parts to specific filters or even
+to the user.  General principle is that value of a _name variable replaces
+occurrences of %(_name)s within the same section or anywhere in the config file
+if defined in [DEFAULT] section.
+
+Regular Expressions:
+
+Regular expressions (failregex, ignoreregex) assume that the date/time has been
+removed from the log line (this is just how fail2ban works internally ATM).
+
+If the format is like '<date...> error 1.2.3.4 is evil' then you need to match
+the < at the start so regex should be similar to '^<> <HOST> is evil$' using
+<HOST> where the IP/domain name appears in the log line.
+
+The following general rules apply to regular expressions:
+
+* ensure regexes start with a ^ and are as restrictive as possible. E.g. do not
+  use .* if \d+ is sufficient;
+* use functionality of Python regexes defined in the standard Python re library
+  http://docs.python.org/2/library/re.html;
+* make regular expressions readable (as much as possible). E.g.
+  (?:...) represents a non-capturing regex but (...) is more readable, thus
+  preferred.
+
+If you have only a basic knowledge of regular repressions we advise to read
+http://docs.python.org/2/library/re.html first.  It doesn't take long and would
+remind you e.g. which characters you need to escape and which you don't.
+
+Developing/testing a regex:
+
+You can develop a regex in a file or using command line depending on your
+preference. You can also use samples you have already created in the test cases
+or test them one at a time.
+
+The general tool for testing Fail2Ban regexes is fail2ban-regex. To see how to
+use it run:
+
+./fail2ban-regex --help
+
+Take note of  -l heavydebug  / -l debug  and -v as they might be very useful.
+
+TIP: Take a look at the source code of the application you are developing
+     failregex for. You may see optional or extra log messages, or parts there
+     of, that need to form part of your regex.  It may also reveal how some
+     parts are constrained and different formats depending on configuration or
+     less common usages.
+
+TIP: For looking through source code - http://sourcecodebrowser.com/ . It has
+     call graphs and can browse different versions.
+
+TIP: Some applications log spaces at the end. If you are not sure add \s*$ as
+     the end part of the regex.
+
+If your regex is not matching, http://www.debuggex.com/?flavor=python can help
+to tune it.  fail2ban-regex -D ...  will present Debuggex URLs for the regexs
+and sample log files that you pass into it.
+
+In general use when using regex debuggers for generating fail2ban filters:
+* use regex from the ./fail2ban-regex output (to ensure all substitutions are
+done)
+* replace <HOST> with (?&.ipv4)
+* make sure that regex type set to Python
+* for the test data put your log output with the date/time removed
+
+When you have fixed the regex put it back into your filter file.
+
+Please spread the good word about Debuggex - Serge Toarca is kindly continuing
+its free availability to Open Source developers.
+
+Finishing up:
+
+If you've added a new filter, add a new entry in config/jail.conf. The theory
+here is that a user will create a jail.local with [filtername]\nenable=true to
+enable your jail.
+
+So more specifically in the [filter] section in jail.conf:
+* ensure that you have "enabled = false" (users will enable as needed);
+* use "filter =" set to your filter name;
+* use a typical action to disable ports associated with the application;
+* set "logpath" to the usual location of application log file;
+* if the default findtime or bantime isn't appropriate to the filter, specify
+  more appropriate choices (possibly with a brief comment line).
+
+Submit github pull request (See "Pull Requests" above) for
+github.com/fail2ban/fail2ban containing your great work.
+
+Filter Security
+---------------
+
+Poor filter regular expressions are susceptible to DoS attacks.
+
+When a remote user has the ability to introduce text that would match filter's
+failregex, while matching inserted text to the <HOST> part, they have the
+ability to deny any host they choose.
+
+So the <HOST> part must be anchored on text generated by the application, and
+not the user, to an extent sufficient to prevent user inserting the entire text
+matching this or any other failregex.
+
+Ideally filter regex should anchor at the beginning and at the end of log line.
+However as more applications log at the beginning than the end, anchoring the
+beginning is more important. If the log file used by the application is shared
+with other applications, like system logs, ensure the other application that use
+that log file do not log user generated text at the beginning of the line, or,
+if they do, ensure the regexes of the filter are sufficient to mitigate the risk
+of insertion.
+
+
+Examples of poor filters
+------------------------
+
+1. Too restrictive
+
+We find a log message:
+
+    Apr-07-13 07:08:36 Invalid command fial2ban from 1.2.3.4
+
+We make a failregex
+
+    ^Invalid command \S+ from <HOST>
+
+Now think evil. The user does the command 'blah from 1.2.3.44'
+
+The program diligently logs:
+
+    Apr-07-13 07:08:36 Invalid command blah from 1.2.3.44 from 1.2.3.4
+
+And fail2ban matches 1.2.3.44 as the IP that it ban. A DoS attack was successful.
+
+The fix here is that the command can be anything so .* is appropriate.
+
+    ^Invalid command .* from <HOST>
+
+Here the .* will match until the end of the string. Then realise it has more to
+match, i.e. "from <HOST>" and go back until it find this. Then it will ban
+1.2.3.4 correctly. Since the <HOST> is always at the end, end the regex with a $.
+
+    ^Invalid command .* from <HOST>$
+
+Note if we'd just had the expression:
+
+    ^Invalid command \S+ from <HOST>$
+
+Then provided the user put a space in their command they would have never been
+banned.
+
+2. Unanchored regex can match other user injected data
+
+From the Apache vulnerability CVE-2013-2178
+( original ref: https://vndh.net/note:fail2ban-089-denial-service ).
+
+An example bad regex for Apache:
+
+    failregex = [[]client <HOST>[]] user .* not found
+
+Since the user can do a get request on:
+
+    GET /[client%20192.168.0.1]%20user%20root%20not%20found HTTP/1.0
+Host: remote.site
+
+Now the log line will be:
+
+    [Sat Jun 01 02:17:42 2013] [error] [client 192.168.33.1] File does not exist: /srv/http/site/[client 192.168.0.1] user root not found
+
+As this log line doesn't match other expressions hence it matches the above
+regex and blocks 192.168.33.1 as a denial of service from the HTTP requester.
+
+3.  Over greedy pattern matching
+
+From: https://github.com/fail2ban/fail2ban/pull/426
+
+An example ssh log (simplified)
+
+    Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser remoteuser
+
+As we assume username can include anything including spaces its prudent to put
+.* here. The remote user can also exist as anything so lets not make assumptions again.
+
+    failregex = ^%(__prefix_line)sFailed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$
+
+So this works. The problem is if the .* after remote user is injected by the
+user to be 'from 1.2.3.4'. The resultant log line is.
+
+    Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4
+
+Testing with:
+
+    fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$'
+
+TIP: I've removed the bit that matches __prefix_line from the regex and log.
+
+Shows:
+
+    1) [1] ^ Failed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$
+       1.2.3.4  Sun Sep 29 17:15:02 2013
+
+It should of matched 127.0.0.1. So the first greedy part of the greedy regex
+matched until the end of the string. The was no "from <HOST>" so the regex
+engine worked backwards from the end of the string until this was matched.
+
+The result was that 1.2.3.4 was matched, injected by the user, and the wrong IP
+was banned.
+
+The solution here is to make the first .* non-greedy with .*?. Here it matches
+as little as required and the fail2ban-regex tool shows the output:
+
+    fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .*? from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$'
+
+    1) [1] ^ Failed \S+ for .*? from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$
+       127.0.0.1  Sun Sep 29 17:15:02 2013
+
+So the general case here is a log line that contains:
+
+    (fixed_data_1)<HOST>(fixed_data_2)(user_injectable_data)
+
+Where the regex that matches fixed_data_1 is gready and matches the entire
+string, before moving backwards and user_injectable_data can match the entire
+string.
+
+Another case:
+
+ref: https://www.debuggex.com/r/CtAbeKMa2sDBEfA2/0
+
+A webserver logs the following without URL escaping:
+
+    [error] 2865#0: *66647 user "xyz" was not found in "/file", client: 1.2.3.1, server: www.host.com, request: "GET ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host", host: "www.myhost.com"
+
+regex:
+
+    failregex = ^ \[error\] \d+#\d+: \*\d+ user "\S+":? (?:password mismatch|was not found in ".*"), client: <HOST>, server: \S+, request: "\S+ .+ HTTP/\d+\.\d+", host: "\S+"
+
+The .* matches to the end of the string. Finds that it can't continue to match
+", client ... so it moves from the back and find that the user injected web URL:
+
+    ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host
+
+In this case there is a fixed host: "www.myhost.com" at the end so the solution
+is to anchor the regex at the end with a $.
+
+If this wasn't the case then first .* needed to be made so it didn't capture
+beyond <HOST>.
+
+4. Application generates two identical log messages with different meanings
+
+If the application generates the following two messages under different
+circumstances:
+
+    client <IP>: authentication failed
+    client <USER>: authentication failed
+
+
+Then it's obvious that a regex of "^client <HOST>: authentication
+failed$" will still cause problems if the user can trigger the second
+log message with a <USER> of 123.1.1.1.
+
+Here there's nothing to do except request/change the application so it logs
+messages differently.
+
+
diff --git a/MANIFEST b/MANIFEST
index 977a9ec5..83cab61e 100644
--- a/MANIFEST
+++ b/MANIFEST
@@ -5,6 +5,7 @@ TODO
 THANKS
 COPYING
 DEVELOP
+FILTERS
 fail2ban-client
 fail2ban-server
 fail2ban-testcases