diff --git a/website/source/api/index.html.md b/website/source/api/index.html.md
index 0740fe3364..aea0c3e750 100644
--- a/website/source/api/index.html.md
+++ b/website/source/api/index.html.md
@@ -77,6 +77,51 @@ to the supplied maximum `wait` time to spread out the wake up time of any
 concurrent requests. This adds up to `wait / 16` additional time to the maximum
 duration.
 
+### Implementation Details
+
+While the mechanism is relatively simple to work with, there are a few edge 
+cases that must be handled correctly.
+
+ * **Reset the index if it goes backwards**. While indexes in general are 
+   monotonically increasing(i.e. they should only ever increase as time passes), 
+   there are several real-world scenarios in 
+   which they can go backwards for a given query. Implementations must check 
+   to see if a returned index is lower than the previous value, 
+   and if it is, should reset index to `0` - effectively restarting their blocking loop. 
+   Failure to do so may cause the client to miss future updates for an unbounded 
+   time, or to use an invalid index value that causes no blocking and increases 
+   load on the servers. Cases where this can occur include:
+   * If a raft snapshot is restored on the servers with older version of the data.
+   * KV list operations where an item with the highest index is removed.
+   * A Consul upgrade changes the way watches work to optimize them with more 
+   granular indexes.
+
+ * **Sanity check index is greater than zero**. After the initial request (or a
+   reset as above) the `X-Consul-Index` returned _should_ always be greater than zero. It
+   is a bug in Consul if it is not, however this has happened a few times and can
+   still be triggered on some older Consul versions. It's especially bad because it
+   causes blocking clients that are not aware to enter a busy loop, using excessive 
+   client CPU and causing high load on servers. It is _always_ safe to use an 
+   index of `1` to wait for updates when the data being requested doesn't exist
+   yet, so clients _should_ sanity check that their index is at least 1 after 
+   each blocking response is handled to be sure they actually block on the next 
+   request.
+
+ * **Rate limit**. The blocking query mechanism is reasonably efficient when updates 
+   are relatively rare (order of tens of seconds to minutes between updates). In cases 
+   where a result gets updated very fast however - possibly during an outage or incident 
+   with a badly behaved client - blocking query loops degrade into busy loops that 
+   consume excessive client CPU and cause high server load. While it's possible to just add a sleep 
+   to every iteration of the loop, this is **not** recommended since it causes update 
+   delivery to be delayed in the happy case, and it can exacerbate the problem since 
+   it increases the chance that the index has changed on the next request. Clients 
+   _should_ instead rate limit the loop so that in the happy case they proceed without 
+   waiting, but when values start to churn quickly they degrade into polling at a 
+   reasonable rate (say every 15 seconds). Ideally this is done with an algorithm that 
+   allows a couple of quick successive deliveries before it starts to limit rate - a 
+   [token bucket](https://en.wikipedia.org/wiki/Token_bucket) with burst of 2 is a simple
+   way to achieve this.
+
 ### Hash-based Blocking Queries
 
 A limited number of agent endpoints also support blocking however because the