As far as I know, nobody uses it. It was replaced by PublicIPs. If I were
being very polite I would leave it in internal, but since I am 99.99% sure
nobody uses it, I am cutting it. Let's argue about it.
It was an ABA problem where the proxy loop might see its own service as
"existing" when it had been destroyed and recreated (as in an update).
To prove this I added a counter of running ProxyLoop goroutines and check that
in tests. If I undo my main change, the tests fail. This makes the
proxier_test significantly slower (3 seconds vs 0.5 seconds). Sorry.
After this DNS is resolvable from the host, if the DNS server is targetted
explicitly. This does NOT add the cluster DNS to the host's resolv.conf. That
is a larger problem, with distro-specific tie-ins and circular deps.
- Added process to cleanup stale session affinity records
- Automatically set cloud provided load balancer for sticky session if the service requires it - Note, this only works on GCE right now.
- Changed sessionAffinityMap a map to pointers instead of structs to improve performance
- Commented out cookie and protocol from sessionAffinityDetail to avoid confusion as it is not yet implemented.
Don't log an error when Accept failed because the interface (portal)
was just removed.
Don't pass around a pointer to a serviceInfo since another thread
deletes those. Instead, just check if service name is still in the
service map.
Delete the locking on the serviceInfo object since it is only used
by the "main" proxier thread.
The iptables args list needs to include all fields as they are eventually spit
out by iptables-save. This is because some systems do not support the
'iptables -C' arg, and so fall back on parsing iptables-save output. If this
does not match, it will not pass the check. For example: adding the /32 on
the destination IP arg is not strictly required, but causes this list to not
match the final iptables-save output. This is fragile and I hope one day we
can stop supporting such old iptables versions.
This allows the proxier to portal Public IPs even if the
createExternalLoadBalancer flag is not set.
This also fixes what appears to be a bug in the createExternalLoadBalancer path
wherein multiple PublicIPs would get truncated.
Move a lot of common error logging into better buckets:
glog.Errorf() - Always an error
glog.Warningf() - Something unexpected, but probably not an error
glog.V(0) - Generally useful for this to ALWAYS be visible
to an operator
* Programmer errors
* Logging extra info about a panic
* CLI argument handling
glog.V(1) - A reasonable default log level if you don't want
verbosity
* Information about config (listening on X, watching Y)
* Errors that repeat frequently that relate to conditions
that can be corrected (pod detected as unhealthy)
glog.V(2) - Useful steady state information about the service
* Logging HTTP requests and their exit code
* System state changing (killing pod)
* Controller state change events (starting pods)
* Scheduler log messages
glog.V(3) - Extended information about changes
* More info about system state changes
glog.V(4) - Debug level verbosity (for now)
* Logging in particularly thorny parts of code where
you may want to come back later and check it
The bug was that the proxier is passed as value on method decleration.
This caused a copy of Proxier to be created when the method was invoked.
The copy being a shallow copy turned out to have them both reference
a same map instance, but their mutexes were different instances.
This turned out to use different mutexes before operating on a same map
instance, which didn't make sense.
This change includes minor refactoring and cleanup of the proxy
package including the following items:
* Rename source files with misspelling of round robin
* Remove unnecessary and redundant comments
* Update comments for clarity
* Add locking when updating the round-robin index
* Improve method receiver names
* Rename the LoadBalance method to NextEndpoint to add clarity
No changes in behaviour have been introduced.