Merge pull request #7259 from a-robinson/fluentd

Make a Fluentd sidecar image and example for sending logs from within a
pull/6/head
David Oppenheimer 2015-04-23 17:10:24 -07:00
commit 2d69f03183
9 changed files with 175 additions and 6 deletions

View File

@ -0,0 +1,43 @@
# This Dockerfile will build an image that is configured to use Fluentd to
# collect container log files from the specified paths and send them to the
# Elasticsearch.
# The environment variable that controls which log files are collected is
# FILES_TO_COLLECT. Files specified in the environment variable should be
# separated by whitespace, as in "/var/log/syslog /var/log/nginx/access.log".
# This configuration assumes that the cluster this pod is running in has an
# Elasticsearch instance reachable via a service named elasticsearch-logging.
FROM ubuntu:14.04
MAINTAINER Alex Robinson "arob@google.com"
# Disable prompts from apt.
ENV DEBIAN_FRONTEND noninteractive
ENV OPTS_APT -y --force-yes --no-install-recommends
# Install the standard, official Fluentd agent (called td-agent, for the
# project's parent company, Treasure Data).
RUN apt-get -q update && \
apt-get -y install curl && \
apt-get install -y -q libcurl4-openssl-dev make && \
apt-get clean
RUN /usr/bin/curl -L http://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent2.sh | sh
# Change the default user and group to root. Needed to ensure Fluentd can access
# files anywhere in the filesystem that it's requested to.
RUN sed -i -e "s/USER=td-agent/USER=root/" -e "s/GROUP=td-agent/GROUP=root/" /etc/init.d/td-agent
# Install the Elasticsearch Fluentd plug-in.
RUN /usr/sbin/td-agent-gem install fluent-plugin-elasticsearch
# Copy the configuration file generator for creating input configurations for
# each file specified in the FILES_TO_COLLECT environment variable.
COPY config_generator.sh /usr/local/sbin/config_generator.sh
# Copy the Fluentd configuration file for collecting from all the inputs
# generated by the config generator and sending them to Elasticsearch.
COPY td-agent.conf /etc/td-agent/td-agent.conf
# Run the config generator to get the config files in place and start Fluentd.
# We have to run the config generator at runtime rather than now so that it can
# incorporate the files provided in the environment variable in its config.
CMD /usr/local/sbin/config_generator.sh && /usr/sbin/td-agent -qq --use-v1-config --suppress-repeated-stacktrace > /var/log/td-agent/td-agent.log

View File

@ -0,0 +1,9 @@
.PHONY: build push
TAG = 1.0
build:
docker build -t gcr.io/google_containers/fluentd-sidecar-es:$(TAG) .
push:
gcloud preview docker push gcr.io/google_containers/fluentd-sidecar-es:$(TAG)

View File

@ -0,0 +1,25 @@
# Collecting log files from within containers with Fluentd and sending them to Elasticsearch.
*Note that this only works for clusters with an Elastisearch service. If your cluster is logging to Google Cloud Logging instead (e.g. if you're using Container Engine), see [this guide](/contrib/logging/fluentd-sidecar-gcp/) instead.*
This directory contains the source files needed to make a Docker image that collects log files from arbitrary files within a container using [Fluentd](http://www.fluentd.org/) and sends them to the cluster's Elasticsearch service.
The image is designed to be used as a sidecar container as part of a pod.
It lives in the Google Container Registry under the name `gcr.io/google_containers/fluentd-sidecar-es`.
This shouldn't be necessary if your container writes its logs to stdout or stderr, since the Kubernetes cluster's default logging infrastructure will collect that automatically, but this is useful if your application logs to a specific file in its filesystem and can't easily be changed.
In order to make this work, you have to add a few things to your pod config:
1. A second container, using the `gcr.io/google_containers/fluentd-sidecar-es:1.0` image to send the logs to Elasticsearch.
2. A volume for the two containers to share. The emptyDir volume type is a good choice for this because we only want the volume to exist for the lifetime of the pod.
3. Mount paths for the volume in each container. In your primary container, this should be the path that the applications log files are written to. In the secondary container, this can be just about anything, so we put it under /mnt/log to keep it out of the way of the rest of the filesystem.
4. The `FILES_TO_COLLECT` environment variable in the sidecar container, telling it which files to collect logs from. These paths should always be in the mounted volume.
To try it out, make sure that your cluster was set up to log to Elasticsearch when it was created (i.e. you set `LOGGING_DESTINATION=elasticsearch`), then simply run
```
kubectl create -f logging-sidecar-pod.yaml
```
You should see the logs show up in the cluster's Kibana log viewer shortly after creating the pod. To clean up after yourself, simply run
```
kubectl delete -f logging-sidecar-pod.yaml
```

View File

@ -0,0 +1,37 @@
#!/bin/bash
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
mkdir -p /etc/td-agent/files
if [ -z "$FILES_TO_COLLECT" ]; then
exit 0
fi
for filepath in $FILES_TO_COLLECT
do
filename=$(basename $filepath)
cat > "/etc/td-agent/files/${filename}" << EndOfMessage
<source>
type tail
format none
time_key time
path ${filepath}
pos_file /etc/td-agent/fluentd-es.log.pos
time_format %Y-%m-%dT%H:%M:%S
tag file.${filename}
read_from_head true
</source>
EndOfMessage
done

View File

@ -0,0 +1,26 @@
apiVersion: v1beta3
kind: Pod
metadata:
labels:
example: logging-sidecar
name: logging-sidecar-example
spec:
containers:
- name: synthetic-logger
image: ubuntu:14.04
command: ["bash", "-c", "i=\"0\"; while true; do echo \"`hostname`: $i \" >> /var/log/synthetic-count.log; date --rfc-3339 ns >> /var/log/synthetic-dates.log; sleep 4; i=$[$i+1]; done"]
volumeMounts:
- name: log-storage
mountPath: /var/log
- name: sidecar-log-collector
image: gcr.io/google_containers/fluentd-sidecar-es:1.0
env:
- name: FILES_TO_COLLECT
value: "/mnt/log/synthetic-count.log /mnt/log/synthetic-dates.log"
volumeMounts:
- name: log-storage
readOnly: true
mountPath: /mnt/log
volumes:
- name: log-storage
emptyDir: {}

View File

@ -0,0 +1,27 @@
# This Fluentd configuration file enables the collection of log files
# that can be specified at the time of its creation in an environment
# variable, assuming that the config_generator.sh script runs to generate
# a configuration file for each log file to collect.
# Logs collected will be sent to the cluster's Elasticsearch service.
#
# Currently the collector uses a text format rather than allowing the user
# to specify how to parse each file.
# Pick up all the auto-generated input config files, one for each file
# specified in the FILES_TO_COLLECT environment variable.
@include files/*
# All the auto-generated files should use the tag "file.<filename>".
<match file.**>
type elasticsearch
log_level info
include_tag_key true
host elasticsearch-logging.default
port 9200
logstash_format true
flush_interval 5s
# Never wait longer than 5 minutes between retries.
max_retry_wait 300
# Disable the limit on the number of retries (retry forever).
disable_retry_limit
</match>

View File

@ -4,7 +4,7 @@
# The environment variable that controls which log files are collected is # The environment variable that controls which log files are collected is
# FILES_TO_COLLECT. Files specified in the environment variable should be # FILES_TO_COLLECT. Files specified in the environment variable should be
# separated by whitespace, as in "/var/log/syslog /var/log/nginx/access.log". # separated by whitespace, as in "/var/log/syslog /var/log/nginx/access.log".
# This configuration assumes that the host performning the collection is a VM # This configuration assumes that the host performing the collection is a VM
# that has been created with a logging.write scope and that the Logging API # that has been created with a logging.write scope and that the Logging API
# has been enabled for the project in the Google Developer Console. # has been enabled for the project in the Google Developer Console.

View File

@ -1,4 +1,6 @@
# Collecting log files from within containers with Fluentd and sending to the Google Cloud Logging service. # Collecting log files from within containers with Fluentd and sending them to the Google Cloud Logging service.
*Note that this only works for clusters running on GCE and whose VMs have the cloud-logging.write scope. If your cluster is logging to Elasticsearch instead, see [this guide](/contrib/logging/fluentd-sidecar-es/) instead.*
This directory contains the source files needed to make a Docker image that collects log files from arbitrary files within a container using [Fluentd](http://www.fluentd.org/) and sends them to GCP. This directory contains the source files needed to make a Docker image that collects log files from arbitrary files within a container using [Fluentd](http://www.fluentd.org/) and sends them to GCP.
The image is designed to be used as a sidecar container as part of a pod. The image is designed to be used as a sidecar container as part of a pod.
It lives in the Google Container Registry under the name `gcr.io/google_containers/fluentd-sidecar-gcp`. It lives in the Google Container Registry under the name `gcr.io/google_containers/fluentd-sidecar-gcp`.