Application Log Monitoring using Fluentd as a Daemon Set in gopaddle
In order to comply with the Payment Card Industry Data Security Standard (PCI DSS), application access, transaction and system logs need to be monitored. Fluentd is an open source, distributed logging and aggregation framework. It offers a unified logging layer to collate & filter logs of different nature like access logs, application logs and system logs. Fluentd also helps to aggregate, analyse and archive these logs. Kubernetes uses the glog library module for logging Fluentd for application monitoring.
For application (microservice) log monitoring, a sidecar container is suggested pattern for deploying Fluentd within the same pod where the application lives. A shared volume between the application container and the Fluentd container holds the log files for the containers. Fluentd uses the shared volume as the data input and the Google Cloud Logging Service as the data output, provided the cluster is on Google Compute and the VM has cloud-logging.write scope.
The Daemon Set feature, which went GA in Kubernetes 1.2, allows you to deploy Fluentd as a pod across all the nodes in your cluster, instead of deploying them specific to an application.
Daemon Set is an extended feature available in Kubernetes 1.2 beta. Some of the typical use cases of Daemon Sets are log aggregation, monitoring services that need to be run on a selected set of nodes within a cluster. Daemon Sets are headless services and are very different from replication controllers. Replication controllers are stateless services that can scaled up or down and can be scheduled across different nodes, whereas Daemon Sets help in performing specific tasks within selected nodes.
Benefits of Daemon Sets
- Daemon Sets can be deployed the same way as deploying an application on Kubernetes
- Daemon Sets can be deployed on a selected set of nodes with specific tags. Whenever a new node is added with a similar tag, Daemons get deployed automatically.
- Daemon Sets with different resource requirements like CPU, memory & storage can be grouped and deployed on selected nodes
- Running Daemon Sets as separate PODs & containers within a node gives better resource isolation rather than running them as simple daemons within the node
Fluentd Use Case
Let us consider a use case with a 2-tier application deployment using goPaddle. The application design composes of ngnix and tomcat services. Using goPaddle, each of these services can be deployed as scalable micro-services using POD per container architecture shown below. Each service will have its own replication controller and can scale independently.
The services are deployed across a 3 node Kubernetes 1.2 cluster.
Fluentd is available as an add-on component in goPaddle designer. It can be configured to accept log folder(s) and write logs to an external MongoDB service. Fluentd can be created as a single service design and can be deployed the same way as any other service. At the time of deployment, goPaddle provides an option to select the nodes where it needs to be deployed.
In all the selected nodes, Fluentd gets deployed as a Daemon Set and gets configured automatically to collect the logs from the specified folders. goPaddle deploys the services and the daemons using the Sidecar Pods pattern (similar to Sidecar containers pattern) where the pods share the volumes for log read/write.
As a best practice, if the application loggers are configured to prepend the host IPs, then the aggregated logs would be prepended with the Pod IPs which can later be used to filter the logs from specific service instance.
By deploying Fluentd as a Daemon Set using goPaddle,
- It can be customized to monitor specific or all applications running within the same node
- It can collect logs from specific log files across all applications within the same node
- It can collect system logs across all the containers within the same node
- Since they are run as a separate pod, they run outside the application scaling rules
- Use custom log data outputs other than Google Cloud Logging
- Since it is run as a Daemon Set, it is ensured that is always in running state
Using goPaddle, it is also possible to deploy a Daemon Set in ambassador pod pattern (similar to ambassador container pattern) where the application Pod can communicate with the Daemon Set pod using localhost reference. A typical use case would be to deploy etcd services within a Pod and the applications can access etcd for key/value store and retrieval.