Sending CSV Events using Fluentd

This tutorial explains how to send events from CSV files using Fluentd.

Prerequisites

Please verify the following is installed to configure your system correctly:

Component Recommendation
Docker Minimum version 19.03.4
docker-compose Minimum version 1.24.1
Operating System Ubuntu, Linux (Amazon, RHEL)
Available RAM Minimum 4 GB
Available Disk Space 10 to 12 GB for File based buffering mechanism

 

Step 1: Create the Fluentd configuration file


Create a file called qsensei_fluentd.conf

Paste the following configuration in the file to declare system-wide parameters. In this example, we have declared the number of workers as 1 but, to achieve higher throughput, this can be increased depending on the number of CPU cores available on the system.

<system>
   workers 1
   log_level info
</system>

Paste the following configuration to import the configuration file to report connector status.

@include qsensei_fluentd_connector_status.conf

 

Step 2: Configure the Input CSV Plugin


In this example, the CSV file stores apache web access logs. Below you can see the first 4 lines of the sample CSV file. The first line are the CSV headers and following lines are data records.

host,logname,time,method,url,response,bytes,referer,useragent

199.72.81.55,-,2021-04-14T00:57:59+00:00,GET,/history/apollo/,200,6245,,

unicomp6.unicomp.net,-,2021-04 14T00:57:59+00:00,GET,/shuttle/countdown/,200,3985,,

burger.letters.com,-,2021-04-14T00:57:59+00:00,GET,/shuttle/countdown/liftoff.html,304,0,,

Paste the following contents in Fluentd configuration file. The tag apache_log associated with the input events will be used as topic name in Q-Sensei Logs.

<source>
   @type tail
   path /var/log/access_logs.csv
   pos_file /var/log/access_logs.csv.pos
   tag apache_log
   format csv
   keys host, logname, time, method, url, response, bytes, referrer, useragent
   time_key time
   @label @QSENSEI_LOGS
</source>

 

Step 3: Configure the Output plugin


The output plugin configuration will define where to forward the events labelled @QSENSEI_LOGS. In this example, the output plugin is an HTTPS API Gateway Endpoint to your Q-Sensei Logs deployment. To configure the output plugin, paste the following contents in Fluentd configuration file.

<label @QSENSEI_LOGS>

# Enrich and transform the events
<filter **>
   @type record_transformer
   renew_record true
   enable_ruby true
   <record>
      value ${record}
      topic ${tag}
      serialization_format JSON/OBJECT
      fuse:action replace_or_create
      fuse:type message
      sent-time ${time}
   </record>
</filter>

# Forward the events to Q-Sensei’s HTTPS endpoint
<match **>
   @type http
   http_method post
   endpoint https://uxxx.execute-api.us-east-1.amazonaws.com/api/upload/events
   content_type application/json
   json_array true
   headers {"x-api-key": "xxxxx","x-deployment-id": "xxxxx"}
   <format>
      @type json
   </format>
</match>

</label>

In the above output plugin configuration, notice the parameters endpoint, x-api-key and x-deployment-id. These parameters are specific to your deployment can be downloaded from the Manager UI.

 

Step 4: Define docker services


Create a file called docker-compose.yaml and paste the following contents in the file. The docker image for Fluentd is the official docker image maintained by Treasure Data, Inc.

version: "3"

services:
aggregator:
image: fluent/fluentd:v1.12

volumes:

  • ./qsensei_fluentd.conf:/fluentd/etc/qsensei_fluentd.conf
  • qsensei_fluentd_data:/fluentd

environment:

FLUENTD_CONF: qsensei_fluentd.conf

volumes:
qsensei_fluentd_data: {}

 

Step 5: Start Fluentd


Run the following command to start Fluentd. This command will pull the latest docker image, create docker volume to persist Fluentd data and start sending events to Q-Sensei Logs deployment.

docker-compose up -d

Was this article helpful?
0 out of 0 found this helpful