This tutorial explains how to send events from CSV files using Fluentd.
Prerequisites
Please verify the following is installed to configure your system correctly:
Component | Recommendation |
---|---|
Docker | Minimum version 19.03.4 |
docker-compose | Minimum version 1.24.1 |
Operating System | Ubuntu, Linux (Amazon, RHEL) |
Available RAM | Minimum 4 GB |
Available Disk Space | 10 to 12 GB for File based buffering mechanism |
Step 1: Create the Fluentd configuration file
Create a file called qsensei_fluentd.conf
Paste the following configuration in the file to declare system-wide parameters. In this example, we have declared the number of workers as 1 but, to achieve higher throughput, this can be increased depending on the number of CPU cores available on the system.
<system>
workers 1
log_level info
</system>
Paste the following configuration to import the configuration file to report connector status.
@include qsensei_fluentd_connector_status.conf
Step 2: Configure the Input CSV Plugin
In this example, the CSV file stores apache web access logs. Below you can see the first 4 lines of the sample CSV file. The first line are the CSV headers and following lines are data records.
host,logname,time,method,url,response,bytes,referer,useragent
199.72.81.55,-,2021-04-14T00:57:59+00:00,GET,/history/apollo/,200,6245,,
unicomp6.unicomp.net,-,2021-04 14T00:57:59+00:00,GET,/shuttle/countdown/,200,3985,,
burger.letters.com,-,2021-04-14T00:57:59+00:00,GET,/shuttle/countdown/liftoff.html,304,0,,
Paste the following contents in Fluentd configuration file. The tag apache_log associated with the input events will be used as topic name in Q-Sensei Logs.
<source>
@type tail
path /var/log/access_logs.csv
pos_file /var/log/access_logs.csv.pos
tag apache_log
format csv
keys host, logname, time, method, url, response, bytes, referrer, useragent
time_key time
@label @QSENSEI_LOGS
</source>
Step 3: Configure the Output plugin
The output plugin configuration will define where to forward the events labelled @QSENSEI_LOGS. In this example, the output plugin is an HTTPS API Gateway Endpoint to your Q-Sensei Logs deployment. To configure the output plugin, paste the following contents in Fluentd configuration file.
<label @QSENSEI_LOGS>
# Enrich and transform the events
<filter **>
@type record_transformer
renew_record true
enable_ruby true
<record>
value ${record}
topic ${tag}
serialization_format JSON/OBJECT
fuse:action replace_or_create
fuse:type message
sent-time ${time}
</record>
</filter>
# Forward the events to Q-Sensei’s HTTPS endpoint
<match **>
@type http
http_method post
endpoint https://uxxx.execute-api.us-east-1.amazonaws.com/api/upload/events
content_type application/json
json_array true
headers {"x-api-key": "xxxxx","x-deployment-id": "xxxxx"}
<format>
@type json
</format>
</match>
</label>
In the above output plugin configuration, notice the parameters endpoint, x-api-key and x-deployment-id. These parameters are specific to your deployment can be downloaded from the Manager UI.
Step 4: Define docker services
Create a file called docker-compose.yaml and paste the following contents in the file. The docker image for Fluentd is the official docker image maintained by Treasure Data, Inc.
version: "3"
services:
aggregator:
image: fluent/fluentd:v1.12
volumes:
- ./qsensei_fluentd.conf:/fluentd/etc/qsensei_fluentd.conf
- qsensei_fluentd_data:/fluentd
environment:
FLUENTD_CONF: qsensei_fluentd.conf
volumes:
qsensei_fluentd_data: {}
Step 5: Start Fluentd
Run the following command to start Fluentd. This command will pull the latest docker image, create docker volume to persist Fluentd data and start sending events to Q-Sensei Logs deployment.
docker-compose up -d