AWS Fargate × Firelens (Fluentd) でGoogle BigQueryとCloudWatch LogsにLogを連携する
記事の要点
- Fargateで起動しているApplication LogをBigQueryとCloudWatch Logsに連携する
- Table Createをしたいので、Fluent-bitではなくFluentdを利用する
- Fluentd用のDockerfile, custom.conf を実装する
実装
FluentdでBigQuery, CloudWatch Logsへのデータ連携を実装します。データ連携だけならFluent-bitでも実装可能ですが、現時点 (2020/05/30) で、 Fluent-bit BigQuery pluginではtable createができないみたいなのでFluentdを利用します。
Fluentd
Dockerfile
FROM fluent/fluentd:v1.10.4-1.0 USER root # file copy COPY conf/extra.conf /fluentd/etc/extra.conf COPY conf/schema.json /fluentd/etc/schema.json COPY conf/conf.out /fluentd/etc/conf.out COPY extra_entrypoint.sh /bin/ # below RUN includes plugin as examples elasticsearch is not required # you may customize including plugins as you wish RUN apk add --no-cache --update --virtual .build-deps \ sudo build-base ruby-dev \ && chmod +x /bin/extra_entrypoint.sh \ && sudo gem install fluent-plugin-bigquery -v "~> 2.2.0" \ && sudo gem install fluent-plugin-record-reformer -v "~> 0.9.1" \ && sudo gem install fluent-plugin-cloudwatch-logs -v "~> 0.9.4" \ && sudo gem sources --clear-all \ && apk del .build-deps \ && rm -rf /tmp/* /var/tmp/* /usr/lib/ruby/gems/*/cache/*.gem \ && mkdir -p /fluentd/etc/.keys \ && chown -R fluent:fluent /fluentd/etc USER fluent # HACK: redefine entrypoint ENTRYPOINT ["/bin/extra_entrypoint.sh"] CMD ["fluentd"]
BigQueryへの認証用のjson_keyをContainerに用意するために、entrypoint.shを少し修正しています。 なんでこんな面倒くさいことをしているかというと、bigQueryPlugin内で、環境変数をなぜか読み込んでくれなかったからです。記事の最後に記載しておきます。 (教えてエロい人 🤗)
extra_entrypoint.sh
#!/bin/sh ###### start extra ##### # create key file cat << EOS > /fluentd/etc/.keys/my-jsonkey.json { "client_email": "${BQ_CLIENT_EMAIL}", "private_key": "${BQ_PRIVATE_KEY}" } EOS ###### end extra ##### # start default entrypoint # https://github.com/fluent/fluentd-docker-image/blob/master/v1.10/alpine/entrypoint.sh /bin/entrypoint.sh "$@"
/fluentd/etc/extra.conf
# firelensに設定するfluentdのoption config file # source directiveはfirelensのdefault fluent.confに定義されているので記述をしない # fluentTagDockerFormat is the format for the log tag, which is "containerName-firelens-taskID" # https://github.com/aws/amazon-ecs-agent/blob/master/agent/engine/docker_task_engine.go#L91 <match "#{ENV['CONTAINER_NAME']}-firelens-**"> @type relabel @label @firelens_log @id forward_all </match> # copy record for bigQuery and cloudwatch logs <label @firelens_log> <match **> @type copy <store> @type relabel @label @out_bigquery </store> <store> @type relabel @label @out_cloudwatch_logs </store> </match> </label> # include bigquery, cloudwatch config file @include conf.out/*.conf
source directiveはAWS側が用意する fluent.confに記載されるため、ここでは設定しません。また、fluent.confはAWSで予約されているので名前はextra.confにしました。
conf.out/bigquery.conf
<label @out_bigquery> <filter **> @type record_transformer <record> tag ${tag} </record> </filter> # grep log <filter **> @type grep <regexp> key log pattern /^\[(?:[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{6})\] \[API\] INFO \- log \- (?:.*) \- \[\]/ </regexp> </filter> # extract message, timestamp <filter **> @type parser key_name log <parse> @type regexp expression /^\[(?<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{6})\] \[API\] INFO \- log \- (?<message>.*) \- \[\]/ </parse> </filter> # tag rewrite - add ymd <match **> @type record_reformer @label @insert_bigquery enable_ruby true # HACK: Application Logのtimestamp (timezone: Asia/Tokyo) からBigQuery table suffix用のymdを取得する (fluentdはUTCなので) tag #{ENV['ENV']}.${record["timestamp"].slice(0, 10).gsub('-', '')} </match> </label> <label @insert_bigquery> <match **> @type bigquery_insert auth_method json_key json_key /fluentd/etc/.keys/my-jsonkey.json # buffer - set chunk keys tag # https://docs.fluentd.org/configuration/buffer-section#chunk-keys <buffer tag> flush_interval 1 </buffer> project my-project dataset mytable_${tag[0]} # mytable_dev / mytable_prod table log${tag[1]} # log20200101 auto_create_table true schema_path /fluentd/etc/schema.json </match> </label>
schema.json
[ { "name": "timestamp", "type": "TIMESTAMP", "mode": "REQUIRED" }, { "name": "message", "type": "STRING" } ]
conf.out.cloudwatch.conf
<label @out_cloudwatch_logs> <match **> @type cloudwatch_logs log_group_name "#{ENV['LOG_GROUP_NAME']}" region "#{ENV['AWS_REGION']}" use_tag_as_stream true auto_create_stream true </match> </label>
Fargate
task_definition
要点だけ抜粋します
{ "name": "${local.myapp_name}", "image": "${local.myapp_image}", "logConfiguration": { "logDriver": "awsfirelens", "options": { "region": "${data.aws_region.current.name}", "auto_create_stream": "true", "log_group_name": "${aws_cloudwatch_log_group.myapp.name}", "use_tag_as_stream": "true", "@type": "cloudwatch_logs" } } }, { "name": "${local.log_router_name}", "image": "${local.log_router_image}", "essential": true, "firelensConfiguration": { "type": "fluentd", "options": { "config-file-type": "file", "config-file-value": "/fluentd/etc/extra.conf" } }, "linuxParameters": { "initProcessEnabled": true }, "environment" : [ { "name" : "ENV", "value" : "${local.environment}" }, { "name" : "CONTAINER_NAME", "value" : "${local.myapp_name}" }, { "name" : "LOG_GROUP_NAME", "value" : "${aws_cloudwatch_log_group.myapp.name}" }, { "name" : "AWS_REGION", "value" : "${data.aws_region.current.name}" } ], "secrets": [ { "name": "BQ_CLIENT_EMAIL", "valueFrom": "${aws_ssm_parameter.big_query_client_email.name}" }, { "name": "BQ_PRIVATE_KEY", "valueFrom": "${aws_ssm_parameter.big_query_private_key.name}" } ], "logConfiguration": { "logDriver": "awslogs", "options": { "awslogs-region": "${data.aws_region.current.name}", "awslogs-group": "${aws_cloudwatch_log_group.log_router.name}", "awslogs-stream-prefix": "firelens" } } }
要点
- config-file-type は file / s3が選択できますが、Firelensは1.4.0時点でs3を選択できない
- json_keyに利用する環境変数は ssm から secretsで取得
- fluentdの公式Dockerのentrypoint.shはtiniを使っていたけど上書きしたので代わりにinitProcessEnabledを指定
- extra.confでlogを全部matchするが、awsのfluent.confの初期化のためにlogConfigration#optionsが必要
おまけ
json_keyは、本当はこんな感じで指定したかった (けど動かなかった) 😔
<match dummy> @type bigquery_insert auth_method json_key json_key {"private_key": "#{ENV['BQ_PRIVATE_KEY']}", "client_email": "#{ENV['BQ_CLIENT_EMAIL']}"} </match>
参考
GitHub - fluent-plugins-nursery/fluent-plugin-bigquery
BigQuery - Fluent Bit: Official Manual
GitHub - fluent/fluentd-docker-image: Docker image for Fluentd