Skip to main content

.NET Kafka Tracing with OpenTelemetry

Manual configuration

This integration includes:

  • Installing the Confluent.Kafka.Extensions.OpenTelemetry (package)[https://github.com/vhatsura/confluent-kafka-extensions-opentelemetry].
  • Configuring OpenTelemetry for your .NET application.
  • Setting up the OpenTelemetry collector to forward traces to Logz.io.

Before you begin, you'll need:

  • A .NET application without instrumentation.
  • An active account with Logz.io.
  • Port 4317 available on your host system.
  • A name defined for your tracing service to identify traces in Logz.io.

Install the necessary packages via NuGet

dotnet add package OpenTelemetry.Extensions.Hosting
dotnet add package OpenTelemetry.Instrumentation.AspNetCore
dotnet add package Confluent.Kafka.Extensions.OpenTelemetry
dotnet add package OpenTelemetry
dotnet add package OpenTelemetry.Exporter.OpenTelemetryProtocol

or

Install-Package OpenTelemetry.Extensions.Hosting
Install-Package OpenTelemetry.Instrumentation.AspNetCore
Install-Package Confluent.Kafka.Extensions.OpenTelemetry
Install-Package OpenTelemetry
Install-Package OpenTelemetry.Exporter.OpenTelemetryProtocol

Kafka OpenTelemery .NET Application

You can download the (example)[https://logzio-aws-integrations-us-east-1.s3.amazonaws.com/dotnet-kafka.zip ] of a instrumented .NET application with a Kafka producer and consumer.

Configure OpenTelemetry in .NET

Add the following OpenTelemetry configuration to your .NET application's Program.cs:

using OpenTelemetry.Trace;
using OpenTelemetry.Resources;
using OpenTelemetry.Exporter;
using Confluent.Kafka.Extensions.OpenTelemetry;

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddOpenTelemetry()
.WithTracing(traceBuilder =>
{
traceBuilder
.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("kafka-api"))
.AddAspNetCoreInstrumentation()
.AddConfluentKafkaInstrumentation()
.AddOtlpExporter();
});

var app = builder.Build();

You can configure the otlp endpoint and protocol using envurinment variables:

  • OTEL_EXPORTER_OTLP_ENDPOINT (The destination of your otel collector 'http://localhost:4318`)
  • OTEL_EXPORTER_OTLP_PROTOCOL (grpc or http/protubuf)

Instrument Kafka Producer

Build your producer using .BuildWithInstrumentation() method, to add instrumentation to your Kafka producer:

using Confluent.Kafka;
using Confluent.Kafka.Extensions.Diagnostics;

using var producer =
new ProducerBuilder<Null, string>(new ProducerConfig(new ClientConfig { BootstrapServers = "localhost:9092" }))
.SetKeySerializer(Serializers.Null)
.SetValueSerializer(Serializers.Utf8)
.BuildWithInstrumentation();

await producer.ProduceAsync("topic", new Message<Null, string> { Value = "Hello World!" });

Instrument Kafka Consumer

Use ConsumeWithInstrumentation() method to add instrumentation to your Kafka consumer:

using Confluent.Kafka;
using Confluent.Kafka.Extensions.Diagnostics;

using var consumer = new ConsumerBuilder<Ignore, string>(
new ConsumerConfig(new ClientConfig { BootstrapServers = "localhost:9092" })
{
GroupId = "group", AutoOffsetReset = AutoOffsetReset.Earliest
})
.SetValueDeserializer(Deserializers.Utf8)
.Build();

consumer.Subscribe("topic");

consumer.ConsumeWithInstrumentation((result) =>
{
Console.WriteLine(result.Message.Value);
});

Pull the Docker image for the OpenTelemetry collector

docker pull otel/opentelemetry-collector-contrib:0.78.0

Create a configuration file

Create a file config.yaml with the following content:

receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
http:
endpoint: "0.0.0.0:4318"


exporters:
logzio/traces:
account_token: "<<TRACING-SHIPPING-TOKEN>>"
region: "<<LOGZIO_ACCOUNT_REGION_CODE>>"
headers:
user-agent: logzio-opentelemetry-traces

logging:

processors:
batch:
tail_sampling:
policies:
[
{
name: policy-errors,
type: status_code,
status_code: {status_codes: [ERROR]}
},
{
name: policy-slow,
type: latency,
latency: {threshold_ms: 1000}
},
{
name: policy-random-ok,
type: probabilistic,
probabilistic: {sampling_percentage: 10}
}
]

extensions:
pprof:
endpoint: :1777
zpages:
endpoint: :55679
health_check:

service:
extensions: [health_check, pprof, zpages]
pipelines:
traces:
receivers: [otlp]
processors: [tail_sampling, batch]
exporters: [logging, logzio/traces]

Replace <<TRACING-SHIPPING-TOKEN>> with the token of the account you want to ship to.

Replace <LOGZIO_ACCOUNT_REGION_CODE> with the applicable region code.

Tail Sampling

The tail_sampling defines the decision to sample a trace after the completion of all the spans in a request. By default, this configuration collects all traces that have a span that was completed with an error, all traces that are slower than 1000 ms, and 10% of the rest of the traces.

You can add more policy configurations to the processor. For more on this, refer to OpenTelemetry Documentation.

The configurable parameters in the Logz.io default configuration are:

ParameterDescriptionDefault
threshold_msThreshold for the spand latency - all traces slower than the threshold value will be filtered in.1000
sampling_percentageSampling percentage for the probabilistic policy.10

If you already have an OpenTelemetry installation, add the following parameters to the configuration file of your existing OpenTelemetry collector:

  • Under the exporters list
  logzio/traces:
account_token: <<TRACING-SHIPPING-TOKEN>>
region: <<LOGZIO_ACCOUNT_REGION_CODE>>
headers:
user-agent: logzio-opentelemetry-traces
  • Under the service list:
  extensions: [health_check, pprof, zpages]
pipelines:
traces:
receivers: [otlp]
processors: [tail_sampling, batch]
exporters: [logzio/traces]

Replace <<TRACING-SHIPPING-TOKEN>> with the token of the account you want to ship to.

Replace <LOGZIO_ACCOUNT_REGION_CODE> with the applicable region code.

An example configuration file looks as follows:

receivers:  
otlp:
protocols:
grpc:
http:

exporters:
logzio/traces:
account_token: "<<TRACING-SHIPPING-TOKEN>>"
region: "<<LOGZIO_ACCOUNT_REGION_CODE>>"
headers:
user-agent: logzio-opentelemetry-traces

processors:
batch:
tail_sampling:
policies:
[
{
name: policy-errors,
type: status_code,
status_code: {status_codes: [ERROR]}
},
{
name: policy-slow,
type: latency,
latency: {threshold_ms: 1000}
},
{
name: policy-random-ok,
type: probabilistic,
probabilistic: {sampling_percentage: 10}
}
]

extensions:
pprof:
endpoint: :1777
zpages:
endpoint: :55679
health_check:

service:
extensions: [health_check, pprof, zpages]
pipelines:
traces:
receivers: [otlp]
processors: [tail_sampling, batch]
exporters: [logzio/traces]

Replace <<TRACING-SHIPPING-TOKEN>> with the token of the account you want to ship to.

Replace <LOGZIO_ACCOUNT_REGION_CODE> with the applicable region code.

The tail_sampling defines the decision to sample a trace after the completion of all the spans in a request. By default, this configuration collects all traces that have a span that was completed with an error, all traces that are slower than 1000 ms, and 10% of the rest of the traces.

You can add more policy configurations to the processor. For more on this, refer to OpenTelemetry Documentation.

The configurable parameters in the Logz.io default configuration are:

ParameterDescriptionDefault
threshold_msThreshold for the spand latency - all traces slower than the threshold value will be filtered in.1000
sampling_percentageSampling percentage for the probabilistic policy.10
Run the container

Mount the config.yaml as volume to the docker run command and run it as follows.

Linux
docker run  \
--network host \
-v <PATH-TO>/config.yaml:/etc/otelcol-contrib/config.yaml \
otel/opentelemetry-collector-contrib:0.78.0

Replace <PATH-TO> to the path to the config.yaml file on your system.

Windows
docker run  \
-v <PATH-TO>/config.yaml:/etc/otelcol-contrib/config.yaml \
-p 55678-55680:55678-55680 \
-p 1777:1777 \
-p 9411:9411 \
-p 9943:9943 \
-p 6831:6831 \
-p 6832:6832 \
-p 14250:14250 \
-p 14268:14268 \
-p 4317:4317 \
-p 55681:55681 \
otel/opentelemetry-collector-contrib:0.78.0

Replace <<TRACING-SHIPPING-TOKEN>> with the token of the account you want to ship to.

Replace <LOGZIO_ACCOUNT_REGION_CODE> with the applicable region code.

  • Replace <path/to> with the path to the directory where you downloaded the agent.
  • Replace <YOUR-SERVICE-NAME> with the name of your tracing service defined earlier.

Check Logz.io for your traces

Give your traces some time to get from your system to ours, and then open Tracing.

Configuration via Helm

You can use a Helm chart to ship traces to Logz.io via the OpenTelemetry collector. The Helm tool is used to manage packages of preconfigured Kubernetes resources that use charts.

logzio-monitoring allows you to ship traces from your Kubernetes cluster to Logz.io with the OpenTelemetry collector.

Deploy the Helm chart

Add logzio-helm repo as follows:

helm repo add logzio-helm https://logzio.github.io/logzio-helm
helm repo update

Run the Helm deployment code

helm install -n monitoring \
--set metricsOrTraces.enabled=true \
--set logzio-k8s-telemetry.traces.enabled=true \
--set logzio-k8s-telemetry.secrets.TracesToken="Replace `<<TRACING-SHIPPING-TOKEN>>` with the [token](https://app.logz.io/#/dashboard/settings/manage-tokens/data-shipping?product=tracing) of the account you want to ship to.

Replace `<LOGZIO_ACCOUNT_REGION_CODE>` with the applicable [region code](https://docs.logz.io/docs/user-guide/admin/hosting-regions/account-region/#available-regions).

" \
--set logzio-k8s-telemetry.secrets.LogzioRegion="<<LOGZIO-REGION>>" \
--set logzio-k8s-telemetry.secrets.env_id="<<CLUSTER-NAME>>" \
logzio-monitoring logzio-helm/logzio-monitoring
ParameterDescription
<<TRACES-SHIPPING-TOKEN>>Your traces shipping token.
<<CLUSTER-NAME>>The cluster's name, to easily identify the telemetry data for each environment.
<<LISTENER-HOST>>Your account's listener host.
<<LOGZIO-REGION>>Name of your Logz.io traces region e.g us, eu...

Define the logzio-monitoring dns name

In most cases, the dns name will be logzio-k8s-telemetry.<<namespace>>.svc.cluster.local, where <<namespace>> is the namespace where you deployed the helm chart and svc.cluster.name is your cluster domain name.

If you are not sure what your cluster domain name is, you can run the following command to look it up:

kubectl run -it --image=k8s.gcr.io/e2e-test-images/jessie-dnsutils:1.3 --restart=Never shell -- \
sh -c 'nslookup kubernetes.<<namespace>> | grep Name | sed "s/Name:\skubernetes.<<namespace>>//"'

It will deploy a small pod that extracts your cluster domain name from your Kubernetes environment. You can remove this pod after it has returned the cluster domain name.

Configure your .NET Kafka application to send spans to logzio-monitoring

You can configure the otlp endpoint and protocol using envurinment variables:

  • OTEL_EXPORTER_OTLP_ENDPOINT (The destination of your otel collector <<logzio-monitoring-service-dns>>)
  • OTEL_EXPORTER_OTLP_PROTOCOL (grpc or http/protubuf)
  • Replace <<logzio-monitoring-service-dns>> with the OpenTelemetry collector service DNS obtained previously (service IP is also allowed here).

Check Logz.io for your traces

Give your traces some time to get from your system to ours, then open Logz.io.

Customizing Helm chart parameters

Configure customization options

You can use the following options to update the Helm chart parameters:

  • Specify parameters using the --set key=value[,key=value] argument to helm install.

  • Edit the values.yaml.

  • Overide default values with your own my_values.yaml and apply it in the helm install command.

If required, you can add the following optional parameters as environment variables:

ParameterDescription
secrets.SamplingLatencyThreshold for the spand latency - all traces slower than the threshold value will be filtered in. Default 500.
secrets.SamplingProbabilitySampling percentage for the probabilistic policy. Default 10.
Example

You can run the logzio-monitoring chart with your custom configuration file that takes precedence over the values.yaml of the chart.

For example:

note

The collector will sample ALL traces where is some span with error with this example configuration.

logzio-k8s-telemetry:
tracesConfig:
processors:
tail_sampling:
policies:
[
{
name: error-in-policy,
type: status_code,
status_code: {status_codes: [ERROR]}
},
{
name: slow-traces-policy,
type: latency,
latency: {threshold_ms: 400}
},
{
name: health-traces,
type: and,
and: {
and_sub_policy:
[
{
name: ping-operation,
type: string_attribute,
string_attribute: { key: http.url, values: [ /health ] }
},
{
name: main-service,
type: string_attribute,
string_attribute: { key: service.name, values: [ main-service ] }
},
{
name: probability-policy-1,
type: probabilistic,
probabilistic: {sampling_percentage: 1}
}
]
}
},
{
name: probability-policy,
type: probabilistic,
probabilistic: {sampling_percentage: 20}
}
]
helm install -f <PATH-TO>/my_values.yaml -n monitoring \
--set logzio.region=<<LOGZIO_ACCOUNT_REGION_CODE>> \
--set logzio.tracing_token=Replace `<<TRACING-SHIPPING-TOKEN>>` with the [token](https://app.logz.io/#/dashboard/settings/manage-tokens/data-shipping?product=tracing) of the account you want to ship to.

Replace `<LOGZIO_ACCOUNT_REGION_CODE>` with the applicable [region code](https://docs.logz.io/docs/user-guide/admin/hosting-regions/account-region/#available-regions).

\
--set traces.enabled=true \
logzio-monitoring logzio-helm/logzio-monitoring

Replace <PATH-TO> with the path to your custom values.yaml file.

Uninstalling the Chart

The uninstall command is used to remove all the Kubernetes components associated with the chart and to delete the release.

To uninstall the logzio-monitoring deployment, use the following command:

helm uninstall logzio-monitoring

This section contains some guidelines for handling errors that you may encounter when trying to collect traces with OpenTelemetry.

Problem: No traces are sent

The code has been instrumented, but the traces are not being sent.

Possible cause - Collector not installed

The OpenTelemetry collector may not be installed on your system.

Suggested remedy

Check if you have an OpenTelemetry collector installed and configured to receive traces from your hosts.

Possible cause - Collector path not configured

If the collector is installed, it may not have the correct endpoint configured for the receiver.

Suggested remedy
  1. Check that the configuration file of the collector lists the following endpoints:

    receivers:
    otlp:
    protocols:
    grpc:
    endpoint: "0.0.0.0:4317"
    http:
    endpoint: "0.0.0.0:4318"
  2. In the instrumentation code, make sure that the endpoint is specified correctly. You can use Logz.io's integrations hub to ship your data.

Possible cause - Traces not genereated

If the collector is installed and the endpoints are properly configured, the instrumentation code may be incorrect.

Suggested remedy
  1. Check if the instrumentation can output traces to a console exporter.
  2. Use a web-hook to check if the traces are going to the output.
  3. Use the metrics endpoint of the collector (http://<<COLLECTOR-HOST>>:8888/metrics) to see the number of spans received per receiver and the number of spans sent to the Logz.io exporter.
  • Replace <<COLLECTOR-HOST>> with the address of your collector host, e.g. localhost, if the collector is hosted locally.

If the above steps do not work, refer to Logz.io's integrations hub and re-instrument the application.

Possible cause - Wrong exporter/protocol/endpoint

If traces are generated but not send, the collector may be using incorrect exporter, protocol and/or endpoint.

The correct endpoints are:

   receivers:
otlp:
protocols:
grpc:
endpoint: "<<COLLECTOR-URL>>:4317"
http:
endpoint: "<<COLLECTOR-URL>>:4318/v1/traces"
Suggested remedy
  1. Activate debug logs in the configuration file of the collector as follows:

    service:
    telemetry:
    logs:
    level: "debug"

Debug logs indicate the status code of the http/https post request.

If the post request is not successful, check if the collector is configured to use the correct exporter, protocol, and/or endpoint.

If the post request is successful, there will be an additional log with the status code 200. If the post request failed for some reason, there would be another log with the reason for the failure.

Possible cause - Collector failure

If the debug logs are sent, but the traces are still not generated, the collector logs need to be investigated.

Suggested remedy
  1. On Linux and MacOS, see the logs for the collector:

    journalctl | grep otelcol

    To only see errors:

    journalctl | grep otelcol | grep Error
  2. Otherwise, navigate to the following URL - http://localhost:8888/metrics

This is the endpoint to access the collector metrics in order to see different events that might happen within the collector - receiving spans, sending spans as well as other errors.

Possible cause - Exporter failure

Traces may not be generated if the exporter is not configured properly.

Suggested remedy

If you are unable to export traces to a destination, this may be caused by the following:

  • There is a network configuration issue
  • The exporter configuration is incorrect
  • The destination is unavailable

To investigate this issue:

  1. Make sure that the exporters and service: pipelines are configured correctly.
  2. Check the collector logs as well as zpages for potential issues.
  3. Check your network configuration, such as firewall, DNS, or proxy.

For example, those metrics can provide information about the exporter:

# HELP otelcol_exporter_enqueue_failed_metric_points Number of metric points failed to be added to the sending queue.

# TYPE otelcol_exporter_enqueue_failed_metric_points counter
otelcol_exporter_enqueue_failed_metric_points{exporter="logging",service_instance_id="0582dab5-efb8-4061-94a7-60abdc9867e1",service_version="latest"} 0
Possible cause - Receiver failure

Traces may not be generated if the receiver is not configured properly.

Suggested remedy

If you are unable to receive data, this may be caused by the following:

  • There is a network configuration issue
  • The receiver configuration is incorrect
  • The receiver is defined in the receivers section, but not enabled in any pipelines
  • The client configuration is incorrect

Those metrics can provide about the receiver:

# HELP otelcol_receiver_accepted_spans Number of spans successfully pushed into the pipeline.

# TYPE otelcol_receiver_accepted_spans counter
otelcol_receiver_accepted_spans{receiver="otlp",service_instance_id="0582dab5-efb8-4061-94a7-60abdc9867e1",service_version="latest",transport="grpc"} 34


# HELP otelcol_receiver_refused_spans Number of spans that could not be pushed into the pipeline.

# TYPE otelcol_receiver_refused_spans counter
otelcol_receiver_refused_spans{receiver="otlp",service_instance_id="0582dab5-efb8-4061-94a7-60abdc9867e1",service_version="latest",transport="grpc"} 0