Skip to content

SNMP Quickstart

Ian edited this page Apr 12, 2024 · 43 revisions

Pre-reqs

Validate you have docker or podman installed and running (docs)

docker version --format '{{.Server.Version}}'

Validate you have a non-root user available in the docker group (docs)

grep -e "docker" /etc/group

Note: podman should also work with ktranslate.

Step 1

Download the ktranslate image from dockerhub:

docker pull kentik/ktranslate:v2

Step 2

Copy the SNMP config file to your local $HOME directory for your docker user, discard the container, and update the permissions on the file:

cd .
id=$(docker create kentik/ktranslate:v2)
docker cp $id:/etc/ktranslate/snmp-base.yaml .
docker rm -v $id

Step 3

Edit the SNMP config file (snmp-base.yaml) with your preferred text editor, setting cidrs: and default_communities: to appropriate values for your network

Step 4

Run a discovery on SNMP devices based on the CIDR and Community String values you've configured:

This will require you have your New Relic Account ID and Insights Insert Key, replacing $NR_INSIGHTS_INSERT_KEY and $NR_ACCOUNT_ID in this command, respectively

Note! This will not work with a Free Tier Account. Upgrade to a paid account to proceed.

docker run -ti --name ktranslate-discovery --rm --net=host \
  --user `id -u`:`id -g` \
  -e NEW_RELIC_API_KEY=$NR_INSIGHTS_INSERT_KEY  \
  -v `pwd`/snmp-base.yaml:/snmp-base.yaml \
  kentik/ktranslate:v2 \
    -snmp /snmp-base.yaml \
    -tee_logs=true \
    -nr_account_id=$NR_ACCOUNT_ID \
    -snmp_discovery=true

Towards the end of the discovery process, you should see a log line similar to:

[Info] KTranslate Adding 3 new snmp devices to the config, 0 replaced from 3

The above example indicates discovery found 3 new devices

If nothing is discovered, increase the timeout value timeout_ms: in the config file and re-run discovery.

If still nothing is discovered, set the cidrs: blocks to be only /32 length. This forces the discovery process into a more in depth mode which should work as long as the device is reachable via SNMP.

After successful discovery, devices are listed in the snmp-base.yaml file. By default, only the mib IF-MIB is polled. Add other mibs here as your devices support them.

Step 5

Lastly, start ktranslate to run in background and poll target devices:

Example: Output to JSON

docker run -ti --name ktranslate-snmp --rm --net=host \
  -v `pwd`/snmp-base.yaml:/snmp-base.yaml \
  kentik/ktranslate:v2 \
    -snmp=/snmp-base.yaml \
    -log_level=info \
    -format=json 

Example: Output to New Relic sink

This will require you have your New Relic Account ID and Insights Insert Key, replacing $NR_INSIGHTS_INSERT_KEY and $NR_ACCOUNT_ID in this command, respectively

Note! This will not work with a Free Tier Account. Upgrade to a paid account to proceed.

docker run -d --name ktranslate-snmp --restart unless-stopped --net=host \
  -v `pwd`/snmp-base.yaml:/snmp-base.yaml \
  -e NEW_RELIC_API_KEY=$NR_INSIGHTS_INSERT_KEY  \
  kentik/ktranslate:v2 \
    -snmp /snmp-base.yaml \
    -nr_account_id=$NR_ACCOUNT_ID \
    -log_level=info \
    -metrics=jchf \
    -tee_logs=true \
    nr1.snmp 

Troubleshooting

EU Region Accounts

To send data to the EU region, the following flag needs to be added to the docker command: -nr_region=EU

Logs

Get logs with docker logs ktranslate-snmp

Logs are also sent to New Relic using the --tee_logs=true argument during deployment of the container. You can find them in the New Relic Logs UI with this search:

collector.name:"ktranslate"

And filter out the [Info] messages with this:

collector.name:"ktranslate" message:-*\[Info\]*

Metrics

Additionally, you can capture the latest performance metrics of ktranslate from the --metrics=jchf argument during container deployment.

Metric Name Description
baseserver_healthcheck_execution_total Rate of internal health checks; Shows mostly that things are not deadlocked
delivery_metrics_nr Rate of metrics sent to New Relic
delivery_logs_nr Rate of logs sent to New Relic
delivery_wins_nr Rate of 200 HTTP codes received from sending metrics and events to NR
device_metrics Rate of SNMP polling of device level metrics
inputq Rate (msg/sec) of messages recieved over the last 60 sec from inputs (SNMP, VPC, Flow)
interface_metrics Rate of SNMP polling of interface level metrics
jchfq Gauge with number of available pre-allocated buffers; Should be ~8,000
snmp_fail Gauge if SNMP is working. 1 == GOOD, 2 == BAD. Facet by device_name

You can query these in the New Relic One UI with the following NRQL:

FROM Metric
SELECT
latest(kentik.ktranslate.chf.kkc.baseserver_healthcheck_execution_total) AS 'baseserver_healthcheck_execution_total',
latest(kentik.ktranslate.chf.kkc.delivery_metrics_nr) AS 'delivery_metrics_nr',
latest(kentik.ktranslate.chf.kkc.delivery_logs_nr) AS 'delivery_logs_nr',
latest(kentik.ktranslate.chf.kkc.delivery_wins_nr) AS 'delivery_wins_nr',
latest(kentik.ktranslate.chf.kkc.device_metrics) AS 'device_metrics',
latest(kentik.ktranslate.chf.kkc.inputq) AS 'inputq',
latest(kentik.ktranslate.chf.kkc.snmp_fail) AS 'snmp_fail',
latest(kentik.ktranslate.chf.kkc.interface_metrics) AS 'interface_metrics',
latest(kentik.ktranslate.chf.kkc.jchfq) AS 'jchfq'
WHERE provider = 'kentik-agent'
AND instrumentation.name = 'heartbeat'
LIMIT MAX