-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
F5 Trap messages not having event
varbind metrics converted from OID to string name
#740
Comments
I'm having a hard to reproducing this bug. When I use the provided python script I get: [
{
"instrumentation.name": "netflow-events",
"TrapOID": ".1.3.6.1.4.1.3375.2.4.0.10",
"src_addr": "127.0.0.1",
"collector.name": "ktranslate",
"node_name": "/Common/100.84.2.109",
"service_port": "80",
"eventType": "KSnmpTrap",
"provider": "kentik-trap-device",
"instrumentation.provider": "kentik",
"message": "Pool /Common/http_las.fml.prod.eleadcrm.com_80_pool member /Common/100.84.2.109:80 monitor status down. [ /Common/http_head_evo2-releaseinfo_200: down, /Common/http_release_updown: checking; last error: /Common/http_head_evo2-releaseinfo_200: Response Code: 200 (OK) @2024/08/14 09:10:15. ] [ was up for 0hr:3mins:36sec ]",
"TrapName": "bigipServiceDown",
"device_name": "127.0.0.1"
}
] Which I think is correct? Can you tell me what's missing here? One thing that comes to mind is you might have an older ktrans container which didn't get correct f5 yaml? |
The primary issue (at least to my understanding of Ktranslate) is that the varbinds that are collected as
but the metric names for the varbinds are the actual OID instead of looking like this:
The example that you provided does have that translation between varbind OID and metric name present. How did you create the JSON output you provided? |
Can you try this and see if you get the same output:
Should see [{"src_geo":"Private IP","src_as_name":"Private IP","node_name":"/Common/100.84.2.109","TrapName":"bigipServiceDown","provider":"kentik-trap-device","eventType":"KSnmpTrap","service_port":"80","message":"Pool /Common/http_las.fml.prod.eleadcrm.com_80_pool member /Common/100.84.2.109:80 monitor status down. [ /Common/http_head_evo2-releaseinfo_200: down, /Common/http_release_updown: checking; last error: /Common/http_head_evo2-releaseinfo_200: Response Code: 200 (OK) @2024/08/14 09:10:15. ] [ was up for 0hr:3mins:36sec ]","sysUpTimeInstance":257116481,"device_name":"192.168.0.100","collector.name":"ktranslate","instrumentation.provider":"kentik","TrapOID":".1.3.6.1.4.1.3375.2.4.0.10","instrumentation.name":"netflow-events","src_addr":"192.168.0.100"}] This is using the default json format which just dumps what is given to stdout. |
Running the container with the command provided does show a successful launch:
but after that, no logs are written from the container. I ran the script again to generate the Trap message, and I can see it on
but it never appears in the Docker logs for some reason. Does the container typically output logs when "successfully" ingesting data? |
That sounds like a firewall in the way. Remember that tcpdump runs before iptables and so doesn't show dropped packets, rather all packets. Can you go back to how you were originally running the docker container and do it this way? However that was it was getting the packets through. Just make sure to pull a new image and take out all the flags except for |
Command to launch the container:
Logs from Docker container:
Even with debug logs supposedly enabled, I'm not seeing that Ktranslate is picking up the trap messages. I stripped out the stuff related to SNMP processing and ingesting data to New Relic, but can re-enable it if needed to see what gets sent to NRDB. This is what the
Sorry for the confusion on my end, but how are you checking the JSON that gets created by Ktranslate? |
I'm just removing From this, Thanks! |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
@ASchneider-GitHub were to able to try Ian's suggestion to get the JSON output? |
Problem Summary:
The customer is ingesting Trap messages from their Ktranslate container, and the data is making it into the platform as shown here: [Click]
The issue is that the headers of the columns for the event varbinds that are defined in the profile are not being translated to using their OIDs. For example, the column header
.1.3.6.1.4.1.3375.2.4.1.1
should be namedbigipServiceDown
, but also have the name message in NRDB as defined here: [Click]. That name translation doesn't appear to be working despite the profile being valid YAML and the packets arriving in a healthy state.To test this I added additional logging code to the code that handles trap translation: https://github.com/kentik/ktranslate/compare/main...ASchneider-GitHub:ktranslate:main#diff-4e0190318b944f9edc49501e72cf7697e3b2a8bce0c93dfa22bb2488eafbab94
After building and running the custom image, I ran the following Python script to send a Trap that matches the customer's 1:1:
The Docker container was run with a custom-mounted profile traps.yml that only contained the OID for the expected trap message, and the additional varbind metric values that are collected as well:
After sending the Trap, I got the following logs:
Based on the output we can see that the Trap was received:
It tried to look up
.1.3.6.1.2.1.1.3.0
but failed (because it's not defined in my custom profile). It then tried to look up.1.3.6.1.4.1.3375.2.4.1.1
,.1.3.6.1.4.1.3375.2.4.1.2
, and.1.3.6.1.4.1.3375.2.4.1.3
(all of which ARE defined in the profile) and failed for all of them as well. That said, we can still see that the values of the varbinds were collected:The trap is then sent to New Relic without any of the
OID <-> name
translation occurring. The Trap profile is formatted as documented by the template here: https://github.com/kentik/snmp-profiles/blob/main/profiles/kentik_snmp/_trap_template.ymlThe text was updated successfully, but these errors were encountered: