-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 35 #45
Issue 35 #45
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall made some minor comments
packages/detector.go
Outdated
if len(r) == 0 { | ||
data, err := readReleaseFile() | ||
if err != nil { | ||
return packageManager, err | ||
} | ||
raw := bytes.Split(data, []byte("\n")) | ||
|
||
id, err = parseField(raw, "ID") | ||
if err != nil { | ||
return packageManager, err | ||
} | ||
|
||
version, err = parseField(raw, "VERSION_ID") | ||
if err != nil { | ||
return packageManager, err | ||
} | ||
} else { | ||
id = r[0].ID | ||
version = r[0].Version | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this feels off to me. If we are going to leave it this way and pass in Release
feels like there should be a different method that calculates the Release
and only has one argument of logger and passes the calculated Release into this method.
That would be used by the actual production code. And you could still use this method for the unit tests. But if we are being honest I think we have almost all the tests we should need here.
https://github.com/sonatype-nexus-community/ahab/tree/master/docker
We are probably missing a few versions of redhat, debian, mint, fedora, etc so we could add those to make sure this is covered for the parts of the case statement below. And if we go that way we would at least not need the extra "test code" in the production code ¯_(ツ)_/¯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the sanity check! Agreed, the docker based coverage makes this redundant + it just felt wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was going too far with the matching -- the rhel/redhat bits in particular probably don't make sense, because they are commercial (not sure how heavily used they would be as cloud images, as a former redhat user i just used centos for that). going to clean that up which will simplify the case.
i'll add more tests to cover the additional debian distros. i'm not sure if anyone would run a rolling release in the cloud, but arch also now has a docker image (finally)... honestly i'm not sure why anyone would run something other than alpine for most things, but a future PR might be adding support for arch. since that'll need a new purl format maybe that can be part of #44.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to think this through after more coffee tomorrow. when adding more docker based tests the first thing i hit was what looked like timeouts. boosting -timeout, i then got rate limited by OSSI. i feel like i worked around that in the past with a token, but need to remember how to get it injected into all the containers...and possibly rethink having so many parallel invocations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've poked at this awhile, and the issues seems to be rate limiting -- although i might have uncovered something worth chasing.
if i run the original set of docker tests (8) all pass. if i add variants for ubuntu and old/new centos/fedora (11) each test can be ran individually but running them all errors out. playing with -timeout and -parallel doesn't help, just makes it take longer to fail. if the total number of ossi requests goes past the limit it obviously causes issues.
however, in the past when i'd hit that myself i'd see "you've been rate limited" type messages. i've been able to consistently cause the failure when running more tests, and each time i see the same message -- and it's failing (due to order) at the same test each time. the message is:
Uh oh, an error occurred, if this persists try rerunning with -v, -vv, or -vvv to get more information in the logs
Error: runtime error: index out of range [1] with length 1
full output: https://gist.github.com/deadlysyn/ff3c13c6acb6e9f4182e180b4ff8a6cf
it's not a lot to go on, but unless this turns out to be something silly from the changes i've made, it feels like this might be an opportunity for more defensive programming / better error message.
i'm going to try and figure out where that's coming from...but wanted to pose some questions while i'm spelunking:
-
are there time limits to worry about in the current CI setup? even w/o a rate limit, more tests obviously add tiime. aside from causing longer builds, not sure if there are hard limits we need to keep in mind. it's all running with t.Parallel() but also not sure how parallelism is limited in pipelines. 🤔
-
do the CI tests currently run in such a way that they use an account/token for OSSI access? even if this odd error is chased down, i think the additional tests will still require token access to ever succeed. i was thinking we could refactor the dockerfiles to use env vars to feed that to ahab, but that would mean the CI system needs configured house secrets and devs need .envrc or some way to consistently set the same. is this worthwhile, or should i just pair back the tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry to spam, wanted to brain dump before i sleep and forget everything.
per gist above, i noticed it fails consistently with a dockerfile that forces centos:latest + specifying --os fedora
(which should be using dnf). since it complains about a [1]
index, i'm trying to figure out if it's somehow package format related since we have some stuff like this:
parse/yum.go
50: parsedProject.Version = doParseYumVersionIntoPurl(splitPackage[1])
58: parsedProject.Version = doParseYumVersionIntoPurl(splitPackage[1])
that code is easy to read and seems solid. it's just using strings.Fields
, yum and dnf both return the same list output, and manually running a centos:latest container, copying in an ahab binary, and running it a) with auto detect b) specifying --os and c) specifying --package manager (yum or dnf) all work fine. there is some nuance about how the tests/docker are working i don't yet understand.
note: in my original manual tests i was just using base centos image, skipping this step from the test dockerfiles:
RUN yum -y install epel-release python3-pip
if i do that in my image, i then get consistent failures with the same error message whether using dnf or yum and either --os, --package-maanger or auto detect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow this was an edge case, i need sleep + coffee to trust the finding, but current thought is it looks like meta packages don't have versions which throws the yum parsing off. just diffing dnf list installed output before and after adding the packages which caused failures and see stuff like this after:
...
+dracut-network.x86_64 049-70.git20200228.el8 @System
+dracut-squash.x86_64 049-70.git20200228.el8 @System
+dracut.x86_64 049-70.git20200228.el8 @System
+elfutils-default-yama-scope.noarch
+elfutils-libelf.x86_64 0.178-7.el8 @System
+elfutils-libs.x86_64 0.178-7.el8 @System
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've confirmed this is a legitimate issue the extended test coverage uncovered. steps to reproduce:
- docker pull centos:latest (or fedora:latest)
- docker run -it --rm -v
pwd
:pwd
-wpwd
centos:latest (make sure freshly built ahab binary is in pwd) - yum (or dnf) -y install epel-release python3-pip
- yum (or dnf) list installed | ./ahab chase (or --os or --package-manager)
also verified by checking diff output and doing simple shell for/while prefixing each line -- yum/dnf list output can generate lines which miss some fields so breaks current parsing.
proposal: while working around this in various ways it dawned on me that we might avoid this issue + simplify yum vs dnf detection by simply using rpm --queryformat. rpm exists in all centos/fedora versions so simplifies detection/testing. this approach would be similar to using dpkg-query vs apt in debian distros -- apt/dnf/yum are really just dependency management tools atop the actual package manager.
any objections to reworking this to use rpm vs dnf/yum? it should ensure consistent input data by being able to control the format. for example:
[root@976a05fd817a yum]# rpm -qa --queryformat "%{NAME}.%{ARCH} %{VERSION}\n" | grep elf
elfutils-libelf.x86_64 0.178
elfutils-libs.x86_64 0.178
elfutils-default-yama-scope.noarch 0.178
packages/detector.go
Outdated
var execCommand = exec.Command | ||
// SupportedPackageManagers represents the standard error string used | ||
// when OS package maanger can not be identified. | ||
const SupportedPackageManagers = "No supported package manager found; apk, dpkg, dnf or yum installed?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question here feels a little off I think. Maybe something like this??
const SupportedPackageManagers = "No supported package manager found; apk, dpkg, dnf or yum installed?" | |
const SupportedPackageManagers = "No supported package manager could be auto detected. Supported versions are apk, dpkg, dnf or yum." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
packages/detector.go
Outdated
// Package packages implements package manager detection and package | ||
// manager specific output formatting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment reads a little funny to me. #youGetAPackageAndYouGetAPackage :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed. Getting too OCD with linting warnings :-D
* run nancy to check for dependency vulns
* deprecate os, add package-manager
Looks like some of the prior PR's I merged caused some conflicts. I will try to resolve the conflicts. Apologies in advance if I make a bigger mess. |
@bhamail no worries, you are not alone in merge conflict purgatory :-D i could also help sort it out if you want -- let me know how it goes. i might be able to jump through the hoops faster in this specific case if it relates to change i made recently. should i just pull in master on my side and sort it out? aside from that, this has turned into a bigger issue than i first imagined...detecting os based on /etc/os-release was easy enough, but the inconsistent output format of dnf and yum (newer versions some times seem to split package names/versions across multiple lines in rare cases, which breaks parsing...we could piece those lines together, but it feels brittle and makes me think GIGO so we should possibly change how we inject the package data). because that is a giant can of worms i am tempted to just help you get the os-release parsing/detection conflicts sorted, back out the test changes, and then maybe we do another PR that is about adding back the extended tests (which uncovered the package version parsing issue that's been lurking all along) + work through an appropriate fix. there's different ways we can do it, and while rpm queryformat feels technically right changes usage so probably needs larger discussion/review. |
:) Well, I don't pretend to know why the PR I created against your branch doesn't even show changes to the files listed as conflicting: I'm guessing ignoring that PR and fixing it directly is no worse. ;) |
Took care of the merge conflicts, and added what I think is the minimal workaround for the dnf/yum parsing issue. It's not ideal, because it just skips the bad lines (that package won't be scanned), but the current behavior is a crash from indexing out of bounds. I see the last commit as a bandaid to prevent the crashes described above, but we should create a new PR to discuss and agree on the real solution. |
So I agree with all that you've mentioned around the issue. If you wouldn't mind opening an issue to deal with the actual issue. I think the workaround is good enough for now but moving over to rpm sounds reasonable and really maybe we just make that a 1.0 level change.
Ummmm..... not sure how that would be. We do the same
Yah as I alluded to above I dont believe we have anything like that setup from what I know. Maybe @bhamail or @DarthHater can go poking and looking at the circleCI config magic for us?? But that could be a good addition if we dont have anything to get around this. |
CircleCI has a nice mechanism for passing secrets via environment variables. We use it often. Related to Issue #27:
|
I'll see if I can make some progress with adding |
I merged the Viper/config stuff into |
Update package manager detection to use /etc/os-release.
This pull request makes the following changes:
API is unchanged so behavior should be the same. Docker functional tests still pass as expected.
NOTE: Since #29 has not been merged and there is overlap, I've pulled those changes into this branch to try and avoid merge conflict fun.
detector.go
anddetector_test.go
are all that contain new changes specific to this issue.It relates to the following issue #s:
cc @bhamail / @DarthHater / @zendern