Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 35 #45

Merged
merged 21 commits into from
Dec 18, 2020
Merged

Issue 35 #45

merged 21 commits into from
Dec 18, 2020

Conversation

deadlysyn
Copy link
Contributor

Update package manager detection to use /etc/os-release.

This pull request makes the following changes:

  • Parse /etc/os-release to determine appropriate package manager
  • Remove dependency on which (or any external shell/command) for detection
  • Update unit test to mock os-release content vs which return code

API is unchanged so behavior should be the same. Docker functional tests still pass as expected.

NOTE: Since #29 has not been merged and there is overlap, I've pulled those changes into this branch to try and avoid merge conflict fun. detector.go and detector_test.go are all that contain new changes specific to this issue.

It relates to the following issue #s:

cc @bhamail / @DarthHater / @zendern

Copy link
Contributor

@zendern zendern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall made some minor comments

Comment on lines 51 to 70
if len(r) == 0 {
data, err := readReleaseFile()
if err != nil {
return packageManager, err
}
raw := bytes.Split(data, []byte("\n"))

id, err = parseField(raw, "ID")
if err != nil {
return packageManager, err
}

version, err = parseField(raw, "VERSION_ID")
if err != nil {
return packageManager, err
}
} else {
id = r[0].ID
version = r[0].Version
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this feels off to me. If we are going to leave it this way and pass in Release feels like there should be a different method that calculates the Release and only has one argument of logger and passes the calculated Release into this method.

That would be used by the actual production code. And you could still use this method for the unit tests. But if we are being honest I think we have almost all the tests we should need here.

https://github.com/sonatype-nexus-community/ahab/tree/master/docker

We are probably missing a few versions of redhat, debian, mint, fedora, etc so we could add those to make sure this is covered for the parts of the case statement below. And if we go that way we would at least not need the extra "test code" in the production code ¯_(ツ)_/¯

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the sanity check! Agreed, the docker based coverage makes this redundant + it just felt wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was going too far with the matching -- the rhel/redhat bits in particular probably don't make sense, because they are commercial (not sure how heavily used they would be as cloud images, as a former redhat user i just used centos for that). going to clean that up which will simplify the case.

i'll add more tests to cover the additional debian distros. i'm not sure if anyone would run a rolling release in the cloud, but arch also now has a docker image (finally)... honestly i'm not sure why anyone would run something other than alpine for most things, but a future PR might be adding support for arch. since that'll need a new purl format maybe that can be part of #44.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to think this through after more coffee tomorrow. when adding more docker based tests the first thing i hit was what looked like timeouts. boosting -timeout, i then got rate limited by OSSI. i feel like i worked around that in the past with a token, but need to remember how to get it injected into all the containers...and possibly rethink having so many parallel invocations.

Copy link
Contributor Author

@deadlysyn deadlysyn Oct 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've poked at this awhile, and the issues seems to be rate limiting -- although i might have uncovered something worth chasing.

if i run the original set of docker tests (8) all pass. if i add variants for ubuntu and old/new centos/fedora (11) each test can be ran individually but running them all errors out. playing with -timeout and -parallel doesn't help, just makes it take longer to fail. if the total number of ossi requests goes past the limit it obviously causes issues.

however, in the past when i'd hit that myself i'd see "you've been rate limited" type messages. i've been able to consistently cause the failure when running more tests, and each time i see the same message -- and it's failing (due to order) at the same test each time. the message is:

Uh oh, an error occurred, if this persists try rerunning with -v, -vv, or -vvv to get more information in the logs
                Error: runtime error: index out of range [1] with length 1

full output: https://gist.github.com/deadlysyn/ff3c13c6acb6e9f4182e180b4ff8a6cf

it's not a lot to go on, but unless this turns out to be something silly from the changes i've made, it feels like this might be an opportunity for more defensive programming / better error message.

i'm going to try and figure out where that's coming from...but wanted to pose some questions while i'm spelunking:

  • are there time limits to worry about in the current CI setup? even w/o a rate limit, more tests obviously add tiime. aside from causing longer builds, not sure if there are hard limits we need to keep in mind. it's all running with t.Parallel() but also not sure how parallelism is limited in pipelines. 🤔

  • do the CI tests currently run in such a way that they use an account/token for OSSI access? even if this odd error is chased down, i think the additional tests will still require token access to ever succeed. i was thinking we could refactor the dockerfiles to use env vars to feed that to ahab, but that would mean the CI system needs configured house secrets and devs need .envrc or some way to consistently set the same. is this worthwhile, or should i just pair back the tests?

Copy link
Contributor Author

@deadlysyn deadlysyn Oct 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry to spam, wanted to brain dump before i sleep and forget everything.

per gist above, i noticed it fails consistently with a dockerfile that forces centos:latest + specifying --os fedora (which should be using dnf). since it complains about a [1] index, i'm trying to figure out if it's somehow package format related since we have some stuff like this:

parse/yum.go
50:     parsedProject.Version = doParseYumVersionIntoPurl(splitPackage[1])
58:     parsedProject.Version = doParseYumVersionIntoPurl(splitPackage[1])

that code is easy to read and seems solid. it's just using strings.Fields, yum and dnf both return the same list output, and manually running a centos:latest container, copying in an ahab binary, and running it a) with auto detect b) specifying --os and c) specifying --package manager (yum or dnf) all work fine. there is some nuance about how the tests/docker are working i don't yet understand.

note: in my original manual tests i was just using base centos image, skipping this step from the test dockerfiles:

RUN yum -y install epel-release python3-pip

if i do that in my image, i then get consistent failures with the same error message whether using dnf or yum and either --os, --package-maanger or auto detect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow this was an edge case, i need sleep + coffee to trust the finding, but current thought is it looks like meta packages don't have versions which throws the yum parsing off. just diffing dnf list installed output before and after adding the packages which caused failures and see stuff like this after:

...
+dracut-network.x86_64             049-70.git20200228.el8              @System
+dracut-squash.x86_64              049-70.git20200228.el8              @System
+dracut.x86_64                     049-70.git20200228.el8              @System
+elfutils-default-yama-scope.noarch
+elfutils-libelf.x86_64            0.178-7.el8                         @System
+elfutils-libs.x86_64              0.178-7.el8                         @System
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've confirmed this is a legitimate issue the extended test coverage uncovered. steps to reproduce:

  • docker pull centos:latest (or fedora:latest)
  • docker run -it --rm -v pwd:pwd -w pwd centos:latest (make sure freshly built ahab binary is in pwd)
  • yum (or dnf) -y install epel-release python3-pip
  • yum (or dnf) list installed | ./ahab chase (or --os or --package-manager)

also verified by checking diff output and doing simple shell for/while prefixing each line -- yum/dnf list output can generate lines which miss some fields so breaks current parsing.

proposal: while working around this in various ways it dawned on me that we might avoid this issue + simplify yum vs dnf detection by simply using rpm --queryformat. rpm exists in all centos/fedora versions so simplifies detection/testing. this approach would be similar to using dpkg-query vs apt in debian distros -- apt/dnf/yum are really just dependency management tools atop the actual package manager.

any objections to reworking this to use rpm vs dnf/yum? it should ensure consistent input data by being able to control the format. for example:

[root@976a05fd817a yum]# rpm -qa --queryformat "%{NAME}.%{ARCH} %{VERSION}\n" | grep elf
elfutils-libelf.x86_64 0.178
elfutils-libs.x86_64 0.178
elfutils-default-yama-scope.noarch 0.178

var execCommand = exec.Command
// SupportedPackageManagers represents the standard error string used
// when OS package maanger can not be identified.
const SupportedPackageManagers = "No supported package manager found; apk, dpkg, dnf or yum installed?"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question here feels a little off I think. Maybe something like this??

Suggested change
const SupportedPackageManagers = "No supported package manager found; apk, dpkg, dnf or yum installed?"
const SupportedPackageManagers = "No supported package manager could be auto detected. Supported versions are apk, dpkg, dnf or yum."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 15 to 16
// Package packages implements package manager detection and package
// manager specific output formatting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment reads a little funny to me. #youGetAPackageAndYouGetAPackage :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. Getting too OCD with linting warnings :-D

@bhamail
Copy link
Contributor

bhamail commented Oct 14, 2020

Looks like some of the prior PR's I merged caused some conflicts. I will try to resolve the conflicts. Apologies in advance if I make a bigger mess.

@deadlysyn
Copy link
Contributor Author

@bhamail no worries, you are not alone in merge conflict purgatory :-D i could also help sort it out if you want -- let me know how it goes. i might be able to jump through the hoops faster in this specific case if it relates to change i made recently. should i just pull in master on my side and sort it out?

aside from that, this has turned into a bigger issue than i first imagined...detecting os based on /etc/os-release was easy enough, but the inconsistent output format of dnf and yum (newer versions some times seem to split package names/versions across multiple lines in rare cases, which breaks parsing...we could piece those lines together, but it feels brittle and makes me think GIGO so we should possibly change how we inject the package data).

because that is a giant can of worms i am tempted to just help you get the os-release parsing/detection conflicts sorted, back out the test changes, and then maybe we do another PR that is about adding back the extended tests (which uncovered the package version parsing issue that's been lurking all along) + work through an appropriate fix. there's different ways we can do it, and while rpm queryformat feels technically right changes usage so probably needs larger discussion/review.

@bhamail
Copy link
Contributor

bhamail commented Oct 14, 2020

:) Well, I don't pretend to know why the PR I created against your branch doesn't even show changes to the files listed as conflicting: detector.go and detectory_test.go, but here is it in case it is of any help: deadlysyn#1

I'm guessing ignoring that PR and fixing it directly is no worse. ;)

@deadlysyn
Copy link
Contributor Author

Took care of the merge conflicts, and added what I think is the minimal workaround for the dnf/yum parsing issue. It's not ideal, because it just skips the bad lines (that package won't be scanned), but the current behavior is a crash from indexing out of bounds. I see the last commit as a bandaid to prevent the crashes described above, but we should create a new PR to discuss and agree on the real solution.

@zendern
Copy link
Contributor

zendern commented Oct 15, 2020

and added what I think is the minimal workaround for the dnf/yum parsing issue

So I agree with all that you've mentioned around the issue. If you wouldn't mind opening an issue to deal with the actual issue. I think the workaround is good enough for now but moving over to rpm sounds reasonable and really maybe we just make that a 1.0 level change.

are there time limits to worry about in the current CI setup? even w/o a rate limit, more tests obviously add tiime. aside from causing longer builds, not sure if there are hard limits we need to keep in mind. it's all running with t.Parallel() but also not sure how parallelism is limited in pipelines. 🤔

Ummmm..... not sure how that would be. We do the same make test locally where you were getting rate limited and CI. You have to explicitly pass in the username and password. So I'm not sure why in this PR the tests did in fact pass.

do the CI tests currently run in such a way that they use an account/token for OSSI access? even if this odd error is chased down, i think the additional tests will still require token access to ever succeed. i was thinking we could refactor the dockerfiles to use env vars to feed that to ahab, but that would mean the CI system needs configured house secrets and devs need .envrc or some way to consistently set the same. is this worthwhile, or should i just pair back the tests?

Yah as I alluded to above I dont believe we have anything like that setup from what I know. Maybe @bhamail or @DarthHater can go poking and looking at the circleCI config magic for us?? But that could be a good addition if we dont have anything to get around this.

@bhamail
Copy link
Contributor

bhamail commented Oct 23, 2020

do the CI tests currently run in such a way that they use an account/token for OSSI access? even if this odd error is chased down, i think the additional tests will still require token access to ever succeed. i was thinking we could refactor the dockerfiles to use env vars to feed that to ahab, but that would mean the CI system needs configured house secrets and devs need .envrc or some way to consistently set the same. is this worthwhile, or should i just pair back the tests?

CircleCI has a nice mechanism for passing secrets via environment variables. We use it often.

Related to Issue #27:
Also, if we decide to, we could leverage Viper to read the ossi credentials (e.g. by default, the same file used by nancy (see nancy link below). Can override in tests to some other file, or env var, etc). We already use Cobra commands, and Viper integrates nicely with Cobra. Here's some of what nancy does to read credentials via Viper. Viper supports the following precedence for reading values:

  1. explicit call to Set
  2. flag
  3. env
  4. config
  5. key/value store
  6. default

@bhamail
Copy link
Contributor

bhamail commented Oct 26, 2020

I'll see if I can make some progress with adding Viper to ahab

@bhamail
Copy link
Contributor

bhamail commented Nov 3, 2020

I merged the Viper/config stuff into master.

@bhamail bhamail merged commit 147c794 into sonatype-nexus-community:master Dec 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto Detect v2
3 participants