-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTA-1177: Gather OSUS data #416
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
#!/bin/bash | ||
|
||
source $(dirname "$0")/common.sh | ||
BASE_COLLECTION_PATH="must-gather" | ||
OSUS_OPERATOR_NAME="update-service-operator" | ||
get_log_collection_args | ||
|
||
HAS_OSUS=$(oc get csv -A --no-headers -o custom-columns=NS:.metadata.namespace,OPERATOR:.metadata.name --ignore-not-found=true | awk '/'${OSUS_OPERATOR_NAME}'/ {print $1}') | ||
if [ -z "$HAS_OSUS" ]; then | ||
exit 0 | ||
fi | ||
|
||
function gather_osus() { | ||
echo "INFO: Collecting OSUS Data" | ||
declare -a OSUS_NAMESPACES=(${HAS_OSUS}) | ||
|
||
for OSUS_NAMESPACE in "${OSUS_NAMESPACES[@]}"; do | ||
oc adm inspect ${log_collection_args} --dest-dir "${BASE_COLLECTION_PATH}" "ns/${OSUS_NAMESPACE}" | ||
oc adm inspect --dest-dir "${BASE_COLLECTION_PATH}" updateservices --namespace "${OSUS_NAMESPACE}" | ||
done | ||
} | ||
|
||
gather_osus | ||
|
||
# force disk flush to ensure that all data gathered are written | ||
sync |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a hard direction for an OCP-central must-gather to support. I thought the OLM-installed operator pattern was for each operator to ship a separate image with the must-gather-for-them logic (as described in the KCS you'd linked from OTA-1177, and also in these docs), so the central must-gather maintainers didn't have to be bothered reviewing gather logic for all the many, many, OLM-installed operators that could possibly be present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wking , while I agree with your comment for most of the operators that can be installed in OpenShift, I think creating a full image for this operator is excessive.
Other operators provide extra capacities to the cluster, and usually involve several different CRDs and even several namespaces. This operator helps with one of the core capacities of OpenShift, which is the upgrade (in this case, for disconnected clusters), and lots of cases already includes a must-gather proactively. The info for this operator (when installed) shouldn't increase a lot the size of the must-gather, and will avoid asking for a new different must-gather.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wking , any thoughts on the above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oarribas / @wking the only thing this collects are the details from a given namespace[s]. I think this is fine (given its related to update); but @soltysh or @ingvagabund should have the final say if this is provided by this image or an independent image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oarribas how big are the extra manifests? Can you share an example of running both commands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ingvagabund , checking data from some cases, it depends a lot on the volume of the logs from the pods.
Largest inspect of the namespace I have seen, without compression is 30MB, and compressed between 2-3MB. The size of the
updateservice
resource is few KB.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ingvagabund , @soltysh , any thoughts based on the above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some time we have been merging extra collection with a promise of collecting extra "small" data that help to avoid asking customers to run yet another must gather image. More relevant in the disconnected environment. Yet, we don't track how much of these extra collections increase the overall must-gather size on average. Before going further I'd like to see when all the extra collections are collected. E.g. what's the estimated worst case extra bump in the collected data. A new section under https://github.com/openshift/must-gather/blob/master/README.md will do. E.g.
Extra collections
ns/openshift-azure-operator
orns/openshift-azure-logging
presentThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ingvagabund , the estimated size for this one (for the OSUS operator), when compressed, is around 3MB (usually less as per several inspect of the
openshift-update-service
namespace I have reviewed). And it's only collected if OSUS is installed.Regarding the above for the estimated "worst" case, things like ARO and vSphere cannot be collected at the same time if the conditions are OK.