Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up validate-json.sh script #417

Open
luceleaftea opened this issue Jun 20, 2024 · 1 comment
Open

Speed up validate-json.sh script #417

luceleaftea opened this issue Jun 20, 2024 · 1 comment
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@luceleaftea
Copy link
Contributor

The validate-json.sh script is slow, and can likely be sped up by parallelizing the various subscripts it runs.

@luceleaftea luceleaftea added enhancement New feature or request help wanted Extra attention is needed labels Jun 20, 2024
@RRunner1337
Copy link

RRunner1337 commented Aug 4, 2024

There is a simple solution that will sheer the time of processing probably in half.

The pajv library can process multiple files that MUST confirm to a one schema file. This will not paralize but actually shorten the number of calls to the pajv library.

The library can pass globb patterns to the second -d argument of the call. Hence changing the helper-scripts/json-validation/validate-json.sh to:

#!/bin/bash

[ -x ./node_modules/pajv/index.js ] || npm i

function validate_json {
    ./node_modules/pajv/index.js validate -s $1 -d $2  || exit $?
}

CMD="./node_modules/pajv/index.js validate -s"

SECONDS=0

validate_json ../../json-schema/ability-schema.json ../../json/*/ability.json
validate_json ../../json-schema/art-variation-schema.json ../../json/*/art-variation.json
validate_json ../../json-schema/artist-schema.json ../../json/*/artist.json
validate_json ../../json-schema/card-schema.json ../../json/*/card.json
validate_json ../../json-schema/card-flattened-schema.json ../../json/*/card-flattened.json
validate_json ../../json-schema/card-face-association-schema.json ../../json/*/card-face-association.json
validate_json ../../json-schema/card-reference-schema.json ../../json/*/card-reference.json
validate_json ../../json-schema/edition-schema.json ../../json/*/edition.json
validate_json ../../json-schema/foiling-schema.json ../../json/*/foiling.json
validate_json ../../json-schema/icon-schema.json ../../json/*/icon.json
validate_json ../../json-schema/keyword-schema.json ../../json/*/keyword.json
validate_json ../../json-schema/legality-schema.json "../../json/*/@(banned|living-legend|suspended|restricted)-*.json"
#validate_json ../../json-schema/legality-schema.json ../../json/english/banned-cc.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/banned-commoner.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/banned-upf.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/living-legend-blitz.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/living-legend-cc.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/suspended-blitz.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/suspended-cc.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/suspended-commoner.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/restricted-ll.json
validate_json ../../json-schema/rarity-schema.json ../../json/*/rarity.json
validate_json ../../json-schema/set-schema.json ../../json/*/set.json
validate_json ../../json-schema/type-schema.json ../../json/*/type.json

#validate_json ../../json-schema/ability-schema.json ../../json/french/ability.json
#validate_json ../../json-schema/artist-schema.json ../../json/french/artist.json
#validate_json ../../json-schema/card-schema.json ../../json/french/card.json
#validate_json ../../json-schema/card-flattened-schema.json ../../json/french/card-flattened.json
#validate_json ../../json-schema/keyword-schema.json ../../json/french/keyword.json
#validate_json ../../json-schema/set-schema.json ../../json/french/set.json
#validate_json ../../json-schema/type-schema.json ../../json/french/type.json

#validate_json ../../json-schema/ability-schema.json ../../json/german/ability.json
#validate_json ../../json-schema/artist-schema.json ../../json/german/artist.json
#validate_json ../../json-schema/card-schema.json ../../json/german/card.json
#validate_json ../../json-schema/card-flattened-schema.json ../../json/german/card-flattened.json
#validate_json ../../json-schema/keyword-schema.json ../../json/german/keyword.json
#validate_json ../../json-schema/set-schema.json ../../json/german/set.json
#validate_json ../../json-schema/type-schema.json ../../json/german/type.json

#validate_json ../../json-schema/ability-schema.json ../../json/italian/ability.json
#validate_json ../../json-schema/artist-schema.json ../../json/italian/artist.json
#validate_json ../../json-schema/card-schema.json ../../json/italian/card.json
#validate_json ../../json-schema/card-flattened-schema.json ../../json/italian/card-flattened.json
#validate_json ../../json-schema/keyword-schema.json ../../json/italian/keyword.json
#validate_json ../../json-schema/set-schema.json ../../json/italian/set.json
#validate_json ../../json-schema/type-schema.json ../../json/italian/type.json

#validate_json ../../json-schema/ability-schema.json ../../json/spanish/ability.json
#validate_json ../../json-schema/artist-schema.json ../../json/spanish/artist.json
#validate_json ../../json-schema/card-schema.json ../../json/spanish/card.json
#validate_json ../../json-schema/card-flattened-schema.json ../../json/spanish/card-flattened.json
#validate_json ../../json-schema/keyword-schema.json ../../json/spanish/keyword.json
#validate_json ../../json-schema/set-schema.json ../../json/spanish/set.json
#validate_json ../../json-schema/type-schema.json ../../json/spanish/type.json

echo "JSON validation took: $SECONDS seconds"

Should help considerably. Further optimization could be made by having a specialized Docker nodejs image for executing JSON validation - for example built like this:

FROM node:16.4.2-alpine as base

WORKDIR /usr/local/app

COPY package.json /usr/local/app/package.json
COPY package-lock.json /usr/local/app/package-lock.json

RUN npm i

ENTRYPOINT [ "./node_modules/pajv/index.js" ]

which should reside in helper-scripts/json-validation/Dockerfile and be bult using:

docker build --no-cache -f Dockerfile -t <docker-image-name> .

from the same directory.

The image would then be used instead of locally installing the development tools needed and run the validation through command line (maybe even fixing the issue #407 in the process):

docker run -rm -i -v "./:/data" <docker-image-name-from-build-step> validate -s <path-to-schema-file-in-docker> -d <glob-to-json-files-in-docker>

validate-json.sh script would need to be adopted for validation to be running from Docker volume (different paths, ...)

In my local environment I managed to optimize the validation time from 59 seconds to 19 seconds using the script that used local docker image built from provided dockerfile script, so I would assume that approximatelly 60% reduction in validation time is an accurate assumption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants