Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config lefthook #25

Merged
merged 22 commits into from
Aug 24, 2024
Merged

Config lefthook #25

merged 22 commits into from
Aug 24, 2024

Conversation

SalihuDickson
Copy link

@SalihuDickson SalihuDickson commented Aug 24, 2024

Summary by Sourcery

Refactor the TES and WES converters to use Pydantic for data validation, improving data integrity and error handling. Introduce new Pydantic models for TES and WES data. Enhance the CI workflow to run tests conditionally and trigger on all branches. Add unit tests for WRROC models and validators. Configure pre-push hooks using Lefthook for code linting with Ruff.

New Features:

  • Introduce Pydantic models for TES and WES data validation, enhancing data integrity and error handling in the conversion process.
  • Add a new lefthook.yml configuration file to set up pre-push hooks for code linting with Ruff.

Enhancements:

  • Refactor TES and WES converters to use Pydantic for data validation, improving code reliability and maintainability.
  • Update the CLI to use double quotes for consistency in option definitions and improve readability.
  • Enhance the CI workflow to run tests automatically and ensure code quality by adding a condition to run tests only if previous steps succeed.

CI:

  • Modify the CI workflow to trigger on all branches instead of just the main branch, allowing for more comprehensive testing across different development branches.

Tests:

  • Add unit tests for WRROC models and validators to ensure correct functionality and data validation.

Copy link

sourcery-ai bot commented Aug 24, 2024

Reviewer's Guide by Sourcery

This pull request introduces significant changes to the project structure and functionality, focusing on improving data validation, error handling, and code organization. The changes include the addition of new models for WRROC, TES, and WES data structures, implementation of validators, updates to existing converters, and the introduction of a pre-push hook for code linting.

File-Level Changes

Change Details Files
Implemented Pydantic models for data validation
  • Created WRROC models (WRROCProcess, WRROCWorkflow, WRROCProvenance)
  • Added TES models for task execution data
  • Introduced WES models for workflow execution data
crategen/models/wrroc_models.py
crategen/models/tes_models.py
crategen/models/wes_models.py
Updated converters to use new Pydantic models
  • Refactored TESConverter to use TESData model for validation
  • Updated WESConverter to use WESData model for validation
  • Improved error handling in converters
crategen/converters/tes_converter.py
crategen/converters/wes_converter.py
Introduced validators for WRROC data
  • Implemented validate_wrroc function to determine WRROC profile
  • Added validate_wrroc_tes for TES-specific validation
  • Created validate_wrroc_wes for WES-specific validation
crategen/validators.py
Updated CI workflow and added pre-push hook
  • Modified CI workflow to run on all branches
  • Added Ruff linter check to CI process
  • Implemented pre-push hook using Lefthook for Ruff linting
.github/workflows/ci.yml
lefthook.yml
Refactored and improved existing code
  • Updated CLI implementation for better error handling
  • Improved utility functions for date/time conversions
  • Refactored AbstractConverter for consistency
crategen/cli.py
crategen/utils.py
crategen/converters/abstract_converter.py
Added unit tests for new functionality
  • Created unit tests for WRROC models and validators
tests/unit/test_wrroc_models.py

Tips
  • Trigger a new Sourcery review by commenting @sourcery-ai review on the pull request.
  • Continue your discussion with Sourcery by replying directly to review comments.
  • You can change your review settings at any time by accessing your dashboard:
    • Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
    • Change the review language;
  • You can always contact us if you have any questions or feedback.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @SalihuDickson - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟡 Testing: 1 issue found
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

volumes,
logs,
tags,
) = validated_tes_data.dict().values()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Use a more robust method for extracting validated data

Using .dict().values() is potentially fragile if the order of fields in the Pydantic model changes. Consider using named attributes instead, e.g., validated_tes_data.id, validated_tes_data.name, etc. This would make the code more robust and easier to understand.

Suggested change
) = validated_tes_data.dict().values()
id = validated_tes_data.id
name = validated_tes_data.name
description = validated_tes_data.description
executors = validated_tes_data.executors
inputs = validated_tes_data.inputs
outputs = validated_tes_data.outputs
volumes = validated_tes_data.volumes
logs = validated_tes_data.logs
tags = validated_tes_data.tags

Comment on lines 194 to 200
def test_validate_wrroc_tes_missing_fields(self):
"""
Test that validate_wrroc_tes raises a ValueError if required fields for TES conversion are missing.
"""
data = {"id": "process-id", "name": "Test Process"}
with self.assertRaises(ValueError):
validate_wrroc_tes(data)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding more specific test cases for TES validation

While this test checks for missing fields, it might be beneficial to add more specific test cases that check each required field individually. This could help pinpoint exactly which field validations might fail in the future.

Suggested change
def test_validate_wrroc_tes_missing_fields(self):
"""
Test that validate_wrroc_tes raises a ValueError if required fields for TES conversion are missing.
"""
data = {"id": "process-id", "name": "Test Process"}
with self.assertRaises(ValueError):
validate_wrroc_tes(data)
def test_validate_wrroc_tes_missing_fields(self):
"""Test validate_wrroc_tes raises ValueError for missing required fields."""
required_fields = ['id', 'name', 'description', 'executors']
for field in required_fields:
data = {f: "value" for f in required_fields if f != field}
with self.subTest(f"Missing {field}"):
with self.assertRaises(ValueError):
validate_wrroc_tes(data)

logs,
tags,
) = validated_tes_data.dict().values()
end_time = validated_tes_data.logs[0].end_time

# Convert to WRROC
wrroc_data = {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Inline variable that is immediately returned (inline-immediately-returned-variable)

start_time = wrroc_data.get("startTime", "")
end_time = wrroc_data.get("endTime", "")
def convert_from_wrroc(self, data):
# Validate WRROC data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): We've found these issues:


Explanation
Python has a number of builtin variables: functions and constants that
form a part of the language, such as list, getattr, and type
(See https://docs.python.org/3/library/functions.html).
It is valid, in the language, to re-bind such variables:

list = [1, 2, 3]

However, this is considered poor practice.

  • It will confuse other developers.
  • It will confuse syntax highlighters and linters.
  • It means you can no longer use that builtin for its original purpose.

How can you solve this?

Rename the variable something more specific, such as integers.
In a pinch, my_list and similar names are colloquially-recognized
placeholders.

Comment on lines 54 to 59
wes_data = {
"run_id": run_id,
"run_log": {
"name": name,
"start_time": start_time,
"end_time": end_time,
},
"run_log": {"name": name, "start_time": start_time, "end_time": end_time},
"state": state,
"outputs": [{"location": res.get("@id", ""), "name": res.get("name", "")} for res in result_data],
"outputs": [{"location": res.id, "name": res.name} for res in result_data],
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Inline variable that is immediately returned (inline-immediately-returned-variable)


if content_is_set:
values["url"] = None
elif not content_is_set and not url_is_set:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Remove redundant conditional (remove-redundant-if)

Suggested change
elif not content_is_set and not url_is_set:
elif not url_is_set:

Comment on lines 68 to 72
missing_fields = [
field for field in required_fields if getattr(validated_data, field) is None
]

if missing_fields:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use named expression to simplify assignment and conditional (use-named-expression)

Suggested change
missing_fields = [
field for field in required_fields if getattr(validated_data, field) is None
]
if missing_fields:
if missing_fields := [
field
for field in required_fields
if getattr(validated_data, field) is None
]:

Comment on lines 103 to 107
missing_fields = [
field for field in required_fields if getattr(validated_data, field) is None
]

if missing_fields:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use named expression to simplify assignment and conditional (use-named-expression)

Suggested change
missing_fields = [
field for field in required_fields if getattr(validated_data, field) is None
]
if missing_fields:
if missing_fields := [
field
for field in required_fields
if getattr(validated_data, field) is None
]:

@SalihuDickson SalihuDickson changed the base branch from main to models August 24, 2024 22:58
@SalihuDickson SalihuDickson merged commit 3dc009a into models Aug 24, 2024
2 checks passed
@SalihuDickson SalihuDickson deleted the config-lefthook branch August 24, 2024 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant