Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added feature to export the evaluation summary table in csv format #229

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

Drita-ai
Copy link

@Drita-ai Drita-ai commented Oct 6, 2024

This PR implements the feature to export the evaluation summary table in csv format to the outputs folder.

Usage : Run the code with python3 main.py -i ./samples/sample4 and you'll see the csv files outputted inside outputs/Evaluation

Fixes #221

@Drita-ai Drita-ai changed the title Feat/evaluation table to csv [Feature] Export the evaluation summary table in csv format Oct 6, 2024
@Drita-ai Drita-ai changed the title [Feature] Export the evaluation summary table in csv format Added feature to export the evaluation summary table in csv format Oct 6, 2024
Copy link
Owner

@Udayraj123 Udayraj123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @Drita-ai! Please check my comments.

Comment on lines 366 to 368
evaluation_json = open_evaluation_with_validation(self.path)

if evaluation_json["options"].get("enable_evaluation_table_to_csv", False):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should already have access to options in constructors. No need to load the json file again.

self.should_explain_scoring = options.get("should_explain_scoring", False)
self.enable_evaluation_table_to_csv = options.get("enable_evaluation_table_to_csv", False)

Comment on lines 357 to 362
def conditionally_print_explanation(self, file_id):
if self.should_explain_scoring:
console.print(self.explanation_table, justify="center")

self.explanation_to_csv(file_id)
self.explanation_table_data_for_csv = []
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead can we directory call a function conditionally_save_explanation_csv() from the parent? Wherever conditionally_print_explanation() is called currently?


output_dir = os.path.join(
os.path.dirname(os.getcwd()),
f"OMRChecker/outputs/Evaluation/{processed_img_name}.csv",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMRChecker works on multiple directories actually, so can't use hardcoded paths. You need to access outputs_namespace.paths (create evaluation path there) and pass it in this function from the parent (evaluate_concatenated_response)

@@ -505,9 +538,10 @@ def conditionally_add_explanation(
if item is not None
]
self.explanation_table.add_row(*row)
self.explanation_table_data_for_csv.append(row)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of appending the row here, can you check if `self.explanation_table.rows can be looped upon later? It will reduce the coupling.

@@ -517,6 +551,6 @@ def evaluate_concatenated_response(concatenated_response, evaluation_config):
)
current_score += delta

evaluation_config.conditionally_print_explanation()
evaluation_config.conditionally_print_explanation(file_id)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's call a separate function instead of coupling the logic with the dedicated small functions

@@ -44,6 +46,11 @@ def setup_dirs_for_paths(paths):
logger.info(f"Created : {save_output_dir}")
os.makedirs(save_output_dir)

for save_output_dir in [paths.evaluation_dir]:
if os.path.exists(save_output_dir):
shutil.rmtree(save_output_dir)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not remove any of the user's output. Conditionally create the directory instead similar to other lines here

if self.should_explain_scoring:
console.print(self.explanation_table, justify="center")

self.explanation_to_csv(file_id)
self.explanation_table_data_for_csv = []
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also populate explanation_table_data_for_csv directly by looping on self.explanation_table.rows if that's available. That can make the code even cleaner

@Drita-ai
Copy link
Author

Drita-ai commented Oct 7, 2024

Thank @Udayraj123 for the review and for pointing out the areas for improvement. I’ll make the necessary changes as you've suggested.

@Drita-ai
Copy link
Author

Drita-ai commented Oct 9, 2024

Hey @Udayraj123, I've made the changes you've asked for. Please consider reviewing it.

Copy link
Owner

@Udayraj123 Udayraj123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good improvement on previous comments. Please check these comments as well.

Comment on lines 48 to 54
for save_output_dir in [paths.evaluation_dir]:
if not os.path.exists(save_output_dir):
logger.info(f"Created : {save_output_dir}")
os.makedirs(save_output_dir)

for save_output_dir in [paths.multi_marked_dir, paths.errors_dir]:
if not os.path.exists(save_output_dir):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for save_output_dir in [paths.evaluation_dir]:
if not os.path.exists(save_output_dir):
logger.info(f"Created : {save_output_dir}")
os.makedirs(save_output_dir)
for save_output_dir in [paths.multi_marked_dir, paths.errors_dir]:
if not os.path.exists(save_output_dir):
for save_output_dir in [paths.multi_marked_dir, paths.errors_dir, paths.evaluation_dir]:
if not os.path.exists(save_output_dir):

# Explanation Table to CSV
def conditionally_save_explanation_csv(self, evaluation_path):
if self.enable_evaluation_table_to_csv:
data = {col.header: col._cells for col in self.explanation_table.columns}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Can you add a screenshot of what the output csv looks like now?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot from 2024-10-13 17-10-38
Screenshot from 2024-10-13 17-11-53

Comment on lines 364 to 371
def conditionally_save_explanation_csv(self, evaluation_path):
if self.enable_evaluation_table_to_csv:
data = {col.header: col._cells for col in self.explanation_table.columns}

output_dir = os.path.join(
os.getcwd(),
f"{evaluation_path}.csv",
)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use evaluation_output_dir here

Suggested change
def conditionally_save_explanation_csv(self, evaluation_path):
if self.enable_evaluation_table_to_csv:
data = {col.header: col._cells for col in self.explanation_table.columns}
output_dir = os.path.join(
os.getcwd(),
f"{evaluation_path}.csv",
)
def conditionally_save_explanation_csv(self, file_path, evaluation_output_dir):
if self.enable_evaluation_table_to_csv:
data = {col.header: col._cells for col in self.explanation_table.columns}
output_path = os.path.join(
evaluation_output_dir,
f"{file_path.stem}_evaluation.csv",
)

src/entry.py Outdated
Comment on lines 279 to 281
score = evaluate_concatenated_response(
omr_response, evaluation_config, evaluation_path
)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
score = evaluate_concatenated_response(
omr_response, evaluation_config, evaluation_path
)
score = evaluate_concatenated_response(
omr_response, evaluation_config, file_path, evaluation_output_dir
)

src/entry.py Outdated
@@ -209,6 +209,9 @@ def process_files(
for file_path in omr_files:
files_counter += 1
file_name = file_path.name
evaluation_path = os.path.join(
outputs_namespace.paths.evaluation_dir, file_path.stem
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's directly pass evaluation_output_dir (= outputs_namespace.paths.evaluation_dir) into the function below

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Export the evaluation summary table in csv format to the outputs folder
2 participants