Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multisession modeling functionality for AutoLFADS #152

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

claytonwashington
Copy link

No description provided.

…r update at 2023-10-03 16:52:49.268505. Purpose: test. AMI: ami-05ddab33dd696e7d0
…r update at 2023-10-04 11:00:56.540733. Purpose: setup before first single run. AMI: ami-076b9ecc29502b344
…r update at 2023-10-09 17:07:12.925453. Purpose: upgraded CUDA. AMI: ami-0295578a40774d483
…r update at 2023-10-09 17:40:31.179587. Purpose: updated cuda and got sklearn. AMI: ami-0163b17cdfa51a9f5
…r update at 2023-10-10 17:08:18.822096. Purpose: final test on local single and multi. AMI: ami-0ccfff108ae8de464
@cellistigs
Copy link
Collaborator

Confirmed unit tests passed on local machine. Moving on to tests with AWS SAM.

@cellistigs
Copy link
Collaborator

cellistigs commented Oct 27, 2023

Sam package built/Template validated.

$ cd ~/neurocaas/ncap_iac/ncap_blueprints/iac_utils/
$ bash fullbuild.sh ../autolfads-torch
$ sam validate --template-file 

@cellistigs
Copy link
Collaborator

cellistigs commented Oct 27, 2023

Built AWS sam package successfully (docker failing before- updated fullbuild.sh to use appropriate context.)

@cellistigs
Copy link
Collaborator

Need multisession parameter for this analysis to pass in config file.

@cellistigs
Copy link
Collaborator

cellistigs commented Oct 27, 2023

Tested standard implementation. Across various errors (different environment variables, misspecified arns) preserved behavior of shutting down instances as expected.

@cellistigs
Copy link
Collaborator

At the moment, ncap_iac/protocols/submit_start.py has both process_upload_dev and process_upload_multisession. It appears these are identical save for the submission class passed to them. It would be good to refactor this so we can pass a submission object- since these functions take care of error handling having a single function wherever possible will simplify testing in the future.

@cellistigs
Copy link
Collaborator

The following functionality has been tested for reliability of cloud management systems, validated through sam local invoke calls. Note that we only check MainLambda here, as the behavior of the postprocessing lambda file does not have to change.

  • failing to pass a multisession parameter
  • passing multisession:False with a single dataset in the inputs folder.
  • passing multisession:True with a single dataset in the inputs folder.
  • passing multisession:True with datasets in multisession_test (directory passed to submit file) (launches one instance)
  • passing multisession:False with datasets in multisession_test (directory passed to submit file) (launches one instance)
  • passing multisession:True with datasets in multisession_test (individual files passed to submit file) (launches two instances)
  • passing multisession:False with datasets in multisession_test (individual files passed to submit file) (launches two instance)

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

@cellistigs
Copy link
Collaborator

We have some issues with logging- at least on AWS-SAM, we only ever see messages from logging start of the first instance. Check this in deployment. Note we have only checked the above functions for cloud reliability, so the above formats should be tested for expected functionality in all combinations of inputs listed above.

We should also refactor: At the moment, ncap_iac/protocols/submit_start.py has both process_upload_dev and process_upload_multisession. It appears these are identical save for the submission class passed to them. It would be good to refactor this so we can pass a submission object- since these functions take care of error handling having a single function wherever possible will simplify testing in the future.

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

1 similar comment
@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

…r update at 2023-10-31 14:18:09.494446. Purpose: multisession entry no longer needed in config. AMI: ami-0699e307bc81482db
@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

…r update at 2023-11-10 15:49:37.826420. Purpose: fixed lfads-torch import. AMI: ami-0893ac69ba200a546
@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

…r update at 2023-11-16 13:55:30.184299. Purpose: fixed issues with access for root/SSM user. AMI: ami-017465330c3757649
…r update at 2023-11-16 14:25:10.865461. Purpose: root path fix fix?. AMI: ami-0754772739da3a57d
@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

…r update at 2023-11-28 17:12:13.330582. Purpose: fixed single dataset run; command broken for multisession. AMI: ami-024a73c9849ee99e4
…r update at 2023-12-01 21:09:51.895815. Purpose: new multisession post-tutorial. AMI: ami-0825772e0ac2e67c8
…r update at 2023-12-05 22:16:10.678782. Purpose: successful ssm test on multisession via nc_contrib hotfix 51. AMI: ami-0cebcfea203997fcf
@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

Copy link
Collaborator

@cellistigs cellistigs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes here look good- however I have some suggestions regarding code structure and logging.

ncap_iac/protocols/submit_start.py Outdated Show resolved Hide resolved
ncap_iac/protocols/submit_start.py Outdated Show resolved Hide resolved
ncap_iac/protocols/submit_start.py Outdated Show resolved Hide resolved
ncap_iac/protocols/submit_start.py Show resolved Hide resolved
@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

@cellistigs
Copy link
Collaborator

#deploy:autolfads-torch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants