Skip to content

0.26.4

Compare
Choose a tag to compare
@determined-ci determined-ci released this 17 Nov 21:45
· 1376 commits to main since this release

Release Notes

0.26.4

Changelog

  • bf665ae chore: bump version: 0.26.4-rc4 -> 0.26.4
  • 2f86950 docs: add release notes for 0.26.4 (#8451)
  • f2ef0fe chore: bump version: 0.26.4-rc3 -> 0.26.4-rc4
  • f0a37a9 fix: Calculate allocation bar stats same as overview [WEB-1822] (#8431)
  • 9acfbf2 chore: bump version: 0.26.4-rc2 -> 0.26.4-rc3
  • 9dd0211 fix: k8s autoscaling nodes not counted towards RP (#8439)
  • 3bf6647 chore: bump version: 0.26.4-rc1 -> 0.26.4-rc2
  • 47397a4 fix: new experiment list tooltip styling (#8433)
  • 680ac02 ci: fix linting with responses==0.24.1 (#8436)
  • d4200d2 chore: add version dropdown url for previous release (#8437)
  • 0a4b6bc test: fix model registry rbac wrong user regression (#8420)
  • 242ff97 fix: Wrap older modals in theme class [WEB-1824] (#8432)
  • e3c109a chore: bump version: 0.26.4-rc0 -> 0.26.4-rc1
  • 22e18ae fix: replace antd select with hew select (#8424)
  • 00d349a feat: add workspace/project creation/deletion (#8430)
  • 9f727fe feat: client gets list_models, too. (#8425)
  • b8c1be7 chore: update Column and Row from Hew (#8412)
  • 4e6fd52 chore: bump version: 0.26.4-dev0 -> 0.26.4-rc0
  • e9a457d chore: lock published urls to preserve redirects
  • 2fae9ba chore: add docs dropdown link for new version
  • 6c3bf84 chore: make insert-dropdown-url.sh executable (#8418)
  • b5ca7f4 chore: fail deployment if launching part of the service fails (#8409)
  • 8498674 fix: allow --json in det master config CLI command (#8413)
  • d123932 fix: Place modal inside of ResourcePoolCard (#8414)
  • ff19924 chore: Add eslint rule for ?? operator (#8410)
  • d56b3ae chore: convert DOS line endings to Unix (#8411)
  • c1219eb fix: Hide stats card when 0 on cluster page (#8359)
  • da77efb fix: added permission check on GetAllocation (#8281)
  • 3b0550c chore: Bumpenvs 0.26.4 (#8407)
  • e48d03d fix: user flag to prompt for password during user requests (#8158)
  • 513e6d7 fix: Project and Workspace cards wrap modal divs (#8378)
  • 2497d84 chore: export AddUserTx (#8403)
  • ad764f0 refactor: implement Glossary component from Hew (#8385)
  • 52326d1 feat: change cli command for patch master log config DET[9720] (#8054)
  • 1e9155d chore(type): stricter tsconfig (#8349)
  • 16f18cc chore: revert task obfuscation lint failures (#8406)
  • dde3156 chore: Implement Theming updates in Determined [WEB-1726] (#8388)
  • 4edfc3c ci: move packaging test to test-e2e-longrunning (#8381)
  • d3c208a ci: cache go modules deps and build cache (#8383)
  • 8924996 chore: temporarily disable CI upload job (#8399)
  • 356f651 Revert "chore: temporarily disable upload_test_results job step"
  • 6dd9701 chore: temporarily disable upload_test_results job step
  • ba49dbd ci: up parallelism for slowest test_e2e premerge tests (#8374)
  • 5f3e556 ci: finish removing growforest (#8389)
  • 62084e2 fix: NTSC task and slot viewing obscured for RBAC users with no Viewer Permissions (#8311)
  • 0254f7d chore: fix nil ptr on allocation.Proto() (#8372)
  • 119e759 chore: fix profiler test in CI (#8382)
  • b428d5e feat: add hide column header menu item to explist (#8342)
  • 7ae0501 chore: update the lore service port (#8375)
  • 052cf8d feat: Cluster historical usage charts move to UI Kit LineChart [WEB-1786] [WEB-1764] (#8327)
  • 819948d feat: clear filter from experiment table header (#8376)
  • a590999 test: fix slow delete_checkpoint test (#8377)
  • b0505db chore: Job/task displays Running instead of Scheduled (#8335)
  • 1d64941 chore: short dsat e2e tests (#8288)
  • 6afa836 chore: fix CI mnist_pytorch (#8364)
  • 4d3eaab chore: Update Horovod Cycle Time (#8362)
  • d3b01cb docs: Add det pach tutorial (#8082)
  • 7cebc30 fix: adjust card size on workspaces page (#8370)
  • 5c93cb0 chore: enable more Go linters (#8333)
  • a279967 fix: aws deployment can deploy priority scheduler (#8345)
  • 3d9293c fix: fixed bug in error handling in experiment.go (#8339)
  • 194bfd5 fix: Cell can be undefined in experiment list table (#8360)
  • 1da92aa chore: bump environment images to ubuntu 18.04 [MLG-1194] (#8356)
  • 990c56f chore: add list_experiments to experimental.client (#8361)
  • 3a7d9ea fix(tests): lower e2e_gpu_quarantine parallelism (#8363)
  • 4c48458 fix: patched remote users were able to login with password (#8337)
  • baf5c96 chore: port over PyTorch example to use Trainer API [MLG-1181] (#8292)
  • 235bd8f feat: delete TB files from the SDK (#8329)
  • 2fe3d99 chore: update Typography from UI kit (#8323)
  • 2b23674 fix: prevent carriage return in env from crashing deepspeed launcher (#8321)
  • 461c307 chore: Remove DesignKit since it's now maintained in Hew [WEB-1790] (#8338)
  • 5ee87ec fix: Set group name and number columns to handle Safari [DET-9948] [DET-9949] (#8355)
  • 10deef9 fix(experiments): transient errors shouldn't leave trial hung (#8352)
  • 512b9f3 chore: remove accidental mock commit (#8354)
  • 9d17dbf feat: Show "-" for null values in data cells for experiment list (#8343)
  • ea50987 fix: properly interpret flag values (#8326)
  • 8b6fc68 fix: Allow SAML and OIDC logins to work differently [WEB-1797] (#8308)
  • 274288e docs: fix linting failure (#8351)
  • 73bf0e8 docs: log policies (#8302)
  • 8418029 chore: ft slot capacity check for each trial [DET-9897] (#8213)
  • 494ca57 fix: replace TODO with ctx for deleteTensorboard (#8332)
  • cfde2f6 docs: Docs Version Dropdown Automation (#8340)
  • 8e69941 chore: Remove examples/legacy (#8153)
  • af995ba fix: cli is not a library! (#7891)
  • bf0a03d test: fix ray.air.session import. (#8344)
  • 9bb10cc ci: mypy fix for responses>=0.24.0 (#8341)
  • b924b25 fix: add pin icon in dropdown (#8324)
  • 62b7f3b chore: remove fit-content from TimeAgoc (#8328)
  • 86d6962 chore: update determined-ui to hew (#8334)
  • f580385 fix: metric group charts have more than one color (#8304)
  • 1966373 feat: Add tensorboard delete command to CLI (#8227)
  • 656c8b2 chore: bump version: 0.26.3-dev0 -> 0.26.4-dev0
  • af43248 docs: add release notes for 0.26.3 (#8322)
  • b262a3d chore: Update lore.yaml to use the new version
  • d64a0ac chore: use a single .golangci.yml file (#8320)
  • ad94d20 chore: Add progress bar from UI Kit [WEB-1675] (#8181)
  • b3b5be0 feat: implement CodeSample from UI Kit [WEB-1677] (#8270)
  • d723b7f docs: fix typo in user edit release note (#8319)
  • 6e5d840 chore: initial experiment actor refactor (#8229)
  • 8a1ff58 chore: use a single root level go mod (#8285)
  • 3511abf chore: delete dead code (#8313)
  • d0e6375 chore: add a new deployment type for aws (#8279)
  • 3929e8c chore(actors): remove ctx usage in agent_state.go (#8267)
  • 5bf1b87 ci: delete broken wait_for helper (#8312)
  • 50535f1 test: quarantine GPU execution of test_task_logs (#8261)
  • d5b8e80 chore: deployment's --dry-run option doesn't print template (#8303)
  • ac89d44 fix: allow experiments with directory checkpoint storage to parse (#8310)
  • 306c0c3 fix: Project info not presists when forking (#8307)
  • dc1b131 chore: sort out issues after bringing EE e2e_tests into OSS (#8084)
  • d182abe chore: slurm support for blocklist (#1111)
  • efdf62b fix: return correct location URL for /Users SCIM API endpoint (#1115)
  • 37a84d1 fix: ruamel.yaml fixes for EE
  • 0ce925a chore: Update nightly tests that use legacy cifar10_pytorch (#1102)
  • 6cad296 fix: update for error message change in product (#1098)
  • 1e302b5 chore: update e2e tests affected by examples_pruning (#1100)
  • ad3dcda chore: cleanup model registry rbac test
  • a3ffb5d test: enable command run tests for PBS (#1073)
  • 9dd0e42 test: enable command and deepspeed tests run on slurm/pbs (#1044)
  • ea4f4c4 chore(templates): ee fixes for template rbac
  • c48e48d fix: Test test_slurm_verify_home fails with podman and it shouldn't [FE-136] (#1028)
  • 760a738 test: Add pytorch2 distributed e2e tests on slurm [FE-168] (#1007)
  • b5aee79 chore: use longer running no op experiment when seeding workspace (#994)
  • facbda9 test: run test_hpc_job_pending_reason only on gcp vm (#996)
  • 393d0b5 ci: FE-133 Configure non agent slurm/pbs tests to skip without explicitly listing test names in circleci. (#977)
  • fd15535 ci: add ee-only files to the import-restrictions linter exclusions.
  • 9cf2a26 test: slurm/pbs test for pending reason (FE-90) (#960)
  • b3c2ca3 chore(actors): allocation.go, ee side
  • eb7d1a1 test: [ALLGCP] Add e2e test for HPC that verifies that user HOME is preserved (#972)
  • f3a8b0e test: fix test_slurm.py lint error (#949)
  • 71896f3 chore: FE-91: Update base images (slurm/pbs) to include a populated singularity_image_cache (#943)
  • 5891567 feat: add rbac to api/v1/master/config [DET-9633] (#931)
  • 0a5c32e ci: FE-72: Add test-e2e-pbs-*-gcp tests (#941)
  • 4c233c0 feat: add rbac for strict job queue control (#927)
  • 6e23aa2 chore: removed admin dependency from delete model/version (#912)
  • 7c6c59e feat: rbac for templates (#909)
  • e89cc08 ci: DET 9622: (ee) test_slurm.py::test_cifar10_pytorch_distributed failures (#919)
  • c6ee094 fix: test_rbac goes to wrong url (#918)
  • 6b34e0d fix: DET-9483 successfully run e2e_slurm_preemption tests as part of nightly workflow (#903)
  • 4f6277d ci: FE-14 Migrate test-e2e-slurm to GCP slurmcluster (#879)
  • f4507f7 tests: fix a miss indentation leading to missing project err (#878)
  • 5cad9bb chore: fix a missing check for global permissions in jq (#874)
  • bca3848 feat: add rbac support for reading job queue (#871)
  • 5c79474 chore: update how we wait for tasks to be ready (#863)
  • b292862 test: fix test_master_host [DET-9482]. (#851)
  • 5375a08 ci: quarantine flaky slurm tests (#850)
  • 8e44c6c fix: Patch groups test [DET-9473] (#845)
  • 49d2e08 fix: fix bug with launching tensorboards on trials (#842)
  • d4dcbe5 test: Fix and add e2e_slurm_preemption tests to nightly workflow [DET-9237] (#806)
  • 1529b9c style: update py binding references for ee
  • 06f5554 feat: implement rbac for master logs and cluster usage (#745)
  • a399707 chore: fix api usage after oss update.
  • e47c22a ci: checkpoint loading is for unit tests (#754)
  • 7b13cc7 feat: add rbac agents/slots enable/disable [DET-9156] (#751)
  • 0546ae9 fix: rbac e2e test (#738)
  • e44f464 chore: add e2e test for preemption on hpc cluster (FOUNDENG-497) (#726)
  • 81b7bf3 fix: Fail attempts to mount under /run/determined on HPC [FOUNDENG-482] (#710)
  • 483705c chore: use ported test code (#701)
  • e588dba fix: user can only list models with correct permissions + small fixes in workspace filtering in get models (#681)
  • c55777d chore: fix ntsc fix authz order tb [DET-8885] (#667)
  • f111780 fix: websocket upgrade failed in tensorboard [DET-8903] (#672)
  • 7adb55b fix: tensorboard list not showing tensorboards [DET-8904] (#669)
  • d4ce809 fix: ntsc endpoints should return 404 on unauthorized workspace id [det-8911] (#671)
  • ecd3d18 test: add dtrain test to e2e_slurm (#664)
  • 72c9f93 chore: repair tensorboard e2e tests (#666)
  • af77d76 feat: rbac ntsc ee (#662)
  • 9a7c8b9 feat: RBAC Model Registry EE features [DET-8704] (#644)
  • 42ef7d0 chore: integration tests to verify that slurm jobs are restored on master restart (FOUNDENG-216) (#600)
  • 8ce0fbc chore: Fix lint errors in Slurm integration tests (#647)
  • 7cb0cb3 chore: rbac ntsc supporting changes (#641)
  • 44257db fix: FOUNDENG-336 test_noop_hpc continues to fail periodically (#632)
  • cc0b450 chore: Add test-e2e-slurm-podman (#543)
  • bf7be54 fix: Improve test_noop_hpc reliability [FOUNDENG-361] (#590)
  • bc8c46f fix: Reduce verbosity of failure messages [FOUNDENG-370] (#583)
  • 89e296a fix: stop printing incorrect (exit code 1) for failed command. (#588)
  • 037ff4d test: adding tests for Oauth in EE (#582)
  • ee60ddf chore: Add e2e slurm test for env var quoting [FOUNDENG-366] (#579)
  • aebd7ed chore: Add test-e2e-slurm suite using enroot containers (#574)
  • fe07d9a fix: FOUNDENG-310 test_noop_pause_hpc needs timeout increase to avoid random failures (#539)
  • ab5fafe ci: Addition of znode runners (#417)
  • b00a7f3 fix: FOUNDENG-303 Pausing, then resuming an experiment fails (#533)
  • 80fabf0 test: disable restart on expected failure case. (#528)
  • cd286c9 chore: Fix test_node_not_available test [FOUNDENG-304] (#517)
  • 9f27bed test: update expected error messages. (#526)
  • 71bd2c6 chore: Disable test_node_not_available [FOUNDENG-304] (#512)
  • 51d4427 chore: Disable test_node_not_available [FOUNDENG-304] (#510)
  • 109cf24 chore: consume experiment PBS & Slurm batch args (#472)
  • b4a6223 chore: generalize message for Slurm/PBS. (#463)
  • 21ffa89 test: enable test_launch for e2e_slurm. (#389)
  • d0e8021 test: Enable test case on slurm (FOUNDENG-171) (#385)
  • a8cbeca test: Enable logging tests on slurm (#367)
  • 4a29f69 fix: e2e_test test_slurm.py test_node_not_available fails on CPU based cluster (Mosaic) due to different Error output (FOUNDENG-132) (#364)
  • b71c672 ci: slurm ci (#342)
  • 4fc1908 chore: Provide Slurm job submission failure test cases (FOUNDENG-86) (#321)
  • 564a714 feat: add support for SCIM provisioning
  • 143ff3c fix: adjust width size in group table (#8309)
  • d8eb61b chore: fix job service panic when workspace does not exist (#8306)
  • bf177da feat: new user management filters (#8002)
  • 30c7681 chore: stop using root logger (#8294)
  • 9d4c8e4 fix: check externalConfig is enabled before setting det_jwt as auth header (#8298)
  • 79a324b fix: undefined handling in CreateGroupModal (#8301)
  • d265530 feat: added a new cli command to recover hp search experiments (#8149)
  • c5efc30 docs: quick fix for version dropdown (#8300)
  • 5a8a912 fix: prevent settings store from triggering rerenders on poll (#8295)
  • 7772fb3 ci: update wrapper config to run on tags (#8296)