Releases: determined-ai/determined
Releases · determined-ai/determined
0.37.0
Release Notes
Changelog
- c415087 chore: bump version: 0.37.0-rc4 -> 0.37.0
- 736fba6 docs: add release notes for 0.37.0 (#9995)
- 73dee98 docs: fix broken links (#9996)
- ecf8ac7 chore: bump version: 0.37.0-rc3 -> 0.37.0-rc4
- 1b50305 fix: fix default id search for runs (#9988)
- 0990c11 chore: bump version: 0.37.0-rc2 -> 0.37.0-rc3
- a78b190 fix: fix hf on_save raise exception (#9977)
- 0560939 fix: bring in handleEmptyCell from #9963 (#9984)
- 7caf18a chore: bump version: 0.37.0-rc1 -> 0.37.0-rc2
- 08d782a fix: show search progress in run table (#9976)
- 478c78f fix: Cluster page height (#9975)
- 2772a3c fix: correct
dataPath
for hyperparameters (#9971) - 94f2d95 chore: bump version: 0.37.0-rc0 -> 0.37.0-rc1
- 63e7df0 chore: 0.37.0 environment images (#9967)
- b2267d1 chore: bump version: 0.37.0-dev0 -> 0.37.0-rc0
- f758303 chore: lock published urls to preserve redirects
- 2a8e7dd chore: lock api state for backward compatibility check
- 3f54d07 chore: bump version: 0.36.1-dev0 -> 0.37.0-dev0
- baf451f chore: do not log error for resource pools with zero agents (#9960)
- 6a8606e docs: Add hpc installation guide (#9945)
- 3241edb fix: fix flaky generic task pause test (#9962)
- 43556e9 fix: Remove CSS rule for hiding the Form.Item error message (#9872)
- 5906001 perf: improve the initial page load speed (#9939)
- eb1b0de docs: Add workload alerting (#9938)
- cedfcfe chore: refactor and test RBAC config policies work [CM-530] (#9943)
- 2d884b9 docs: Add cluster overview (#9936)
- e17d12c feat: release notes and improvements for workload alerting (#9944)
- 0db2e3b ci: deflake make slurmcluster, hopefully (#9957)
- 95f079d feat: add GET global config policies API (#9952)
- d943d85 chore: fix global PUT for task config policies (#9941)
- 410edf6 fix: broken MNIST download in e2e tests (#9937)
- 004c194 ci: fix flaky test_allocation_csv tests (#9953)
- 88a4c67 feat: add Config Policies GET API and modify CRUD functions to accept both Workload types (#9946)
- a73c8db test: debug auth [TESTENG-95] (#9942)
- 13db674 test: experiment list show archived filter [ET-753] (#9932)
- 02e302f chore: remove unused languages from code editor (#9898)
- f6d874d docs: Replace slack links (#9919)
- 26b0954 chore: implement Delete config policies API handlers (#9927)
- 2d12be1 test: add projects tests [CM-467] (#9928)
- 062cb52 fix: use different modules for Trial and Cluster topology (#9917)
- 0928958 chore: change log level for log retention policies (#9935)
- b559467 chore: bump coverage target (#9920)
- 3a2ea56 fix: do not filter slots for mixed-slot-type pools (#9902)
- a58ed7c chore: reassign RM code to CM in CODEOWNERS (#9926)
- cb3515e fix: update LogRetentionDays from master config when master starts/upgrades (#9930)
- 13b7b3f ci: increase timeout for k8s intg tests (#9929)
- 6f36969 fix: flaky workspace test (#9931)
- 867eb31 fix: update huggingface example (#9925)
- 5b2275f fix: Refactor sorting logic in WorkspaceProjects for filtering projects (#9903)
- fd7f77a fix: move validation dataloader check in PyTorchTrial [MD-515] (#9923)
- db2881f chore: fix config policy unmarshal tests (#9924)
- 3900742 chore: update test log pattern webhook cache (#9922)
- f44687d chore: create config policies table and add NTSC CRUD operations (#9915)
- de89f68 feat: support updating web hook url [MD-482] (#9890)
- 02fbdbb fix: huggingface callback raise process preempted exception (#9913)
- 8c799b8 chore: prune cruft out of no_op fixture (#9912)
- 11de119 chore(deps): bump path-to-regexp and express in /webui/react (#9909)
- 03961b5 test: add workspace tests (#9905)
- c877383 fix: GetTrialRemainingLogRetentionDays should take global log retention days into account [CM-518] (#9914)
- fb0d5f9 fix: change workspace name and set resource quota simultaneously (#9847)
- 8fb9f6b docs: Update ROCM support (#9893)
- 481bddb chore(deps): bump github.com/docker/docker from 24.0.9+incompatible to 25.0.6+incompatible (#9780)
- c1499ac chore: removing model_hub references from Makefile (#9901)
- c961dbd feat: new run object for Run Centric API (#9897)
- bfeb418 feat: Implement custom trigger for webhooks (#9879)
- b6eb05e chore: Remove model hub (#9869)
- 4a28c10 chore: add unmarshal functions for task config policies (#9896)
- d842383 fix: timezone handling error in queued allocation time update (#9892)
- 55b3f9b test: cover project id filtering on bulk actions [ET-138] (#9870)
- 036477b chore: stub new APIs for task config policies [CM-485] (#9880)
- be2622a test: Delete workspace after webhook test (#9891)
- a30bc25 feat: Add rbac for config policies (#9873)
- 8c83d31 chore: create WorkloadType enum and Go config + constraints structs (#9885)
- 0a18c5a fix: add backwards compatibility for Pods to Jobs for k8s <v1.27 [CM-461] (#9878)
- 8e6bba8 ci: fix master-config syntax (#9889)
- d5d647a fix: inconsistent timezone handling in daily allocation aggregation (#9888)
- b4209ef test: login redirect with nested route (#9881)
- 8cacba6 ci: add e2e bulk kill test (#9868)
- 590c362 fix: Hf callback metric naming (#9887)
- 61fd26b fix: reset Model Registry page number on pageload [ET-640] (#9876)
- ce27f81 fix: show
-
for empty data in run table (#9871) - b1c0814 fix: prevent
hyperparameter search modal
submitting the same request multiple times (#9883) - d54713c fix: use new ruamel yaml APIs (#9882)
- ad5fe5a fix: prevent out of bounds navigation on new list views (#9875)
- a605f00 fix: reject reconnecting agents with different resource pool configuration (#9815)
- db92bad feat: Support RBAC in webhook (#9859)
- 0ef81aa fix: sorting by arbitrary metadata (#9874)
- c1b7767 feat: Auto-Populate POSIX Information on sign in using SSO [CM-399] (#9755)
- 54b6165 feat: Logic of different modes for webhook (#9865)
- a773551 fix: allow for objects inside array metadata to be typed properly (#9864)
- ee269c8 test: successful login with weak or strong password (#9858)
- e21fc6f ci: pin chromadb version to avoid incompatibility (#9849)
- a1234a1 chore: bump version: 0.36.0-dev0 -> 0.36.1-dev0
- d79c90d chore: add docs dropdown link for new version
- ce6da74 docs: add release notes for 0.36.0 (#9854)
- a55af74 fix: use task sessions in Core API [MD-509] (#9860)
- 3ee88bb fix: replace tree with code mirror for metadata view (#9853)
- 8dd46d5 chore: Improve CompareTrials perfomance (#9807)
- 6e08303 fix: fix error toast popping up in Workpace Creator view (#9855)
- fb95df8 chore: add backport github action (#9835)
- a37e6e7 fix: prevent loading issues with ipynb files (#9850)
- 9de4f72 feat: configurable preemption timeout [MD-500] (#9833)
- 640126b feat: Add workspaceId, mode, name to webhook (#9820)
- d436c23 fix: reset pinned column state when resetting columns (#9852)
- 3a91552 fix: fix fallback logic for partially provided custom logos (#9842)
- 707ad07 Revert "chore: add tracing info to some backend APIs" (#9843)
- 73a756a fix: update broken tensorflow & certbot links (#9846)
- 771bbe4 ci: sequential metric count sweep test [Scale-35] (#9791)
- 32fafdd perf: remove duplicate ids in
ExpMetricNames
api (#9848) - a8fa015 docs: Fix broken links (#9845)
- 2b1856a fix: model version name overflow on mobile [ET-384] (#9827)
- e13de20 docs: Document rbac editorprojectrestricted role (#9844)
- 2838af4 chore: add tracing info to some backend APIs (#9841)
- e3dfb0a fix: change filter form to say "Show runs" in flat runs view [ET-740] (#9840)
- 52f2b9f chore: add release notes for PR 9822 (#9837)
- a37d482 fix: experiment single trial tabs don't scroll on load (#9831)
- aff486c feat: Rocm bumpenvs (#9830)
- 13622ad feat: Add
report_progress
toTrainContext
(#9826) - d831461 fix: replace rawsource attribute with node directly, due to removal of rawsource in Docutil 2.0 (#9838)
- 7ed9e83 feat: add EOL notice regarding Aurora V1 & Postgres 12 along with Master Log warnings for Postgres <=12 [CM-413] [CM-416] (#9832)
- 5c5f107 docs: Minor docs enhancements (#9836)
0.36.0
Release Notes
Changelog
- c349314 chore: bump version: 0.36.0-rc7 -> 0.36.0
- 39db2a8 docs: add release notes for 0.36.0 (#9854)
- 61538a2 chore: bump version: 0.36.0-rc6 -> 0.36.0-rc7
- 9494823 fix: fix error toast popping up in Workpace Creator view (#9855)
- bd33228 chore: bump version: 0.36.0-rc5 -> 0.36.0-rc6
- fa155de chore: bump version: 0.36.0-rc4 -> 0.36.0-rc5
- 9332ab9 chore: 0.36.0 environment images (#9851)
- 838cafe Revert "chore: add tracing info to some backend APIs" (#9843)
- 1e2447d chore: bump version: 0.36.0-rc3 -> 0.36.0-rc4
- f70a03d fix: update broken tensorflow & certbot links (#9846)
- e3695a9 perf: remove duplicate ids in
ExpMetricNames
api (#9848) - 101441d docs: Fix broken links (#9845)
- 8e28493 docs: Document rbac editorprojectrestricted role (#9844)
- 9e73cd3 chore: bump version: 0.36.0-rc2 -> 0.36.0-rc3
- 8acaee5 chore: add tracing info to some backend APIs (#9841)
- 46a400e fix: change filter form to say "Show runs" in flat runs view [ET-740] (#9840)
- 119d544 chore: bump version: 0.36.0-rc1 -> 0.36.0-rc2
- 5affb09 chore: add release notes for PR 9822 (#9837)
- 21bc083 feat: Rocm bumpenvs (#9830)
- 26f8ed2 chore: bump version: 0.36.0-rc0 -> 0.36.0-rc1
- 89d5ddb fix: replace rawsource attribute with node directly, due to removal of rawsource in Docutil 2.0 (#9838)
- d58ff68 feat: add EOL notice regarding Aurora V1 & Postgres 12 along with Master Log warnings for Postgres <=12 [CM-413] [CM-416] (#9832)
- 4be07af docs: Minor docs enhancements (#9836)
- 34b567e chore: bump version: 0.36.0-dev0 -> 0.36.0-rc0
- e11629b chore: lock published urls to preserve redirects
- 6e0b9d1 chore: lock api state for backward compatibility check
- e1a2273 chore: bump version: 0.35.1-dev0 -> 0.36.0-dev0
- 42c2efa docs: Docs cleanup (#9834)
- 3ed0a39 docs: Make docs consistent with run centric ux (#9824)
- a367cd0 chore: deprecate Custom Searcher [MD-504] (#9829)
- f7846cb feat: allow users with role Viewer and above to view resource quotas (#9822)
- 97353c9 fix: Group and User management (CM-436) (#9825)
- 358ed28 fix: hide metadata section if there's no metadata (#9823)
- 287f3be chore: unskip flaky test (#9819)
- e85ac89 Clarify basic data lineage to mldm (#9828)
- c0ca659 fix: checkpoint table action menu shouldn't vanish on polling [ET-277] (#9812)
- 740b0e7 docs: Describe basic lineage steps (#9813)
- e5d4b7f chore: initial k8s rocm support [CM-367] (#9794)
- 9548790 chore: fix torch version to 2.2.2 for intel mac (#9821)
- b2a82e8 chore: deprecate kubernetes priority w/ preemption scheduler (#9763)
- 2002bf0 docs: Getting a list of files in a checkpoint (#9818)
- 91d0b67 docs: Fix broken links (#9816)
- e357849 fix: don't ignore failures during experiment shutdown (#9693)
- 9b96416 test: add go unit tests for experiment bulk actions [ET-138] (#9658)
- 92a7ff5 feat: support filter by metadata with string type (#9810)
- 9da5620 feat: exclude
Array
type columns (#9808) - 79ffa52 chore: bump version: 0.35.0-dev0 -> 0.35.1-dev0
- 9949ab0 chore: add docs dropdown link for new version
- 261e2e7 docs: add release notes for 0.35.0 (#9786)
- a11e9e8 chore(deps): bump torch from 1.11.0 to 2.3.0 (#9726)
- bebaf17 fix: make navigation sidebar scrollable [ET-633] (#9803)
- f7e18fc fix: prevent multiple calls to time-series on compare view select (#9805)
- db98c4f ci: Add a portable testing framework and scalability tests [SCALE-29] (#9762)
- 9702d22 fix: prevent extra initial calls to search endpoints (#9782)
- 4e47a1e chore: change the comment for defaultNamespace in values.yaml (#9793)
- d3f3e76 test: datagrid action pause flake (#9802)
- 1f7473c fix: return proper error message when moving a project with a matching names (#9795)
- 15d1a60 ci: fix scripting for
make slurmcluster
job (#9801) - 8173cab fix: forked from link (#9798)
- c3400df feat: add editor project restricted role and testing [DET-10428] (#9796)
- 2cb1022 test: base model package dependency update [TESTENG-59] (#9777)
- 4f31942 test: omnibar tree-extension tests [ET-203] (#9783)
- cdbbedd fix: don't filter single runs in the comparison view (#9789)
- 80822eb ci: label
make slurmcluster
instances for cloud spend [CM-405] (#9792) - ea589d8 chore: fix readme typo (#9797)
- 7b4f01c fix: Add loading indicator when creating HP search (#9774)
- a4d74af chore: readme should include codecov (#9787)
- 786f258 fix: uncomment helm values (#9790)
- a034964 fix: fixed helm chart values and master-config.yaml (#9788)
- fe14062 feat: show metadata in run table (#9776)
- 2b589c4 feat: add array column type for abitrary metadata (#9759)
- 094c58b test: skip flaky test (#9784)
- 49c3fa0 chore: add a utility for connecting devcluster to remote k8s clusters (#9739)
- 13ebf47 chore: add Cluster Name title and change helm value (#9775)
- 61aad78 fix: fix contains filter for hyperparameters and metadata (#9779)
- 15226b7 feat: add master config option to provide custom logo (#9664)
- f42daca feat: make groups scope optional to support azure with OIDC (#9773)
- 6105b3f docs: fix insecure link to systemd docs (#9772)
- 068b959 feat: checkpoint view for flat runs [ET-658] (#9769)
- dab6978 feat: add code tab to run page [ET-657] (#9771)
- 2c91098 test: use previously created experiment for pause test (#9727)
- 935799d fix: use run checkpoint data instead of experiment for run table filter (#9767)
- 30d6e79 fix: extract searcher metric from experiment payload (#9768)
- b8c6773 fix: fix missing task_stats start_time on restored allocation (#9745)
- a094ea1 chore: pin numpy version and upgrade sphinx [MD-468] (#9736)
- 0806597 feat: add Metadata section to TrialDetailsOverview (ET-224) (#9639)
- 287faf7 chore: bumpenv pin numpy to 1.x [MD-470] (#9748)
- becd8b6 chore: remove RM Name from RP descriptions (#9758)
- fc8ac0b chore: undo test skip after fix was merged (#9754)
- de898c9 Revert "chore: add configurable posix claims fields to master config [RM-398]" (#9753)
- 623c945 fix: load trial data for single run searches in search view #9742 (#9752)
- 41a512e fix: debounce searches column width settings #9700 (#9751)
- bc721bf refactor: change 'close' to 'save' on button in ManageJob modal [DET-10446] (#9750)
- 0ce2ff1 fix: change external_run_id to string type in FlatRun proto (#9749)
- 20ed126 fix: reduce the number of api calls from Workspace Create Modal (#9735)
- 61bc7bb chore: add configurable posix claims fields to master config [RM-398] (#9690)
- 2cdfdf9 fix: change external_run_id to string type in FlatRun proto (#9744)
- 36aaed7 chore: fine-tune error and help messages of CLI commands for slot caps (#9743)
- 0df7ad3 test: workspace and project tests [TESTENG-60] (#9740)
- e00d9f4 chore: add release note for ComparisonView bugfix (#9741)
- 35ec077 chore: add 'masterService.annotations' to Helm (#9697)
- 5f8dae3 chore: fix exp delete log msg (#9716)
- 9dc0afa fix: deadlock issue (#9728)
- f8067ba chore: skip failing Deactivate and reactivate user test (#9723)
- 9efb216 feat: CLI command to list the members of a Workspace [RM-388] (#9686)
- dc12336 chore: lengthen abbreviation to avoid ambiguity (#9733)
- e3524b7 chore: add release notes for metrics fetching UI bug (#9737)
- 719f8be chore: update copy when f_flat_runs is on (#9642)
- 6c4f69b test: workspace and project api [TESTENG-46] (#9731)
- 4aa6ffa docs: Add release docs for continue trial, edit hp search, resource a… (#9729)
- d46d776 fix: use before/after search params for historic allocation CSV download endpoint [DET-10442] (#9730)
- a32b010 fix: show selections in ComparisonView on any page (ET-189) (#9694)
- 7260f04 chore: default flat runs to on (#9709)
- 202ab62 fix: Endless fetching for cancelled experiment without metrics (#9714)
- 4466c33 feat: change search-experiments from GET to POST [ET-602] (#9717)
- 787a2f3 docs: Fix workspace cli doc (#9720)
- c3ca1d4 docs: Describe link to mldm data (#9718)
0.35.0
Release Notes
Changelog
- 7d1b0df chore: bump version: 0.35.0-rc20 -> 0.35.0
- e770ee5 docs: add release notes for 0.35.0 (#9786)
- 7f03a87 chore: bump version: 0.35.0-rc19 -> 0.35.0-rc20
- 3c9a188 fix: prevent multiple calls to time-series on compare view select (#9805)
- c65c6cd chore: bump version: 0.35.0-rc18 -> 0.35.0-rc19
- da58c92 fix: prevent extra initial calls to search endpoints (#9782)
- 8074fd9 chore: bump version: 0.35.0-rc17 -> 0.35.0-rc18
- 6fed766 chore: change the comment for defaultNamespace in values.yaml (#9793)
- 6d9b780 chore: bump version: 0.35.0-rc16 -> 0.35.0-rc17
- f02b6b5 fix: forked from link (#9798)
- 7928af1 chore: bump version: 0.35.0-rc15 -> 0.35.0-rc16
- 1841a8e fix: don't filter single runs in the comparison view (#9789)
- 0b89fad chore: bump version: 0.35.0-rc14 -> 0.35.0-rc15
- c451482 fix: uncomment helm values (#9790)
- f041440 chore: bump version: 0.35.0-rc13 -> 0.35.0-rc14
- f144957 fix: fixed helm chart values and master-config.yaml (#9788)
- bfe7912 chore: bump version: 0.35.0-rc12 -> 0.35.0-rc13
- 2794cdc chore: add Cluster Name title and change helm value (#9775)
- 720dcbb chore: bump version: 0.35.0-rc11 -> 0.35.0-rc12
- 94f916d fix: fix contains filter for hyperparameters and metadata (#9779)
- 501d45c chore: bump version: 0.35.0-rc10 -> 0.35.0-rc11
- 44e4786 feat: checkpoint view for flat runs [ET-658] (#9769)
- e1ff8bb chore: bump version: 0.35.0-rc9 -> 0.35.0-rc10
- 5314c58 feat: add code tab to run page [ET-657] (#9771)
- f23ca4c chore: bump version: 0.35.0-rc8 -> 0.35.0-rc9
- 2b1f0e7 fix: use run checkpoint data instead of experiment for run table filter (#9767)
- b750c30 fix: extract searcher metric from experiment payload (#9768)
- 461434e chore: bump version: 0.35.0-rc7 -> 0.35.0-rc8
- 5cb9a32 fix: fix missing task_stats start_time on restored allocation (#9745)
- ca4df77 chore: bump current environment image versions to 0.35.0 (#9760)
- b0b9d84 chore: bumpenv pin numpy to 1.x [MD-470] (#9748)
- 375244d Revert "chore: 0.35.0 images (#9732)"
- bd4af9d chore: remove RM Name from RP descriptions (#9758)
- 22c5ae9 chore: bump version: 0.35.0-rc6 -> 0.35.0-rc7
- 2b76ac8 refactor: change 'close' to 'save' on button in ManageJob modal [DET-10446] (#9746)
- f739956 fix: load trial data for single run searches in search view (#9742)
- d1520a4 chore: bump version: 0.35.0-rc5 -> 0.35.0-rc6
- ed47fb0 fix: reduce the number of api calls from Workspace Create Modal (#9735)
- e345871 fix: change external_run_id to string type in FlatRun proto (#9744)
- 8ef93f8 chore: fine-tune error and help messages of CLI commands for slot caps (#9743)
- c17fc72 chore: add release note for ComparisonView bugfix (#9741)
- 1dbdf00 chore: bump version: 0.35.0-rc4 -> 0.35.0-rc5
- 27b8dbd fix: deadlock issue (#9728)
- f939bc4 chore: bump version: 0.35.0-rc3 -> 0.35.0-rc4
- c554aec chore: lengthen abbreviation to avoid ambiguity (#9733)
- e173bdb chore: add release notes for metrics fetching UI bug (#9737)
- b6051af chore: update copy when f_flat_runs is on (#9642)
- 1dddfa3 chore: bump version: 0.35.0-rc2 -> 0.35.0-rc3
- 40e1f56 chore: 0.35.0 images (#9732)
- f68e36e docs: Add release docs for continue trial, edit hp search, resource a… (#9729)
- a341f9e chore: bump version: 0.35.0-rc1 -> 0.35.0-rc2
- 9f5fefb fix: use before/after search params for historic allocation CSV download endpoint [DET-10442] (#9730)
- 9947eba fix: show selections in ComparisonView on any page (ET-189) (#9694)
- 76dc02f chore: bump version: 0.35.0-rc0 -> 0.35.0-rc1
- 5e22807 chore: default flat runs to on (#9709)
- f84d4eb fix: Endless fetching for cancelled experiment without metrics (#9714)
- 556de9c feat: change search-experiments from GET to POST [ET-602] (#9717)
- 8be3e85 docs: Fix workspace cli doc (#9720)
- ca375ad docs: Describe link to mldm data (#9718)
- d375364 chore: bump version: 0.35.0-dev0 -> 0.35.0-rc0
- 408a609 chore: bump version: 0.34.1-dev0 -> 0.35.0-dev0
- 24feb35 chore: add release notes for workspace slot caps (#9706)
- 3fbc8c0 chore: lock published urls to preserve redirects
- 248a1ba chore: lock api state for backward compatibility check
- fbb5d24 test: skip flaky test until after release (#9719)
- 12fc71c feat: add workspace namespace bindings and resource quotas to workspaces (#9180)
- 769d600 docs: basic lineage release notes (#9708)
- 0703e8a chore: saas authz changes for rbac (#9657)
- bd59ced fix: disallow slots in exp config [MD-454] (#9698)
- 3995b72 test: create Pause experiment action test [ET-644] (#9699)
- 004d9d0 chore: fix makeslurm workload option reference (#9645)
- 4f81548 fix: assign only run in a single run experiment as best_trial_id (#9051)
- 543380d fix: show the correct empty page in flat run table when filters are applied (#9702)
- d5b6181 docs: Describe workspace slot level caps (#9687)
- 1255d07 feat: auto-redirect to SSO provider when expired remote session detected (DET-10392) (#9613)
- 95c636c test: move wait statement to shared code (#9701)
- def5e34 feat: obfuscate data.secrets if present in experiment config [DET-10232] (#9635)
- 8776e35 fix: unmanaged experiment checkpoint storage path (#9625)
- e000e70 docs: use postgres 14 in code snippets. (#9691)
- 4a0eb5f fix: Remove experiments immediately after deleting by filtering out deleting experiments (#9688)
- f672a88 fix: use existed templates when launching notebook (#9681)
- f721751 test: experiment list sort [INFENG-766] (#9675)
- ca90a63 fix: highlight
Data Input
link (#9676) - 160445a fix: require full object for debounced settings updates (#9682)
- 88b93a2 fix: links for Recent Submissions on Dashboard (#9651)
- 1c7f4b5 docs: Link to cluster info (#9633)
- 5eb0bda ci: disable telemetry for ci tests (#9671)
- 086de84 chore: fix e2e test Hf trainer searcher lengths (#9683)
- c70dd8c fix: resolve indefinitely queued (STOPPING_COMPLETED) trials (#9605)
- f4ecd91 chore: remove ContainerState from ResourcesStateChanged (#9680)
- 3baece2 fix: default value in filter form (#9678)
- 4bbde85 test: workspace spec rearrange + one more test case for now [TESTENG-24] (#9674)
- 1500f39 docs: announce deprecation of Kubernetes priority scheduler (#9667)
- ccac2c1 chore: add
determined_master_scheme
for K8s multirm (#9673) - c36705b chore: raise exception in HF trainer for mismatched train units [MD-456] (#9669)
- e956f28 feat: Add selection label to FlatRuns page (ET-309) (#9670)
- bbd6f8a test: collect detailed logs for tests in datadog[infeng-752] (#9637)
- 274d763 fix: update sort settings (#9665)
- 4b3a100 chore: Deprecate model hub (#9628)
- c3e5211 fix: SearchDetails Pivot bug [ET-632] (#9656)
- 0786e08 chore: bumpenvs for jupyter upgrades [MD-242] (#9660)
- 6d1f778 docs: expand det abbreviation in docs (#9652)
- 6528500 chore: remove singularity agent znode nightly test (#9659)
- 6cb6a90 fix: user IDs instead of user session IDs for notebook sessions [MD-453] (#9627)
- e4a9ae3 feat: add Metadata to project columns in Run Table (#9629)
- 3663c5b fix: upload non-conflicting files for sharded checkpointing [MD-298] (#9598)
- 4ece949 test: add test cases for filter group (#9647)
- 02da2a2 chore: Improve core v2 init API [MD-441] (#9560)
- 0494cdf fix: ensure historic usage charts assume the correct timezone (DET-10407) (#9650)
- 9a8591f feat: Add resources allocation-csv to det cli [DFR-519] (#9649)
- 3c310c7 ci: environment in e2e-slurm-enroot-znode [MD-451] (#9617)
- 71a9c4b docs: Update historical csv release note (#9654)
- c3e0a41 chore: deprecating container_runtime config, agentrm supporting singularity,podman, and apptainer (#9516)
- 4eeb4db feat: add metadata filtering to SearchRuns (#9611)
- 0709cab chore: hide the docker password in cloudformation stacks (#9641)
- a85fd6d docs: Describe flat runs view (#9644)
- e630bfb feat: support image pull secrets in genai (#9653)
- 8379b13 feat: Add/remove HPs when creating experiment through HP search (#9610)
- e9e4458 fix: allocation csv: gpu_hours -> slot_hours, add resource_pool [DET-10408] (#9616)
- 6299dcd feat: add basic lineage MDLM link (#9482)
- a498008 chore: send alerts can wait forever and fail for broken workflows (#9638)
- 4ac569d ci: remove deprecated label from agent config (#9648)
- 2aa54e1 chore(deps): bump anchore/scan-action from 3 to 4 (#9634)
- 7fab87b test: chore: fix some typos (#9646)
- 53aa974 feat: switch
det deploy aws
and CI fromm5.large
tom6i.large
(#9636) - 13aa327 fix: overflow in hyperparameter modal (#9626)
- d4c50b5 chore: set cli pwd warning go to stderr (#9536)
- d11c3ee docs: Update remote users auto redirect (#9623)
- c0fc4c4 feat: add Search actions [ET-603] (#9622)
- 1ec5d01 chore: deprecate job move within priority group (#9624)
- d257b89 feat: add getMetadataValues for projects (#9618)
- 55b6d25 chore: remove deprecated labels config option (#9609)
- 3a8c042 feat: pause unpause UX (#9615)
- 58fbf68 feat: continue trial from WebUI for multi-trial experiment (#9589)
- 95d1d2f test: chore: add gratitude to test page model readme [INFENG-767] (#9621)
- 547a4c4 test: Add e2e test for experiment edit (#9619)
- 576e244 fix: column picker should effect pinned columns in compare view [ET-605] (#9608)
- 262ad5a test: collect det job ci logs only in case of failure (#9537)
- 3f7cad6 Revert "Docs/improve sample master yaml" (#9620)
- c28545d fix: docker version bump to unpin requests (#9614)
- 0a57cde Docs/improve sample master yaml (#9607)
- 0e7b3ab docs: Describe supported k8s versions (#9277)
- 2958d42 docs: Add link to FSDP example (#9606)
- 000c679 ci: extend experiment timeout for slurm test (#9601)
- a6a79b8 fix: comparison view parall...
0.34.0
Release Notes
Changelog
- ede2396 chore: bump version: 0.34.0-rc12 -> 0.34.0
- f0d825d chore: bump version: 0.34.0-rc11 -> 0.34.0-rc12
- 1556c18 fix: Pause/Resume run test flake (#9592)
- a74e389 docs: add release notes for 0.34.0 (#9561)
- e5fc5f1 chore: bump version: 0.34.0-rc10 -> 0.34.0-rc11
- a51a640 fix: edit/move modals for projects in workspaces unexpectedly closes [DET-10388] (#9588)
- ce3ea17 chore: bump version: 0.34.0-rc9 -> 0.34.0-rc10
- 5b40a5c chore: remove shared cluster test for circle ci (#9579)
- 0b4dec4 chore: bump version: 0.34.0-rc8 -> 0.34.0-rc9
- 01baf33 chore: Release 0.34.0 bumpenvs (#9578)
- bad22b2 chore: add Nvidia drivers version matching test and bump env [MD-413] (#9567)
- 9adbe7c Revert "chore: 0.34.0 bumpenvs (#9565)"
- 60ada0c chore: bump version: 0.34.0-rc7 -> 0.34.0-rc8
- cde8a18 fix: wrong notebook idleness payload [MD-447] (#9571)
- 3f292f5 chore: bump version: 0.34.0-rc6 -> 0.34.0-rc7
- f36b110 fix: correct workspace_id column type on allocation_workspace_info (#9574)
- f0f45f8 chore: bump version: 0.34.0-rc5 -> 0.34.0-rc6
- a6c7918 fix: persist workspace id/name & experiment id for historic allocations [DET-10378] (#9550)
- f66e816 chore: bump version: 0.34.0-rc4 -> 0.34.0-rc5
- eaabab1 fix: add validation to patching project key (ET-305)
- d8b80ad fix: do not modify cached GetAgentsResponse (#9569)
- 25804d7 chore: bump version: 0.34.0-rc3 -> 0.34.0-rc4
- ead2232 fix: return workspace name for breadcrumb in Project Details page (#9564)
- 8da67d2 chore: 0.34.0 bumpenvs (#9565)
- 2677dc2 chore: bump version: 0.34.0-rc2 -> 0.34.0-rc3
- 42bea1a chore: fix boto3 requirement syntax (#9551)
- b2c7e22 chore: bump version: 0.34.0-rc1 -> 0.34.0-rc2
- 2f1283d fix: hide warning for weak password unless it actually applies [DET-10216] (#9538)
- ca208b9 chore: bump version: 0.34.0-rc0 -> 0.34.0-rc1
- f15bda8 feat: det deploy local generates a password for you [DET-10197] (#9518)
- abaf2e3 chore: bump version: 0.34.0-dev0 -> 0.34.0-rc0
- 0cf7aba chore: lock published urls to preserve redirects
- cd85b44 chore: lock api state for backward compatibility check
- 25b6299 chore: bump version: 0.33.1-dev0 -> 0.34.0-dev0
- 83b9a8b feat: add connect modal for notebook and shell tasks [MD-404] (#9545)
- b9ea173 chore: Bumpenvs 8c90e80 (#9544)
- f9a5dd5 fix: update getProjectColumns calls (ET-270) (#9509)
- 325d47e pre-commit lint check fix (#9543)
- 553521e feat: enable token auth for Jupyter notebooks [MD-404] (#9452)
- ea929fc test: det framework supports "nth" component [testeng-1] (#9540)
- 7568129 docs: address two link check failures (#9539)
- 3641bfc feat: support proxied Determined tasks on remote k8s clusters (#9469)
- 44f446c fix: Huggingface Trust Remote Repo (#9535)
- 3320107 chore: allow empty run metadata requests to delete existing metadata (#9524)
- 8006e2e fix: localize debounced settings updates (#9513)
- fec31a1 chore: handle empty nested structs in run metadata as nil leaf nodes (#9526)
- 88b01c6 refactor: remove DataGrid pagination code (ET-259) (#9520)
- edbeee9 test: increase timeouts for running experiments on k8s after env split (#9530)
- 0f6eb24 fix: webui page height (#9527)
- 1630c45 docs: Clarify startuphook (#9517)
- 63a4163 feat: support node selectors & affinities for Kubernetes resource pools (#9428)
- 6cd7d06 chore: Improving SearchRuns performance when doing hyperparameter filtering (#9489)
- 4321143 ci: add new feature signoff checkbox [INFENG-710][skip ci] (#9410)
- 10667f1 feat: remove round robin scheduler for agentrm (#9493)
- 735fb2c chore: remove hyperparameters from projects table (#9504)
- 66ec006 feat: warn users to change their passwords [DET-10216] (#9519)
- 2bce8b6 fix: historical allocations not appearing (#9522)
- f9ba7f4 fix: skip webhook regex matching for exp config (#9511)
- b51bc93 docs: Fix broken links (#9523)
- 9adc092 fix: partially scheduled k8s jobs should display as queued (#9468)
- 32585ad feat: flat runs comparison view [ET-190] (#9477)
- d44013c feat: add arbitrary metadata GET/POST endpoints (#9130)
- 21ecda5 test: preparing a homework assignment [TESTENG-3] (#9510)
- ee66d15 fix: allow doesnotcontains filters on hyperparameter column (#8842)
- 382995c fix: historical allocations not showing task allocation workspace (#9496)
- 8e9067b feat: Framework Splitting and Bumpenvs (#9457)
- f0d26db ci: fix some failing long-running tests related to password requirements (#9421)
- d0d30cf test: collect det task logs as artifacts for ci jobs (#9459)
- e3d01c1 chore: remove debug logs that were accidentally committed (#9503)
- 3857b7a test: upload unit and intg tests to datadog[infeng-501] (#9505)
- 3afe5df chore: check non-multiples of slots per pod for kubernetes rm [MD-403] (#9393)
- 2ca7733 fix: ensure number of project keys possible for testing is not exceeded (#9501)
- 0f30189 docs: Update the URL to the genai docs (#9507)
- 86e6b68 feat: add cluster-wide message (#9261)
- e138267 fix: Searches view fixes (ET-297) (#9487)
- aa6521b fix: run columns mismatching sort/filter columns for run table (#9479)
- de03909 fix: use num pods in k8 job summary (#9497)
- 439734b chore: avoid payload limitation (#9164)
- dde6362 fix: Use experiment config to determine is_multi_trial in api_runs queries (#9475)
- f87214b test: preparing a homework assignment [TESTENG-4] (#9495)
- 27e7307 feat: add custom key to projects table, backfilling based on current project name, and API support (#9134)
- a5cf959 fix: pin huggingface version to <0.23.0 (#9483)
- 97667c5 tc: Test format (#9490)
- ffee34f test: update playwright [TESTENG-4] (#9484)
- 869b96a test: readme and test name revisions [TESTENG-5] (#9463)
- 698ab6c test: docstring revisions [TESTENG-5] (#9478)
- 96c061b ci: lower hf trainer accuracy target + improve failure messages (#9322)
- 84299a6 chore: upgrade golangci linter to 1.57.2 (#9279)
- 2588eea feat: Pause & Resume run (#9129)
- df3919c docs: remove all references to PowerPC/PPC64 (#9476)
- ca03da1 chore: switch to mockery config file (#9473)
- 9160ae9 docs: correct release note for deprecating PPC64/POWER builds (#9470)
- 418525e fix: convert invalid hparam types to json string (#9449)
- 934aeb6 fix: job state shows as scheduled when resources are allocated (#9466)
- 8d64508 feat: remove genai from experimental feature list and enable via
/master
feature switches [GAS-1016] (#9435) - 4d8596c chore: deprecate PPC64/POWER builds (#9467)
- 13a5142 fix: Revert to get_checkpoints.sql call to enable NaN & Infinity values in searcher metric (#9440)
- d50433d chore: no longer store ee artifacts in circleCI (#9426)
- a45aa1e feat: add SearchDetails page (ET-53) (#9436)
- 4c821c3 docs: clarify data collected by telemetry (#9445)
- 57bece4 fix: job queue's allocated slots should be correct after restarts (#9461)
- c49eeea test: datagrid tests [INFENG-687] (#9400)
- 8a9839a feat: add option for Checkpoint_GC pod spec in task container defaults (#9406)
- d960f29 chore: only connect to the database once (#9456)
- 0fdb822 feat(rm): convert Kubernetes submissions from pods to jobs (#9438)
- f54fb7c test: react test datadog integration [infeng-497] (#9455)
- cc4ad2b docs: fix observability README docs link (#9453)
- ca60325 chore: bump version: 0.33.0-dev0 -> 0.33.1-dev0
- 4936847 chore: add docs dropdown link for new version
- 7b81df7 docs: add release notes for 0.33.0 (#9444)
- da2f943 feat: add heatmap to runs table [ET-230] (#9429)
- 0599d0e test: create test users through the API [INFENG-673] (#9431)
- ac459f7 docs: Add historical cluster usage warning (#9439)
- cf22597 docs: update broken nvidia anchor link (#9441)
- d94e299 fix: notify master for core checkpoint deletes [MD-325] (#9415)
- b96ccba fix: dont utilize the default efs mount on normal aws deploys (#9437)
- a0f2e33 fix: redirect on sso login (#9369)
- 9abde37 chore: remove stdlib errors package from lint blocklist (#9381)
- 515c135 fix: add Admin Settings to NavigationTabbar (ET-194) (#9423)
- 00bbda6 fix: set the defaults for shared_fs mount in genai correctly (#9433)
- 9d54093 chore: skip TestSchedule until flake is fixed (#9434)
- 9524dd4 ci: use priority scheduler in e2e tests (#9430)
- 58b31e6 docs: terraforming an EKS cluster with autoscaling and EFS. (#9427)
- 8a6f571 docs: ignore anchor for observability links (#9412)
- 684c38b fix: add feature gate for checking for blank admin/determined password [DET-10197] (#9425)
- 6ad9d73 fix: Keep template modal open when config is invalid (#9424)
- 3dfb9ec test: remove confusing, unused slurm-related ci code (#9417)
- cdd7a82 test: ensure make unslurmcluster always runs in CI (#9420)
- ba31f03 fix: reset InteractiveTable pagination when filters applied [ET-183] [ET-121] (#9413)
- 3cbe805 fix: master checks db newness before migrating [DET-10312] (#9414)
- da46208 fix: bulk action bug in the old experiment table that cannot trigger bulk actions across pages (#9404)
- 657286c feat: Add Run columns to GetProjectColumns (#9146)
0.33.0
Release Notes
Changelog
- 0c2d3cf chore: bump version: 0.33.0-rc5 -> 0.33.0
- e165541 docs: add release notes for 0.33.0 (#9444)
- 8c69d8b chore: bump version: 0.33.0-rc4 -> 0.33.0-rc5
- e1a40b1 fix: dont utilize the default efs mount on normal aws deploys (#9437)
- ebe2698 chore: bump version: 0.33.0-rc3 -> 0.33.0-rc4
- b85b8b3 fix: set the defaults for shared_fs mount in genai correctly (#9433)
- 52c7d95 chore: bump version: 0.33.0-rc2 -> 0.33.0-rc3
- 9968dce fix: add feature gate for checking for blank admin/determined password [DET-10197] (#9425)
- 9c4fd74 chore: bump version: 0.33.0-rc1 -> 0.33.0-rc2
- 274b152 fix: Keep template modal open when config is invalid (#9424)
- f4d6f54 chore: bump version: 0.33.0-rc0 -> 0.33.0-rc1
- 2661ae0 chore: bump ngc image versions for release (#9418)
- cbc15db fix: master checks db newness before migrating [DET-10312] (#9414)
- d1b3343 chore: bump version: 0.33.0-dev0 -> 0.33.0-rc0
- ca45198 chore: lock published urls to preserve redirects
- f2cd018 chore: lock api state for backward compatibility check
- 6184f6f chore: bump version: 0.32.1-dev0 -> 0.33.0-dev0
- 4af9bfc revert: Framework splitting (#9405)
- 6fa1420 test: project create and delete react e2e [INFENG-456] (#9244)
- 860f6a8 docs: Describe config templates WebUI (#9399)
- 6ff8eb7 chore: Add slurm codeowners (#9403)
- 68b36c6 feat: require initial passwords on new cluster-up [DET-10197] (#9314)
- 0ef3e10 test: datagrid scrolling [INFENG-687] (#9379)
- 18ee0e3 chore: Update docker retag scripts (#9401)
- 6ed2976 pin setuptools in model hub tests (#9402)
- c4ebe5e feat: Release WebUI templates with notes (#9383)
- 3bbb51a feat: Display Log retention days and Remaining log retention days in Logs Tab (#9305)
- 047580c feat: update default scheduler to priority for agentrm (#9385)
- ce70c00 docs: Add more info helm install password (#9388)
- b84ee1f docs: cluster observability documentation and dashboard improvements (#9391)
- c3b3ae6 feat: helm install checks password complexity [DET-10293] (#9360)
- 5c51164 fix: Skip resource checking for unmanaged exp (#9372)
- 107e108 feat: add Sort menu to Flat Runs view (#9396)
- cb81a44 feat: Add charts to Comparison View (ET-99) (#9215)
- cd33c13 test: put flaky fix back in [INFENG-694] (#9394)
- d3e89b1 docs: add exp config for unmanaged example #2. (#9397)
- d4e23f4 chore: pin requests version < 2.32.0 so docker works (#9395)
- 5480c57 chore: don't use a seperate schema for views_and_triggers (#9392)
- 893f7f5 chore: add resource_pools intg test (#9356)
- de21593 chore: push oss images per commit (#9386)
- 95c70d4 docs: Add nav to genai docs (#9387)
- 0c42ced feat: SDK methods to fetch pachyderm configs [MD-406] (#9348)
- 0ff09e0 docs: Describe pwd requirements WebUI (#9378)
- 31bc08a refactor: rename multiRM to more intuitive name (#9350)
- df7a2af docs: Update release note (#9375)
- 2c9b9b9 feat: add pod labels with proper validation (#9364)
- 0a59c63 docs: Remove long metrics rn (#9374)
- 7e4b431 feat: add columns menu to Runs view (#9323)
- c10ae99 test: Remove flakiness of KillRun test (#9370)
- 653a0de chore: store database code as code [DET-9180] (#9302)
- d38e2e0 test: report individual test results from python tests (#9366)
- 7bce6ff chore: report ntsc names via cli at launch (#9228)
- 93c8d81 ci: keep waiting on failing workloads for sending slack alerts (#9371)
- 53edec9 test: More Page Models for Experiment Tracking [INFENG-694] (#9367)
- a96cafd feat: Framework splitting (#9318)
- 3b1d0df chore: remove test suite whose marks match no tests (#9363)
- 566b6af test: page model refactor for dropdown and select components [INFENG-694] (#9362)
- 68b7116 docs: deprecation notice for agentrm features (#9344)
- 2092943 docs: Add FSDP to deepspeed (#9182)
- 4f180db chore: update npm libraries (#9331)
- c7b78fa feat: edit/delete template from WebUI (#9353)
- 8c5fce7 test: provide GKE tests with a Helm value for initialUserPassword [DET-10196] (#9361)
- 0941fc4 feat: helm requires bootstrap password [DET-10196] (#9359)
- f91c2a3 docs: revert a doc format change to reenable slurm tests (#9358)
- 989341c feat: add options in flat run (#9341)
- 16a3f3b revert: resource pool intg tests (#9357)
- 54fb10a chore: add intg tests to resource_pool.go (#9199)
- c3901c8 fix: det ckpt download from s3. (#9332)
- 7c26fe1 refactor: columnpicker remove hard coded value (#9342)
- feb8a7b fix: remove pod labels with potentially incompatible names (#9349)
- eab4981 docs: Reformat grid tables (#9321)
- 758ffd7 chore: add retries to check-doc-links ci job (#9335)
- 80fac3d chore: update release notes date (#9334)
- efbcdee chore: update codecov to ignore e2e react [INFENG-689] (#9346)
- 2445d39 fix: Revert "feat: helm requires bootstrap password (DET-10196)" (#9345)
- df0b7f9 feat: Implement
/template/rename
to patch template name (#9320) - 0a0b3c3 feat: helm requires bootstrap password [DET-10196] (#9274)
- 86aa319 chore: Bunify and add test coverage for
ExperimentTotalStepTime
andExperimentNumSteps
(#9333) - 3c0eac6 test: experiement list tests [INFENG-457] (#9299)
- cc82cc9 chore: add missing setuptools to win cli tests (#9336)
- c0fdaa9 chore: remove step for authenticated master session check and use standard script (#9339)
- b868230 test: wait for background logout (#9340)
- ead928e ci: add missing var overide in ee release [skip ci] (#9338)
- cab9ac5 test: log in with the api rather than through the UI for most react tests (#9307)
- 349d2a5 feat: View templates from WebUI (#9304)
- 9d46f49 chore: update codecov to ignore e2e react [INFENG-689] (#9337)
- 6e10465 ci: send job level failure slack alerts (#9315)
- 2abacb9 docs: update "install on k8s" guide to use helm repo instead of tarball. (#9293)
- 492ef57 chore: bump version: 0.32.0-dev0 -> 0.32.1-dev0
- f74988c chore: add docs dropdown link for new version
- a1b6912 docs: add release notes for 0.32.0 (#9301)
- dab4946 feat: add integration config for pachyderm input datasets (ET-12) (#8933)
- 3d4e283 test: refactor nav spec to use sidebar pagemodels [INFENG-683] (#9326)
- 1779060 test: skip a flaky test [ET-233, ET-178] (#9324)
- 3b167c7 fix: filter action experiments, old ExperimentList (#9325)
- 5b73dc4 fix: filter batch action experiments (#9316)
- 6fb62ad feat: support for configuring the shared_fs mount path in genai (#9317)
- 5f4cbbf Revert "docs: Reformat tables with image names" (#9319)
- c9f5e8a docs: Reformat tables with image names (#9312)
- fdaa015 feat: support filter in flat run table (#9250)
- a76c549 ci: don't run test-e2e-longrunning tests on main (#9313)
- ebf19a6 chore: bumpenvs for efs-utils (#9309)
- f9a35d9 ci: run e2e-react manually [INFENG-676] (#9310)
- 9cab46d chore: drop unused postgres function experiments_best_validation_history (#9306)
- ee4f04e chore: stop writing database down migrations [RM-242] (#9289)
- 24aaff4 ci: store npm log (#9311)
- e90bd0d chore: improve messaging for e2e tests (#9286)
- 0aef4c7 fix: tensorboard metric overwrites and sync throttle [MD-328] [MD-291] (#9282)
- f9b96fe ci: don't run requests-hpc-tests on main (#9308)
- 4c314a2 chore: update efs-utils install for v2.0 (#9297)
- f6181ab test: revert runner size test-e2e-cpu (#9303)
- b0a008e feat: Update workspace for templates server side (#9272)
- 9357391 ci: circleci slack alerts should go to #ci-bots (#9300)
- 49ab75d docs: Update Chart.yaml [ci skip] (#9298)
- e31135a chore(deps): bump google.golang.org/grpc from 1.58.0 to 1.58.3 (#9292)
- f6e42cd fix: Bulk Action bug (#9255)
- a8d05fa test: skip a
useTypedParams
test case due to flakiness (#9287) - 2164912 chore: dependabot upgrade grpc/go-jose/net [RM-66] (#9280)
- f1aa92e chore: log health check failures in master logs (#9291)
- 7496445 fix: proto build shouldn't run if source files are unchanged (#9290)
- 21f76e9 fix: slots being filled returned out of order on k8s [RM-42] (#9276)
- a9c8700 test: e2e no floating promises [INFENG-668] (#9283)
- 7a296fa test: flaky user test fix [INFENG-663] (#9281)
- e4c6afe chore(deps): bump golang.org/x/net from 0.21.0 to 0.23.0 (#9202)
- b8eba3a ci: revert ee rebase changes to dependabot.yaml (#9278)
- 9ee9270 ci: gate hpc by request (#9198)
- 670ac40 chore: make command's run startup-hook.sh [RM-159] (#9275)
- b78020d feat: Create template through WebUI (#9263)
- abcc7b4 fix: Hide runs in archived experiments (#9270)
- b602ff2 docs: fix master config doc typo (#9256)
- 6bd2a8c ci: try to fix slurm podman tests by not building agent binary (#9273)
- 9c068d2 feat: webui create user prompts for password [DET-10221] (#9240)
- ae91042 feat: reuse HTTP sessions (#9116)
- a3f0fcf fix: show non det pods in other namespaces than 'default' [RM-141] (#9268)
- a611cf0 chore: stop publishing helm charts to NGC. (#9271)
- 2905180 test: increase runner size for react e2e (#9269)
- f8f8672 ci: try to fix podman tests by building proto once (#9267)
- 95b5164 feat: timeout change and package dedupe [ET-243] (#9265)
- 55b7fd9 chore: Image rename bumpenvs (#9253)
- 4d87127 test: some react tests are flaky [INFENG-663] (#9264)
- 86328cb fix: users can be removed from all groups in Web UI (#9259)
- aea83df chore: enable genai to connect to db over TLS (#9260)
- 703e6bd feat: Archive & Unarchive run (#9143)
- 8794e42 fix: historical-usage date calculation bug (#9257)
- cda4363 test: increase the timeout on a new users test [INFENG-455] (#9258)
- bd7b5ef test: user tests continued [INFENG-455] (#9214)
- dd4d0f9 ci:...
0.32.1
Release Notes
Changelog
- 7d0b38a chore: bump version: 0.32.1-rc0 -> 0.32.1
- 351826c docs: add release notes for 0.32.1 (#9351)
- 947585f chore: bump version: 0.32.1-dev0 -> 0.32.1-rc0
- f9da12f chore: lock api state for backward compatibility check
- 1e8f8de fix: remove pod labels with potentially incompatible names (#9349)
- 6995ca6 chore: bump version: 0.32.0 -> 0.32.1-dev0
0.32.0
Release Notes
Changelog
- a1b7242 chore: bump version: 0.32.0-rc8 -> 0.32.0
- d8580c2 docs: add release notes for 0.32.0 (#9301)
- 2244f71 chore: bump version: 0.32.0-rc7 -> 0.32.0-rc8
- 0322dc7 fix: filter action experiments, old ExperimentList (#9325)
- 5ebb008 chore: bump version: 0.32.0-rc6 -> 0.32.0-rc7
- b208794 fix: filter batch action experiments (#9316)
- 991818b chore: bump version: 0.32.0-rc5 -> 0.32.0-rc6
- e277782 fix: Bulk Action bug (#9255)
- b2663af chore: bump version: 0.32.0-rc4 -> 0.32.0-rc5
- ee63b67 fix: users can be removed from all groups in Web UI (#9259)
- 00b95c3 chore: bump version: 0.32.0-rc3 -> 0.32.0-rc4
- 642e323 fix: historical-usage date calculation bug (#9257)
- f506989 chore: bump version: 0.32.0-rc2 -> 0.32.0-rc3
- 1047e78 fix: hew update for select bug in log viewer (#9249)
- 4c59c9c chore: bump version: 0.32.0-rc1 -> 0.32.0-rc2
- f8ad009 fix: undo default log retention in values.yaml (#9245)
- 4b3adb9 docs: add a release note for aurora issue. (#9241)
- 004fe70 fix: allow genai deployments with agent GIDs set to share data properly (#9243)
- be231d9 chore: bump version: 0.32.0-rc0 -> 0.32.0-rc1
- 714264e chore: bump version: 0.32.0-dev0 -> 0.32.0-rc0
- dc88b9f chore: bump version: 0.31.1-dev0 -> 0.32.0-dev0
- 7ffdadf ci: add determined-ee context to python ee publish (#9234)
- c18ac83 fix: properly merge resource configs (#9233)
- 3b39d3c chore: add log retention to help charts (#9216)
- 3646395 chore: lock published urls to preserve redirects
- 80d8909 chore: lock api state for backward compatibility check
- 39b948c feat: add genai user role to rbac (#9206)
- 43289e9 test: ee and oss have separate handling (#9218)
- 1ca3613 fix: debounce
userSettings
update (#9220) - ab382b4 chore: update the license date (#9225)
- ff10ac0 docs: Fix broken links (#9219)
- ac68df8 chore: default observability.enable_prometheus to true (#9222)
- 26c1940 chore: upgrade protoc used in CI (#7935)
- 9f6bbc9 chore: Add streaming updates feature flag [MD-371] (#9190)
- f8b3736 ci: Exclude deploy/README.md from build (#9211)
- 3bfc212 fix: hew update for chart scroll bug (#9210)
- da8a040 feat: CLI allows and requires creating a user with a password DET-10184 (#9112)
- fbccaf1 chore: clean up rm module [RM-202] (#9191)
- 8caf3cb test: user tests [INFENG-455] (#9152)
- 3568f27 fix: Skip expected error from web socket (#9194)
- 1b212ae feat: add kill run endpoint (#9061)
- e7d870e test: use devcluster for react tests [INFENG-449] (#9185)
- bd4a54e fix: shared cluster test to work in OSS again (#9195)
- b874acb docs: fix another instance of broken docs link (#9208)
- 86be18a ci: pass ee into args to prevent latest main deploying as ee (#9207)
- f74ab9c docs: Describe multi rm k8s (#9025)
- 6fb1c52 ci: deploy awscli to system (#9188)
- 9cfbb59 docs: fix nvidia device plugin link for EKS (#9204)
- 3e865c6 test: skip flakey user provision tests (#9203)
- 598784d chore: make multi-RM an EE-only feature [RM-166] (#9192)
- 6d2be52 ci: fix test-det-deploy-local (#9196)
- 5f312ed test: can't launch NSC test assert 404 instead of 403 (#9197)
- 4b1c937 test: fix a test util issue with master config schema assumptions (#9193)
- 0bc13d8 feat: non-blocking metrics reports [MD-144] (#9107)
- 2ced9b9 ci: do dry runs of
publish-docs
for RCs (#9186) - 72344e0 feat: Use feature flag for streaming updates - manually update project store (#9170)
- dd7f4b5 docs: add profiling section for trainer API UG [MD-373] (#9177)
- 06586f0 fix: better exception handling in detached mode (#9183)
- 283daab feat: Unfork Enterprise Edition (EE) and require license key for EE features (#9168)
- f233c95 docs: FAQ for python SDK ckpt download, k8s deprecation labels. (#9187)
- 6fcefac chore: bump version: 0.31.0-dev0 -> 0.31.1-dev0
- 19688a9 chore: add docs dropdown link for new version
- 2b2e96a docs: add release notes for 0.31.0 (#9159)
- b194686 chore: style fix for helm initialUserPassword (#9158)
- a5e9f0c chore: add option to auto pick the only matching name on partial hits (#9108)
- 371c90b fix: louden server errors coming from deleteCheckpoints (#9184)
- 0765e38 chore: pass correct master scheme to genai (#9181)
- 26f5e0b fix: report errors from deletecheckpoints endpoint + improve feedback (#9178)
- 1037d83 chore: bumpenv update NGC base images version to 24.03 (#9132)
- 1cc9cd7 fix: count determined-system pods as det pods [RM-148] (#9148)
- 0fc247c fix: single-searcher MNIST example runs for multiple epochs (#9160)
- d41c4a7 fix: fix docs and wording (#9179)
- 5541e54 feat: RM-130 add determined info as pod labels (#9140)
- ee15da0 test: Djanicek/infeng 456/workspaces and projects (#9117)
- e6c0c99 chore: add typing annotations for zmq (#9176)
- 4ceaed0 docs: Add readme to toc (#9175)
- 3105407 chore: make the data_dir consistent to other advertised devc configs (#9157)
- d38fc3c fix: Reset table offset when filtering for models (#9167)
- 338d5d3 docs: remove max supported k8s version. (#9171)
- 35d249f chore: add flake8 relative-import rule (#9169)
- ffed598 feat: support for mounting a hostPath for the shared file system in genai (#9161)
- 2f874b9 test: experiment list page models and sample test [INFENG-451] (#9139)
- fd45ed8 ci: merge EE and OSS doc deploy together [INFENG-625] (#9162)
- 0b2eab0 docs: Copy debug to exp config (#9120)
- 3f70a46 chore: style fix for helm tls (#9163)
- 8a94574 chore: new image publishing (#9090)
- 8b83122 fix: TensorBoard visualization from batch actions. (#9156)
- 384e5c0 fix: fix disable button condition in launch jupyter notebook modal (#9155)
- b109108 feat: add helm master level config for tcd startup hooks (#9135)
- 0228a95 ci: publish-docs installs awscli into user space (#9153)
- 746ba26 chore: add alert metric for Prometheus and add Grafana alert docs [RM-118] (#9150)
- 291565b fix: keras and tensorflow import errors in new versions (#9141)
- 831df43 feat: create flat runs view [ET-24] (#9023)
- 5854b8b chore: add a devcluster config to run Determined across multiple Kubernetes clusters locally (#9151)
- d0497da fix: fix docs for log retention (#9149)
- bd29f1f fix: cli gives misleading error message when logging in with a bad password [MD-277] (#8990)
- 95f87d7 fix: ensure all columns have widths (#9136)
- 3f7a396 test: fix test_logging typehint syntax error (#9142)
- 93e7bdf test: ignore e2e test cases in vitest (#9128)
- 4d1b8ae docs: revert helm values change for multirm (#9145)
- c3d13b6 docs: revert-multiRM-mc-doc (#9144)
0.31.0
Release Notes
Changelog
- 583e0c3 chore: bump version: 0.31.0-rc7 -> 0.31.0
- 29574c4 docs: add release notes for 0.31.0 (#9159)
- 40c34cb chore: bump version: 0.31.0-rc6 -> 0.31.0-rc7
- 75b7e43 fix: louden server errors coming from deleteCheckpoints (#9184)
- 44503bb chore: bump version: 0.31.0-rc5 -> 0.31.0-rc6
- 956df40 fix: fix docs and wording (#9179)
- dae548d fix: report errors from deletecheckpoints endpoint + improve feedback (#9178)
- 592280d chore: style fix for helm tls (#9163)
- 7565447 chore: bump version: 0.31.0-rc4 -> 0.31.0-rc5
- 2daa1fc fix: TensorBoard visualization from batch actions. (#9156)
- 4bbc20d fix: fix disable button condition in launch jupyter notebook modal (#9155)
- ac15a86 chore: bump version: 0.31.0-rc3 -> 0.31.0-rc4
- 990fbfb feat: add helm master level config for tcd startup hooks (#9135)
- a6ae2aa ci: publish-docs installs awscli into user space (#9153)
- d5deffb chore: bump version: 0.31.0-rc2 -> 0.31.0-rc3
- 7f7d2bf fix: fix docs for log retention (#9149)
- 776c5c3 fix: ensure all columns have widths (#9136)
- e8b4fd7 chore: bump version: 0.31.0-rc1 -> 0.31.0-rc2
- 691d190 test: fix test_logging typehint syntax error (#9142)
- 61999f4 docs: revert helm values change for multirm (#9145)
- be36ecd chore: bump version: 0.31.0-rc0 -> 0.31.0-rc1
- f78ccf8 docs: revert-multiRM-mc-doc (#9144)
- 3014dba chore: bump version: 0.31.0-dev0 -> 0.31.0-rc0
- 828532a chore: lock api state for backward compatibility check
- 0547f7f chore: bump version: 0.30.1-dev0 -> 0.31.0-dev0
- 55ef649 chore: change multirm log messages to trace level [RM-151] (#9138)
- 1bb2fe4 feat: expose
hyperparameters
in experiments api to avoid using deprecatedconfig
property for experiment (#9012) - 5a588e0 chore: lock published urls to preserve redirects
- f46bc69 chore: lock api state for backward compatibility check
- 28b3aff feat: add cluster wide startup hook for tasks (#9124)
- fe2b616 docs: Describe pwds default accounts (#9137)
- d1c268b fix: down migrations (#9133)
- f7a5260 chore: update PyPi metadata (#8971)
- 2c3ce29 chore: set the default db storage as docker volume instead of a mount (#9127)
- 8b11e3a ci: publish docs without installing awscli (#9126)
- 133f838 fix: prevent table breaking on null columnWidths [ET-161] (#9131)
- ec43809 fix: det gcp down doesn't have a det_version argument (#9121)
- c89b3df fix: reduce time and increase reliability of tests (#9125)
- d5f807d feat: helm deploys with a password (#9113)
- 8a7832a fix: unlock mutex for experiment ResourcePool() [RM-152] (#9119)
- e70d38e docs: add a python sdk example for log following. (#8981)
- 3028efb docs: add helm doc updates (#9122)
- cf2f2be fix: fix regression caused by join on trials view (#9091)
- bdab9e4 feat: create Searches view (#9089)
- 65339d2 chore: PR template again [INFENG-600] (#9118)
- 0c6985b chore: update github PR template [INFENG-600] (#9098)
- f4b0471 docs: add instructions on deploying determined via HPE MLDES [SAAS-1877] (#9105)
- c32ac6f chore: add test to CODEOWNERS [INFENG-605] (#9115)
- 25767b9 fix: helm value for gke tests (#9114)
- a0847b8 fix: match GetJobQueueStats behavior in k8s RM to agent RM [RM-136] (#9097)
- 2ef5ab9 chore: better k8s testing with shared gke cluster (#9074)
- 7fc8d7a chore: add nightly gke cluster cleanup job (#9031)
- 5cb7927 chore: bump version: 0.30.0-dev0 -> 0.30.1-dev0
- 7f65779 chore: add docs dropdown link for new version
- 4ae4075 docs: add release notes for 0.30.0 (#9103)
- 9dce6f0 feat: Add model version streaming (#9029)
- 2c6fec7 test: user-page-models (#9084)
- 75b1ff4 feat: det deploy aws adds tags to dynamic agents [RM-140] (#9106)
- d6059e9 feat: Create MoveRun endpoint (#9001)
- 91d7e08 feat: Pre-select ws when launching notebook (#9109)
- f03a8a8 fix: add missing k8s job submission times to allocations (#9028)
- b8bf396 chore: upgrade Bun to fix race condition in tests [DET-10193] (#9082)
- bc8c31c fix: make sure that the genai helm chart services work across namespaces (#9102)
- 58cd22b ci: INFENG-600 remove single commit legacy validation (#9104)
- 943b2cb fix: prevent checkpoint modals from closing on their own [ET-116] [ET-120] (#9094)
- d4eed0e chore: change RM log message back to Debug level (#9093)
- 2af21ee chore: unshadow more builtins (#9092)
- 519d702 docs: update multiRM docs (#9050)
- 3688c3f fix: job queue panic for multirm [RM-123] (#9079)
- f78b9aa fix: add change in master config to devcluster.yaml (#9087)
- 8f02a7f fix: fix master config and experiment config for log retention (#9075)
- fe1a6bb fix: no more shadowing "license" (#9085)
- 3f2d6ab fix: spacing issue with exp list pagination (#9067)
- 37abc6c fix: stop showing loading indicator in
queued
state (#9081) - 4050eda chore: bumpenvs for jupyter security update. (#9077)
- fe66b86 chore: CODEOWNERS deploy owned by RM not MD (#9064)
- 94e5d21 ci: more printing about state of master (#9058)
- a3834ac chore: change log level for multirm messages [RM-125] (#9080)
- 0099f4e chore: update old e-mail address (#8944)
- b726cf9 feat: initialize genai shared_fs permissions to agent group in helm deployment (#9065)
- 5031807 chore: api level check if agent slotstats are pre computed (#9073)
- 1a38f0c feat: add /health endpoint [RM-114] (#9062)
- 8217508 style: update harness to eliminate I2041 flake8 errors (#8960)
- 2c7d2a1 feat: add new endpoints to change log retention for experiments and trials (#8982)
- 6c4bc44 fix: slot stats are not filled in everywhere (#9070)
- 53bf20e fix: fix TestScheduleRetention (#9069)
- 1e45918 fix: remove parent_id from create_experiment (#9068)
- 17287f5 fix: API migration to improve performance in resource pool page (#9056)
- 85bb3c8 feat: add log retention for database logger (#8622)
- 8f5de35 fix: remove calls to Pytorch Dataset len (#8647)
- d9e1088 feat: webui nav sidebar dropdown text changes (#9063)
- 06c86ee chore: Remove /lore redirect from deployment template (#9057)
- 18dd29e docs: Update release notes (#9044)
- 8d2a763 refactor: change GlideTable into a reusable component (ET-25) (#8956)
- 93bca2a fix: loading experiments without filterset (#9059)
- a07f0fb faster migrations (#9060)
- e8dba6d feat: add slot stats to /agents endpoints (#9048)
0.30.0
Release Notes
Changelog
- 5a63518 chore: bump version: 0.30.0-rc5 -> 0.30.0
- 97aaa02 docs: add release notes for 0.30.0 (#9103)
- c108443 chore: bump version: 0.30.0-rc4 -> 0.30.0-rc5
- 4ce78b2 fix: prevent checkpoint modals from closing on their own [ET-116] [ET-120] (#9094)
- 8bcdcc8 chore: bump version: 0.30.0-rc3 -> 0.30.0-rc4
- e90238a chore: bump version: 0.30.0-rc2 -> 0.30.0-rc3
- b8db2e6 fix: slot stats are not filled in everywhere (#9070)
- d2e3a5c fix: remove parent_id from create_experiment (#9068)
- 61958ef fix: API migration to improve performance in resource pool page (#9056)
- 62d102b chore: bump version: 0.30.0-rc1 -> 0.30.0-rc2
- bc241b6 docs: Update release notes (#9044)
- 2e31ece fix: loading experiments without filterset (#9059)
- 4efaede chore: bump version: 0.30.0-rc0 -> 0.30.0-rc1
- d2949d3 faster migrations (#9060)
- 4c6e35c feat: add slot stats to /agents endpoints (#9048)
- f32dc82 chore: bump version: 0.30.0-dev0 -> 0.30.0-rc0
- 10030a6 chore: lock published urls to preserve redirects
- 220f820 chore: bump version: 0.29.2-dev0 -> 0.30.0-dev0
- 1e6f0f7 feat: Use filtered resource pools when creating notebook (#9045)
- 74fe16b feat: profiling v2 [MD-27] (#9032)
- 133d127 docs: revert multirm docs changes #9016
- 1992c97 chore: optional DB migrations (#9047)
- 84ba688 fix: docs lint (#9052)
- 848b216 feat: add command det model delete (#9039)
- 1202d5c refactor: DET-9976 remove agentID type from agentrm (#9040)
- 0710c58 docs: Describe editorrestricted (#9049)
- 02da36f chore: mark db-dependent tests as needing to run in integration (#9041)
- 6c88e8d fix: move experiment SQL error (#9042)
- 3fa0df1 Revert "docs: add EditorRestricted role release note (#9007)" (#9046)
- 60cb003 test: Jcom/infeng 454/sign in tests (#9013)
- f08b406 ci: tag CI-deployed resources (#9043)
- 1868723 build(deps): bump google.golang.org/protobuf from 1.28.0 to 1.33.0 (#8996)
- d4ab20b build(deps): bump github.com/docker/docker (#9026)
- e4bc377 test: playwright config and browser usability (#9024)
- f6b9ac8 build(deps): bump github.com/jackc/pgx/v4 from 4.12.0 to 4.18.2 (#8987)
- c811947 chore: helm for multirm kubeconfig_path (#9033)
- 4441d6d feat: Add template to py sdk
create_experiment
(#8927) - 5ac1b85 chore: revert helm for multirm kubeconfig_path (#9030)
- 6fec24d chore: helm for multirm kubeconfig_path (#9015)
- 0518785 feat: streaming update code generation for typescript (#8988)
- 39afa3c docs: add documentation for multirm (#9016)
- 7e37c22 chore: add grpc based auth fallback to proxied requests (#8980)
- 5e1f2af fix: Experiment.await_first_trial exits when Experiment is terminal (#9022)
- a603f4c chore: logins return Sessions (#8883)
- 93b6aa2 feat: SearchFlatRuns api call for flat runs table support (#8852)
- fa43bff ci: test-perf uses determined version from github (#9019)
- 137bfcd feat: add model streaming (#8973)
- 8bf280d refactor: consolidate experiment list selection state (#8860)
- 674cd73 ci: DRY skip logic and clarity on step name (#9002)
- 00d145f chore: bump version: 0.29.1-dev0 -> 0.29.2-dev0
- a3ba9e9 chore: add docs dropdown link for new version
- e922a41 docs: add release notes for 0.29.1 (#9014)
- dfed63d chore: reassign ml-sys CODEOWNERship to model-dev (#9000)
- eac7ddf test: document ui e2e with backend test instructions for local (#9005)
- bc1b431 docs: add EditorRestricted role release note (#9007)
- f52f43b chore: warn about det deploy det-version mistmach (#8994)
- 5b17df3 chore: limit code coverage report to files in src; omit generated files (#9003)
- f73fd09 fix: escape regex in
ProjectDeleteModal
(#8998) - 73fd1cd feat: Add multi RM name to K8s (#8993)
- 978a02e ci: Djanicek/infraeng 487/circle test runner (#8977)
- 4730d76 chore: ban http.Transport & http.Client; add cleanhttp (#8991)
- 52572d4 fix: improved textcell performance for novels (#8986)
- 89d4708 docs: add EditorRestricted role to rbac docs (#8984)
0.29.1
Release Notes
Changelog
- 6f0810b chore: bump version: 0.29.1-rc2 -> 0.29.1
- d13dfac docs: add release notes for 0.29.1 (#9014)
- 8093bee chore: bump version: 0.29.1-rc1 -> 0.29.1-rc2
- cce4e6b chore: warn about det deploy det-version mistmach (#8994)
- a2576be chore: bump version: 0.29.1-rc0 -> 0.29.1-rc1
- 05a75b3 fix: escape regex in
ProjectDeleteModal
(#8998) - de8d02d chore: bump version: 0.29.1-dev0 -> 0.29.1-rc0
- 0a2fd28 chore: change GKE version (#8989)
- 055dd83 docs: Update Deploy on GCP (#8985)
- 47cb6fd fix: remove error text in continue trial modal (#8923)
- 2f40476 chore: bump version: 0.29.0-dev0 -> 0.29.1-dev0
- 84a846e chore: add docs dropdown link for new version
- 0fd6b61 docs: add release notes for 0.29.0 (#8955)
- 115bf13 fix: remove duplicate permissions in rbac CLI output (#8972)
- cc2e9b4 chore: Bumpenvs for NGC+ images (#8975)
- 1a35e5d test: add e2e_tests for multirm k8s [RM-11] (#8926)
- 18154f6 chore: add type ResourcePoolName string (#8978)
- a22656d chore: remove panics from rm initialization (#8983)
- 4ae0987 chore: amend contributing doc to point to correct make rule, as of #2892 (#8947)
- 26c985c fix: Check auth validity before setting isAuthenticated (#8967)
- f67c473 fix: nil deref in ReadPreemptionStatus (#8979)
- 6e1acf4 chore: multirm unique resource pool config changes [RM-74] (#8974)
- ca29879 chore: add multirm router layer to rm module (#8963)
- 2395dcb fix: stopping states are not handled in restore properly [RM-69] (#8958)
- b06c923 feat: allow k8srm to connect with a kubeconfig (#8953)
- f8f860d chore: react-virtuoso LogViewer companion (#8862)
- bf21896 chore: revert multirm refactors (#8962)
- 4309f7f feat: Display resource managers information (#8951)
- 4d538ae test: remove last quarantined test (#8922)
- 68017dd ci: update performance test script for breaking Determined change (#8961)
- b2b85d7 chore: [RM-68] improve readability for unit test (#8950)
- e1ca242 feat: Connect ProjectStore with streaming updates (#8834)
- 191a144 fix: don't access agentState when it may be nil (#8921)
- dcaa893 fix: update default aux container limits and instance types (#8959)
- c7e5d43 docs: fix pre_publish check (#8957)
- 6ecd81e chore: update AMIs - Nvidia minor version bump (#8945)
- e108ed7 chore: set CGO_ENABLED=0 (#8941)
- 54aa739 chore: fix multirm unit test flake (#8949)
- e8b0165 chore: add resource manager name/metadata to resourcepoolv1 proto (#8948)
- a5b425a test: Add e2e test for streaming updates python client (#8901)
- 77d1ede fix:
no data plot
in chart with data (#8935) - f416354 test: refactor usage within test_local (#8913)
- c3012ff chore: add multirm module to ResourceManager (#8857)
- d507edd test: CLI workflows in CI use new Python images (#8943)
- fa856ab chore: remove support for Python 3.7; prefer 3.8 (#7329)
- c404c8e fix: [RM-6] remove global max-slots-per-pod default when multiple RMs… (#8938)
- 0d61d15 build: bump up ci setup_remote_docker version (#8942)
- 60436ea chore: pin pandas and ray versions for ray tests (#8932)
- f997cd8 fix: malformed config with gcp up with --initial-user-password (#8936)
- 967e41f build: bump ci cpu image to latest ubuntu 2004 (#8940)
- 592a566 feat: streaming updates python client [MD-246] (#8778)
- 2dfc4f2 chore: remove unused constant (#8934)
- b99ad9f fix: det deploy gcp down shouldn't check quotas (#8931)
- 21fb6d1 fix:
det dev curl
support for URLs with curly brackets. (#8930) - 225dba3 fix: specify go1.22.0 (#8929)
- 63adae5 fix: cli fails when listing providers [DET-10127] (#8903)
- 9e8cd68 fix: slurm launcher authenticates preemption notification (#8928)
- acded32 tc: Add release note 8851 (#8864)
- dffda27 chore: cover and bunify project functions in postgres_experiments.go (#8912)
- beac348 fix: SSO button link target (#8925)
- 5367f4f chore: add codeowners for resource-mgmt team files (#8879)
- feb73de tc: Remove broken link (#8924)
- cd88bb5 chore: revert pod spec and test changes (#8920)
- 1dfd6d9 chore: bump up ebs size to 400gb for genai deployments
- 3a3b668 fix: canonicalize master urls shim code (#8919)
- ef49195 test: fix failing Go TestResourceCreationFailed test (#8918)
- 94c7bfe chore: minor tweaks as modev takes over streaming updates (#8909)
- 392f054 ci: fix failing nightlies after auth PR (#8904)
- ad7d260 chore: fix mp.pool test_streaming_metrics_api (#8917)
- 8af4148 test: upload test results to datadog (#8910)
- aab9b42 test: remove redundant (and brittle) assertion (#8894)
- 59385a0 feat: log podspec [DET-9861] (#8899)
- 6857ecf chore: refactor ResourceManager interface for multirm (#8847)
- 9f40603 test: skip tests that need to get scheduler type (#8911)
- 4c50601 chore: upgrade Go from 1.21 -> 1.22 (#8914)