Determined - 深度学习训练平台
能帮助深度学习团队更快地训练模型,轻松共享GPU资源并有效协作。使深度学习工程师可以集中精力大规模构建和训练模型,而无需担心DevOps或为常见任务(如容错或实验跟踪)编写自定义代码。 determined-ci released this
Changelog
6f6280e chore: bump version: 0.13.12rc2 -> 0.13.12
335057a chore: bump version: 0.13.12rc1 -> 0.13.12rc2
690c725 docs: Minor grammatical change for 0.13.12 release notes.
27758a4 chore: bump version: 0.13.12rc0 -> 0.13.12rc1
dcc5949 docs: Release notes for 0.13.12.
3ce5a95 chore: bump version: 0.13.12.dev0 -> 0.13.12rc0
9ef6fc6 chore: bump version: 0.13.11 -> 0.13.12.dev0
a19eb5f fix: relax type expectations for hparam values [DET-4742 DET-4744] (#1763)
798fa4f fix: add endTime to metric workloads [DET-4743] (#1772)
783c810 fix: use cookie token from sso if applicable (#1765)
Docker images
docker pull determinedai/determined-master:0.13.12
docker pull determinedai/determined-master:6f6280e7
docker pull determinedai/determined-master:6f6280e7807996eebe64df4e3503d1b08fc63c57
docker pull determinedai/determined-dev:determined-master-6f6280e7
docker pull determinedai/determined-dev:determined-master-6f6280e7807996eebe64df4e3503d1b08fc63c57
Assets
12
determined-ci released this
Changelog
16860f3 chore: bump version: 0.13.11rc6 -> 0.13.11
68015b7 chore: bump version: 0.13.11rc5 -> 0.13.11rc6
8fdc4bc Revert "chore: add priority scheduling"
42631e4 chore: bump version: 0.13.11rc4 -> 0.13.11rc5
03e6406 chore: add priority scheduling
2c299b2 chore: bump version: 0.13.11rc3 -> 0.13.11rc4
2f7207c chore: bump version: 0.13.11rc2 -> 0.13.11rc3
de6b50d fix: check for a cookie token and verify auth with it [DET-4732] (#1758)
a743bf4 fix: add boolean to accepted hparam types [DET-4727] (#1757)
75cec07 Release notes for 0.13.11 (#1759)
81b5780 chore: bump version: 0.13.11rc1 -> 0.13.11rc2
8098ee4 fix: don't request GPUs for Fluent Bit container (#1755)
73d7c62 chore: bump version: 0.13.11rc0 -> 0.13.11rc1
cb24990 chore: bump version: 0.13.11.dev0 -> 0.13.11rc0
bf60ee4 chore: bump version: 0.13.10.dev0 -> 0.13.11.dev0
31bb660 chore: lock api state for backward compatibility check
2242c2c chore: add F1 score example of pytorch custom reducers [DET-4724] (#1752)
b9a9408 chore: hack tensorboard support to include custom metrics (#1750)
ddf8ee9 feat: support custom reducers for PyTorch (#1647)
c7fdddf feat: support agent label for "det-deploy local agent-up" [DET-4713] (#1748)
0e240a4 feat: learning curve [DET-4445] (#1731)
6a9e731 style: sort interface keys and type literals (#1699)
1c818d8 test: fix race condition in test-intg-agent (#1741)
bab2337 chore: fix mnist_data_layer convergence test (#1744)
ef59afa ci: split CircleCI tests more carefully (#1740)
1947408 feat: enable configuring trial logs backend on Kubernetes (#1737)
f8b89bc ci: upgrade version of CircleCI Helm orb (#1739)
d1ece88 feat: rebase onto horovod 0.21.0 [DET-4668] (#1720)
883bc5d fix: command priority not respected [DET-4674] (#1735)
3f87b24 test: use updated GKE version (#1736)
c68b32b docs: add topic guide for priority scheduler [DET-4670] (#1703)
15794fa chore: log through Fluent Bit on Kubernetes [DET-4622] (#1712)
bc52313 chore: migrate getInfo endpoint [DET-4406] (#1713)
8ccb40d fix: webui table horizontal scroll [DET-4660] (#1722)
f991a04 feat: support multiple backward call per train_batch in pytorch [DET-4667] (#1732)
ecc7853 build: allow parallel runs of js and css checks (#1727)
f5e58c3 chore: update to Go 1.15 (#1716)
ac52ec8 fix: fix asha max concurrent trials (#1719)
eb76a2c fix: fix missing field in function call from bad merge (#1726)
73799c5 test: integration test fluent with postgres and elasticsearch backends (#1705)
1a1f756 fix: accept None-type hyperparameters with --local (#1704)
01cde0f chore: webui MultiSelect storybook [DET-4040] (#1714)
5cc385a docs: clarify some Kubernetes-related docs (#1708)
374e747 fix: BERT SQuAD example works with latest stable transformers [DET-4680] (#1718)
8aa6cd3 feat: support order by in trial logs api [DET-4647] (#1706)
96ca862 chore: update command APIs to return resource pool info [DET-4568 DET-4569 DET-4570 DET-4571] (#1710)
d71dfc4 style: fix mobile steps table [DET-4669] (#1701)
491b062 chore: provide telemetry information in new api [DET-4642] (#1672)
2faade7 chore: priority scheduler unit tests [DET-4513] (#1658)
cc491f3 chore: migrate trial details endpoint [DET-4021] (#1674)
b3d70e2 chore: set MinCapacity for RDS for secure det-deploy to 2 (#1535)
7127d26 chore: Updates to 0.13.10.dev0. (#1702)
aa1389e feat: hp viz skeleton [DET-4494, DET-4495, DET-4545] (#1618)
8b35c38 test: integration test elastic-backend trial logs APIs (#1675)
d5a913d chore: restart fluentbit on failures [DET-4665] (#1696)
194af45 chore: add resource pools mock api [DET-4639] (#1662)
b1c2930 chore: add dev hgi cluster page and stat overview [DET-4633] (#1676)
7ed21d4 style: correct mobile viewport [DET-4664] (#1695)
d9add13 feat: port of DETR (#1470)
39b6d12 chore: minor copy update to trial log datetime filters (#1692)
f0817a8 chore: hide the trial logs filters when there are no filter options (#1690)
f013ea3 chore: camelcase required api attr names [DET-4648] (#1681)
c4c6694 chore: fix a shadow var declaration (#1691)
4b0cf95 fix: prevent mobile tabbar from opening new window for Master Logs [DET-4654] (#1688)
72cbe09 docs: document setting priorities in experiment config (#1687)
919d9cf chore: sort trial log's filter options (#1686)
a29ccd6 fix: add abort controller to trial log endpoints (#1689)
db1b7ca chore: move dev dependencies out of dependencies (#1679)
5c69af3 fix: add K8s disclaimer for mmdetection example (#1683)
27a67b6 ci: fix windows cli test (#1685)
Docker images
docker pull determinedai/determined-master:0.13.11
docker pull determinedai/determined-master:16860f3f
docker pull determinedai/determined-master:16860f3fd2495af53913a9a62d7330898004b671
docker pull determinedai/determined-dev:determined-master-16860f3f
docker pull determinedai/determined-dev:determined-master-16860f3fd2495af53913a9a62d7330898004b671
Assets
12
determined-ci released this
Changelog
3f0ec0c chore: bump version: 0.13.9rc4 -> 0.13.9
149a461 docs: Release notes for 0.13.9 (#1623)
06f56dd chore: bump version: 0.13.9rc3 -> 0.13.9rc4
0b35cea chore: bump version: 0.13.9rc2 -> 0.13.9rc3
bb8be81 docs: More changes for release notes for 0.13.9.
cbcc7b1 feat: support per command shmSize settings [DET-4577] (#1620)
5490862 chore: bump version: 0.13.9rc1 -> 0.13.9rc2
d7b20a3 fix: tensorboard to load from experiment list via table batch (#1617)
a76e0cf chore: bump version: 0.13.9rc0 -> 0.13.9rc1
d1992c3 chore: bump version: 0.13.9.dev0 -> 0.13.9rc0
fa24de5 docs: Release notes for 0.13.9.
c96d738 chore: include offset in trial log IDs returned to webui [DET-4561] (#1608)
7fac747 chore: bump version: 0.13.8 -> 0.13.9.dev0
5264be3 docs: Release notes for 0.13.8. (#1603)
Docker images
docker pull determinedai/determined-master:0.13.9
docker pull determinedai/determined-master:3f0ec0ce
docker pull determinedai/determined-master:3f0ec0ce8d9dbe4256a630156480661f3a8c2ff1
docker pull determinedai/determined-dev:determined-master-3f0ec0ce
docker pull determinedai/determined-dev:determined-master-3f0ec0ce8d9dbe4256a630156480661f3a8c2ff1
Assets
12
determined-ci released this
Changelog
35cd77a chore: bump version: 0.13.6rc1 -> 0.13.6
84c3513 chore: bump version: 0.13.6rc0 -> 0.13.6rc1
7b3ade0 docs: Clear out release note candidates.
186e348 fix: show more help text and version info in det-deploy [DET-4373] (#1443)
4b1cdd0 docs: release notes for 0.13.6. (#1444)
88c9cc6 chore: bump version: 0.13.6.dev0 -> 0.13.6rc0
68bdba5 fix: correct humanReadableFloat
error on Experiment Detail page [DET-4354] (#1431)
de4ec0a feat: add opentracing to actor system [DET-4212] (#1327)
c9bb9b2 docs: add docs for TLS usage and configuration [DET-4364] (#1419)
f6e36fe feat: support storageClass configuration in Helm Chart [DET-4357] (#1434)
5f34b58 fix: webui metric chart not displaying log scale properly [DET-4246] (#1418)
588cb70 chore: add telemetry for k8s vs. agents [DET-4234] (#1411)
689b06f chore: update horovod version (#1413)
e145d89 make AWS and GCP agent image optional (#1417)
c608fd1 ci: update gke version (#1424)
f0289b5 fix: webui preserve colors on metrics chart [DET-4247] (#1400)
70d3ae1 fix: make webui chart legend transparent to show data behind [DET-4218] (#1420)
a6cb9be docs: update to new version of custom Sphinx theme (#1414)
153131f fix: don't fail master if restoring non-terminal exp from DB [DET-4074] (#1397)
69afa95 chore: webui add experiment label editing on experiment detail page [DET-3972] (#1356)
e23d113 chore: make make -C proto build
idempotent (#1390)
bec697f fix: support for --local-state-path with det-deploy gcp [DET-4277] (#1402)
eb07743 fix doc for agent starting period and idle timeout (#1416)
bfcb874 feat: support configuring CPU and Mem reqs for DB in helm chart [DET-4032] (#1412)
2564555 chore: handle failed build in update bumpenvs script (#1395)
92de5cd chore: remove mixed mode TLS workaround from the agent (#1410)
dbce507 fix: always load default system TLS certificates in the harness (#1409)
1dd949f chore: add a formatter for protobufs (#1405)
2bd7e3a fix: webui trial info checkpoint size label update [DET-4250] (#1401)
a867df3 fix: use the correct target cancel state for canceling experiment [DET-4257] (#1404)
ed51d5a docs: bundle static swagger-ui with docs [DET-4210] (#1376)
53d974e chore: fix dependency issue for windows tests (#1406)
6c9e941 chore: add swagger authentication spec [DET-4272] (#1396)
4450b45 chore: include workloads in trial endpoint [DET-4036] (#1342)
8c8037f chore: add post experiment swagger spec (#1363)
46643b2 chore: rebase onto horovod 0.20.0 [DET-4225] (#1388)
9028be2 chore: bump version: 0.13.5.dev0 -> 0.13.6.dev0
3361c3b docs: get rid of staged release notes for 0.13.5, in preparation for the next release.
876833d docs: Release notes for 0.13.5. (#1392)
6701cd7 chore: set default startup in det-deploy to 20m (#1394)
c20deab chore: bump tf test versions (#1378)
5bf7eea chore: introduce resource pool and resource manager [DET-4131,DET-4136] (#1365)
c767f02 ci: update from deprecated remote docker versions [DET-4262] (#1393)
89a5906 fix: update agents context polling to block before next poll [DET-4264] (#1385)
5d2d4bc fix: det-deploy deprovisions GCP agents despite long master names [DET-4271] (#1391)
198d64e fix: experiment chart legend labelling line as 'trace 0' (#1389)
1926250 feat: increase max disconnected and idle period [DET-4267] (#1386)
38b0c95 fix: commands (TensorBoards, notebooks, etc.) should not be preempted [DET-4157] (#1346)
20a7bc7 docs: fix broken links (#1387)
68f3568 docs: add a tf.layers-in-Estimator example (#1383)
4d0ba2f feat: don't log through agent 0 [DET-4180] (#1344)
f1ff54e chore: fix typo in a docstring (#1384)
915fb50 fix: update percent utility to handle out of range numbers (#1381)
c8ee63e chore: fix possible syntax error when parsing experiment labels request [DET-4265] (#1382)
346bcc4 chore: fix typo in helm chart (#1379)
8d64cf9 fix: don't accept stale socket connections [DET-4203] (#1367)
5f4c490 fix: webui tweak select for better layout [DET-4123] (#1351)
693098e fix: webui trial chart render metrics with same name [DET-4169] (#1350)
Docker images
docker pull determinedai/determined-master:0.13.6
docker pull determinedai/determined-master:35cd77a2
docker pull determinedai/determined-master:35cd77a202dfe084a5c9655e8291f14c1a1c14a8
docker pull determinedai/determined-dev:determined-master-35cd77a2
docker pull determinedai/determined-dev:determined-master-35cd77a202dfe084a5c9655e8291f14c1a1c14a8
Assets
12
determined-ci released this
Changelog
df113a3 chore: bump version: 0.13.3rc1 -> 0.13.3
1a76cad docs: More release notes for 0.13.3.
edafbd8 chore: bump version: 0.13.3rc0 -> 0.13.3rc1
fc046f1 docs: More release notes for 0.13.3.
ee14ebf fix: filter out non numeric metric values in WebUI [DET-4078] (#1258)
3838584 fix: match column key for experiment list name column for taglist renderer (#1263)
7b1a357 docs: Release notes for 0.13.3.
a636057 chore: bump version: 0.13.3.dev0 -> 0.13.3rc0
afaf7c9 chore: bump version: 0.13.2 -> 0.13.3.dev0
1cffc44 ci: add mypy and ci coverage to det-deploy local [DET-4089] (#1251)
6c7483a fix: add missing cluster_name for det-deploy local (#1249)
Docker images
docker pull determinedai/determined-master:0.13.3
docker pull determinedai/determined-master:df113a37
docker pull determinedai/determined-master:df113a3709f4ec3fc563a82f2a5c18692e7b107f
docker pull determinedai/determined-dev:determined-master-df113a37
docker pull determinedai/determined-dev:determined-master-df113a3709f4ec3fc563a82f2a5c18692e7b107f
Assets
12
determined-ci released this
Changelog
44461a7 chore: bump version: 0.13.1rc1 -> 0.13.1
8c97f68 docs: release notes for 0.13.1 (#1210)
dfb6686 chore: bump version: 0.13.1rc0 -> 0.13.1rc1
ff1eb49 chore: upgrade and apply black version
bf2c24d chore: bump version: 0.13.1.dev0 -> 0.13.1rc0
40eeea3 chore: bump version: 0.13.0 -> 0.13.1.dev0
c7b3e45 fix: fix loading tensorboard from prior versions [DET-4008] (#1201)
d30f3ff fix: fix prior_batches_processed backfill migration [DET-4006] (#1200)
1483598 fix: fix task openability check and a trial detail decoder mismatch [DET-3924 DET-3937] (#1148)
Docker images
docker pull determinedai/determined-master:0.13.1
docker pull determinedai/determined-master:44461a75
docker pull determinedai/determined-master:44461a75b0a5b2b4d66c009271dcb2da90a20481
docker pull determinedai/determined-dev:determined-master-44461a75
docker pull determinedai/determined-dev:determined-master-44461a75b0a5b2b4d66c009271dcb2da90a20481
Assets
12
determined-ci released this
Changelog
5993e2e chore: bump version: 0.12.11rc2 -> 0.12.11
ba62016 chore: bump version: 0.12.11rc1 -> 0.12.11rc2
b03e21c fix: update examples link (#845)
cd2e4dd chore: add response headers to bust cache for elm and react index.html (#847)
b309be9 chore: bump version: 0.12.11rc0 -> 0.12.11rc1
1746c44 chore: link react trial logs for improved rendering performance [DET-3530] (#834)
a7b4c25 feat: react trial logs [DET-3128] (#830)
e1171b6 chore: bump version: 0.12.11.dev0 -> 0.12.11rc0
dad64cc ci: update webui e2e tests to kill experiment instead of cancel (#835)
6920dbd feat: add allgather_metrics to EstimatorContext (#826)
8f45512 test: add nightly test for pytorch flexible primitive example [DET-3534] (#827)
e501ea0 feat: experiment list filter [DET-2999, DET-3000] (#796)
a1b494e feat: model versions endpoints [DET-3478] (#822)
21fb956 fix: fix an issue with cluster resource computation [DET-3509] (#832)
1c5d151 feat: added cli logging to native (#833)
6843c8f feat: add unets tf.keras example [DET-3397] (#825)
a9d7007 feat: clean up swagger spec (#823)
afc6e3f revert: added cli logging to native. (#821)
3d42608 docs: add example for Pytorch flexible primitives [DET-3202] (#778)
0c99a0c feat: added cli logging to native [DET-3316] (#788)
9918ae1 refactor: dissolve experiment table and task table (#791)
f40b75e docs: improve docs for graceful trial termination (#809)
e9721d0 fix: correct the active task counter on dashboard [DET-3510] (#804)
3f599df docs: add warning for max_slots
[DET-3145] (#814)
d9ed73f feat: add preview search to new API (#813)
523b6b8 style: update master logs [DET-3471] (#793)
be5e99f feat: add experiments details page and endpoint [DET-3003] (#795)
6c59f30 Revert "feat: add preview search to new API (#777)" (#812)
d7d9176 fix: update the comment reference. (#802)
f44b527 feat: add preview search to new API (#777)
5fbb4a3 feat: support follow flag in trial logs (#810)
0c1dcd3 chore: bump version: 0.12.10.dev0 -> 0.12.11.dev0
d44261e fix: don't set Segment key to quotes (#803)
ee08c46 docs: update docs for estimator callbacks [DET-3461] (#800)
1836016 feat: support Pytorch multiple optimizers and LR schedulers [DET-3194, DET-3195, DET-3196, DET-3197, DET-3198] (#807)
ef1406c ci: ensure all release jobs have the proper filters (#805)
fad2ffd revert: support Pytorch multiple optimizers and LR schedulers (#806)
b860646 feat: support Pytorch multiple optimizers and LR schedulers [DET-3194, DET-3195, DET-3196, DET-3197, DET-3198] (#707)
2ae78e1 docs: release notes for 0.12.10 (#786)
7c51a47 docs: improve shared fs checkpoint exporting documentation. [DET-3392] (#797)
ad61ccd fix: retry if upload fails with requests.exceptions.ConnectionError [DET-3358] (#792)
7aea74c chore: log failed trial's trial logs when experiment succeeds [DET-3501]
6131fc8 fix: check for analytics library (#794)
a6f4114 feat: model registry create CLI (#787)
2e10c64 feat: task list batch [DET-3224] (#780)
d3f27c3 chore: refactor master to send batches in RUN_STEP [DET-3253] (#704)
5e48ae8 ci: remove cypress logs (#763)
498044c chore: point cluster and master logs routes to react (#757)
14396a1 fix: fix broken docs examples link [DET-3462] (#785)
e18fc6a feat: model registry describe and list CLI (#781)
89e8fb0 feat: task list search [DET-3222] (#768)
fcbeec4 fix: add missing sort-fix eslint plugin (#775)
ce31e56 style: update task table styles (#773)
d965527 build: swap wget for curl and add it as a dependency (#784)
e629043 build: add a missing dependency step (#783)
06d0850 feat: generate and use swagger typescript client [DET-3249 DET-3324 DET-3355] (#691)
Docker images
docker pull determinedai/determined-master:0.12.11
docker pull determinedai/determined-master:5993e2e
docker pull determinedai/determined-master:5993e2e0b866d8b4123bc8361d29fd5baa212756
docker pull determinedai/determined-dev:determined-master-5993e2e
docker pull determinedai/determined-dev:determined-master-5993e2e0b866d8b4123bc8361d29fd5baa212756
Assets
12
determined-ci released this
Changelog
c8497c6 chore: bump version: 0.12.8rc0 -> 0.12.8
cd5a66e chore: bump version: 0.12.8.dev0 -> 0.12.8rc0
60cc187 chore: bump version: 0.12.7 -> 0.12.8.dev0
5909230 chore: bump task environments version (#695)
01e56a5 docs: add explanation of det-nobody user (#686)
c7533a0 fix: Fix typo in terraform files for max_agent_starting_period (#685)
97a26a2 docs: document on_trial_close estimator hook (#683)
Docker images
docker pull determinedai/determined-master:0.12.8
docker pull determinedai/determined-master:c8497c6
docker pull determinedai/determined-master:c8497c6bde3bdc7121d3a2071e88814153a61555
docker pull determinedai/determined-dev:determined-master-c8497c6
docker pull determinedai/determined-dev:determined-master-c8497c6bde3bdc7121d3a2071e88814153a61555
Assets
12
determined-ci released this
Changelog
d9a2cdc fix: avoid saving pytorch model architecture (#594)
071b3eb docs: release notes for 0.12.5 (#595)
be835fd feat: use str instead of pathlib.Path in checkpoint callbacks
4b1c8ca fix: enable logging with --local --test mode (#589)
0978df0 docs: add documentation for EsatimatorTrial callbacks
2e1eb39 feat: add callbacks to EstimatorTrial
52d50c9 fix: fix off-by-one error in master logs
fc27be0 feat: auto focus the username field on page load (#580)
9475664 docs: tweak advice on mounting file systems with cloud deployments
49ae381 docs: use "distributed training" to mean any kind of multi-GPU training
9e5506d feat: support hooks in EvalSpec when using TF Estimators
f975690 docs: add documentation for pytorch callbacks
9ab8984 feat: add timeout to tensorboard startup
354c744 fix: fix startup-hook.sh for tensorboard-entrypoint.sh
69999cd docs: evaluation functions should return JSON-serializable metrics
e0ec9f1 feat: support PyTorch on_validation_end callbacks with multi-GPU
bd569b6 feat: update to nccl 2.6.4 and fix multi-machine dtrain (#564)
1b79ec1 feat: add meta-learning example using protonets (#527)
b003bc7 docs: re-organize model definitions docs
7b82073 docs: package-based install documentation improvements
b0f075d feat: bump YogaDL verion to 0.1.1
ae8e192 feat: make TF 2.2 the default TF2 version
253fd74 feat: support TF 2.2
4f99bb9 feat: add multiple lr schedulers example
085114e fix: pass s3 endpoint url to tensorboard process
2cf584c fix: set default http method name to GET (#516)
a7a3221 feat: add on_checkpoint_end PyTorch callback
5fa952e feat: add user with password and user without password to tests
5c8bf4c fix: enable OSX interactive session during det user create
9c08cc6 feat: support arbitrary login redirect routes (#522)
47ec6cc feat: checkpoint load uses trial to retrieve checkpoints
f14f769 feat: add model code to checkpoints
8623fea feat: add ability to load a trial class locally
f1b30a4 feat: support --csv
, --json
to slot list
and agent list
in CLI
9a707ab docs: add CONTRIBUTING guidelines
bdf83f8 fix: don't support AMP w/ aggregation_frequency > 1
13078b7 docs: update Users docs for auto-login removal [DET-2992] (#532)
25ffffe fix: re-sync package-lock.json to package.json for react
87dc544 docs: various fixes
208ee11 feat: add imagenet NAS architecture (#378)
6c185a2 feat: add container count to "det agent list"
ebf3628 feat: make Link component to be based on HTML anchor tag
2300788 docs: release notes for v0.12.4 (#520)
7668874 fix: update login docs link
74b0fb4 feat: add support for presenting the authentication token
b659f72 docs: update reference docs to include PyTorchTrialContext
4c55a90 feat: add a PyTorchContext with ability to access model, optimizer, lr scheduler
f2726c6 feat: add generic callback support to PyTorch trials
Docker images
docker pull determinedai/determined-master:0.12.5
docker pull determinedai/determined-master:35e75b5
docker pull determinedai/determined-master:35e75b5c2fa2241f2ecdccbdc58634d107234377
Assets
12
determined-ci released this
Changelog
65b1e51 chore: bump version: 0.12.3rc3 -> 0.12.3
4ddfab4 fix: add logic in agent to invoke docker credentials helpers
dcd4d86 docs: add 0.12.3 release notes
c6ca487 fix: fix loading logic for tensorpacktrial when backbone is set
f2e5702 chore: bump version: 0.12.3rc2 -> 0.12.3rc3
17818fc fix: remove cd from startup-hook.sh in bert_glue_pytorch
d6f429e fix: fix make -C agent clean
ef74363 chore: bump version: 0.12.3rc1 -> 0.12.3rc2
ffc4544 feat: remove container_path from shared_fs configuration documentation
dab6de7 fix: wrong postgres query in restart
22e7be9 chore: better logs --tail default
87bcc70 fix: fix FasterRCNN example
4aeeca1 fix: install determined before using it
f502935 fix: update imports in nas
9d5df9d fix: add specificity to css rules to avoid override sub styles
85c6c1e fix: add missing order for steps in det t describe
a8ee76e chore: fix determined-deploy publish
52723b0 feat: reintroduce trial endpoints
6deafa9 fix: add script to run inside agent's container
a352a21 docs: reorganize main install page
2743056 chore: bump version: 0.12.3rc0 -> 0.12.3rc1
50f452c docs: add a topic guide for effective distributed training
2e6c826 docs: cleanup docstrings
e94e865 chore: bump version: 0.12.3.dev0 -> 0.12.3rc0
e3de26c docs: improve docs around context_dir argument
42b3223 feat: improve experiment configuration handling for LOCAL mode
6e29449 feat: add --local support to cli
5873205 chore: update company name in license files
66da4e9 docs: fix typo in quickstart
7aea9c8 docs: document Docker daemon socket bind mount requirement for agents
fc591c1 fix: display progress bar for image pulls
04edcca docs: add HP and dtrain to quick-start
154e3af chore: change mode of trial entrypoint to 744
a900609 chore: make PyTorch checkpoint code path open to all
97d127e fix: properly restore stopping experiments
defe9ec fix: incorrect shared_fs path for tensorboard if storage_path is set
cbfe85c docs: do some miscellaneous copy editing
ff6f7aa docs: use better code block markup in CLI reference page
1f67d8f docs: fix link to docs on Chinese AWS site
67c6668 docs: edit doc page on users
df2b3b8 docs: remove outdated reference to single-file model definitions
4175ea9 docs: fix unintentionally separated lists
edad1a7 docs: fix unintentional definition lists
23c444f docs: add attributions for master and agent
2203869 feat: add restart policy for fault-tolerance
673007a feat: re-revert add Determined object for consolidating authentication
daf6244 feat: usability messages for det-deploy aws
e7ece2a fix: correctly initialize LRSchedulers for multi-gpu training
c35cb1b fix: replace get_lr() with get_last_lr() in LRScheduler
62beacf fix: correctly call step() for LRScheduler when using epochs
c3cadc3 fix: revert add Determined object for consolidating authentication
b8ddc3d feat: add Determined object for consolidating authentication
1d5e3fe feat: make pending commands killable
c82925b feat: exit log tailing based on state and time rather than polls
18ae0e5 docs: add experimental warnings to native tutorial
bbd15ad chore: delegate auth failure handling to error handler
1daa6e6 docs: release notes for version 0.12.2
b953623 fix: clean up logic around commands
2f97381 docs: add a Native API tutorial
d84de8d chore: use pypi to install yogadl
54a3f48 chore: bump version: 0.12.2 -> 0.12.3.dev0
f8a81c0 feat: introduce REST API builder and refactor existing APIs
f0b996c fix: update modal to support flexible height and auto scrolling
e530ad7 chore: add PyTorch object detection example
89c530f feat: sort user selection option to keep the authenticated user on top
2979949 docs: update docs to move native apis to experimental
e9ee930 chore: update native examples to use experimental namespace
df3e0ea chore: fix experiment config comment typos in examples
6f99e07 chore: fix broken and outdated reference links in TF Keras example
bc9ce53 feat: recover notebook state on agent failure
52fb275 chore: shift naming of test and submit to local and cluster
b38bdef fix: grab back control after loading native implementation
69af009 chore: support .dev and rc tags in determined version
1c4be08 chore: bump version: 0.12.1 -> 0.12.2
be3b295 chore: disambiguate logs in workload_sequencer
961798c chore: standardize official example experiment configurations
dfd54ce docs: update README.md for recent docs reorg
515afe4 chore: bump environment images
6667a87 fix: respect shared_fs storage_path configuration in tensorboard
65c3b2e docs: minor fixes for quick start
acabe33 docs: minor fixes for tutorials
bbab37d docs: revert quick-start to use tarball
8a42748 feat: expose container failure reason in trial logs
d5792ad docs: tweak tutorial text around downloading model code
26a1e0b fix: fix data config for MNIST PyTorch distributed example
0c02269 chore: update environments
1eee210 fix: add missing arguments to GraphQL schema update command
d12ceb6 fix: don't automatically import tf with determined
8798035 chore: update generated GraphQL files for function rename
6ecb84e fix: move a new SQL migration after all previous ones numerically
1d014b8 fix: use per-slot batch size in tf keras Native
514a2c6 chore: move calculating batch sizes to EnvContext
c1c5f0b fix: use a default session when initializing TFKerasTrialController
64c9a32 chore: add more logging to harness
9a70a74 feat: add experiment class for querying top checkpoints
2af7f2a feat: add experiment level best checkpoint by metric
4435d7d chore: mount master config file rather etc root
8fb768d fix: add det command to PATH
dcea476 fix: update old references to "mnist_tf_keras"
b44abe6 docs: update CONTRIBUTING.md
dbefe18 chore: add JetBrains IDE config folder to .gitignore
6573bac chore: bump and limit Cypress version to 4.3.x
d1c3e2b fix: fixture-down before fixture-up
4e52149 docs: update local deploy with new commands
60c108e fix: use correct hvd size function
775f24e docs: add documentation for experimental contexts
5c26ca7 docs: add examples using data layer
65c09ff feat: support data layer in harness
9fcefee feat: bind mount data layer paths
cc1a5eb feat: add data layer config
c21aab8 docs: add command to speed up experiments
17ccd19 docs: fix tutorials typos
Assets
11
Watchers:45 |
Star:1155 |
Fork:136 |
创建时间: 2020-04-08 00:12:29 |
最后Commits: 昨天 |
许可协议:Apache-2.0 |
875429b
Compare
Changelog
875429b chore: bump version: 0.14.1rc2 -> 0.14.1
cbf814f chore: bump version: 0.14.1rc1 -> 0.14.1rc2
0c229de fix: make trial log timestamp filters backwards compatible (#1944)
461288d chore: bump version: 0.14.1rc0 -> 0.14.1rc1
a695609 chore: bump version: 0.14.1.dev0 -> 0.14.1rc0
e9d51cd chore: bump version: 0.14.0 -> 0.14.1.dev0
6a90217 docs: Release notes for 0.14.1.
3e00128 fix: add backwards compatability for logs before 0.13.8 (#1942)
db67b27 docs: More changes to release notes for 0.14.0. (#1927)
Docker images
docker pull determinedai/determined-master:0.14.1
docker pull determinedai/determined-master:875429b1
docker pull determinedai/determined-master:875429b1b96bedcdd0a15bbb5f40a1957e00ee6e
docker pull determinedai/determined-dev:determined-master-875429b1
docker pull determinedai/determined-dev:determined-master-875429b1b96bedcdd0a15bbb5f40a1957e00ee6e