Cortex:面向开发人员的机器学习平台,更快地构建机器学习应用
Cortex是一个开源平台,用于在生产中构建,部署和管理机器学习应用程序。 它专为任何想要构建机器学习驱动的服务的开发人员而设计,而不必担心配置数据管道,持续部署和依赖关系管理等基础架构挑战。v0.32.0
New features
- Add gRPC support to realtime APIs (docs) #1997 #1056 (RobertLucian)
- Add support for ONNX and TensorFlow predictor types in async APIs (docs) #1996 #1980 (miguelvr)
- Support using ECR images from other AWS accounts and regions #2011 #1988 (vishalbollu)
Breaking changes
- GCP support has been removed so that we can focus our efforts on improving the scalability, reliability, and security for Cortex on AWS. Cortex on GCP will still be available in v0.31. If you are currently using Cortex on GCP, our team will be happy to help you migrate to AWS or work with you to find alternative solutions. Please feel free to reach out to us on slack or email us at hello@cortex.dev if you're interested.
Bug fixes
- Fix memory plots on Grafana dashboards for realtime and batch APIs #2024 #2014 #1970 (RobertLucian)
Docs
- Misc docs improvements #1994 (ospillinger)
Misc
Assets
2
v0.31.1
Bug fixes
- Preemptible node pools on GCP aren't autoscaling #1981 (vishalbollu)
- Replica autoscaler targets incorrect deployments on operator restart #1982 (miguelvr)
- Replica autoscaler is not reinitialized for running APIs on operator restart on GCP #1984 (vishalbollu)
Assets
2
vishalbollu released this
v0.31.0
New features
- Add support for AsyncAPI (experimental) (docs) #1935 #1610 (miguelvr)
- Add support for multi-instance-type clusters to AWS/GCP providers (experimental) (aws/gcp docs) #1951 (RobertLucian)
- Allow users to duplicate/mirror traffic using shadow pipelines #1948 #1889 (docs) (vishalbollu)
Breaking changes
on_demand_backup
in cluster configuration has been removed in favour of using a cluster with a mixture of spot and on-demand nodegroups. See multi-instance documentation for aws and gcp for more details.
Bug fixes
- Fix Python client not respecting CORTEX_CLI_CONFIG_DIR environment variable for client-id.txt #1953 (jackmpcollins)
- Prevent threads from being stuck in DynamicBatcher #1915 (cbensimon)
- Fix unexpected cortex logs termination by increasing buffer size #1939 (vishalbollu)
- Decouple cluster deletion from EBS volume deletion for cortex cluster down #1954 (deliahu)
- Fix spot/on-demand GPU instances not joining the cluster by upgrading to eksctl 0.40.0 #1955 (vishalbollu)
- Prevent premature queue not found errors by preserving the SQS for minutes till after the job has completed #1952 (vishalbollu)
Docs
- Update docs #1949 (ospillinger)
Misc
- Configure a default cortex client to manage APIs from with cortex workloads #1942 #1644 (RobertLucian)
- Save batch metrics to cloud to preserve job metrics history #1940 (vishalbollu)
Assets
2
v0.30.0
New features
- Record custom metrics from predictors and view them in Grafana (docs) #1910 #1897 (miguelvr)
- Add granular pod metrics to the Grafana dashboards #1905 (RobertLucian)
- Add node metrics to Grafana dashboards #1900 (miguelvr)
Breaking changes
- Remove support for installing Cortex on your own Kubernetes Cluster #1921 (RobertLucian)
Bug fixes
- Fix bug where successfully completed jobs were marked as completed with errors #1913 (vishalbollu)
- Fix bug where batch jobs were being terminated unnecessarily #1917 (vishalbollu)
- Prevent cluster autoscaler from reallocating job pods #1919 (vishalbollu)
- Address AWS cluster up quota issues such not enough NAT Gateways or EIPs #1912 (RobertLucian)
- Delete unused prometheus volume on cluster down #1863 (miguelvr)
- Create .cortex dir if not present #1909 (RobertLucian)
Docs
Misc
- Allow specifying paths for requirements.txt, conda-packages.txt & dependencies.sh (docs) #1896 #1927 #1777 (miguelvr)
- Log relevant kubernetes events to API specific log streams #1906 #833 (miguelvr)
- Support credentials using AWS_SESSION_TOKEN with the CLI/Client (docs) #1908 #1920 #1134 #1865 (vishalbollu)
- Provide auth to Operator and APIs by attaching IAM policies to the cluster (docs) #1908 #1858 (vishalbollu)
Assets
2
v0.29.0
New features
- Add Grafana dashboard for APIs (docs) #1867 #1885 #1890 #1887 (miguelvr)
- Support API autoscaling in GCP clusters (docs) #1814 #1879 #1601 (miguelvr)
- Support traffic splitting in GCP clusters (docs) #1892 #1660 (miguelvr)
Breaking changes
- The default Docker images for APIs have been slimmed down to not include packages other than what Cortex requires to function. Therefore, when deploying APIs, it is now necessary to include the dependencies that your predictor needs in
requirements.txt
(docs) and/ordependencies.sh
(docs).
Bug fixes
- Disable dynamic batcher for TensorFlow predictor type #1888 (miguelvr)
- Support empty directory objects for models saved in S3/GCS #1830 #1829 (RobertLucian)
- Fix bug which prevented Task APIs on GCP from being cleaned up after completion #1871 (RobertLucian)
Docs
- Add documentation for using a version of Python other than the default via
dependencies.sh
(docs) or custom images (docs) #1862 #1779 (RobertLucian)
Misc
- Support deploying predictor Python classes from more environments (e.g. from separate Python files, AWS Lambda) #1883 3a1b777 #1824 #1826 (vishalbollu)
- Improve error logging for Batch and Task APIs #1866 #1833 (RobertLucian)
Assets
2
v0.28.0
New features
- Support installing Cortex on an existing Kubernetes cluster (on AWS or GCP) (docs) #1837 #1808 (vishalbollu)
Breaking changes
- The cloudwatch dashboard has been removed as a result of our switch to Prometheus for metrics aggregation. The dashboard will be replaced with an alternative in an upcoming release.
Bug fixes
- Fix bug which can cause requests to APIs from a Python client to timeout during cluster autoscaling #1841 #1840 (RobertLucian)
- Fix bug which can cause
downscale_stabilization_period
to be disregarded during downscaling #1847 #1846 (RobertLucian)
Misc
- AWS credentials are no longer required to connect the CLI to the cluster operator. If you need to restrict access to your cluster operator, configure the operator's load balancer to be private by setting
operator_load_balancer_scheme: internal
in your cluster configuration file, and set up VPC Peering. We plan in supporting a new auth strategy in an upcoming release. - Improve S6 error code/signal handling #1825 #1703 (RobertLucian)
Assets
2
v0.27.0
New features
- Add new API type
TaskAPI
for running arbitrary Python jobs (docs) #1717 #253 (miguelvr, RobertLucian) - Write Cortex's logs as structured logs, and allow use of Cortex's structured logger in predictors (supports adding extra fields) (aws docs, gcp docs) #1778 #1803 #1804 #1732 #1563 (vishalbollu)
- Support preemptible instances on GCP (docs) #1791 #1631 (RobertLucian)
- Support private load balancers on GCP (docs) #1786 #1621 (deliahu)
- Support GCP instances with multiple GPUs (docs) #1789 #1784 (deliahu)
Breaking changes
cortex logs
now streams logs from a single replica at random when there are multiple replicas for an API. The recommended way to analyze production logs is via a dedicated logging tool (by default, logs are sent to CloudWatch on AWS and StackDriver on GCP)
Bug fixes
- Misc Python client fixes #1798 #1782 #1772 (vishalbollu, RobertLucian)
Docs
- Document the shared
/mnt
directory for TensorFlow predictors #1802 #1792 (deliahu) - Misc GCP docs improvements #1799 (deliahu)
Misc
- Improve out-of-memory status reporting (RobertLucian)
- Improve batch job cleanup process #1797 #1796 (vishalbollu)
- Remove grpc msg send/receive limit #1769 #1740 (RobertLucian)
Assets
2
v0.26.0
New features
- Support configuring the log level for APIs (docs) #1741 #1484 (RobertLucian)
- Support creating a cluster in an existing AWS VPC (docs) #1759 #1142 (deliahu)
- Support specifying the GCP network and subnet for the Cortex cluster (docs) #1752 #1738 (deliahu)
- Support configuring shared memory size (shm) for inter-process communication (docs) #1756 #1638 (vishalbollu)
Breaking changes
- The local provider has been removed. The best way to test your predictor implementation locally is to import it in a separate Python file and call your
__init__()
andpredict()
functions directly. The best way to test your API is to deploy it to a dev/test cluster. - Built-in support for API Gateway has been removed. If you need to create an https endpoint with valid certs, some options are to set up a custom domain or to manually create an API Gateway.
- Prediction monitoring has been removed. We are exploring how to build a more powerful and customizable solution for this.
- The
predict
CLI command has been deleted.curl
,requests
, etc. are the best tools for testing APIs.
Bug fixes
- For multi-model APIs, allow model names to share a prefix #1745 #1699 (RobertLucian)
Docs
- Misc docs improvements (ospillinger)
Assets
2
v0.25.0
New features
- Support server-side micro batching for the Python predictor (docs) #1653 #1382 (miguelvr)
- Add timeout configuration for batch jobs (docs) #1712 #1324 (vishalbollu)
- Support batch retries (docs) #1713 #1540 (lapaniku, vishalbollu)
- Support sending failed batches to a dead-letter queue (docs) #1713 #1541 (lapaniku, vishalbollu)
- Support installing the cortex Python client in predictors #1709 #1670 #1206 (RobertLucian)
Breaking changes
- The
predictor.model_path
field of the realtime api configuration has been moved topredictor.models.path
. In addition, for the Python predictor type,predictor.models
has been renamed topredictor.multi_model_reloading
. Here is the entire API configuration schema.
Bug fixes
- Misc batch reliability improvements #1705 #1718 #1729 (vishalbollu)
Docs
- Reorganize the docs structure #1696 #1701 #1704 #1719 #1675 (ospillinger)
- Add GCP to the contributing guide #1720 #1654 (deliahu)
- Add docs for setting up kubectl on GCP 759b4b1 (deliahu)
Misc
- Parse the request body as a string when content type
text/plain
is specified #1714 (deliahu) - Support paths to single ONNX files in API configuration #1711 #1686 (RobertLucian)
- Support deploying public S3 models on GCP, and public GCS models on AWS #1694 #1684 (RobertLucian)
- Pre-download docker images when creating GCP clusters #1721 #1658 (deliahu)
- Speed up the validation processes for multi-model APIs #1690 #1663 (RobertLucian)
Assets
2
v0.24.1
Bug fixes
- Propagate the exit code from the predictor's initialization so that the API status is set to "error" when initialization fails #1680 #1691 (RobertLucian)
Assets
2
v0.24.0
New features
- Add GCP support: our initial release supports all three predictor types (Python, TensorFlow, ONNX), on CPU or GPU, with live reloading, multi-model caching, and cluster autoscaling #1655 #1672 #1667 #1661 #114 #1600 #1602 #1616 #1624 (RobertLucian, deliahu, vishalbollu)
- Add the patch command to the CLI and Python client, which can be used to update an API using only the API configuration (without needing to provide the predictor's Python implementation) #1651 #1666 #1329 (vishalbollu)
- Support deploying predictor Python classes from the Python client #1587 #1617 (see the tutorial for an example) (vishalbollu)
Breaking changes
- The Python client's
deploy()
function has been renamed tocreate_api()
, and some of the argument names have changed (docs)
Bug fixes
- Enable CORS for APIs accessed via API Gateway or load balancer #1649 #1234 (RobertLucian, deliahu)
- Fix local TensorFlow models when live reloading is enabled #1668 #1554 (RobertLucian)
- Prevent TensorFlow multi-model caching from attempting to download local models from S3 #1669 #1598 (RobertLucian)
Docs
- Miscellaneous docs improvements (vishalbollu, ospillinger)
Misc
- Improve Python client cross Python version compatibility #1640 (vishalbollu)
- Reinstall TensorFlow and ONNX dependencies when the Python version is overridden #1652 (vishalbollu)
- Terminate container when bootloader script fails #1639 (vishalbollu)
Assets
2
v0.23.0
New features
- Update Python client
deploy()
to accept a Python dictionary for API configuration (previously, only a file path was supported) (docs) #1587 (vishalbollu) - Show API deployment history in
cortex get API_NAME
command #1544 #1496 (deliahu) - Add
cortex export API_NAME
andcortex export API_NAME API_ID
commands to export specific and historical API deployments #1544 #1497 (deliahu) - Build and push
python-predictor-gpu-slim
image with different combinations of cuda and cudnn (cuda10.0-cudnn7
,cuda10.1-cudnn7
,cuda10.1-cudnn8
,cuda10.2-cudnn7
,cuda10.2-cudnn8
,cuda11.0-cudnn8
,cuda11.1-cudnn8
) (docs) #1575 #1574 (deliahu)
Bug fixes
- Allow local deployments of public S3 models without requiring AWS credentials #1589 #1588 (RobertLucian)
Docs
- Add guide for avoiding Docker Hub rate limits #1576 (RobertLucian, deliahu)
- Add guide for self-hosting Cortex's Docker images #1579 (RobertLucian, deliahu)
Misc
- Remove API request maximum payload size limit #1583 (deliahu)
- Switch to Quay docker container registry #1578 (deliahu, RobertLucian)
Assets
2
v0.22.1
Bug fixes
- Set the predictor's working directory to the root Cortex project directory #1573 #1572 (deliahu)
- Allow
max_instances
to be updated viacortex cluster configure
#1568 #1567 (deliahu) - Gracefully stop the serving container when a multi-processed cron throws exception #1560 #1552 (RobertLucian)
Docs
- Demonstrate how to make API requests with various payload types (binary, form fields, etc), and show how to access them in
predict()
#1566 (docs) - Misc docs improvements #1551 #1556 c3dab40 #1557 (deliahu, RobertLucian)
Misc
- Build and upload the Python package/CLI to a public S3 bucket #1562 (vishalbollu)
Assets
2
v0.22.0
New features
- Multi-model caching: serve a collection of models that is collectively bigger than what will fit in memory (via LRU cache eviction) (docs) #1428 #619 (RobertLucian)
- Live reloading: support updating models in running APIs by adding new versions to the model's S3 directory (docs) #1428 #1252 (RobertLucian)
- Inter-process fairness: distribute requests within an API replica evenly across all processes #1526 #839 #1298 (RobertLucian)
- Support requests between APIs within the same cluster (docs) #1503 #1241 (deliahu)
- Allow overriding of CLI install path and config directory (via
$CORTEX_INSTALL_PATH
and$CORTEX_CLI_CONFIG_DIR
) (docs) #1521 #1222 (deliahu)
Breaking changes
- ONNX model paths in API configuration files must now point to a directory containing a single ONNX file, rather than the onnx file itself. For example
model_path: s3://cortex-examples/onnx/yolov5-youtube/yolov5s.onnx
becomesmodel_path: s3://cortex-examples/onnx/yolov5-youtube
. - The
--env/-e
flag in allcortex cluster
commands has been renamed to--configure-env/-e
, and if not provided, the environment namedaws
will no longer be configured in thecortex cluster info
command
Bug fixes
- Fix intermittent failed requests during rolling updates #1526 #814 (RobertLucian)
- Prevent CLI environments from getting overwritten when multiple
cortex cluster
commands are run concurrently #1520 #1410 (deliahu)
Docs
- Add Python client docs #1519 #1502 (deliahu)
- Add guide for running in production #1513 #1464 #1257 (deliahu)
- Add guide for low-cost clusters #1514 #1425 (deliahu)
- Add guide for using a REST API Gateway #1505 #1228 (deliahu)
- Add guide for troubleshooting
cortex cluster down
failures #1515 #1319 (deliahu)
Misc
- Stagger Predictor
__init__()
calls to reduce peak memory consumption #1543 #1450 (RobertLucian) - Add
--name/-n
and--region/-r
flags tocortex cluster info
,cortex cluster export
, andcortex cluster down
commands #1492 #1363 (RobertLucian) - Rename
--env/-e
flag to--configure-env/-e
incortex cluster
commands and update its behavior #1533 #1412 (deliahu) - Disallow ARM-based instances, which are not currently supported #1536 (deliahu)
- Validate AWS vCPU quota is sufficient for up to
max_instances
instances when runningcortex cluster up
andcortex cluster configure
#1537 #1461 (deliahu)
Assets
2
v0.20.0
New features
- Add
cortex cluster export
command to export all APIs running in a cluster (docs) #1368 #1255 (vishalbollu) - Enable users to specify CIDR ranges for the cluster's VPC (docs) #1388 (vishalbollu)
- Support json output for CLI commands (via
-o/--output json
) #1365 #1161 (vishalbollu) - Support the nvidia device driver (nvidia-container-toolkit) when running locally #1366 #1223 (vishalbollu)
Breaking changes
- The valid values for
api_gateway
in the cluster configuration file have been changed fromenabled
/disabled
topublic
/none
(to match the values fornetworking.api_gateway
in the API configuration file).
Bug fixes
- Support AWS tags with spaces and valid special characters #1374 #1355 #1380 #1385 #1373 (deliahu)
- Fix tensor shape validation for the TensorFlow predictor #1311 #1310 (RobertLucian)
- Allow
cortex cluster *
commands to be run from within a docker container #1370 #1361 #1325 (deliahu)
New examples
- pytorch/question-generator to generate questions given text and the correct answer (uses transformers and spacy) #1308 (ismaelc)
Docs
- Add documentation for how to install a specific version of the CLI #1386 #1244 (vishalbollu)
- Add sections for overprovisioning and responsiveness to autoscaling docs #1397 (deliahu)
- Add documentation for how to allow IAM users who did not create the cortex cluster to run
cortex cluster *
commands #1392 #1391 (deliahu) - Add guide for setting up
kubectl
to access the cluster #1344 #1343 (RobertLucian)
Misc
- Update sources of AWS credentials for
cortex cluster *
commands, and improve transparency (docs) #1378 #1229 (vishalbollu) - Rename cluster
api_gateway
config values to match API config #1335 #1334 (deliahu) - Set the default value for
networking.api_gateway
in the API configuration tonone
if api gateway is disabled cluster-wide #1337 #1336 (deliahu) - Support c6g and r6g instances #1332 #809 (deliahu)
- Display autoscaling group activity history when
cortex cluster up
fails #1342 #1340 (deliahu) - Print debug info if
cortex cluster up
times out #1396 (deliahu) - Add Inferentia compute statistics to
cortex cluster info
command #1354 #1304 (RobertLucian) - Disable prompts in
get-cli.sh
if not running interactively #1372 #1371 (deliahu) - Update
cortex help
output #1398 (deliahu)
Assets
2
vishalbollu released this
New features
- Support batch APIs docs #1203 #523 (vishalbollu)
- Support traffic splitting (enables A/B testing, multi-armed bandit, etc) docs #1213 #1270 #1132 #275 #1089 (tthebst)
- Support server-side request batching for the TensorFlow Predictor docs #1193 #1060 (RobertLucian)
- Add
post_predict()
method to Predictor interface (runs after the response has been sent) docs #1237 #954 (RobertLucian) - Support disabling API Gateway cluster-wide docs #1259 #1198 (deliahu)
- Support different CUDA versions for the slim Python Predictor image docs #1263 #923 #1254 (RobertLucian)
- Add additional widgets to the CloudWatch Dashboard (avg in-flight requests per replica, active replicas) docs #1181 (RobertLucian)
Breaking changes
kind
is now a required top-level field for all API configurations. Existing APIs should addkind: RealtimeAPI
. This release adds support forkind: BatchAPI
andkind: TrafficSplitter
.
Bug fixes
- Fix
python_path
config field #1202 (deliahu) - Fix local TensorFlow deploy from parent directory #1274 (deliahu)
- Improve error response for invalid payloads #1212 #1208 (RobertLucian)
New examples
- onnx/yolov5-youtube #1201 (dsuess)
- Update PyTorch text generator example to use Hugging Face transfomers GPT-2 model #1177 (ospillinger)
Docs
- Update tutorial to use the pytorch text-generator example #1278 #1256 (deliahu)
- Improve instructions for updating cluster without downtime #1261 (deliahu)
- Mention API Gateway timeout in 404/503 API responses guide #1264 #1225 (deliahu)
Misc
- Set tags on log groups #1164 #1078 (tthebst)
- Display API metrics in the CLI by API ID (rather than by API name) #1216 (vishalbollu)
- Fix recursive error message for deploy/delete CLI commands #1247 #1218 (RobertLucian)
- Add shell completion to .zshrc file during CLI installation #1265 #1221 (deliahu)
- Handle OOM error when project files are too large #1217 (RobertLucian)
- Display image pull errors #1167 #955 (deliahu)
- Display local Docker image pull error when out of space #1238 #1236 (zouyee)
Assets
2
Bug fixes
- Fix dynamic axes for ONNX models #1187 #1186 (RobertLucian)
- Fix memory node capacity calculation for multi-api configuration files #1185 (deliahu)
- Check cluster-name tag when choosing load balancer for VPC Link integration #1173 (deliahu)
New guides
- Troubleshooting: API request errors (deliahu)
- Troubleshooting: TensorFlow session in predict() (RobertLucian)
Misc
- Delete API Gateway if
cluster up
fails #1172 (deliahu) - Move image version verification from serve.py to run.sh #1180 #1183 (vishalbollu)
- Add retries for resource tagging during
cluster up
#1188 (deliahu) - Use info log level when TensorFlow model is being loaded #1171 (RobertLucian)
- Increase max number of processes per API replica to 100 #1166 (RobertLucian)
- Allow empty cluster config #1179 (deliahu)
Assets
2
New features
- Support Inferentia instances #1119 #654 (RobertLucian)
- Automatically provision HTTPS API Gateway endpoints for Cortex APIs #1108 #1077 (tthebst)
- Support multi-model endpoints for TensorFlow and ONNX predictors #1107 #890 (RobertLucian)
- Support local Docker images in the local environment #1114 #1094 (RobertLucian)
- Support replica parallelism fields (
processes_per_replica
andthreads_per_process
) in the local environment #1158 #960 #1090 (RobertLucian) - Support a
.env
file to export environment variables in the API container #1154 #1147 (RobertLucian, spentaur)
Breaking changes
autoscaling.workers_per_replica
andautoscaling.threads_per_worker
have been moved/renamed topredictor.processes_per_replica
andpredictor.threads_per_process
(see API configuration docs)endpoint
andlocal_port
have been moved to a new sub-field callednetworking
(see API configuration docs)model
has been renamed tomodel_path
in TensorFlow and ONNX predictors (see API configuration docs)
Bug fixes
- Prevent GPU overprovisioning during autoscaling #1111 #1085 (vishalbollu)
New examples
- tensorflow/image-classifier-resnet50 and pytorch/image-classifier-resnet50 to demonstrate using Inferentia #1119 (RobertLucian)
- pytorch/multi-model-text-analyzer, tensorflow/multi-model-classifier, and onnx/multi-model-classifier to demonstrate multi-model APIs #1107 (RobertLucian)
New guides
- Multi-model endpoints using TensorFlow and ONNX predictors #1107 (RobertLucian)
Docs
- Add API architecture diagram #1126 (deliahu)
- Add documentation for configuring the CLI on a new machine #1127 (deliahu, javithe7)
Misc
- Call predictor
__init__()
from the request's threadpool to avoid mutlithreadding issues in some ML frameworks when using 1 thread #1146 (deliahu) - Allow changing an API's endpoint to not trigger a rolling replica update #1155 #1116 (deliahu)
- Set the default shell to
bash
in Predictor Dockerfiles (which simplifies using them as base images for custom-built images) #1104 #1086 (RobertLucian) - Move
endpoint
andlocal_port
tonetworking
API config #1151 #1091 (deliahu) - Rename
model
tomodel_path
in API config #1150 #1115 (deliahu) - Use cluster name for cloudwatch metrics namespace #1138 (deliahu)
- Misc UI improvements #1159 #1084 #1152 #1136 #1128 a52f0a0 #1096 (deliahu)
Assets
2
New features
- Support arbitrary API request payload content types, including raw bytes and form fields (not just JSON) #1062, #332, #917 (deliahu)
- Support custom SSL certificates for the API load balancer #1069, #326, #1066 (vishalbollu)
- Add a cloudwatch dashboard to show metrics for each running API #1054, #855 (tthebst)
- Allow for custom tagging of AWS resources created by cortex (and add the
cortex.dev/cluster-name
tag by default) #1031, #854, #856 (vishalbollu) - Expose request query parameters to the predictor's
predict()
function #1062, #546 (deliahu) - Expose request headers to the predictor's
predict()
function #1062 (deliahu) - Allow users to change the Python version via
conda-packages.txt
#1052, #1051 (RobertLucian)
Bug fixes
- Fix bug which caused a validation error when running TensorFlow or ONNX locally with locally saved models #1075 (RobertLucian)
- Enable tty on local docker API containers to avoid strange characters showing up in
cortex logs
#1067 (deliahu) - Allow cluster's
min_instances
andmax_instances
to be updated at the same time #1050, #840 (deliahu)
New examples
New guides
- Viewing API metrics on the CloudWatch dashboard (deliahu)
Docs
- Add architecture diagram #1042, #1013 (deliahu)
- Document how to install packages from private PyPI indexes #1072 (RobertLucian)
Misc
- Add zsh completion #1024, #1020 (deliahu)
- Rename
cortex cluster update
tocortex cluster configure
#1035, #887 (zouyee) - Add instance and pricing information to
cortex cluster info
output #1053, #835, #935 (deliahu) - Rename tracker to monitoring #1041, #869 (deliahu)
- Change the default cortex region to us-east-1 #1063 (deliahu)
- Disable cluster logging #1029, #888 (deliahu)
- Add
git
to API images #1068 (RobertLucian) - Add upper bound validation for max_replica_concurrency #1025 (zouyee)
- Misc UI/UX improvements #1032, #1023, #1033, #1036, #1045, #1047, #1049, #1044, 93032e2, #1065, #726, #1048, #894, #1043, #1028, #933, #1027, #934, #1026 (deliahu, vishalbollu)
Breaking changes
cortex cluster update
has been renamed tocortex cluster configure
(so it won't be misinterpreted as updating the Cortex version of the cluster)- the
tracker
field in API configuration has been renamed tomonitoring
(to be consistent with the other field names)
Assets
2
Bug fixes
- Read and validate TensorFlow and ONNX models from buckets in any region #1059 (vishalbollu)
Assets
2
New features
- Support deploying APIs locally #973 #109 (vishalbollu)
- Enable private networking: private subnets for instances, internal API load balancer, and internal operator load balancer #978 #965 #832 #964 (deliahu)
- Support installing system packages via
dependencies.sh
#880 #852 (RobertLucian) - Support installing conda packages via
conda-packages.txt
#880 #844 (RobertLucian) - Allow for spot instances to be used with a single instance type #979 #886 (RobertLucian)
- Support specifying serving images in API configuration (on a per-API basis) #948 #900 (RobertLucian)
- Add cortex commands to list and remove CLI environments #973 #730 (deliahu)
- Support
bytes
andstarlette.responses.Response
response types frompredict()
#915 #913 (RobertLucian) - Add slim predictor base images #992 #781 (deliahu)
- Support configuring instance volume type and provisioned IOPS #982 #592 (tthebst)
- Support highly available NAT Gateway #978 #963 (deliahu)
- Add --yes flag to skip prompts on cluster CLI commands #980 #929 (deliahu)
New Examples
- Bart summarizer (using PyTorch) #907 (ismaelc)
- Named entity recognizer (using spacy) #924 (aced125)
- Lite version of the license plate reader #994 (RobertLucian)
New Guides
- Set up AWS API gateway (deliahu)
- Plot response code counts (deliahu)
- Plot API request time (deliahu)
- Plot in-flight requests (deliahu)
- Set up VPC peering (deliahu)
- SSH into AWS instance (RobertLucian)
Docs
Misc
- Use rolling updates for daemonsets when running
cortex cluster update
#972 #630 (tthebst) - Switch from Classic Elastic Load Balancer to Network Load Balancer #978 #966 (deliahu)
- Show the original error message when encountering "invalid AWS credentials" #918 (deliahu)
- Log unexpected TensorFlow Serving gRPC errors #949 (deliahu)
Breaking Changes
- Previously, custom serving images were configured in the cluster configuration file (e.g.
cluster.yaml
would have a line forimage_python_serve: my-repo/python-serve:latest
). Now, custom images are specified inside the API configuration (e.g.cortex.yaml
should haveimage: my-repo/python-serve:latest
in thepredictor
section of your API configuration). Here's the full documentation for API configuration. - The names of the serving base images have been updated to be more descriptive, and "slim" images have been added (they are more appropriate to use as base images when building custom images). Here is the full documentation for custom Docker images.
- The
debug
query parameter to APIs (which caused the input and output topredict()
to be logged) has been removed (#985)
Assets
2
deliahu released this
Assets
2
Bug fixes
- Improve availability zone selection and validation #885 #891 (deliahu)
- Validate predictor implementation using
getfullargspec()
#902 (vishalbollu) - Do not remove cluster configuration cache if refresh fails #893 (vishalbollu)
New Examples
- Keras autoencoder for filtering out the noise from text documents #834 (RobertLucian)
Misc
- Check cluster status before executing cluster commands #881 #879 #892 (vishalbollu)
- Check for AWS Administrator IAM access in
cluster up
andcluster down
commands #878 (deliahu) - Wait for cloudformation stacks to delete during
cluster down
#876 (vishalbollu) - Verify cortex operator url during
cortex configure
#877 (vishalbollu) - Allow blank bucket values in
cluster.yaml
#875 (vishalbollu) - Improve various error messages #895 #896 #897 #899 #905 (deliahu)
- Improve documentation #861 #853 #851 #868 #870 #871, #872, ac481b9 3ad3903 (deliahu, vishalbollu, RobertLucian)
Breaking changes
- Remove
json_tricks
for encoding API responses (responses frompredict()
must now be json serializable) #908 (vishalbollu)
Assets
2
vishalbollu released this
Bug fixes
- Fix JSON parsing before it gets passed to
predict()
#865 (vishalbollu) - Support setup.py packages in
requirements.txt
#864 (deliahu) - Run TensorFlow Predictor's model validations in the region that contains the bucket #866 (deliahu)
Misc
Assets
2
New features
- Support request-based autoscaling #815 #838 #573 (vishalbollu, deliahu)
- Support fine-grained configuration for autoscaling algorithm behavior #815 (deliahu)
- Support configurable in-replica parallelism (i.e. workers, threads) #838 #590 (vishalbollu, deliahu)
- Support configurable request queue length #838 #646 (vishalbollu)
- Support .cortexignore file to exclude files/directories from Cortex project zip #800 #723 (wingkwong)
Bug fixes
- Ensure previous logs are never shown after showing newer ones #792 (deliahu)
- Skip service quota validation in unsupported regions #825 (deliahu)
- Fix prediction metrics when specifying tracker.key #793 (deliahu)
New Examples
- Real-Time License Plate Detector Example Project (YOLOv3, CRAFT, CRNN) #803 (RobertLucian)
Misc
- Show a warning if AWS session token is detected #842 (chrisranderson, vishalbollu)
- Disable NAT gateway #808 (deliahu)
- Add debug information to cluster error messages 5a1a2bc #850 (deliahu)
- Add cluster costs to README #807 #806 (bcjordan, deliahu)
- Document which system packages are installed in Docker images #847 #822 (deliahu)
- Update pytorch examples to use GPU #849 (vishalbollu)
- Install
libsndfile1
in API Docker images #826 (deliahu)
Breaking API Changes
min_replicas
,max_replicas
, andinit_replicas
have been moved from thecompute
configuration key toautoscaling
max_surge
andmax_unavailable
have been moved from thecompute
configuration key toupdate_strategy
target_cpu_utilization
has been removed in favor of the request-based scaling configuration parameters (see the autoscaling docs for a detailed explanation of the new parameters)
Assets
2
Bug fixes
- Fix
cortex cluster update
when using spot instances with no on-demand backup nodegroup #787 (vishalbollu)
Misc
- Set locale to en_US.UTF-8 #784 (deliahu, RobertLucian)
- Cause replica to error when pip install fails 394862b (deliahu)
- Query EKS price from AWS Pricing API #783 (deliahu)
- Assert API version before inspecting args in python #789 (vishalbollu)
- Improve healthcheck #788 (vishalbollu)
Assets
2
New features
- Support on-demand instance backup when spot instances are not available #745 #629 (vishalbollu)
- Remove
kind: deployment
from API configuration #759 (deliahu, vishalbollu) - Add
cortex refresh <api_name>
command #759 #758 (deliahu) - Update
cortex delete <api_name>
command #759 (deliahu) - Add configuration for rolling update strategy (
max_surge
andmax_unavailable
) #763 (deliahu) - Support programatic CLI configuration via command line flags #764 #729 (deliahu)
- Support small instance types #720 (deliahu)
- Add env flag to
cluster up
andcluster update
commands #731 (deliahu)
Bug fixes
- Limit cluster growth rate to avoid Kubernetes API server crashes #769 (vishalbollu)
- Use configured max price for filtering spot instance distribution #746 #719 (vishalbollu)
- Disallow nano and mirco instances 84f0937 #755 (deliahu)
- Fix pod status calculation to classify successfully recovered replicas as ready c5d97eb #738 (deliahu)
New Examples
- Object detection in images with R-CNN #754 (ArkinDharawat)
- Fastai #725 (caleb-kaiser)
Misc
- Add total cluster price to installation confirmation message #714 #775 #713 (deliahu, vishalbollu)
- Prompt before attempting to zip large files, many files, or large total folder size #752 #767 #721 #722 (vishalbollu, deliahu)
- Reduce cortex operator kubernetes API calls #759 #672 (deliahu)
- Reduce fluentd kubernetes API calls #759 #672 (vishalbollu)
- Add EKS control plane logging #753 #717 (vishalbollu)
- Enforce that bucket and cluster regions match #777 (deliahu)
- Update ONNX runtime to 1.1.0 1e74ab7 64f95b7 #571 (deliahu)
- Direct users to check auto scaling group activity history if cluster up fails #757 #740 (vishalbollu)
- Pre-install opencv system packages #772 (vishalbollu)
- Improve config validations #751 c0a89a2 #732 #742 (deliahu, vishalbollu)
- Increase metrics server memory request/limit 60f00b0#diff-d62cba9784a96fc0a7471ca4d8b38e96 #748 (deliahu)
- Disable operator autoscaling #743 (vishalbollu)
- Add
kubectl top
tocortex cluster info --debug
output #756 #716 (vishalbollu)
Assets
2
New features
- Support new instance types (e.g. g3 and g4 instances) #655 (deliahu)
- Support batched TensorFlow and ONNX predictions #666 #562 (vishalbollu)
- Allow users to configure availability zones #681 #677 (vishalbollu)
- Support multiple cortex clusters in the same region #661 #664 #660 (deliahu)
- Add AWS resource pricing to
cortex cluster up
confirmation message #647 #690 #641 (deliahu) - Autofill instance distribution based on spot price #670 #603 (vishalbollu)
- Add support for passing environment variables through to containers #694 #688 (vishalbollu)
Bug fixes
- Surface operator connection error messages in CLI #659 #658 (deliahu)
- Fix occasional logs stream errors #689 (vishalbollu)
- Install pip packages with --no-cache-dir #623 (vishalbollu)
Misc
- Convert predictor APIs into Python classes #636 #666 #589 (vishalbollu)
- Rename sample to payload in Python APIs #626 (vishalbollu)
- Confirm before deleting a deployment #692 #674 (vishalbollu)
- Check for unsupported instance types 952a1f7 (deliahu)
- Check user EC2 limits before spinning up instances #638 #653 #584 (vishalbollu, deliahu)
- Add
cortex cluster info --debug
command #691 #657 (deliahu) - Remove upper limit on CPU target utilization #635 (deliahu)
- Improve cortex deploy response message #650 #642 (deliahu)
- Improve API status output #656 #652 (deliahu)
- Improve spot config documentation #670 #627 (vishalbollu)
- Pre-download Docker images on cluster installation #662 #569 (deliahu)
- Remove
cortex support
command #683 #668 (vishalbollu) - Stream logs from all pods to CloudWatch #671 #586 (vishalbollu)
- Support running
cortex deploy
from subdirectories #675 #673 (deliahu) - Set log group and bucket name defaults to cluster name #693 #680 (vishalbollu)
Assets
2
Assets
2
vishalbollu released this
New features
- Support spot instances #585 #597 #469 (vishalbollu)
Examples
- Add MLflow example #566 #553 (ospillinger)
- Add language identification example (fastText) 0173bc4 (ospillinger)
- Add answer generation example #580 (ospillinger)
- Add reading comprehension example #581 (ospillinger)
- Add text summarization example be42b7c (ospillinger)
Misc
- Create separate nodegroup for Cortex operator containers #577 #500 (vishalbollu)
- Improve API logging #596 #587 (deliahu)
- Improve CLI output #570 #567 #568 #574 (deliahu)
- Update API info endpoint route and response #594 #593 (deliahu)
- Add quickstart / tutorial #595 (ospillinger)
- Create CONTRIBUTING.md #555 #310 (ospillinger)
Assets
2
Assets
2
Bug fixes:
Assets
2
Bug fixes:
- Refresh logger after loading user modules #563 (vishalbollu)
- Remove extra parameters that may be sent JSON tricks encoder initialization #565 (vishalbollu)
- Set all TensorFlow version directory names to "1" #560 #354 (deliahu)
- Convert TensorFlow model prefix to a directory 10b62b4 (deliahu)
Misc
Assets
2
Bug fixes
- Fix bug in multi-input ONNX models a6bdb5f (vishalbollu)
- Don't update API metrics on non-POST requests f1bc223 (deliahu)
Misc
Assets
2
New features
- Add Cortex Python client #488 #467 (vishalbollu)
- Add Cortex support CLI command #491 #336 (vishalbollu)
- Add configure --print CLI command 52ceae3 (deliahu)
Bug fixes:
- Prevent load balancer from timing out requests #490 adcf18c #487 (vishalbollu)
- Remove unnecessary lock in operator init 411bac6 (deliahu)
- Silence stale API saved status not found errors aeac492 (deliahu)
- Remove availability zone configuration 2e8913b #494 (deliahu)
- Show correct URL upon failed HTTP request from CLI #504 (vishalbollu)
Examples
Misc
Assets
2
New features
- Add prediction response tracking #322 #360 #378 #419 #481 81718b4 #225 (vishalbollu)
- Add networking metrics (latency, error codes) #278 #420 #475 #472 #187 (vishalbollu, 1vn)
- Support importing local python files in handlers #398 #452 (1vn, vishalbollu)
- Support TensorFlow model directories on S3 #323 #373 #215 #366 (1vn)
- Support user-specified TensorFlow signature def keys #365 #471 #459 #343 (1vn, vishalbollu, deliahu)
- Improve signature def detection #460 28dc989 #451 (vishalbollu)
- Add debug mode to API requests #369 #328 (1vn)
- Support print statements in handlers #406 #377 #339 (vishalbollu, 1vn)
- Automatically configure operator URL when installing Cortex #401 #334 (1vn)
Bug fixes:
- Evict pods that consume too much memory #426 #424 (deliahu)
- Show logs from init containers #393 #324 (vishalbollu)
- Support "None" dims in model signatures #465 (deliahu)
- Fix line wrapping with CLI --watch flag b4f7257 (deliahu)
Examples
- Convert example model code to notebooks #480 (deliahu)
- Add GPT-2 text generation example #353 (1vn)
- Add Bert sentiment anaylsis example #295 (1vn)
- Add Alexnex Pytorch example #477 (vishalbollu)
- Add Imagenet inception example #344 #318 (1vn)
- Add normalizaion to iris sklearn example #337 (deliahu)
Misc
- Remove sample key from prediction API and prediction key from prediction response #399 #389 (vishalbollu)
- Remove response key from TensorFlow prediction response #478 (vishalbollu)
- Pass onnx model output directly to post_inference request handler #476 (vishalbollu)
- Use HTTP endpoints by default #350 #327 (1vn)
- Remove verbose flag from logs command #400 #391 (vishalbollu)
- Add operator AWS credentials #349 (ospillinger)
- Replace non-ready APIs without rolling update #448 #407 (deliahu)
- Autocast numpy objects to appropriate type #384 #338 (vishalbollu)
- Add out-of-memory error #418 #372 (deliahu)
- Add more fine-grained status messages #440 #408 (deliahu)
- Don't require --force once min replicas are met #449 #359 (deliahu)
- Validate that requested resources can fit in a node before deploying #379 #306 (1vn)
- Validate that request handlers exist before deploying #438 #427 #428 (vishalbollu)
- Enforce zip file size limit #457 #437 (vishalbollu)
- Support numeric CPU values in API configuration #413 #395 (deliahu)
- Improve handler loading error messages #382 #352 #479 #292 #341 (vishalbollu)
- Improve cortex.sh configuration and logging d4e7738 577b31e 5a96fa7 08fde5b 994a49b 4c196a6 (deliahu, ospillinger)
- Add timestamp to logs #402 #390 (vishalbollu)
- Only read deployment configuration from cortex.yaml #396 #387 (deliahu)
- Rename default environment to "default" 6051dcd (deliahu)
- Set Python version to 3.6 #461 (deliahu)
- Update TensorFlow version to 1.14 ba0b541 (deliahu)
- Use Istio for networking #237 #374 #201 (1vn deliahu)
- Stream logs from cloudwatch #447 #466 (vishalbollu)
Assets
2
Assets
2
New features:
- Add GPU support for serving ONNX models #232 #233 #220 (vishalbollu)
- Set model format based on path if not explicitly specified #251 #206 (vishalbollu)
- Improve get command output for APIs #263 #177 #257 #256 (vishalbollu)
- Aggregate API logs in cortex logs command #227 #214 (vishalbollu)
- Aggregate API logs in CloudWatch #259 #226 (vishalbollu)
- Add CLI command to list active Cortex deployments #268 #117 (vishalbollu)
Misc:
- Improve API ready timestamp #244 (deliahu)
- Prevent scaling immediately after API creation #255 #222 (deliahu)
- Add Cortex Namespace to python modules #230 #205 (vishalbollu)
- Allow ctrl+c to kill the manager process #252 #246 (vishalbollu)
- Improve logging around request handlers #240 #207 #204 (vishalbollu)
- Improve error message for prediction api mismatch #249 #176 (vishalbollu)
- Make all logs one line #216 (vishalbollu)
- Document how to add system packages to docker containers #250 #245 (vishalbollu)
- Replace Argo with in-operator DAG manager #235 #218 (deliahu)
- Improve uninstall process #711017f (deliahu)
- Restructure iris example #270 #266 (vishalbollu)
Assets
2
New features:
- Add cluster autoscaler #194 #189 (ospillinger)
- Add pod autoscaler #196 #188 (deliahu)
- Automate/improve installation process, add manager image #193 #192 (ospillinger)
- Add support for serving ONNX models #182 #181 #164 (vishalbollu)
- Support Python pre- and post-processing for inference #182 #178 (vishalbollu)
Misc:
Assets
2
New features:
- Remove status command, fold into get and logs #171 #166 #165 (deliahu)
- Remove region for external data fa227d1 #174 (deliahu)
- Return expected input schema for prediction request errors febc293 (deliahu)
Bugs fixes
- Show previous logs for failed API pods f620125 #179 (deliahu)
- Fix external constants ec96d80 (deliahu)
Misc:
- Hide end-to-end components if only using serving ff4910a b0666fb #180 #167 (deliahu)
- Rename app to deployment #175 #180 (deliahu)
- Rename app.yaml to cortex.yaml 355fdfc #168 (deliahu)
- Improve get command resource printing 5e301c6 (deliahu)
- Improve python error message if external data doesn't exist #183 (deliahu)
- Remove
init
CLI command a044d81 #172 (deliahu) - Update to go 1.12 #170 #169 (deliahu)
Assets
2
New features:
Bug fixes:
Assets
2
New features:
- Input redesign #72 #154 (deliahu)
- Add estimators #72 #154 (deliahu)
- Support deploying external TensorFlow models #124 #154 (1vn)
- Make raw columns optional #103 #111 (1vn)
- Make aggregators and transformers optional #90 #100 (1vn)
- Respond to prediction request with transformed columns #97 #153 (1vn)
- Support bucket regions for data ingestion #115 #155 (vishalbollu)
- Support not using an ingested column as a raw_column #69 #92 (vishalbollu)
- Update to TensorFlow 1.13 #95 #116 (1vn)
- Update to Spark 2.4.2 #87 (vishalbollu)
- Validate app name does not have underscore #59 #112 (1vn)
Bug fixes:
- Resolve Spark Context file added warnings #79 #137 (1vn)
- Improve built-in index_string data format #68 #127 (1vn)
- Address TF Serving gRPC Warning #61 #128 (1vn)
- Ingestion of Parquet containing int or double columns throw validation errors #91 #92 (vishalbollu)
- Update Argo version #74 #125 (1vn)
- API is sometimes temporarily unavailable when updating #71 #85 (deliahu)
- Resources not allocated to Spark workloads to generate training datasets #56 #86 (vishalbollu)
Misc:
Assets
2
Merged pull requests:
- Rename
transformed_column
parameter intransform_spark()
#49 (deliahu) - OOM (Out of memory) status #40 (1vn)
- Change status to ingesting only after enough resources have been allocated #39 (vishalbollu
- Change default TensorFlow log level to DEBUG #37 (1vn)
- Transformer model sentiment analysis example #36 (1vn)
- Add integration test to spark workloads #35 (vishalbollu)
- Tensor2Tensor Example and transform_tensorflow feature #29 (1vn)
Assets
2
docs.cortexlabs.com/cortex/v/0.2
Merged pull requests:
- Allow specifying ranges in cortex requirements.txt #32 (vishalbollu)
- Prevent users from installing conflicting packages #30 (vishalbollu)
- Add additional config path error wrapping and index to embeds #15 (deliahu)
- Show config path in config errors #14 (1vn)
- Add ability to sample dataset #12 (vishalbollu)
- Expose additional csv parsing options #10 (vishalbollu)
- GPU support #6 (1vn)
- Bring your own package #5 (vishalbollu)
Assets
2
See docs.cortexlabs.com for documentation
Assets
2
Watchers:159 |
Star:7444 |
Fork:571 |
创建时间: 2019-01-24 12:43:14 |
最后Commits: 前天 |
b4e2d01
Compare
v0.33.0
New features
Breaking changes
Bug fixes
Misc
cortex cluster configure
command tocortex cluster scale
#2040 #1972 (RobertLucian)async_api
to avoid name collision with the reserved keyword in Python 3.7+ #2066 #2052 (vishalbollu)cluster up
failures #2080 #2027 (vishalbollu)