Serving Python Packages

ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

60.7M 43K 8K

torch-model-archiver

Serve, optimize and scale PyTorch models in production

214K 4K 882

litserve

A minimal Python framework for building custom AI inference servers with full control over logic, batching, and scaling.

71K 4K 292

torchserve

Serve, optimize and scale PyTorch models in production

61K 4K 882

ant-ray-cpp-nightly

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

31K 43K 8K

seldon-core

An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models

22K 5K 866

torch-workflow-archiver

Serve, optimize and scale PyTorch models in production

19K 4K 882

ray-cpp

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

18K 43K 8K

torch-model-archiver-nightly

Serve, optimize and scale PyTorch models in production

17K 4K 882

ovmsclient

A scalable inference server for models optimized with OpenVINO™

9K 896 260

ant-ray-nightly

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

9K 43K 8K

haupt

Lineage metadata API, artifacts streams, sandbox, API, and spaces for Polyaxon

5K 452 207

friendli-client

Friendli Suite Client

5K 50 7

clearml-serving

ClearML - Model-Serving Orchestration and Repository Solution

4K 167 50

ant-ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

4K 43K 8K

torchserve-nightly

Serve, optimize and scale PyTorch models in production

3K 4K 882

torch-workflow-archiver-nightly

Serve, optimize and scale PyTorch models in production

3K 4K 882

secretflow-serving-lib

SecretFlow-Serving is a serving system for privacy-preserving machine learning models.

3K 16 6

fastdeploy

Deploy DL/ ML inference pipelines with minimal extra code.

3K 105 17

omniback

Serving Inside Pytorch

2K 169 12

evadb

Database system for AI-powered apps

2K 3K 261

fastdeploy-python

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

1K 4K 754

bodywork

ML pipeline orchestration and model deployments on Kubernetes.

1K 436 23

paddle-serving-client

A flexible, high-performance carrier for machine learning models（『飞桨』服务化部署框架）

1K 922 247