Machine Learning and AI Engineer open source stack

Selected repositories matched to this profile using language, topics, activity and real usage patterns.

machine learning engineer open source tools
open source ml stack
llm frameworks for production
open source vector database
deeplake
⭐ 8913 Python Score 107 Updated 2025-11-30

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time...

ai computer-vision cv data-science datalake datasets
transformers
⭐ 153201 Python Score 97 Updated 2025-11-29

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference...

audio deep-learning deepseek gemma glm hacktoberfest
annotated_deep_learning_paper_implementations
⭐ 64613 Python Score 97 Updated 2025-11-11

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optim...

attention deep-learning deep-learning-tutorial gan literate-programming lora
keras
⭐ 63617 Python Score 97 Updated 2025-11-27

Deep Learning for humans

data-science deep-learning jax machine-learning neural-networks python
yolov5
⭐ 56202 Python Score 97 Updated 2025-11-25

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

coreml deep-learning ios machine-learning ml object-detection
ray
⭐ 40079 Python Score 97 Updated 2025-12-01

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

data-science deep-learning deployment distributed hyperparameter-optimization hyperparameter-search
DocsGPT
⭐ 17450 Python Score 97 Updated 2025-11-26

Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API conn...

agent-builder agents ai chatgpt docsgpt hacktoberfest
txtai
⭐ 11873 Python Score 97 Updated 2025-11-30

💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows

ai artificial-intelligence embeddings information-retrieval language-model large-language-models
ragflow
⭐ 68583 Python Score 89 Updated 2025-12-01

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context...

agent agentic agentic-ai agentic-workflow ai ai-search
ml-engineering
⭐ 15880 Python Score 89 Updated 2025-11-21

Machine Learning Engineering Open Book

ai debugging gpus inference large-language-models llm
wandb
⭐ 10595 Python Score 89 Updated 2025-11-27

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

ai collaboration data-science data-versioning deep-learning experiment-track
LLMs-from-scratch
⭐ 80199 Jupyter Notebook Score 85 Updated 2025-11-25

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

ai artificial-intelligence chatbot chatgpt deep-learning from-scratch
Real-Time-Voice-Cloning
⭐ 58940 Python Score 82 Updated 2025-09-23

Clone a voice in 5 seconds to generate arbitrary speech in real-time

deep-learning python pytorch tensorflow tts voice-cloning
memvid
⭐ 10441 Python Score 82 Updated 2025-10-12

Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.

ai context embedded faiss knowledge-base knowledge-graph
pytorch
⭐ 95502 Python Score 81 Updated 2025-12-01

Tensors and Dynamic neural networks in Python with strong GPU acceleration

autograd deep-learning gpu machine-learning neural-network numpy
faceswap
⭐ 54760 Python Score 81 Updated 2025-11-24

Deepfakes Software For All

deep-face-swap deep-learning deep-neural-networks deepface deepfakes deeplearning
ultralytics
⭐ 49350 Python Score 81 Updated 2025-12-01

Ultralytics YOLO 🚀

cli computer-vision deep-learning hub image-classification instance-segmentation
DeepSpeed
⭐ 40871 Python Score 81 Updated 2025-11-26

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

billion-parameters compression data-parallelism deep-learning gpu inference
stable-diffusion-webui
⭐ 158657 Python Score 81 Updated 2025-11-07

Stable Diffusion web UI

ai ai-art deep-learning diffusion gradio image-generation
langchain
⭐ 120826 Python Score 81 Updated 2025-11-28

🦜🔗 The platform for reliable agents.

agents ai ai-agents ai-agents-framework aiagentframework anthropic
vllm
⭐ 64307 Python Score 81 Updated 2025-12-01

A high-throughput and memory-efficient inference and serving engine for LLMs

amd blackwell cuda deepseek deepseek-v3 gpt
LLaMA-Factory
⭐ 63329 Python Score 81 Updated 2025-11-30

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

agent ai deepseek fine-tuning gemma gpt
llama_index
⭐ 45546 Python Score 81 Updated 2025-11-28

LlamaIndex is the leading framework for building LLM-powered agents over your data.

agents application data fine-tuning framework llamaindex
peft
⭐ 20167 Python Score 81 Updated 2025-11-21

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

adapter diffusion fine-tuning llm lora parameter-efficient-learning
RWKV-LM
⭐ 14181 Python Score 81 Updated 2025-11-14

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "...

attention-mechanism chatgpt deep-learning gpt gpt-2 gpt-3
speechbrain
⭐ 10858 Python Score 81 Updated 2025-11-30

A PyTorch-based Speech Toolkit

asr audio audio-processing deep-learning huggingface language-model
tensorflow
⭐ 192633 C++ Score 77 Updated 2025-12-01

An Open Source Machine Learning Framework for Everyone

deep-learning deep-neural-networks distributed machine-learning ml neural-network
haystack
⭐ 23498 MDX Score 77 Updated 2025-11-28

AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or...

agent agents ai gemini generative-ai gpt-4
metaflow
⭐ 9653 Python Score 75 Updated 2025-11-27

Build, Manage and Deploy AI/ML Systems

agents ai aws azure cost-optimization datascience
BentoML
⭐ 8268 Python Score 75 Updated 2025-11-28

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

ai-inference deep-learning generative-ai inference-platform llm llm-inference
deep-searcher
⭐ 7199 Python Score 75 Updated 2025-11-19

Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

agent agentic-rag claude deep-research deepseek deepseek-r1
airweave
⭐ 5287 Python Score 75 Updated 2025-11-29

Context retrieval for AI agents across apps and databases

agents knowledge-graph llm llm-agent rag search
LEANN
⭐ 4843 Python Score 75 Updated 2025-11-30

RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

ai faiss gpt-oss langchain llama-index llm
scikit-learn
⭐ 64163 Python Score 73 Updated 2025-12-01

scikit-learn: machine learning in Python

data-analysis data-science machine-learning python statistics
OpenBB
⭐ 55077 Python Score 73 Updated 2025-11-29

Financial data platform for analysts, quants and AI agents.

ai crypto derivatives economics equity finance
streamlit
⭐ 42448 Python Score 73 Updated 2025-11-30

Streamlit — A faster way to build and share data apps.

data-analysis data-science data-visualization deep-learning developer-tools machine-learning
gradio
⭐ 40744 Python Score 73 Updated 2025-11-28

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

data-analysis data-science data-visualization deep-learning deploy gradio
MockingBird
⭐ 36794 Python Score 73 Updated 2025-11-13

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

ai deep-learning pytorch speech text-to-speech tts
mlflow
⭐ 23123 Python Score 73 Updated 2025-12-01

The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and e...

agentops agents ai ai-governance apache-spark evaluation
browser-use
⭐ 73123 Python Score 73 Updated 2025-11-30

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

ai-agents ai-tools browser-automation browser-use llm playwright
OpenHands
⭐ 65328 Python Score 73 Updated 2025-12-01

🙌 OpenHands: Code Less, Make More

agent artificial-intelligence chatgpt claude-ai cli developer-tools
mem0
⭐ 43726 Python Score 73 Updated 2025-11-27

Universal memory layer for AI Agents

agents ai ai-agents application chatbots chatgpt
Langchain-Chatchat
⭐ 36690 Python Score 73 Updated 2025-11-10

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local kno...

chatbot chatchat chatglm chatgpt embedding faiss
PaddleNLP
⭐ 12862 Python Score 73 Updated 2025-11-28

Easy-to-use and powerful LLM and SLM library with awesome model zoo.

bert compression distributed-training document-intelligence embedding ernie
segmentation_models.pytorch
⭐ 11111 Python Score 73 Updated 2025-11-26

Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.

computer-vision deeplab-v3-plus deeplabv3 dpt fpn image-processing
ComfyUI
⭐ 95084 Python Score 73 Updated 2025-11-30

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

ai comfy comfyui python pytorch stable-diffusion
nni
⭐ 14305 Python Score 72 Updated 2024-07-03

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper...

automated-machine-learning automl bayesian-optimization data-science deep-learning deep-neural-network
tensorzero
⭐ 10615 Rust Score 69 Updated 2025-12-01

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.

ai ai-engineering anthropic artificial-intelligence deep-learning genai
anything-llm
⭐ 51699 JavaScript Score 69 Updated 2025-11-27

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.

ai-agents custom-ai-agents deepseek kimi llama3 llm
ipex-llm
⭐ 8495 Python Score 68 Updated 2025-10-14

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU ...

gpu llm pytorch transformers