O Open Source Frameworks medium

Text-Embeddings-Inference

by Community

A blazing fast inference solution for text embeddings models

Visit Community View repo Submit your build →

OSS

Added 1 June 2026

#ai #embeddings #huggingface #llm #ml

Overview

Text-Embeddings-Inference is a framework for serving text embeddings models at high throughput. Built in Rust, it provides a REST API to generate embeddings from various transformer models. It is designed for low-latency inference, making it suitable for production embedding pipelines.

Best for

Best for
Developers who need fast, scalable embedding serving for search or NLP pipelines

Use cases

Generate embeddings for semantic search
Compute embeddings for text classification
Serve embeddings for clustering workflows

Notes

4,829 stars on GitHub. Last updated 2026-05-26. Licensed Apache-2.0.

Use cases

Generate embeddings for semantic search
Compute embeddings for text classification
Serve embeddings for clustering workflows

Pros

High throughput due to Rust implementation
Supports a wide range of embedding models from Hugging Face
Low latency inference

Cons

Only supports text embeddings models, not generative or other tasks
Requires appropriate hardware (GPU) for optimal performance
Limited community support as a community project

Indexed from awesome-llm and enriched against its public facts.

Pros

High throughput due to Rust implementation
Supports a wide range of embedding models from Hugging Face
Low latency inference

Cons

Only supports text embeddings models, not generative or other tasks
Requires appropriate hardware (GPU) for optimal performance
Limited community support as a community project

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Pairs with4entries

O OSS Obs medium

Qdrant

Community

Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

★ 31,735 updated 1mo ago

O OSS Obs medium

Milvus

Community

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

★ 44,579 updated 1mo ago

O OSS Obs medium

Chroma

Community

Search infrastructure for AI

★ 28,173 updated 1mo ago

O OSS Framework medium

LangChain

Community

The agent engineering platform.

★ 138,234 updated 1mo ago

Alternative to1entry

O OSS Framework medium

vLLM

Community

A high-throughput and memory-efficient inference and serving engine for LLMs

★ 81,619 updated 1mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →