O Open Source Frameworks medium

FasterTransformer

by Community

Transformer related optimization, including BERT, GPT

Visit Community View repo Submit your build →

OSS

FasterTransformer

Added 1 June 2026

#bert #gpt #pytorch #transformer

Overview

FasterTransformer is an open-source framework that accelerates transformer model inference. It implements optimized kernels and memory management for models like BERT and GPT. Written in C++, it provides high-performance execution on NVIDIA GPUs.

Best for

Best for
Developers seeking maximum inference performance for transformer models on NVIDIA hardware

Use cases

Deploying large BERT models for low-latency inference
Running GPT-based text generation with higher throughput
Optimizing transformer inference on NVIDIA GPUs

Notes

6,418 stars on GitHub. Last updated 2024-03-27. Licensed Apache-2.0.

Use cases

Deploying large BERT models for low-latency inference
Running GPT-based text generation with higher throughput
Optimizing transformer inference on NVIDIA GPUs

Pros

Delivers state-of-the-art inference speed for supported transformers
Actively maintained with strong community adoption (6,418 stars)
Fine-tuned for NVIDIA GPU architectures

Cons

Limited to NVIDIA GPUs, no CPU or other hardware support
C++ codebase requires compilation and integration effort
Does not offer a high-level API; manual configuration needed

Indexed from awesome-llm and enriched against its public facts.

Pros

Delivers state-of-the-art inference speed for supported transformers
Actively maintained with strong community adoption (6,418 stars)
Fine-tuned for NVIDIA GPU architectures

Cons

Limited to NVIDIA GPUs, no CPU or other hardware support
C++ codebase requires compilation and integration effort
Does not offer a high-level API; manual configuration needed

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Pairs with2entries

O OSS Obs medium

PyTorch

Community

Tensors and Dynamic neural networks in Python with strong GPU acceleration

★ 100,318 updated 1mo ago

O OSS Obs medium

TensorFlow

Community

An Open Source Machine Learning Framework for Everyone

★ 195,356 updated 1mo ago

Alternative to4entries

O OSS Framework medium

TensorRT-LLM

Community

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NV

★ 13,781 updated 1mo ago

O OSS Framework medium

vLLM

Community

A high-throughput and memory-efficient inference and serving engine for LLMs

★ 81,619 updated 1mo ago

O OSS Framework medium

DeepSpeed

Community

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

★ 42,436 updated 1mo ago

O OSS Framework medium

Colossal-AI

Community

Making large AI models cheaper, faster and more accessible

★ 41,382 updated 1mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →