Components WorkspaceAbout us
DocumentationContact

Simplito sp. z o.o.

1-3 Grudziądzka

87-100 Toruń, Poland

KRS 0000305883

VAT EU: PL9562217643

Share Capital: 336 100 PLN

  • Components

  • DeepFellow Infra
  • DeepFellow Server
  • DeepFellow Enterprise
  • Enterprise

  • Workspace
  • Threat Intelligence
  • Night Shift
  • Dashboard
  • Resources

  • About us
  • Contact
  • Blog
  • Terms & Privacy
  • Privacy Policy

Copyright © 2026 All rights reserved. Simplito. sp. z o.o.

  1. DeepFellow
  2. deepfellow infra

DeepFellow Infra

The open-source foundation for running AI on your own terms

Deploy, manage, and scale AI models across your own hardware - on-premise, private cloud, or hybrid. Host models on your own, with full control over compute and every layer of your AI infrastructure.

Free and open source.

One moment please...

SetupSpecsBest forEstimated cost
Server Solution60GB VRAM, 256GB RAM, 8×RTX 4000 ADALarge models, high performance$5,000–$76,000
PC workstation32–128GB RAM, 2×Nvidia GPUSmall and medium models at speed$2,000–$9,000
Mac Studio35–512GB RAM, up to 80-core GPULarge models at moderate speed$3,000–$25,000
MacBook Pro36–128GB RAM, up to 40-core GPUMedium models, individual use$2,000–$10,000

See full hardware recommendations

Hardware specs
One moment please...
One moment please...
One moment please...

What you get

Flexible and efficient model hosting

Install, host, and run multiple LLM, ML, voice, and image generation models across any setup. Works out of the box with vLLM, Ollama, llama.cpp, Hugging Face models, and any custom ML model. Supports embeddings, text-to-speech, speech-to-text, and image generation natively.

Tree-cluster topology with load balancing

Build resilient, multi-node infrastructure with automatic node organization and intelligent load balancing. Delegate entire nodes or single machines to specific models or tasks. Scale vertically and horizontally — no downtime, no manual reshuffling.

OpenAI-compatible API

A single, unified API endpoint for all model interactions. Drop-in compatible with existing OpenAI-based tooling and workflows. No rewrites, no adapters.

Local, hybrid, and cloud inference

Run fully on-premise, connect to private cloud, or build hybrid setups spanning both. Infra handles routing and distribution across the entire cluster regardless of where compute lives. Runs on AWS and other cloud providers out of the box.

Integrations

Native LangChain integration, Vector Store support, custom endpoints, and plugin architecture. Connects directly to DeepFellow Server as the compute backbone for the full stack.

CLI-first, GitOps-ready

Full CLI-based management for automation, scripting, and GitOps workflows. Every operation available in a terminal - versionable, automatable, repeatable.

Runs on hardware you already have

DeepFellow Infra adapts to your hardware.

There are no strict minimum requirements - start small, scale when you need to. Here's an example specs and cost sheet:

DeepFellow Infra is free and open source.

You can inspect it, extend it, self-host it and start owning your AI. No licensing fees, no vendor lock-in on the foundation layer.

View on GithubRed the docs

DeepFellow Infra is the foundation.

Add Server for orchestration and access control and Enterprise Plugins for compliance, auditability, governance and security.

Explore DeepFellow ServerExplore DeepFellow Enterprise

Ready to build?

Book a demo and we'll help you start building.

Explore resources about licensing, use cases, and technical documentation.

Schedule a demoStart building