For Developers

Simplito sp. z o.o.

1-3 Grudziądzka 
87-100 Toruń, Poland

KRS 0000305883 
VAT EU: PL9562217643

Share Capital: 336 100 PLN

DeepFellow provides all essential components for deploying production-grade AI infrastructure on your terms. Built and tested by a team of experts in the field of data privacy, with production-grade code and predefined integration workflows, designed for data-sensitive and compliance-driven environments.

DeepFellow Server

The Control Center for Your AI Infrastructure

DeepFellow Server is the main orchestration layer, responsible for all user interactions, access management, and workflow coordination. Built on FastAPI, it provides an OpenAI-compatible API that makes migration seamless.

Extensions & Plugins

Modular Security and Compliance

DeepFellow's plugin architecture allows fine-grained control over every aspect of your AI system. Add filtering, anonymization, evaluation, and logging capabilities through simple, configurable plugins.

Plugins are flexible and provide numerous integrations, such as (but not limited to):

Prompt Sensitive Data Detection: Control what data reaches your models through configurable filtering pipelines with automatic detection and handling of sensitive data
Corporate Filtering Plugin: Predefined sets of corporate data rules, enforcing organizational data policies
Prompt Anonymization: Protect privacy by anonymizing sensitive information before processing, using chosen tools and custom rules
Response Evaluation: Validate model outputs before delivery to ensure quality and safety, with industry-standard quality metrics and factual validation
Content Safety Validation: Verify for content safety with set of custom rules
Audit Logging: Complete transparency and accountability for regulatory compliance, with OpenTelemetry
Performance analysis: Complete AI environment analysis with Graphana and Prometeus support for load and performance verification

Simplito sp. z o.o.

1-3 Grudziądzka 
87-100 Toruń, Poland

KRS 0000305883 
VAT EU: PL9562217643

Share Capital: 336 100 PLN

DeepFellow Server

The Control Center for Your AI Infrastructure

Key Capabilities:

OpenAI API Compatibility: Drop-in replacement for OpenAI, Claude, and other LLM APIs
Access Management: Role-based access control (RBAC) for models, data, and toolboxes
Request Routing: Intelligent routing to appropriate infrastructure nodes
Workflow Orchestration: Pre-built pipelines and plugins for filtering, validation, and response evaluation
Load Balancing: Automatic distribution across multiple infra nodes
Logging & Monitoring: OpenTelemetry integration for complete observability

Installation:

curl deepfellow.ai/install.sh | bash
deepfellow server install
deepfellow server start

DeepFellow Infra

Where Your AI Models Run

DeepFellow Infra is the infrastructure component that hosts and serves your AI models. It supports multiple inference engines, model formats, and deployment topologies – all while maintaining security and performance.

Infrastructure Features:

Multi-Node Clusters: Connect multiple infrastructure nodes for scalability
Load Balancing: Automatic selection of available nodes with matching models
Model Auto-Discovery: Infra nodes automatically register available models
Easy setup: Infra speeds up configuration and maintains whole installation process
Multiple Inference Engines: Support for vLLM, Ollama, llama.cpp, and more
Model Catalog: Maintained catalog of tested and compatible models

Custom Models: Configuration-based support for your own models
Hybrid Deployment: Mix on-premises and cloud infrastructure
Widest Hardware Support: Infra supports MacOS, Linux, and Windows machines, allowing you to use all existing infrastructure to build powerful local AI systems
NightShift Mode: Extend capabilities of used model by Fine Tuning performed on your Infra in idle time and make your models know your domain better and better every day

Supported Model Providers:

Hugging Face: Direct integration with thousands of open-source models
Ollama: Easy deployment of quantized models
vLLM: High-throughput inference for production workloads
llama.cpp: Efficient CPU-based inference
Custom Models: PyTorch, TensorFlow, and custom frameworks

Installation

deepfellow infra install
deepfellow infra service install ollama
deepfellow infra model install ollama gemma3:1b
deepfellow infra start

Extensions & Plugins

Modular Security and Compliance

Plugins are flexible and provide numerous integrations, such as (but not limited to):

Prompt Sensitive Data Detection: Control what data reaches your models through configurable filtering pipelines with automatic detection and handling of sensitive data
Corporate Filtering Plugin: Predefined sets of corporate data rules, enforcing organizational data policies
Prompt Anonymization: Protect privacy by anonymizing sensitive information before processing, using chosen tools and custom rules
Response Evaluation: Validate model outputs before delivery to ensure quality and safety, with industry-standard quality metrics and factual validation
Content Safety Validation: Verify for content safety with set of custom rules
Audit Logging: Complete transparency and accountability for regulatory compliance, with OpenTelemetry
Performance analysis: Complete AI environment analysis with Graphana and Prometeus support for load and performance verification

Deployment Topologies

Perfect for development and testing.

Components: Single server + single infra node
Setup Time: 5 minutes
Use Case: Development, testing, proof-of-concept

Gaming Laptop will be enough.

Scalable production deployment.

Components: Server cluster + multiple infra nodes
Features: Load balancing, high availability, fault tolerance
Use Case: Production deployments, enterprise applications

Use existing infrastructure for lighter models and dedicated nodes (like NVIDIA Spark) for heavy computations

Best of both worlds – local + cloud.

Components: On-premises server + mixed infra (local + cloud)
Strategy: Sensitive data local, general tasks cloud
Use Case: Gradual migration, compliance + performance

Scale demanding models with cloud, protect your data on-premise.

Maximum scale and resilience.

Components: Multiple server instances + distributed infra across regions
Features: Geographic distribution, disaster recovery, global scale
Use Case: Large enterprises, multi-region deployments

Model Support

DeepFellow supports the latest models from Hugging Face, Ollama, vLLM, and more, including Llama, Mistral, Gemma, DeepSeek, and custom models. Deploy text, audio, and vision models quickly and securely, with hardware recommendations for every setup. Flexible architecture lets you use pre-trained, open-source, or in-house models – all with full API compatibility.

Integration Ecosystem:

MCP (Model Context Protocol)

Secure Toolbox Integration

DeepFellow provides secure integration with MCP servers, enabling your AI to use corporate tools and sources while maintaining data privacy.

Custom Toolboxes: Create your own tool collections with well-managed access
Secure Connectors: Ensure compatibility without exposing internal data
Marketplace Ready: Compatible with growing MCP ecosystem

Vector Databases (VDB)

RAG and Knowledge Base Integration

Integrate vector databases for retrieval-augmented generation (RAG) without compromising security.

RAG Capabilities:

Real-time document retrieval
Similarity search on embeddings
Source tracking for factual validation
Secure internal knowledge base access

Data Sources & Tools

Secure Access to Internal Data

AI can operate on your organization's data through secure, server-side integrations provided by custom plugins.

Databases: PostgreSQL, MySQL, MongoDB, etc.
File Storage: S3, MinIO, NAS, local filesystems
APIs: Internal REST/GraphQL APIs
Enterprise Systems: CRM, ERP, HRIS integrations
Workflow Tools: n8n integration for complex workflows

Platform Features: Multi-Architecture Support

Feature Matrix
Core features

Infrastructure:

DeepFellow Server
DeepFellow Infra
Model hosting automation
Automatic model updates
Model catalog access
Custom model support
Load balancing
Secure server-infra communication

Security:

Rights management & RBAC
Attack-proof architecture
Filtering interfaces
Anonymization interfaces
Response evaluation interfaces
Content safety check interfaces
Log system interface
EU AI Act compliance

Integrations:

OpenAI API compatibility
MCP server support
Vector database integration
Hugging Face models
Multiple inference engines

Enterprise Features
& Add-Ons

Infra Manager (unified management portal)
Kubernetes auto-deployment
Custom model catalog server
Corporate filtering plugins
Corporate credentials (OpenLDAP, etc.)
Confidential computing integrations
Remote computing integrations
Priority support & SLA
Custom plugin development
Dedicated training & onboarding
Architecture consulting

Book a demo of DeepFellow – we will show you how it works in action.

Explore resources about licensing, use cases, and technical documentation.

Key Capabilities:

OpenAI API Compatibility: Drop-in replacement for OpenAI, Claude, and other LLM APIs
Access Management: Role-based access control (RBAC) for models, data, and toolboxes
Request Routing: Intelligent routing to appropriate infrastructure nodes
Workflow Orchestration: Pre-built pipelines and plugins for filtering, validation, and response evaluation
Load Balancing: Automatic distribution across multiple infra nodes
Logging & Monitoring: OpenTelemetry integration for complete observability

Installation:

curl deepfellow.ai/install.sh | bash
deepfellow server install
deepfellow server start

DeepFellow Infra

Where Your AI Models Run

Infrastructure Features:

Multi-Node Clusters: Connect multiple infrastructure nodes for scalability
Load Balancing: Automatic selection of available nodes with matching models
Model Auto-Discovery: Infra nodes automatically register available models
Easy setup: Infra speeds up configuration and maintains whole installation process
Multiple Inference Engines: Support for vLLM, Ollama, llama.cpp, and more
Model Catalog: Maintained catalog of tested and compatible models

Custom Models: Configuration-based support for your own models
Hybrid Deployment: Mix on-premises and cloud infrastructure
Widest Hardware Support: Infra supports MacOS, Linux, and Windows machines, allowing you to use all existing infrastructure to build powerful local AI systems
NightShift Mode: Extend capabilities of used model by Fine Tuning performed on your Infra in idle time and make your models know your domain better and better every day

Supported Model Providers:

Hugging Face: Direct integration with thousands of open-source models
Ollama: Easy deployment of quantized models
vLLM: High-throughput inference for production workloads
llama.cpp: Efficient CPU-based inference
Custom Models: PyTorch, TensorFlow, and custom frameworks

Installation

deepfellow infra install
deepfellow infra service install ollama
deepfellow infra model install ollama gemma3:1b
deepfellow infra start

Deployment Topologies

Perfect for development and testing.

Components: Single server + single infra node
Setup Time: 5 minutes
Use Case: Development, testing, proof-of-concept

Gaming Laptop will be enough.

Scalable production deployment.

Components: Server cluster + multiple infra nodes
Features: Load balancing, high availability, fault tolerance
Use Case: Production deployments, enterprise applications

Use existing infrastructure for lighter models and dedicated nodes (like NVIDIA Spark) for heavy computations

Best of both worlds – local + cloud.

Components: On-premises server + mixed infra (local + cloud)
Strategy: Sensitive data local, general tasks cloud
Use Case: Gradual migration, compliance + performance

Scale demanding models with cloud, protect your data on-premise.

Maximum scale and resilience.

Components: Multiple server instances + distributed infra across regions
Features: Geographic distribution, disaster recovery, global scale
Use Case: Large enterprises, multi-region deployments

Model Support

Integration Ecosystem:

MCP (Model Context Protocol)

Secure Toolbox Integration

DeepFellow provides secure integration with MCP servers, enabling your AI to use corporate tools and sources while maintaining data privacy.

Custom Toolboxes: Create your own tool collections with well-managed access
Secure Connectors: Ensure compatibility without exposing internal data
Marketplace Ready: Compatible with growing MCP ecosystem

Vector Databases (VDB)

RAG and Knowledge Base Integration

Integrate vector databases for retrieval-augmented generation (RAG) without compromising security.

RAG Capabilities:

Real-time document retrieval
Similarity search on embeddings
Source tracking for factual validation
Secure internal knowledge base access

Data Sources & Tools

Secure Access to Internal Data

AI can operate on your organization's data through secure, server-side integrations provided by custom plugins.

Databases: PostgreSQL, MySQL, MongoDB, etc.
File Storage: S3, MinIO, NAS, local filesystems
APIs: Internal REST/GraphQL APIs
Enterprise Systems: CRM, ERP, HRIS integrations
Workflow Tools: n8n integration for complex workflows

Platform Features: Multi-Architecture Support

Feature Matrix
Core features

Infrastructure:

DeepFellow Server
DeepFellow Infra
Model hosting automation
Automatic model updates
Model catalog access
Custom model support
Load balancing
Secure server-infra communication

Security:

Rights management & RBAC
Attack-proof architecture
Filtering interfaces
Anonymization interfaces
Response evaluation interfaces
Content safety check interfaces
Log system interface
EU AI Act compliance

Integrations:

OpenAI API compatibility
MCP server support
Vector database integration
Hugging Face models
Multiple inference engines

Enterprise Features
& Add-Ons

Infra Manager (unified management portal)
Kubernetes auto-deployment
Custom model catalog server
Corporate filtering plugins
Corporate credentials (OpenLDAP, etc.)
Confidential computing integrations
Remote computing integrations
Priority support & SLA
Custom plugin development
Dedicated training & onboarding
Architecture consulting

Book a demo of DeepFellow – we will show you how it works in action.

Explore resources about licensing, use cases, and technical documentation.

For Developers

DeepFellow Server

The Control Center for Your AI Infrastructure

Extensions & Plugins

Modular Security and Compliance

For Developers

DeepFellow Server

The Control Center for Your AI Infrastructure

DeepFellow Infra

Where Your AI Models Run

Infrastructure Features:

Supported Model Providers:

Installation

Extensions & Plugins

Modular Security and Compliance

Deployment Topologies

Perfect for development and testing.

Scalable production deployment.

Best of both worlds – local + cloud.

Maximum scale and resilience.

Model Support

Integration Ecosystem:

MCP (Model Context Protocol)

Vector Databases (VDB)

Data Sources & Tools

Platform Features: Multi-Architecture Support

Feature MatrixCore features

Infrastructure:

Security:

Integrations:

Enterprise Features & Add-Ons

DeepFellow Infra

Where Your AI Models Run

Infrastructure Features:

Supported Model Providers:

Installation

Deployment Topologies

Perfect for development and testing.

Scalable production deployment.

Best of both worlds – local + cloud.

Maximum scale and resilience.

Model Support

Integration Ecosystem:

MCP (Model Context Protocol)

Vector Databases (VDB)

Data Sources & Tools

Platform Features: Multi-Architecture Support

Feature MatrixCore features

Infrastructure:

Security:

Integrations:

Enterprise Features & Add-Ons

Feature Matrix
Core features

Enterprise Features
& Add-Ons

Feature Matrix
Core features

Enterprise Features
& Add-Ons