Eigen Labs Research
We present Darkbloom, a decentralized inference network. AI compute today flows through three layers of markup — GPU manufacturers to hyperscalers to API providers to end users. Meanwhile, over 100 million Apple Silicon machines sit idle for most of each day. We built a network that connects them directly to demand. Operators cannot observe inference data. The API is OpenAI-compatible. Our measurements show up to 70% lower costs compared to centralized alternatives. Operators retain 95% of revenue.
For users
Idle hardware has near-zero marginal cost. That saving passes through to price. OpenAI-compatible API for chat, image generation, and speech-to-text. Every request is end-to-end encrypted.
For hardware owners
Your Mac already has the hardware. Operators keep 100% of inference revenue. Electricity cost on Apple Silicon runs $0.01–0.03 per hour depending on workload. The rest is profit.
02 — Motivation
The AI compute market has three layers of margin.
Current supply chain
→
→
API providers
→
End users
This concentrates both wealth and access. A small number of companies control the supply. Everyone else rents.
100M+
Apple Silicon machines shipped since 2020
3x+
markup from silicon to end-user API price
18hrs
average daily idle time per machine
100%
of revenue goes to the hardware owner
03 — The Challenge
Other decentralized compute networks connect buyers and sellers. That is the easy part.
04 — Our Approach
We eliminate every software path through which an operator could observe inference data. Four independent layers, each independently verifiable.
Encryption
Requests are encrypted on the user's device before transmission. The coordinator routes ciphertext. Only the target node's hardware-bound key can decrypt.
Hardware
Each node holds a key generated inside Apple's tamper-resistant secure hardware. The attestation chain traces back to Apple's root certificate authority.
Runtime
The inference process is locked at the OS level. Debugger attachment is blocked. Memory inspection is blocked. The operator cannot extract data from a running process.
Output
Every response is signed by the specific machine that produced it. The full attestation chain is published. Anyone can verify it independently.
Prompts are encrypted before they leave your machine. The coordinator routes traffic it cannot read. The provider decrypts inside a hardened process it cannot inspect. The attestation chain is public.
05 — Implementation
Change the base URL. Everything else works. Streaming, function calling, all existing SDKs.
python
from openai import OpenAI
client = OpenAI(
base_url="https://api.darkbloom.dev/v1",
api_key="your-api-key"
)
response = client.chat.completions.create(
model="mlx-community/gemma-4-26b-a4b-it-8bit",
messages=[{"role": "user", "content": "Hello!"}],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.content, end="")
Streaming — SSE, OpenAI format
Image generation — FLUX.2 on Metal
Speech-to-text — Cohere Transcribe
Large MoE — up to 239B params
06 — Results
Idle hardware has near-zero marginal cost, so the savings pass through. No subscriptions or minimums. Per-token pricing compared against OpenRouter equivalents.
| Model | Input | Output | OpenRouter | Savings |
|---|---|---|---|---|
| Gemma 4 26B4B active, fast multimodal MoE | $0.03 | $0.20 | $0.40 | 50% |
| Qwen3.5 27BDense, frontier reasoning | $0.10 | $0.78 | $1.56 | 50% |
| Qwen3.5 122B MoE10B active, best quality | $0.13 | $1.04 | $2.08 | 50% |
| MiniMax M2.5 239B11B active, SOTA coding | $0.06 | $0.50 | $1.00 | 50% |
Prices per million tokens
$0.0015
per image
Together.ai: $0.003
$0.001
per audio minute
AssemblyAI: $0.002
0%
operators keep 100%
transparent
07 — Operator Economics
Operators contribute idle Apple Silicon and earn USD. 100% of inference revenue goes to the operator. The only variable cost is electricity.
Downloads the provider binary and configures a launchd service.
terminal
$ curl -fsSL https://api.darkbloom.dev/install.sh | bashNo dependenciesAuto-updatesRuns as launchd service
Select hardware to model projected operator earnings.
Estimates only. Actual results depend on network demand and model popularity. Assumes you own the Mac.
Architecture specification, threat model, security analysis, and economic model for hardware-verified private inference on distributed Apple Silicon.
Model Catalog
Curated for quality. Only models worth paying for.
Gemma 4 26B
Google's latest — fast multimodal MoE, 4B active params
text
Qwen3.5 27B
Dense, frontier-quality reasoning (Claude Opus distilled)
text
Qwen3.5 122B MoE ⓘ
10B active — best quality per token
text
MiniMax M2.5 239B ⓘ
SOTA coding, 11B active, 100 tok/s on Mac Studio
text
Cohere Transcribe
2B conformer — best-in-class speech-to-text
audio