
Flux Motion
Deforum was the go-to animation tool for Stable Diffusion — powerful, flexible, but tightly coupled to A1111. When AnimateDiff arrived, the community shifted to ComfyUI's node-based workflows. Deforum's complexity became a barrier. This project brings Deforum's power to FLUX with an abstracted UI.
The Thesis
Deforum's motion techniques — camera transforms, prompt morphing, temporal feedback — are still powerful. The tooling just hasn't kept up. This project is three things: Research — adapting Deforum workflows to FLUX's 16/128-channel latent architecture. Product — an abstracted generator UI that makes FLUX animation accessible without mastering legacy tooling. Platform — deployed for public use with a generation library to view, compare, and iterate on outputs. Classic Deforum assumes 4-channel SD latents. FLUX.1 uses 16, FLUX.2/Klein uses 128-dimensional tokens. The maths doesn't transfer — this is where the research lives. The UI abstracts that complexity. The library enables systematic research, discovery, and outputs capture.
The Challenge
The challenge: FLUX.2's architecture is fundamentally different from the SD models Deforum was built on. Different latent space, different inference patterns, different architecture. Making it work means rebuilding core assumptions, not just swapping models.
Research Decisions
- —FLUX-Native Stack: Animation pipeline built on FLUX's latent space — 16 channels (FLUX.1). FLUX.2/Klein: 32-channel VAE output, 2×2 patchified to 128 dimensions per token, each covering 16×16 pixels.
- —Distilled Model Tradeoffs: Klein 4B is a 4-step distilled model — fast iteration but lacks the self-correction depth of 50-step models, requiring explicit anti-collapse techniques.
- —Anti-Drift Corrections: LAB color coherence, pre-sharpening, blue noise dithering. Parameters tuned per-model rather than runtime adaptive.
- —Recursive Collapse Problem: Distilled models rapidly amplify feedback errors without denoising steps to self-correct — frames drift to abstract forms by frame 30-40.
- —Latent Reservoir: Periodic injection of fresh latent entropy prevents collapse while maintaining temporal coherence.
- —Hybrid V2V Mode: Input video guides structure while AI adds style — balances motion preservation with creative generation. Tuned blend ratios prevent hallucination while maintaining temporal coherence.
Deployment
Deployed via Cloudflare Workers edge routing with multi-provider GPU backend (Freepik API, RunPod for custom pipelines). Automated init scripts handle Tailscale networking, model downloads, and GPU warmup. Remote development via Claude Code over Tailscale to GPU instances.
Conceptual Flow
Deforum-style feedback loop adapted for FLUX.2's rectified flow architecture. Classic noise injection replaced with edit-mode refinement — Klein is an editor, not traditional img2img. Pre-sharpening, LAB color matching, and blue noise dithering compensate for the different denoising behavior. Tested on FLUX.1 Dev, Klein 4B (distilled, 4-step), and Klein 9B base — the distilled model enabled fast iteration cycles.
Animation Pipeline
Input
Frame N-1
Encode
VAE → Latent
Transform
Channel Motion
Denoise
FLUX Sampling
Decode
Latent → Frame N
Animation Research
This is ongoing research — multiple approaches to FLUX animation, none fully solved. Each reveals different tradeoffs in the latent space:
Consistent seed across frames maintains visual coherence. Same latent starting point produces stable aesthetics while allowing controlled variation.
Traditional img2img feedback using diffusers pipeline. Achieves smooth temporal transitions but accumulates noise over time as the model repeatedly processes its own output.
Input video guides structure while AI adds style. Balances motion preservation with creative generation — tuned blend ratios prevent hallucination while maintaining temporal coherence.
Keyframed strength ramp (0.15 → 0.4 → 0.25) with mid-sequence prompt transition. First half stays subtle, jumps to strong style at 50%, eases back in final quarter. Enables controlled aesthetic shifts without hard cuts.
System Architecture
| Animation | FLUX-native motion pipeline | Python, PyTorch, diffusers |
| Edge | Request routing, API gateway | Cloudflare Workers |
| Frontend | Generation UI, gallery | Next.js, React |
| Fast Inference | Standard FLUX models | Freepik API |
| Custom Pipelines | Deforum, LTX, ControlNets | RunPod Serverless |
| Storage | Asset persistence, CDN | Cloudflare R2 |
How It Fits Together
- 1.Motion-aware animation engine that operates in FLUX's latent space
- 2.Deployment platform that makes generation provider-agnostic
- 3.Edge layer handles routing, failover, and storage automatically
- 4.Research and production in the same system — new models get benchmarked here
Research and production in the same system.
Stack
Python • PyTorch • diffusers • ComfyUI • Next.js 15 • Cloudflare Workers • RunPod • Freepik • Tailscale