PORTFOLIO // 2026
Koshi Mazaki

Product Architect

Founder, Glitch Candies Studio.

Building the intersection of Creative Coding and Agentic AI.

I design and deploy unique aesthetics for immersive, AI-driven experiences.

Flux Motion

Flux Motion

Deforum was the go-to animation tool for Stable Diffusion — powerful, flexible, but tightly coupled to A1111. When AnimateDiff arrived, the community shifted to ComfyUI's node-based workflows. Deforum's complexity became a barrier. This project brings Deforum's power to FLUX with an abstracted UI.

The Thesis

Deforum's motion techniques — camera transforms, prompt morphing, temporal feedback — are still powerful. The tooling just hasn't kept up. This project is three things: Research — adapting Deforum workflows to FLUX's 16/128-channel latent architecture. Product — an abstracted generator UI that makes FLUX animation accessible without mastering legacy tooling. Platform — deployed for public use with a generation library to view, compare, and iterate on outputs. Classic Deforum assumes 4-channel SD latents. FLUX.1 uses 16, FLUX.2/Klein uses 128-dimensional tokens. The maths doesn't transfer — this is where the research lives. The UI abstracts that complexity. The library enables systematic research, discovery, and outputs capture.

The Challenge

The challenge: FLUX.2's architecture is fundamentally different from the SD models Deforum was built on. Different latent space, different inference patterns, different architecture. Making it work means rebuilding core assumptions, not just swapping models.

Research Decisions

  • FLUX-Native Stack: Animation pipeline built on FLUX's latent space — 16 channels (FLUX.1). FLUX.2/Klein: 32-channel VAE output, 2×2 patchified to 128 dimensions per token, each covering 16×16 pixels.
  • Distilled Model Tradeoffs: Klein 4B is a 4-step distilled model — fast iteration but lacks the self-correction depth of 50-step models, requiring explicit anti-collapse techniques.
  • Anti-Drift Corrections: LAB color coherence, pre-sharpening, blue noise dithering. Parameters tuned per-model rather than runtime adaptive.
  • Recursive Collapse Problem: Distilled models rapidly amplify feedback errors without denoising steps to self-correct — frames drift to abstract forms by frame 30-40.
  • Latent Reservoir: Periodic injection of fresh latent entropy prevents collapse while maintaining temporal coherence.
  • Hybrid V2V Mode: Input video guides structure while AI adds style — balances motion preservation with creative generation. Tuned blend ratios prevent hallucination while maintaining temporal coherence.

Deployment

Deployed via Cloudflare Workers edge routing with multi-provider GPU backend (Freepik API, RunPod for custom pipelines). Automated init scripts handle Tailscale networking, model downloads, and GPU warmup. Remote development via Claude Code over Tailscale to GPU instances.

Conceptual Flow

Deforum-style feedback loop adapted for FLUX.2's rectified flow architecture. Classic noise injection replaced with edit-mode refinement — Klein is an editor, not traditional img2img. Pre-sharpening, LAB color matching, and blue noise dithering compensate for the different denoising behavior. Tested on FLUX.1 Dev, Klein 4B (distilled, 4-step), and Klein 9B base — the distilled model enabled fast iteration cycles.

Animation Pipeline

Input

Frame N-1

Encode

VAE → Latent

Transform

Channel Motion

Denoise

FLUX Sampling

Decode

Latent → Frame N

FLUX.1 Latent Space
16 Channels
VAE Output
z_channels = 16
Patchification
2×2 × 16 = 64 dims
Motion engine operates pre-patchify on raw 16 channels
FLUX.2 / Klein Token Space
128 Dimensions per Token
VAE Output
32 channels
Patchification
2×2 × 32 = 128
Token Grid
64×64 = 4,096 tokens
Coverage
16×16 px per token
128 learned dimensions — entangled semantics, not discrete channels
Adaptive Corrections
Burn • Blur • Flicker Detection
Output Frame
Colour-coherent • Anti-burn

Animation Research

This is ongoing research — multiple approaches to FLUX animation, none fully solved. Each reveals different tradeoffs in the latent space:

1Static Seed Generation

Consistent seed across frames maintains visual coherence. Same latent starting point produces stable aesthetics while allowing controlled variation.

2Diffusers Feedback Loop

Traditional img2img feedback using diffusers pipeline. Achieves smooth temporal transitions but accumulates noise over time as the model repeatedly processes its own output.

3Hybrid V2V Mode

Input video guides structure while AI adds style. Balances motion preservation with creative generation — tuned blend ratios prevent hallucination while maintaining temporal coherence.

4Scheduled Strength + Prompt Morphing

Keyframed strength ramp (0.15 → 0.4 → 0.25) with mid-sequence prompt transition. First half stays subtle, jumps to strong style at 50%, eases back in final quarter. Enables controlled aesthetic shifts without hard cuts.

System Architecture

AnimationFLUX-native motion pipelinePython, PyTorch, diffusers
EdgeRequest routing, API gatewayCloudflare Workers
FrontendGeneration UI, galleryNext.js, React
Fast InferenceStandard FLUX modelsFreepik API
Custom PipelinesDeforum, LTX, ControlNetsRunPod Serverless
StorageAsset persistence, CDNCloudflare R2

How It Fits Together

  • 1.Motion-aware animation engine that operates in FLUX's latent space
  • 2.Deployment platform that makes generation provider-agnostic
  • 3.Edge layer handles routing, failover, and storage automatically
  • 4.Research and production in the same system — new models get benchmarked here

Research and production in the same system.

Stack

Python • PyTorch • diffusers • ComfyUI • Next.js 15 • Cloudflare Workers • RunPod • Freepik • Tailscale