Skip to main content

Your first video

A guided run through the eight choices that shape a video, from format to voice to thumbnail style.

4 min read

The brief form has eight choices in it. Most can be left at their defaults the first time; a few are the difference between a video that lands and one that doesn't. This page covers each one.

The brief itself

The single biggest lever. The text field at the top of the form is what Claude reads to generate the outline.

Three pillars work consistently:

Topic + angle. Not just "the Berlin Wall" but "the Berlin Wall through the lens of one family that was split between East and West for twenty-eight years". The narrower the angle, the better the script. AI documentary lives or dies on angle.

Tone. "Sober documentary", "energetic explainer", "deadpan commentary", "true-crime gravitas". Two or three adjectives. The voice picker and the script generator both read this.

Hard constraints. Anything that has to be in the video, anything that has to NOT be. "Don't speculate about motive, only sourced facts." "Include the 1989 Tear Down This Wall speech as a fixed beat near the end."

If you draw a blank, the brief form has a Help me write this affordance, it opens a coach dialog with worked examples for the niche you've picked, a step-by-step builder, and a "polish what I've written" button that runs your draft through Claude with a humanise pass.

Format

Two formats today: Documentary and Listicle.

Documentary is the default. Linear narrative, single subject, sustained tone. Use it for history, science explainers, deep-dive commentary, biographies, anything that benefits from a beginning-middle-end.

Listicle is for ranked sets, "Top 7 mistakes the Allies made at Dunkirk", "5 startups that almost killed Google". The pipeline knows to write a hook, then equal-weight item beats, then a payoff. Item count is configured separately and the brief form will warn you if you've picked a count that doesn't fit the length.

More formats land per task, Commentary, Compilation, Explainer, Tutorial, and Shorts are queued.

Length

Four rungs: 8, 12, 20, or 30 minutes. The script generator targets the chosen minute count and validates after; you'll see a "duration check" warning on the outline if a scene came back materially over or under.

The 8-minute rung is the sweet spot for new channels, long enough that YouTube reads it as serious content, short enough that you can iterate on cadence and angle quickly. 12 minutes is the workhorse for established channels and is what most full-time long-form operators land on. 20 and 30 minutes are where AI-assisted production starts to compete with manual production on absolute quality, and where the upside is biggest if you nail the angle.

Niche

Thirty-one niches today. Each one carries:

  • A prompting fragment that nudges the script toward conventions of that genre (true crime knows about defamation; science knows about overclaim).
  • A footage routing chain (history starts with archival, gaming with Pexels, science with NASA / Wikimedia).
  • A voice bias, narrators are tagged by genre fit; the recommender uses your niche.
  • A theme bias, the visual treatment defaults to one that suits the niche.

Pick the closest match. If your video doesn't quite fit any of the thirty-one, pick the nearest and the pipeline will still produce a good video; the niche-specific lifts just won't all fire.

Voice

The voice picker shows recommended voices for your niche first, then your channel's recently-used voices (if you've made videos before), then the full catalogue. Each voice has a play button, listen to a few before picking.

You can also paste an ElevenLabs voice ID to import a custom voice. The first time you do this you'll be asked to tick a consent checkbox confirming you have the rights to the voice and have read ElevenLabs' terms. The consent record carries through every import you make from that point.

Theme

The visual treatment. Cinematic-dark, editorial-paper, archival-warm, et cetera. Niche-biased, the picker recommends a default and lets you override.

Each theme controls colour palette, title-card typography, lower-third style, and the visual posture of the comparison overlays. Most creators pick once and stick.

Thumbnail style

Generated by Gemini 3 Pro Image. The brief form has a thumbnail-style picker (cinematic, documentary, listicle, commentary, etc.). The pipeline generates three candidates after the script lands; you pick the one you want on the publish surface.

Custom voiceover

Optional. If you'd rather use your own voice, upload an MP3 / WAV / M4A of the narration in the voice picker. The pipeline aligns clips against your audio (rather than the ElevenLabs synthesised audio) and skips the synthesis step entirely.

There's a length-alignment indicator that shows whether your recording is on-target, light, or over for your chosen video length. If you're over by more than 48 seconds the pipeline suggests trimming the script.

Custom script

Also optional. If you've already written the script, paste it in instead of writing a brief. The pipeline still generates outline metadata, footage, voice (or accepts your custom VO), and timeline. Script-critic still runs but knows to be gentler with hand-authored prose than with AI-generated prose.

What's next

Editor overview covers what to do after the pipeline lands a draft.

Plans and pricing covers credits, relevant if you're trying to budget how many videos a month a plan really gets you.

Cheers,
Carl