Skip to main content

Editing your video

The working manual: scene-by-scene script edits, finding resolution, per-stage regenerate, search-more, captions, music, save and Magic Reset.

6 min read

After the pipeline lands a draft, the editor is where most creators spend their hands-on time. This page is the working manual: scene-by-scene, what you can edit, what triggers a regenerate, and the affordances that exist for each kind of fix.

The scene-by-scene loop

The editor is structured around scenes. Each scene is one block of script, one stretch of footage, one chunk of voiceover, one or two timeline frames. Most of your editing time is per-scene, not whole-video.

The canonical loop is:

  1. Read the script for a scene.
  2. Listen to the voiceover preview if anything reads awkwardly.
  3. Look at the chosen footage clip.
  4. Check the finding badges (script critic, vision evaluator) for that scene.
  5. Decide: leave it, regenerate it, edit it.
  6. Move to the next scene with j (or back with k).

The keyboard shortcuts make this fast. See Editor overview for the full shortcut sheet.

Editing the script

Click into a scene's script block. The text becomes editable.

Three kinds of edit work:

In-place text edits (typing, deleting, rewording). Saved as you type, no regenerate needed. The voiceover for that scene is marked stale (the small waveform turns dimmer) and a "Re-synth this scene" affordance appears. You can re-synth one scene at a time or batch them with the timeline's "Re-synth changed scenes" button.

Structural edits (splitting a scene into two, merging two scenes, deleting a scene). The timeline shifts to match. Footage is preserved per surviving scene; the deleted scene's footage candidates are released back to the orchestrator pool so a regen elsewhere can reuse them.

Verbatim line preservation (you want one specific sentence kept untouched across regens). Highlight the line, mark it as a "fixed beat" via the right-click affordance. Future regenerates preserve it. Use sparingly, fixing too many beats fights the model.

Resolving findings

Each scene shows finding badges in the gutter: blockers (red), majors (amber), minors (grey). Click a badge to see the critic's note + the suggested fix.

Three resolution affordances per finding:

  • Looks good dismisses the finding. The critic's view is logged but the script doesn't change. Use when the finding is a false positive (the critic flagged something that's intentional).
  • Apply suggestion replaces the flagged span with the critic's suggested rewrite. The voiceover for that scene is marked stale.
  • Hide diff collapses the finding without dismissing it. Use when you want to read on and come back.

The sticky header at the top of the script panel shows the live count of unresolved findings. The mark-all-blockers-resolved affordance (⌘ + Shift + A) accepts the critic's suggestion for every blocker in one click. Use carefully, blockers exist for a reason.

Regenerate per stage

Each scene has a regenerate panel with four scopes:

  • Regenerate script rewrites the prose for this scene only. The brief + outline are re-read; the surrounding scenes are unchanged. Costs the script-stage portion of the credit budget. Best when the prose is flat but the scene's premise is right.
  • Regenerate voice re-runs ElevenLabs synthesis for this scene. Same script, fresh synthesis. Costs the voice-stage portion. Best when the synthesis came back with an artefact (cutting, slurring).
  • Regenerate footage re-runs the orchestrator + vision evaluator for this scene. Same script, fresh candidates. Costs the footage-stage portion (which is small). Best when the chosen clip is technically correct but visually off.
  • Regenerate everything does all three in sequence. Costs the full per-scene budget. Best when you want a fresh take.

Each regenerate has an optional hint input, a one-to-three-sentence directive Claude must satisfy. "Less encyclopaedic, more dramatic." "Lead with the date." "Cut the third paragraph." The hint is preserved across the regenerate and recorded in the scene's history.

A Sharpen with Claude button polishes your draft hint with the same niche + anti-clickbait guardrails the main generation path uses. Useful when your hint reads vague.

Search more footage

Per-scene Search more runs an ad-hoc fresh search with a query of your choice. The candidate grid swaps to show the fresh hits while keeping the original orchestrator results in a "Recent" tab.

Search more has a source-tier toggle: All / Archival / Free stock / Pixabay. The default is All, which walks the niche's default chain. Restricting to one tier is useful when you know the niche should be (e.g.) archival-only but the auto-pick included a stock clip.

If a tier-restricted search comes back empty, the input surfaces a one-click "Try All" affordance that re-runs across the niche default chain. Your query is preserved.

Custom voiceover swap

You can swap to custom voiceover post-generation. The voice editor accepts an MP3 / WAV / M4A upload and aligns the existing clips against your audio instead of re-synthesising via ElevenLabs. The pipeline's existing alignment indicator (8 percent of target, minimum 15 seconds tolerance) surfaces with a waveform preview. See Custom voiceover for upload formats and what runs differently.

Per-scene custom-VO is also supported. Upload an MP3 for a single scene; the pipeline aligns that scene to your audio and re-synthesises the rest. Useful for hot-fixes (one scene came back wrong, you re-record just that line).

Captions

Captions are generated from the script with forced-alignment against the voice timing. Edit:

  • Wording inline by clicking into a caption block. The timing recalculates to match the new length.
  • Style (font, size, position, colour, background) via the caption-style panel on the right. Six font choices, six style presets (cinematic-dark, editorial-paper, archival-warm, plain, large-mobile, accessible-high-contrast), letterbox-aware bottom-third positioning. Style choices apply globally to every caption in the video.
  • Timing by dragging the caption block edges on the timeline. Useful for tightening or loosening relative to footage cuts.

Music

The music swap panel surfaces the niche-recommended track + alternatives. Two tabs:

  • Library is the curated CC-BY collection. Recommended track at the top, alternatives in mood/tempo groups below.
  • Upload accepts an MP3 you provide. Must be CC-BY-attributable or your own; the pipeline does not verify rights.

Mix levels (music volume relative to voiceover) live in the audio mixer panel. Defaults are sensible.

Thumbnail

The publish surface shows three Gemini-generated thumbnail candidates after the script lands. Pick one, or upload your own. Custom uploads go through the CSAM scan + safety filters before being attached.

Title and description live on the publish surface too. The title-CTR predictor shows a band-rating for your draft title with one-click "fix this signal" affordances when something hurts CTR.

Save, auto-save, Magic Reset

Edits save as you type. Auto-save fires every 8 seconds and on every significant action (regenerate, search-more selection, caption-style change). Manual save is ⌘ + S.

Magic Reset at the foot of the editor reverts the current scene to its last-known-published state. Per-scene, not whole-video. The publish surface has a separate "Revert to draft" affordance for whole-video rollback.

Snapshot history at the foot of every scene block lets you step back through the last few saves. Useful for "I changed something I didn't mean to and now I want it back".

Preview without re-render

The editor has a preview affordance that plays the current state of the video without triggering a Shotstack re-render. The preview is browser-side (the timeline + the clips + the voiceover stitched together in real time) so it shows pacing and clip choice without spending the render credit.

Use preview to validate edits before committing to a re-render. The render itself is what costs (Shotstack time and bandwidth), the preview is free.

What's next

Rendering and troubleshooting covers the render queue states, what to do when a job is stuck, and the common error messages.

Editor overview covers the three-panel layout, finding-badge meaning, and the full keyboard shortcut sheet.

Custom voiceover is the full picture on bringing your own audio.

Cheers,
Carl