A.I. Video Creation

AI Video Creation with Pollo AI — step-by-step inside Google AI Studio Tutor

Below you’ll find a complete, teachable script you can follow while using Google AI Studio’s on-screen Tutor (screen-share + voice). Learners will connect the Tutor, open Pollo.ai in a new tab, and follow along as the Tutor explains and answers questions live.


1) Quick course flow (what learners will do with the Tutor)

  1. Open Google AI Studio → start a new conversation with the Tutor.

  2. Click Share screen and select the browser tab that has Pollo.ai.

  3. Tell the Tutor what you want to make (e.g., “60-second fantasy trailer from text, with a slow dolly-in and cinematic lighting”).

  4. The Tutor will guide you to the right creation mode in Pollo AI:

    • Text to Video · Image to Video · Video to Video · Lip-Sync Video

    • Photo to Video (Avatar) · Mimic Motion

    • Video Tools: Canvas, Video Upscaler, Denoise Video, Face Swap

  5. The Tutor will help you pick a model (from the long model list), tune length / credits / end-frame, and craft prompts.

  6. Generate, review, iterate. Use Upscaler / Denoise at the end for polish.


2) Pollo AI interface primer (what the Tutor will point at)

  • AI Video (top): quick entry to Image→Video, Text→Video.

  • Features: Video→Video (stylize / transform existing footage), Lip Sync.

  • AI Avatar: Photo→Video, Mimic Motion (map motion from a reference clip).

  • Video Tools:

    • Canvas – storyboard, sequence multiple shots, add captions.

    • Video Upscaler – final 2×/4× upscale; best used after you lock an edit.

    • Denoise Video – cleans grain/flicker; especially helpful for cheaper models.

    • Face Swap Video – replace a face in a clip (ensure you have rights/consent).

  • Model picker: each model shows a typical duration, End frame support, and an estimated credit cost. Use this to budget and choose realism vs. speed.

Tip for learners: do fast ideation with low-credit models, then finalize with a premium model + Upscale + Denoise.


3) End-to-end workflows 

A) Text → Video (from nothing to a moving shot)

  1. Open Text to Video.

  2. Paste a structured prompt:

    • Subject: “golden-armored warrior on a rain-soaked bridge”

    • Scene grammar: camera move, lens, lighting, mood, weather, color palette

    • Motion: “slow push-in, cape ripples, rain splashes, subtle head turn”

    • Duration: “8–12 seconds” (or use the length control in the UI)

  3. Choose a model (see model guide below).

  4. Toggle End frame when you need a clean last frame (thumbnails, continuity).

  5. Generate → review → iterate; when happy, Upscale and Denoise.

B) Image → Video (animate a keyframe or poster)

  1. Upload a hero still (concept art / product shot).

  2. Prompt only the micro-motion you want (camera sway, cloth, particles).

  3. Pick a model that preserves appearance well (Pixverse, Luma Ray, Kling lines).

  4. Generate → Upscale → Denoise.

C) Video → Video (stylize real footage)

  1. Upload reference footage (ensure you own the rights).

  2. Prompt the style (“anime ink”, “noir cine grain”, “hyper-real skin”), keep motion minimal in the prompt so the source motion dominates.

  3. Generate → If faces are soft, run Face Swap or Upscale as a finish.

D) Lip-Sync Video / AI Avatar

  1. For Lip Sync, upload the talking head + audio track; set language.

  2. For Photo to Video or Mimic Motion, upload the portrait and a motion reference.

  3. Check alignment → regenerate if phonemes drift; consider Denoise to reduce mouth flicker.


4) Model-by-model guide (pros, cons, best use)

Below is a practical read of what the model list in Pollo AI communicates (names, tags, typical credit guidance, and brief behavior). Use it to pick the right engine for the job.

Budget key: lower “credits” = cheaper ideation. Higher = best fidelity / stability.
“End frame” shown in the picker means you can lock a clean final still when needed.

Pollo 1.6“Better, faster and cheaper” (≈ 60s · End frame · ~5+ credits)

  • Pros: Speedy drafts, low cost, dependable for storyboarding and social loops.

  • Cons: Less micro-detail; occasional temporal jitter on complex scenes.

  • Best for: Quick tests, simple camera moves, captioned promos.

Pollo 1.5Advanced, versatile (≈ 5 min · End frame · ~10+ credits)

  • Pros: Stable upgrade over 1.6 for looks; still economical.

  • Cons: Not as crisp as top-tier realism engines.

  • Best for: Product shots, iterative R&D before upgrading to Luma/Kling.

Wan 2.2 (New)Smoother motion, enhanced realism (~4+ credits)

  • Pros: Very fluid motion; attractive naturalism; good for people & apparel.

  • Cons: Can soften textures in ultra-busy frames; tweak prompt to preserve detail.

  • Best for: Lifestyle, fashion try-ons, natural handheld looks.

Wanx 2.1Realistic outputs (≈ 4 min · ~20+ credits)

  • Pros: Photographic flavor with solid coherence.

  • Cons: Longer renders; can be conservative with big camera moves.

  • Best for: Beauty, food, interiors where texture consistency matters.

Hailuo 02 (Hot)Extreme physics simulations (≈ 60s · ~5+ credits)

  • Pros: Water, cloth, particles feel lively; great for action set-pieces.

  • Cons: Styling can skew “effects-y” if over-prompted.

  • Best for: Weather, explosions, cape dynamics, fantasy magic shots.

Hailuo (no version)Highest video quality (≈ 3 min · ~35+ credits)

  • Pros: Premium visual finish with strong coherence.

  • Cons: Costlier; prompt carefully to avoid “over-gloss”.

  • Best for: Final hero shots when you need that polished sheen.

Hailuo Live2DGood for 2D animation (≈ 3 min · ~35+ credits)

  • Pros: Keeps flat/illustrated looks consistent; great for anime panels.

  • Cons: Not intended for live-action realism.

  • Best for: Comic/anime scenes, animated posters, VTuber-style loops.

Pixverse V5 (New)Smooth, expressive movements (≈ 2 min · End frame · ~5+ credits)

  • Pros: Emotion and gesture read well; preserves faces decently.

  • Cons: Not the sharpest micro-detail; add Upscaler at the end.

  • Best for: Character-centric storytelling, ads with body language.

Pixverse V4.5Enhanced realism & camera motions (≈ 60s · End frame · ~10+ credits)

  • Pros: Strong camera language; clean motion arcs.

  • Cons: Slightly less expressive faces than V5 in some cases.

  • Best for: Kinetic B-roll, product fly-throughs.

Pixverse V4 / V3.5Improved motion/coherence

  • Pros: Reliable mid-tier engines for drafts and social posts.

  • Cons: You’ll likely upscale/denoise for finals.

  • Best for: Cost-conscious campaigns, iteration passes.

Kling 2.1 / 2.1 MasterEnhanced realism & motion fluidity

  • 2.1: (≈ 60s · ~20+ credits)

  • 2.1 Master: (≈ 6 min · ~100+ credits)

  • Pros: Crisp surfaces, strong temporal stability, premium feel.

  • Cons: Master is expensive/time-heavy.

  • Best for: High-stakes hero shots, product macro, automotive, architecture.

Kling 2.0 / 1.6 / 1.0

  • 2.0: Better motion dynamics & aesthetics (≈ 8 min · ~100+ credits)

  • 1.6: More realistic motions (≈ 4 min · ~20+ credits)

  • 1.0: Short videos (≈ 6 min · ~10+ credits)

  • Pros: A clear ladder: 1.0 for cheap tryouts → 1.6 solid → 2.0/2.1+ for finals.

  • Cons: Earlier versions may drift on faces or fine text.

  • Best for: Progressive refinement: ideate → refine → master.

Luma Ray 2 / 2 Flash / 1.6

  • Luma Ray 2: Large-scale model for realistic visuals (≈ 3 min · ~60+ credits)

  • Luma Ray 2 Flash: Faster outputs with coherent motion (≈ 3 min · ~20+ credits)

  • Luma Ray 1.6: Realistic and detailed (≈ 60s · End frame · ~60+ credits)

  • Pros: Renowned for cinematic realism; Flash is great for speed.

  • Cons: Can impose a “cinematic look” even when you want raw/gritty.

  • Best for: Films, luxury products, natural light scenes.

Runway Gen-3 / Gen-4 Turbo

  • Gen-3: Multimodal, professional (≈ 60s · End frame · ~40+ credits)

  • Gen-4 Turbo: Efficient, consistent (≈ 3 min · ~40+ credits)

  • Pros: Strong composition; good editor-friendly outputs.

  • Cons: May be less hyper-photoreal than Luma/Kling on closeups.

  • Best for: Branded content, kinetic typography, agency work.

Google Veo 3 (New) / Veo 3 Fast / Veo 2

  • Veo 3: Realistic outputs w/ natural audio (≈ 3 min · ~140+ credits)

  • Veo 3 Fast: ~30% faster (≈ 3 min · ~70+ credits)

  • Veo 2: HD, visually rich (≈ 5 min · ~180+ credits)

  • Pros: Excellent scene understanding; “natural audio” option on Veo 3 adds ambience for drafts.

  • Cons: Highest credit cost; reserve for finals or marquee shots.

  • Best for: Trailers, hero ads, broadcast-level polish.

Pika 2.2 / 2.1

  • 2.2: Better transition & transformation (≈ 100s · ~30+ credits)

  • 2.1: Crystal-clear & immersive (≈ 100s · ~60+ credits)

  • Pros: Strong with morphs, transforms, animated typography.

  • Cons: Photoreal closeups can look stylized; faces may need extra care.

  • Best for: Music videos, logo morphs, surreal edits.

Vidu 2.0 / Vidu Q1

  • 2.0: Enhanced quality & speed (≈ 60s · End frame · ~10+ credits)

  • Q1: Precise control over motion (≈ 4 min · End frame · ~25+ credits)

  • Pros: Q1 gives nice motion control; 2.0 is a cheap/effective generalist.

  • Cons: Texture fidelity trails the premium class.

  • Best for: Social ads, explainers, motion-controlled demos.

Midjourney (New)Aesthetically pleasing visuals (≈ 4 min · ~60+ credits)

  • Pros: Strong art direction and taste; stylized beauty.

  • Cons: Less literal realism; better for stylized than for true live-action.

  • Best for: Fashioned looks, art promos, stylized stories.

HunyuanTencent video model (≈ 12 min · ~20+ credits)

  • Pros: Cost-efficient long-form; stable wide shots.

  • Cons: Slower; may need upscaling.

  • Best for: Longer sequences, establishing shots.

Seedance 1.0 Lite / Pro

  • Lite: Accurate motion & camera control (≈ 60s · ~5+ credits)

  • Pro: Fluid, cohesive multi-shot outputs (≈ 2 min · ~15+ credits)

  • Pros: Great when you care about the camera path; budget friendly.

  • Cons: Visual fidelity is mid-tier—plan to Upscale.

  • Best for: Previz, tutorials, camera-move studies.


5) Prompting cheat-sheet 

  • Shot grammar first: “35 mm lens, shallow depth of field, soft key from left, golden hour backlight, slow push-in.”

  • Action second: “Character lifts head, water drips from armor, sparks in the background.”

  • Style last: “Cinematic, moody teal-and-orange, realistic skin, rain FX.”

  • Constraints: “No text on screen, no watermark, keep face consistent, maintain armor logo.”

  • For Image→Video: Add “preserve composition and costume; micro-motion only.”

  • For Video→Video: Add “retain original timing and blocking.”


6) Quality pipeline

  1. Ideate cheap (Pollo 1.6 / Seedance Lite / Pixverse V4).

  2. Lock framing & motion.

  3. Upgrade model (Kling 2.1, Luma Ray 2, Veo 3) for hero shots.

  4. Upscale (2×/4×) → Denoise → (optional) Face Swap if identity matters.

  5. Export and assemble shots in your editor.


7) Troubleshooting 

  • Faces keep changing: use models known for coherence (Kling/Luma/Pixverse V5), add “consistent face/identity” to prompt, shorten duration, and run Denoise.

  • Too stylized / not realistic enough: switch from Midjourney/Pika to Luma/Kling/Veo.

  • Physics look fake: try Hailuo 02, simplify motion prompts, shorten duration.

  • Budget blow-ups: stick to Pollo 1.6 / Vidu 2.0 for drafts; only send finals to Veo/Luma/Kling Master.

  • Lip-sync off: re-align audio start, reduce head motion, regenerate shorter clips.


8) Ethics & rights 

  • Face Swap / Photo→Video: only use media you own or have written consent for.

  • Voice & music: use licensed or original audio; avoid copyrighted tracks.

  • Logos & brands: do not depict protected marks without permission in commercial output.


9) Three ready-to-teach mini-projects

  1. Hero product shot (30s)

    • Draft: Pollo 1.6 → refine in Pixverse V4.5 → Final: Kling 2.1 → Upscale/Denoise.

  2. Stylized music bumper (15s)

    • Pika 2.2 for transitions → Hailuo 02 for particles → Upscale.

  3. Cinematic character teaser (20s)

    • Image→Video with portrait → Pixverse V5 for expression → Luma Ray 2 for final look.


10) Recap: picking the right model fast

  • Fast & cheap: Pollo 1.6, Vidu 2.0, Seedance Lite.

  • Stylized / artistic: Midjourney, Pika 2.2, Hailuo Live2D.

  • Physics / effects: Hailuo 02.

  • People / faces: Pixverse V5, Kling 2.1, Luma Ray 2.

  • Highest realism: Luma Ray 2, Kling 2.1 Master, Google Veo 3.

  • Longer shots / motion control: Seedance Pro, Vidu Q1, Hunyuan.