Skip to main content

Maintaining Scene Continuity with Sora-2

 We're working on a promo video for our smartphone-based CW practice app. We had great luck last week using the sora-2 app to create B-Roll footage for the Gladych Files. This week, I'm hoping to make an entire scripted trailer for the CW app using sora-2.

There are issues though. The first one is that while the sora app has a storyboard feature, (at least the one I can access this week), the API does  not. It does however allow you to pass in reference images to bridge scenes. That's pretty cool, and seems to work.

I'm working on  a Python script to wait for bridging images between clips. That's worked out ok. 

Table comparing TL;DR features of Sora web versus Sora API, showing storyboards and scene linkage built into the web app but requiring prompt engineering and referenced video IDs in the API.

The real issue, so far, has been sora-2 moderation.

Profanity and Real-Person Filters

You cannot pass the image of a real person, (even one sora-2 invented), between clips. Moderation stops it every time. (Moderation is what sora-2 calls its engine that decides if it's able to make your video at all.) This is what set off a cascade of moderation issues I've yet to overcome.

Close-up of a man in blue over-ear headphones and an orange jacket, looking into the camera in a studio setting, as if recording or reacting to something on screen.

geesh. So far,  I've tried to use a cartoon motif instead, moved to scenes where faces weren't visible. Neither thing helped. Apparently sora-2 is rather prudish with respect to profanity and violence so far. Consequently this scene prompt mostly isn't getting created.


Same operator, now imagined as a sci-fi pilot. Semi-stylized digital painting realism,
\nstrong lighting, soft brushwork,
still grounded and realistic.\n\nIMPORTANT FACE RULES:\n
- Pilot always wears a blast helmet.\n- Helmet visor and design must completely obscure the eyes and upper face.\n
- Only the pilots chin and sometimes the mouth are visible.\n
- No reflections revealing facial features.\n\nNo readable on-screen text.\n\n
[SCENE 2IMAGINED X-WING / TIE-STYLE COCKPIT, SIDETONE DELAY]\n\n
Vertical 6:19 / 9:16.\n\nWe transition from the shimmer at the end of Scene 1 directly into the cockpit of a\n
starfighter, inspired by X-Wing / TIE Fighter designs but generic: angular windows,\n
metallic ribs, analog switches, retro-futuristic indicator lights, and a flight stick.
\n
Deep space and distant stars are visible outside; no visible enemies, no active combat.
\n\n
The pilot is the same person as the operator, now imagined in this cockpit.\n
They wear a bulky blast helmet with a tinted visor and side panels that fully hide the
\n
upper face. Only their chin and mouth area are sometimes visible when they speak.\n\n
A reference still from Scene 1 is provided. Use it to match:\n- body build,\n
- basic posture,\n
- overall lighting mood,\nso the operator and pilot feel like the same person.\n\n
The pilot keys Morse on a small console paddle. The delayed sidetone problem is the\n
same as in Scene 1: the audio beeps are slightly late relative to the hand motion.\n
The pilot clearly knows whats wrong and reacts with authentic frustration.\n\n
The pilot yells, visible only from helmet and chin:\n
\"Darn it, I can’t transmit with this sidetone delay!\"\n\n
They pull off a removable blast visor attachment on the front of the helmet, or tilt\n
the chin slightly upward in exasperation, but the upper face remains entirely hidden\n
behind the helmet structure.\n\nCockpit lighting is dramatic but controlled,
emphasizing the polished metal surfaces\n
and the pilots gloved hands on the controls.",



Sora-2 Cross talk

Here's the part that's kind of fun. I asked GPT-5 if sora-2 had access to my chat context. The chatbot assured me that the videobot did not... Yeah, it might though!

Here's an image the kid and I asked GPT-5 to mock up for us a few days ago while planning out howw to add math lessons to the Gladych Files. Anyway. But, notice that there are two kids and a dog. sora-2 never seen this image.
Retro cartoon poster with bold text reading Boys! Girls! Everyone!, showing a smiling red-haired boy, a blonde girl in a blue dress and mortarboard saying Puppies! in a speech bubble, and a happy brown puppy between them.

I clipped this portion of the image and passed it to sora-2 as a reference to start from.
Cropped retro cartoon of a cheerful red-haired boy in yellow overalls and a green-striped shirt raising one hand in greeting against a warm textured background

Here's what sora-2 outptut. Notice that the blonde kid even has the graduation cap! Kinda cool I think, and it might point to some interesting usage modes as time progresses.

Vintage-style cartoon scene of two kids and a puppy around a Morse code key on a wooden table, with an old-fashioned radio and meter, echoing the earlier math-class poster style.

The prompt for the blonde kid was 

  * Blonde kid: long ponytail tied with a big ribbon,\n    
    blue dress and tiny academic cap, white socks and shoes.\n

So I guess, it could have guessed the look. Or, perhaps  both gpt-5 image generation and sora-2 use the same engine deep down which would also be interesting to know.

In the meantime, I'm starting to see improved results with the sora-2-pro model vs. sora-2 with respect to image moderation. sora-2-pro costs 3X as much (thirty cents per second as opposed to 10 for sora-2.) I'll keep you posted.




Comments

Popular posts from this blog

Cool Math Tricks: Deriving the Divergence, (Del or Nabla) into New (Cylindrical) Coordinate Systems

Now available as a Kindle ebook for 99 cents ! Get a spiffy ebook, and fund more physics The following is a pretty lengthy procedure, but converting the divergence, (nabla, del) operator between coordinate systems comes up pretty often. While there are tables for converting between common coordinate systems , there seem to be fewer explanations of the procedure for deriving the conversion, so here goes! What do we actually want? To convert the Cartesian nabla to the nabla for another coordinate system, say… cylindrical coordinates. What we’ll need: 1. The Cartesian Nabla: 2. A set of equations relating the Cartesian coordinates to cylindrical coordinates: 3. A set of equations relating the Cartesian basis vectors to the basis vectors of the new coordinate system: How to do it: Use the chain rule for differentiation to convert the derivatives with respect to the Cartesian variables to derivatives with respect to the cylindrical variables. The chain ...

The Valentine's Day Magnetic Monopole

There's an assymetry to the form of the two Maxwell's equations shown in picture 1.  While the divergence of the electric field is proportional to the electric charge density at a given point, the divergence of the magnetic field is equal to zero.  This is typically explained in the following way.  While we know that electrons, the fundamental electric charge carriers exist, evidence seems to indicate that magnetic monopoles, the particles that would carry magnetic 'charge', either don't exist, or, the energies required to create them are so high that they are exceedingly rare.  That doesn't stop us from looking for them though! Keeping with the theme of Fairbank[1] and his academic progeny over the semester break, today's post is about the discovery of a magnetic monopole candidate event by one of the Fairbank's graduate students, Blas Cabrera[2].  Cabrera was utilizing a loop type of magnetic monopole detector.  Its operation is in...

More Cowbell! Record Production using Google Forms and Charts

First, the what : This article shows how to embed a new Google Form into any web page. To demonstrate ths, a chart and form that allow blog readers to control the recording levels of each instrument in Blue Oyster Cult's "(Don't Fear) The Reaper" is used. HTML code from the Google version of the form included on this page is shown and the parts that need to be modified are highlighted. Next, the why : Google recently released an e-mail form feature that allows users of Google Documents to create an e-mail a form that automatically places each user's input into an associated spreadsheet. As it turns out, with a little bit of work, the forms that are created by Google Docs can be embedded into any web page. Now, The Goods: Click on the instrument you want turned up, click the submit button and then refresh the page. Through the magic of Google Forms as soon as you click on submit and refresh this web page, the data chart will update immediately. Turn up the:...