LTX IMAGE to TEXT to VIDEO with STG workflow

v3.1 (model 0.9.1)
tremolo28
10 months ago

V2.0: Introducing STG (Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling).

The feature enhances video quality as described in more details here:

https://github.com/logtd/ComfyUI-LTXTricks?tab=readme-ov-file

GUI includes two new nodes in blue:

STG settings, showing CFG, Scale and Rescale. Plus a switch to change between two layers of the model to be skipped (8 or 14 (default), chose "true" for layer 14 or "false" for layer 8)

I dont fully understand yet, what the parameters do, so this version is a bit experimental. I copied a note in the workflow with further info and usable values/limits. Feel free to experiment. In my testing, I kept the values within STG settings as default and just used the switch. I compared generations with/without STG, the results with STG are more pleasant to me.

--

ComfyUI Workflow: LTX IMAGE-to-TEXT-to-VIDEO Using Florence2 Caption

This workflow transforms static images into videos by leveraging Florence2 for captioning and the LTX Text to Video model for dynamic generation.

--

Workflow Steps

  1. Image Input

    • Drag and drop or load a picture into ComfyUI.

    • The workflow automatically captions the image using Florence2, preparing it for LTX text to video generation.

  2. Captioning Options

    • Use the Florence2 node in the GUI to select one of three captioning detail levels:

      • "Caption": A brief description.

      • "Detailed Caption": More descriptive and nuanced.

      • "More Detailed Caption": In-depth details for richer prompts.

  3. Prompt Refinement

    • The "Replace 'Photo/Image/Picture' with" node allows replacing generic terms like "photo," "image," or "picture" in captions with a more video-centric term:

      • Examples include "video," "animation," or "clip."

    • This pushes the output prompt towards dynamic video generation rather than static imagery.

  4. Parameter Control

    • The GUI provides essential controls such as:

      • Video Length

      • CFG Scale

      • Seed

      • Width and Height

      • Steps

  5. Model Initialization

    • On the first run, the workflow will automatically download any missing models, including LTX and Florence2, ensuring smooth operation.

--

This is more of a fun workflow, giving interesting results.

Check my other workflow for pure LTX Image to Video with autocaption and enhanced movement:

https://civitai.com/models/995093?modelVersionId=1118629

Read more...
Download (556 KB) Download available on desktop only

Popularity

140 ~10

Info

Base model: LTXV

Latest version (v3.1 (model 0.9.1)): 1 File

To download these files, please visit this page from a desktop computer.

About this version: v3.1 (model 0.9.1)

Support for LTX Model 0.9.1

7 Versions

πŸ˜₯ There are no LTX IMAGE to TEXT to VIDEO with STG workflow v3.1 (model 0.9.1) prompts yet!

Go ahead and upload yours!

No results

Your query returned no results – please try removing some filters or trying a different term.