V2.0: Introducing STG (Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling).

The feature enhances video quality as described in more details here:

https://github.com/logtd/ComfyUI-LTXTricks?tab=readme-ov-file

GUI includes two new nodes in blue:

STG settings, showing CFG, Scale and Rescale. Plus a switch to change between two layers of the model to be skipped (8 or 14 (default), chose "true" for layer 14 or "false" for layer 8)

I dont fully understand yet, what the parameters do, so this version is a bit experimental. I copied a note in the workflow with further info and usable values/limits. Feel free to experiment. In my testing, I kept the values within STG settings as default and just used the switch. I compared generations with/without STG, the results with STG are more pleasant to me.

ComfyUI Workflow: LTX IMAGE-to-TEXT-to-VIDEO Using Florence2 Caption

This workflow transforms static images into videos by leveraging Florence2 for captioning and the LTX Text to Video model for dynamic generation.

Workflow Steps

Image Input
- Drag and drop or load a picture into ComfyUI.
- The workflow automatically captions the image using Florence2, preparing it for LTX text to video generation.
Captioning Options
- Use the Florence2 node in the GUI to select one of three captioning detail levels:
  - "Caption": A brief description.
  - "Detailed Caption": More descriptive and nuanced.
  - "More Detailed Caption": In-depth details for richer prompts.
Prompt Refinement
- The "Replace 'Photo/Image/Picture' with" node allows replacing generic terms like "photo," "image," or "picture" in captions with a more video-centric term:
  - Examples include "video," "animation," or "clip."
- This pushes the output prompt towards dynamic video generation rather than static imagery.
Parameter Control
- The GUI provides essential controls such as:
  - Video Length
  - CFG Scale
  - Seed
  - Width and Height
  - Steps
Model Initialization
- On the first run, the workflow will automatically download any missing models, including LTX and Florence2, ensuring smooth operation.

This is more of a fun workflow, giving interesting results.

Check my other workflow for pure LTX Image to Video with autocaption and enhanced movement:

https://civitai.com/models/995093?modelVersionId=1118629

Description

ComfyUI Workflow: LTX IMAGE-to-TEXT-to-VIDEO Using Florence2 Caption

LTX IMAGE to TEXT to VIDEO with STG workflow

Model Details

Available Files

Tags

Versions

Related Models

Model Information

Description

ComfyUI Workflow: LTX IMAGE-to-TEXT-to-VIDEO Using Florence2 Caption

LTX IMAGE to TEXT to VIDEO with STG workflow

Model Details

Available Files

Tags

Versions

Related Models