NatViS (Natural Vision) is a photorealistic full-parameter fine-tune of SDXL that uses Natural Language prompting to generate high quality SFW/NSFW images. Trained on 1M+ image-caption pairs on a dataset that’s been expanded and refined for over a year.
Note: NatViS is still being trained. V1 (epoch 68) wrapped up training on July 19th, 2024.
https://ko-fi.com/ndimensional
I’ve never been a fan of e-begging, however SDXL fine-tunes at this scale are becoming expensive to tune. So I will begrudgingly ask; if you like what I do and would like to support my models. Consider donating on Ko-Fi 💗
I will be begin posting updates, answering questions, taking feedback, and releasing early access (NOT EXCLUSIVE) models to supporters.
All donations will be used to fund the creation of new Stable Diffusion fine-tunes and open-source AI tools.
10_2_24 NatViS v2.0 Lightning 4step
Uploaded 4step lightning model for v2.0
============
10-1-24 NatViS v2.0 Lightning 8step
Uploaded 8step lightning models for v2.0
============
9-25-24 NatViS v2.0
What's New?
Prompting: This update focuses primarily on the text-encoders. Natural language prompting capabilities have been improved to follow less-strict formats and relies less on using specific tokens.
Ethnicity and Demonym: Increased accuracy of phenotypes for various ethnicities and demonyms. Not just limited to body structure, but also includes clothing, hair, landscapes, ect.. See here for small examples.
Camera EXIF: Inclusion of Camera EXIF data for popular modern and analog cameras that can be prompted. Includes, Camera Name, Focal Length, f-stop, ISO, shutter speed, lens type. Also includes attachments such as ND filters, polarizers.
Analog: Improvements to analog and vintage photograph generations.
Lighting and shadow: Prompt how light (or thereof) interacts with objects/subjects in the scene. Amongst other general lighting related modifiers. More info soon.
Skin Textures: Small improvements to the detail of skin textures with less or no explicit token related to skin detail.
Implementation of Pseudo Instruction: This will require a more lengthy write-up.
Better male anatomy.
Lesbians.
What's Next?
Lightning models will be released within the coming days.
Full PDF guide and documentation within the next week.
Info on v3.0 within the next month.
8/4/24 NatViS v1.0 Lightning 4step
Uploaded 4step lightning version of v1.0 (See About this version for more info).
============
8/3/24 NatViS v1.0 Lightning 8step
Uploaded 8step lightning version of v1.0 (See About this version for more info)
============
8/2/24 NatViS v1.0
Initial Release
Note: These are simply recommendations, feel free to experiment.
NatViS leverages SDXL’s bigG text-encoder to allow for Natural Language prompting.
What is Natural Language Prompting?
Since the release of Stable Diffusion v1.4 — people have become accustom to comma delimited lists of visually descriptive tags/phrases. This was a necessity for early Stable Diffusion models due to the architecture and choice of text-encoder. With SDXL’s dual text-encoder/tokenizer architecture we are able to write more naturally descriptive prompts.
Simply describe the image you want to generate, just as you would describe the image to a person.
For example;
Comma delimited list: a woman, standing, outdoors, sun beams, dappled light, apple tree, wearing denim jeans, flannel shirt, brown hair, long hair, looking at viewer, highest quality, atmospheric, 35mm, masterpiece
Natural Language: A masterpiece, 35mm-style photo of a woman with long brown hair, standing outdoors in dappled sunlight beneath an apple tree. She wears denim jeans and a flannel shirt, gazing directly at the viewer with an atmospheric quality.
Note: This is just an example to highlight how to write a natural language prompt. For better examples, see the sample images.
Will NatViS Understand Everything I tell it?
Absolutely, not.
Due to various limitations in both the architecture and size of the data I’m able to fine-tune as one person. There will be instances where the model will simply not generate what you want. Often, you experiment with different wording, placement of tokens (i.e., moving a sentence or individual token closer to the start or end of a prompt), remove potentially conflicting tokens, ect… Their really is no definitive solution I can, as it varies from prompt-to-prompt. Unfortunately there will times when no solution/workaround is successful.
Can I still use Tags?
Short answer: Yes
SDXL’s dual text-encoder/tokenizer architecture can process tokens/sequences with both encoders in parallel. Meaning, you don’t have to use natural language prompting.
Note: Since the training data was purely captioned with Natural Language descriptions, not all the common descriptive tags people are familiar with will be understood by the model. Especially Booru, Booru-style tags.
I found a hybrid system works well, as seen in many of the sample images.
For example;
Say you tried your natural language prompt, but want to make the results a bit more cinematic. Instead of modifying the entire prompt; you can simply append cinematic lighting, harmonious, film still, ect..
To the end of your prompt.
Quality Tags/Classifiers? (score_up_x
)
Blasphemy.
You can use quality rank/classifiers if you want. But they will not part of the training data.
Negative Prompt
Similar to other SDXL models. Use tags separated with commas and keep it short. Add/Remove tokens from the negative prompt as needed.
CFG:
Recommended: 5-7
7+ to enforce a specific style/medium
Sampler/Sampling Steps:
This can be quite subjective, so I will just share what I typically use instead of giving direct recommendations.
Sampler - DPM++ 2M SDE
Scheduler - Karras
Steps - 55
ADetailer: (Extension)
Link
Again, subjective so I’ll just share my settings.
Model - mediapipe_face_full (use mediapipe for photorealism)
Confidence - 0.45
Everything else is default.
CFG Rescale: (Extension)
Link
I forgot that I had this installed, I’m not quite sure if it was enforcing the zero terminal SNR to the noise schedule or not. Since the parameter was null, it shouldn’t have.
Phi - 0
If you struggle to replicate the sample images, even with the exact seed and parameters. It’s likely because of the noise scheduler. I enabled the fix for this in Webui, but had since reinstalled webui and forgot to re-enable it. This only applies to V1 of NatViS.
TO-DO
This will take a while to write up. So in the meantime:
TLDR; 1M+ images, processed/cleaned via personal Dataset Toolkit I’m developing, captioned via Multimodal Large Language Model (MLLM) with unified feature space (part of Dataset Toolkit, not GPT). Training Data, Configs, Custom Scripts will be made available and open-sourced when the final version is released. Dataset Toolkit has no announced release date.
SDXL Checkpoints: https://civitai.com/collections/966964
SDXL LoRAs: https://civitai.com/collections/966969
40K Series: https://civitai.com/collections/956187
SD1.5 Checkpoints: https://civitai.com/collections/966974
SD1.5 LoRAs: https://civitai.com/collections/966972
NatViS: Natural Vision is a highly specialized Image generation AI Model of type Safetensors / Checkpoint AI Model created by AI community user ndimensional. Derived from the powerful Stable Diffusion (SDXL Lightning) model, NatViS: Natural Vision has undergone an extensive fine-tuning process, leveraging the power of a dataset consisting of images generated by other AI models or user-contributed data. This fine-tuning process ensures that NatViS: Natural Vision is capable of generating images that are highly relevant to the specific use-cases it was designed for, such as 3d, photorealistic, sexy.
With a rating of 0 and over 0 ratings, NatViS: Natural Vision is a popular choice among users for generating high-quality images from text prompts.
Yes! You can download the latest version of NatViS: Natural Vision from here.
To use NatViS: Natural Vision, download the model checkpoint file and set up an UI for running Stable Diffusion models (for example, AUTOMATIC1111). Then, provide the model with a detailed text prompt to generate an image. Experiment with different prompts and settings to achieve the desired results. If this sounds a bit complicated, check out our initial guide to Stable Diffusion – it might be of help. And if you really want to dive deep into AI image generation and understand how set up AUTOMATIC1111 to use Safetensors / Checkpoint AI Models like NatViS: Natural Vision, check out our crash course in AI image generation.
Epoch 68
Final training Date - July 19, 2024
Go ahead and upload yours!
Your query returned no results – please try removing some filters or trying a different term.