Spyke

perchance·Perchance - Create a Random Text GeneratorbyRandomPerchanceUser

Train your own custom image captioning model for Chroma (Perchance T2i model) on Google Colab T4 in 2 hours

Link to image-to-prompt: https://huggingface.co/codeShare/flux_chroma_image_captioner/blob/main/gemma_image_captioner.ipynb

Writing prompts for Chroma is hard and Joycaptions is inaccurate so I assembled the training data I could find for the model , picked 400 image text pairs at random and trained Google Gemma 3 LoRA model as in image to prompt tool that can run on Google Colab.

Its a proof-of-concept. Feel free to train your own LoRa captioning models for use on perchance. The workflow of converting JSON and .parquets into a dataset can be found in this notebook in the repo: https://huggingface.co/codeShare/flux_chroma_image_captioner/blob/main/train_on_parquet.ipynb

For the original unsloth notebook visit: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3_(4B)-Vision.ipynb

Other unsloth models: https://docs.unsloth.ai/get-started/unsloth-notebooks

/Cheers!

Train your own custom image captioning model for Chroma (Perchance T2i model) on Google Colab T4 in 2 hours

https://huggingface.co/codeShare/flux_chroma_image_captioner/blob/main/gemma_image_captioner.ipynbOpen link View original on lemmy.world

0

Comments

perchance·Perchance - Create a Random Text GeneratorbyRandomPerchanceUser

Train your own custom image captioning model for Chroma (Perchance T2i model) on Google Colab T4 in 2 hours

Link to image-to-prompt: https://huggingface.co/codeShare/flux_chroma_image_captioner/blob/main/gemma_image_captioner.ipynb

Writing prompts for Chroma is hard and Joycaptions is inaccurate so I assembled the training data I could find for the model , picked 400 image text pairs at random and trained Google Gemma 3 LoRA model as in image to prompt tool that can run on Google Colab.

Its a proof-of-concept. Feel free to train your own LoRa captioning models for use on perchance. The workflow of converting JSON and .parquets into a dataset can be found in this notebook in the repo: https://huggingface.co/codeShare/flux_chroma_image_captioner/blob/main/train_on_parquet.ipynb

For the original unsloth notebook visit: https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3_(4B)-Vision.ipynb

Other unsloth models: https://docs.unsloth.ai/get-started/unsloth-notebooks

Also Tensor Art holds a contenst for running the new Qwen model so you might wanna check that out: https://mee6.xyz/i/vSpIL2tvi0

/Cheers!

https://huggingface.co/codeShare/flux_chroma_image_captioner/blob/main/gemma_image_captioner.ipynbOpen link View original on lemmy.world

1

Comments

perchance·Perchance - Create a Random Text GeneratorbyRandomPerchanceUser

All images used to train Chroma (Perchance T2i model) are tagged with the 'aesthetic' tag

Source: https://huggingface.co/lodestones/Chroma/discussions/72

Chroma (Perchance Text to image model) is trained on 5 million images

The tagging system for all these images include the word 'aesthetic' in the training prompt , used in this manner :

'aesthetic 0' , 'aesthetic 1' , .... 'aesthetic 10' , 'aesthetic 11' are labels used to denote the visual style in the training data

Where 'aesthetic 11' denotes (good) AI-images used for training data

Thats all we know.

//----//

This system isn't 100% accurate , but it is highly recommended you use the term 'aesthetic' at least once (preferably often) to mimic the training prompt(s) in the Chroma model.

Check the HF page for Chroma in the future on further info regarding training data / prompts you can use while generating images on the perchance website.

TLDR; use the word 'aesthetic' in your prompt to improve them for perchance text-to-image generation

Cheers!

View original on lemmy.world

6

Comments10

perchance·Perchance - Create a Random Text GeneratorbyRandomPerchanceUser

FLUX Chroma (The Perchance T2i model)

FLUX Chroma: https://huggingface.co/lodestones/Chroma

FLUX Chroma (Tensor Art) : https://tensor.art/models/886764918794154122

Unlike base FLUX Schnell , FLUX Chroma uses NAG (Normalized Attentive Guidence) : https://huggingface.co/spaces/ChenDY/NAG_FLUX.1-dev

TLDR; NAG are negatives added to FLUX model

Paper: https://arxiv.org/abs/2505.21179

See FLUX Chroma Huggingface repo for additional changes from base FLUX Schnell model

To help with creating prompts for FLUX , use Joycaptions:https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one

And Danbooru tags: https://donmai.moe/wiki_pages/help:home

The prompt can be up to 512 tokens long , which can be checked at https://sd-tokenizer.rocker.boo/

https://huggingface.co/lodestones/ChromaOpen link View original on lemmy.world

7

Comments22

Posts

Train your own custom image captioning model for Chroma (Perchance T2i model) on Google Colab T4 in 2 hours

Train your own custom image captioning model for Chroma (Perchance T2i model) on Google Colab T4 in 2 hours

All images used to train Chroma (Perchance T2i model) are tagged with the 'aesthetic' tag

FLUX Chroma (The Perchance T2i model)