Image generator tip

perchance·Perchance - Create a Random Text GeneratorbyChewNDew86

Image generator tip

Wanna say big thank you to the creator of perchance as this site has gotten me to actually understand coding better. Moving from SD to Flux.1 Schnell has also been a major boost. I am working on a side project image Gen that also ties in the ai chat and going to make a game generator that you can play a DND type game with the ai chat bot that generates images of the characters and the scene, still not close to getting that done.

Now, biggest thing I have been seeing with people and posts about the t2I generator now is that they do not understand Flux.1 at all. Here are some tips.

Flux does not use negative prompting. You can add No "whatever" in the main prompt at the bottom to help Flux not generate something you dont want in the image.
Flux does not use weights or () or any SD type prompting. Use the KISS method. Your prompt should be simple, ie: A husky dog running in a wooded forest during the middle of winter. - that will generate a good image. You can add - real photograph taken on an Kodak disposable camera, y2k aesthetic, natural lighting and light leaks through the trees. - by adding the technical data to the end, Flux actually responds to it better. You can mention type of trees, or more background type things, just remeber the KISS method.
Flux does not like high SFG. I hope the dev can adjust perchance main code to allow 1/2 scale so so you can do 2.5, 5.5, etc instead of whole numbers. Flux does prefer 1-5 SFG.
I am not sure if perchance can actually do image steps, I have been messing with that part and it does seem to adjust the quality of the image. Under the image settings code, add a steps = [inout.steps] code line and create a simple step list. I go with 4, 8, and 12 steps. Just something to look into.
Using the old perchance imports for art styles and such, is actually not right. Those are setup for SD and not Flux. Using them without changing the prompt structure will always create bad images.

Seen to many posts here and on reddit where users keep complaining about the t2I generators and wanted to give some insights to help the community. Using Claude, Gronk, Chat gpt to help build prompting is decent, again they tend to over describe things. The ai chat bots you can create do a better job once you setup the persona correctly to fully understand how Flux.1 works and give the bot good guides and instructions to follow.

View original on lemmy.world

Comments3

ChewNDew86 reply

lemmy.world

You have to make sure your prompts are concise and not too wordy. Leave out generic terms like masterpiece, bad quality, and go for technical camera prompts or photorealistic image of.... the brackets may not be the reason for the visible difference with flux. Having a good prompt structure is what works. Use Claude or chargpt to give better tips on how to setup your specific generator. Those have really made a difference in ensuring a more natural language flow in the prompting. I will test out the brackets too and see if that does help or not. But right now a guidance scale of 7 with a step of 8 for certain photoreal artstyles is working really good for me.

Here is some good photo prompts I use. I have the [input.artStyle.a] tag at the beginning of my prompt setup that then feeds into the character build for anime/fictional characters then into other prompts and have the [input.artStyle.b] tag towards the end of the prompt structure. This way flux see the key details of what to generate then sees the aspect of the artstyle. My biggest issue is with webcam, selfie, phone type cameras. Cannot get it to consistently generate real depiction. Only illustration or cartoonish images. I am still tweaking these as I go too.

  // Amateur
    ❗️❗️ Amateur
      a = snapshot photo of
      b = casual amateur photograph taken on a basic consumer camera, natural available light with no professional setup, honest unpolished framing, relaxed snapshot aesthetic with slight exposure imperfection.

    Cheap Snapshot
      a = early 2000s point-and-shoot digicam photo of
      b = shot on a Casio Exilim compact digicam, low-megapixel CCD sensor rendering with flat plastic-lens perspective, blown on-camera flash highlights, washed-out colors with crushed shadow detail, Y2K digicam color rendering with visible JPEG compression.

    Disposable Camera
      a = lo-fi disposable camera photo of
      b = shot on a single-use 35mm disposable camera, harsh direct built-in flash with unpredictable exposure, heavy silver halide grain, color shifts from cheap film emulsion, light leak streaks along frame edges, amateur spontaneous framing with physical film scratches.

    Fujifilm QuickSnap
      a = Fujifilm QuickSnap disposable camera photo of
      b = shot on a Fujifilm QuickSnap loaded with Fujicolor 400 film, cool cyan-shifted shadows with heavy silver halide grain, strong direct on-camera flash with deep crushed blacks and overexposed skin, 1990s amateur snapshot aesthetic.

    Kodak Gold 200
      a = Kodak Gold 200 disposable camera photo of
      b = shot on a 35mm disposable camera loaded with Kodak Gold 200 film stock, warm yellow-amber color cast with boosted reds and warm skin tones, medium silver halide grain, harsh direct on-camera flash with slightly overexposed midtones, 1990s family photo warmth.

    Pentax K1000
      a = amateur 35mm SLR film photograph of
      b = shot on a Pentax K1000 fully manual SLR with a 50mm f/2 lens, honest ungraded natural color reproduction with deep shadow rendering and visible silver grain, tactile film texture with accidental dust specks and a mild light leak from aging door seals, sincere student photography rawness.

    Polaroid
      a = instant Polaroid photograph of
      b = shot on a Polaroid 600 instant camera, square format with heavy white border, slow chemical development producing warm faded whites and milky midtones, slightly underexposed and soft rendering with dreamy uneven color development and barrel distortion from the plastic lens.

    Slumber Party Snap
      a = 2000s snapshot photo of
      b = shot on a cheap point-and-shoot at an indoor party, heavy lo-fi grain with light-leak streaks along frame edges, soft-focus nostalgia with underexposed sensor noise, muted color speckling and crushed shadow noise, candid and spontaneous framing.

  // Cinematic

    ❗️❗️ Cinematic
      a = UHD cinematic film still of
      b = professional color graded image with warm skin tone highlights and cool cyan shadow tones, complementary split-toned color grading, high-production commercial aesthetic.

    Blackmagic
      a = indie film still shot on Blackmagic Pocket Cinema 6K of
      b = shot on Blackmagic Pocket Cinema 6K with Blackmagic RAW color science, Super 35 sensor rendering with rich organic skin tones and 13-stop dynamic range, expressive moody shadow detail with shallow cinematic depth of field through an EF cinema lens.

    Cinema
      a = high-budget Hollywood cinematic film still of
      b = shot on anamorphic cinema glass in CinemaScope aspect ratio, deep professional color grade with selective shallow focus and controlled bokeh, rich textured shadow detail with subtle film grain, dramatic motivated practical lighting and epic production value.

    Cinema Blue Ultra
      a = experimental 35mm film photograph of
      b = shot on Film Photography Project Blue Ultra ISO 3 emulsion, dominant blue-violet tint throughout with popping saturated reds against muted neutrals, ultra-fine grain structure with no anti-halation layer, otherworldly alien tonal palette.

    CineStill Commercial
      a = television commercial 35mm film still of
      b = shot on CineStill 50D emulsion, ISO 50 daylight-balanced cinema stock built on Kodak Vision3 film base, vivid accurate color rendition with ultra-fine grain and broad highlight latitude, characteristic red halation glow around practical light sources.

    Fujifilm GFX 100 II
      a = medium-format cinematic portrait shot on Fujifilm GFX 100 II of
      b = shot on a Fujifilm GFX 100 II with 102MP large-format sensor, extreme subject-to-background compression with razor micro-detail at the focal plane, feathered ultra-shallow depth of field falloff with rich Fujifilm color science and studio-grade tonal gradation.

    IMAX Large Format
      a = IMAX 15-perf 70mm film frame of
      b = captured on IMAX 15-perf 70mm horizontal gate with massive frame area ten times standard 35mm, extraordinary spatial depth and immersive scale with near-grain-free image clarity, precise edge-to-edge sharpness in full 1.43:1 aspect ratio.

    Kodak Vision3 500T
      a = motion picture film still of
      b = shot on Kodak Vision3 500T tungsten-balanced emulsion, 3200K color science with cool blue-shifted daylight rendering, fine grain structure with clean shadow detail and broad highlight latitude, rich tonal gradation with a cinematic motion picture film look.

    Macro Beauty
      a = extreme close-up macro beauty photograph of
      b = shot on a Canon 100mm macro lens at f/2.8, ultra-shallow depth of field with clinical razor sharpness at the focal point, visible skin pores and fine surface texture with smooth creamy background separation, professional high-end beauty photography lighting.

    National Geographic
      a = award-winning National Geographic documentary photograph of
      b = shot on a telephoto prime lens with professional photojournalism composition, natural ambient color accuracy with subtle film grain, authentic on-location documentary photography gravity and moment.

    Nikon F
      a = 1960s photojournalism still shot on Nikon F of
      b = shot on a Nikon F SLR body with a fast Nikkor prime lens, sharp photojournalistic focus with high-contrast rendering, deep shadow density with tactile silver halide grain, raw honest press-photography aesthetic.

    Olympus Street
      a = 1960s half-frame street photograph shot on Olympus Pen F of
      b = shot on an Olympus Pen F half-frame 35mm SLR with a Zuiko 38mm f/1.8 lens, 18x24mm vertical frame with elevated grain from the small negative area, compact intimate street photography feel with 1960s Japanese documentary aesthetic.

    Panasonic Lumix S1H
      a = Panasonic Lumix S1H cinematic still of
      b = shot on a Panasonic Lumix S1H with V-Log L flat color profile graded for cinema, anamorphic lens desqueezed framing with soft digital highlight roll-off, warm controlled cinematic color grade through cinema-grade full-frame sensor rendering.

Hunknown

Thank you for your insightful tips, I'm definitely a common user of Perchance t2i and have no real knowledge about Flux or how I should create the prompts to better instruct Flux to create what I want. That said, my personal 'trial and error' experience is different than what you just stated. For example, still today I notice visible differences when enclosing something in brackets; not sure whether Perchance makes any difference when using () or ((())), but still there's a difference, especially if the bracketed element is in the last part of the prompt. I'm not sure what the SFG is 😅. If it's what the default Perchance t2i interface calls 'Guidance Scale', then a value of 1-5 gives horrible results with realistic images; not so bad when it comes to render a painting or drawing. My prompts are quite complex, and I often use a Guidance Scale ranging from 10 to 20, to balance the adherence of the image to the prompt and the realistic output. Lastly, I've modified some of the standard Perchance styles to suit my needs, but the evergreen "Cinematic" style (and a few others) is just perfect as it is, and still gives outstanding results.

Garth01

Bro is justpassing 2.0