Spyke

Posts

fediadminbr·Admins do FediversobyAdemir

I created some WAF Cloudflare rules and now my server is smooth and FAST

cross-posted from: https://lemmy.eco.br/post/18381325

Hi fellow admins,

Just wanted to share a quick tip about reducing server load.

Before implementing these Cloudflare rules, my server load (on a 4-core Ubuntu box) was consistently between 3.5 and 4.0. Now, it's running much smoother, around 0.3 server load.

The best part? It only took 3 rules, and they all work with the Cloudflare free plan.

The order of the rules are important, so please pay attention.

Allowlist

This rule is set first in order to avoid friction with other fediverse servers and good crawlers. Use the action Skip


(http.user_agent contains "Observatory") or

(http.user_agent contains "FediFetcher") or
(http.user_agent contains "FediDB/") or
(http.user_agent contains "+fediverse.observer") or
(http.user_agent contains "FediList Agent/") or

(starts_with(http.user_agent, "Blackbox Exporter/")) or
(http.user_agent contains "Lestat") or
(http.user_agent contains "Lemmy-Federation-Exporter") or
(http.user_agent contains "lemmy-stats-crawler") or
(http.user_agent contains "lemmy-explorer-crawler/") or

(starts_with(http.user_agent, "Lemmy/")) or
(http.user_agent contains "Mlmym") or
(http.user_agent contains "Photon") or
(http.user_agent contains "Boost") or
(starts_with(http.user_agent, "Jerboa")) or
(http.user_agent contains "Thunder") or
(http.user_agent contains "VoyagerApp/") or

(cf.verified_bot_category in {
    "Search Engine Crawler"
    "Search Engine Optimization" 
    "Monitoring & Analytics"
    "Feed Fetcher"
    "Archiver"
    "Page Preview"
    "Academic Research"
    "Security"
    "Accessibility"
    "Webhooks"
  }
  and http.host ne "old.lemmy.eco.br"
  and http.host ne "photon.lemmy.eco.br"
) or

(http.user_agent contains "letsencrypt"
  and http.request.uri.path contains "/.well-known/acme-challenge/"
) or

(starts_with(http.request.full_uri, "https://lemmy.eco.br/pictrs/") and 
  http.request.method eq "GET" and not 
  starts_with(http.user_agent, "Mozilla") and not 
  ip.src.asnum in {
    200373 198571 26496 31815 18450 398101 50673 7393 14061
    205544 199610 21501 16125 51540 264649 39020 30083 35540
    55293 36943 32244 6724 63949 7203 201924 30633 208046 36352
    25264 32475 23033 31898 210920 211252 16276 23470 136907
    12876 210558 132203 61317 212238 37963 13238 2639 20473
    63018 395954 19437 207990 27411 53667 27176 396507 206575
    20454 51167 60781 62240 398493 206092 63023 213230 26347
    20738 45102 24940 57523 8100 8560 6939 14178 46606 197540
    397630 9009 11878 49453 29802
})
  1. The User Agent contains the name of known Fediverse crawlers, and monitoring tools (e.g., "Observatory", "FediFetcher", "lemmy-stats-crawler").
  2. The User Agent contains the name of known Lemmy mobile and frontends (e.g., "Jerboa", "Boost", "VoyagerApp").
  3. The request comes from Cloudflare-verified bots in specific categories (like "Search Engine Crawler" or "Monitoring & Analytics") and is not targeting the specific hosts "old.lemmy.eco.br" or "photon.lemmy.eco.br" where I host alternative frontends.
  4. The request is a Let's Encrypt challenge for the domain (used for SSL certificate renewal).
  5. The request is a specific type of GET request to the "pictrs" image server that does not come from a standard web browser (a User Agent starting with "Mozilla") and does not originate from a list of specified Autonomous System Numbers (ASNs), this ASNs are all from VPSs providers, so no excuse for browsers UA.

Blocklist

This list blocks the majority of bad crwalers and bots. Use the action Block

(cf.verified_bot_category in {"AI Crawler"}) or

(ip.src.country in {"T1"}) or 

(starts_with(http.user_agent, "Mozilla/") and 
http.request.version in {"HTTP/1.0" "HTTP/1.1" "HTTP/1.2" "SPDY/3.1"} and 
any(http.request.headers["accept"][*] contains "text/html")) or

(http.user_agent wildcard r"HeadlessChrome/*") or

(
  http.request.uri.path contains "/xmlrpc.php" or
  http.request.uri.path contains "/wp-config.php" or
  http.request.uri.path contains "/wlwmanifest.xml"
) or

(ip.src.asnum in {
    200373 198571 26496 31815 18450 398101 50673 7393 14061
    205544 199610 21501 16125 51540 264649 39020 30083 35540
    55293 36943 32244 6724 63949 7203 201924 30633 208046 36352
    25264 32475 23033 31898 210920 211252 16276 23470 136907
    12876 210558 132203 61317 212238 37963 13238 2639 20473
    63018 395954 19437 207990 27411 53667 27176 396507 206575
    20454 51167 60781 62240 398493 206092 63023 213230 26347
    20738 45102 24940 57523 8100 8560 6939 14178 46606 197540
    397630 9009 11878 49453 29802
  }
  and http.user_agent wildcard r"Mozilla/*"
) or

(http.request.uri.path ne "/robots.txt") and 
((http.user_agent contains "Amazonbot") or
  (http.user_agent contains "Anchor Browser") or
  (http.user_agent contains "Bytespider") or
  (http.user_agent contains "CCBot") or
  (http.user_agent contains "Claude-SearchBot") or
  (http.user_agent contains "Claude-User") or
  (http.user_agent contains "ClaudeBot") or
  (http.user_agent contains "FacebookBot") or
  (http.user_agent contains "Google-CloudVertexBot") or
  (http.user_agent contains "GPTBot") or
  (http.user_agent contains "meta-externalagent") or
  (http.user_agent contains "Novellum") or
  (http.user_agent contains "PetalBot") or
  (http.user_agent contains "ProRataInc") or
  (http.user_agent contains "Timpibot")
) or

(ip.src.asnum eq 32934)
  1. The request comes from Cloudflare-verified "AI Crawler"s.
  2. The request originates from a Tor exit node (country code "T1"), it is a Tor heavy tier.
  3. The request uses a Mozilla browser User Agent with an older HTTP version and accepts HTML content, in 2025 it is super weird, all bots.
  4. The User Agent is HeadlessChrome, hence bot.
  5. The request path targets common WordPress vulnerability endpoints (/xmlrpc.php, /wp-config.php, /wlwmanifest.xml).
  6. The request originates from a specific list of Autonomous System Numbers (ASNs) and uses a Mozilla User Agent. Again, more bots.
  7. The request is not for /robots.txt and the User Agent contains the name of known crawlers or bots (e.g., "GPTBot", "Bytespider", "FacebookBot").
  8. The request originates from Autonomous System Number 32934 (Facebook).

Challenge

This one is to protect the frontends, I added some conditions in order to not make logged users verify with cloudflare. Normally a crawler won't have an user account. Set the action to Managed Challenge.

(http.host eq "old.lemmy.eco.br" and not len(http.request.cookies["jwt"]) > 0)

or (http.host eq "photon.lemmy.eco.br" 
  and not len(http.request.headers["authorization"]) > 0 
  and not starts_with(http.cookie, "ph_phc"))

or (http.host wildcard "lemmy.eco.br" 
  and not len(http.request.cookies["jwt"]) > 0 
  and not len(http.request.headers["authorization"]) > 0 
  and starts_with(http.user_agent, "Mozilla") 
  and not http.referer contains "photon.lemmy.eco.br")

or (http.user_agent contains "yandex"
  or http.user_agent contains "sogou"
  or http.user_agent contains "semrush"
  or http.user_agent contains "ahrefs"
  or http.user_agent contains "baidu"
  or http.user_agent contains "python-requests"
  or http.user_agent contains "neevabot"
  or http.user_agent contains "CF-UC"
  or http.user_agent contains "sitelock"
  or http.user_agent contains "mj12bot"
  or http.user_agent contains "zoominfobot"
  or http.user_agent contains "mojeek")

or ((http.user_agent contains "crawl"
  or http.user_agent contains "spider"
  or http.user_agent contains "bot")
  and not cf.client.bot)

or (ip.src.asnum in {135061 23724 4808}
  and http.user_agent contains "siteaudit")
  1. A request to the host "old.lemmy.eco.br" that does not have a "jwt" cookie.
  2. A request to the host "photon.lemmy.eco.br" that lacks both an "Authorization" header and a cookie starting with "ph_phc".
  3. A request to any subdomain of "lemmy.eco.br" that lacks both a "jwt" cookie and an "Authorization" header, uses a Mozilla User Agent, and does not have a referrer from "photon.lemmy.eco.br".
  4. The User Agent contains the name of a specific crawler, bot, or tool (e.g., "yandex", "baidu", "python-requests", "sitelock").
  5. The User Agent contains the words "crawl", "spider", or "bot" but is not a verified Cloudflare-managed bot.
  6. The request originates from specific Autonomous System Numbers (135061, 23724, 4808) and the User Agent contains the word "siteaudit".

All these are heavily inspired by this article: https://urielwilson.com/a-practical-guide-to-custom-cloudflare-waf-rules/

Please let me know your thoughts.

View original on lemmy.eco.br
lemmy·LemmybyAdemir

I created some WAF Cloudflare rules and now my server is smooth and FAST

Hi fellow admins,

Just wanted to share a quick tip about reducing server load.

Before implementing these Cloudflare rules, my server load (on a 4-core Ubuntu box) was consistently between 3.5 and 4.0. Now, it's running much smoother, around 0.3 server load.

The best part? It only took 3 rules, and they all work with the Cloudflare free plan.

The order of the rules are important, so please pay attention.

Allowlist

This rule is set first in order to avoid friction with other fediverse servers and good crawlers. Use the action Skip


(http.user_agent contains "Observatory") or

(http.user_agent contains "FediFetcher") or
(http.user_agent contains "FediDB/") or
(http.user_agent contains "+fediverse.observer") or
(http.user_agent contains "FediList Agent/") or

(starts_with(http.user_agent, "Blackbox Exporter/")) or
(http.user_agent contains "Lestat") or
(http.user_agent contains "Lemmy-Federation-Exporter") or
(http.user_agent contains "lemmy-stats-crawler") or
(http.user_agent contains "lemmy-explorer-crawler/") or

(starts_with(http.user_agent, "Lemmy/")) or
(starts_with(http.user_agent, "PieFed/")) or
(http.user_agent contains "Mlmym") or
(http.user_agent contains "Photon") or
(http.user_agent contains "Boost") or
(starts_with(http.user_agent, "Jerboa")) or
(http.user_agent contains "Thunder") or
(http.user_agent contains "VoyagerApp/") or

(cf.verified_bot_category in {
    "Search Engine Crawler"
    "Search Engine Optimization" 
    "Monitoring & Analytics"
    "Feed Fetcher"
    "Archiver"
    "Page Preview"
    "Academic Research"
    "Security"
    "Accessibility"
    "Webhooks"
  }
  and http.host ne "old.lemmy.eco.br"
  and http.host ne "photon.lemmy.eco.br"
) or

(http.user_agent contains "letsencrypt"
  and http.request.uri.path contains "/.well-known/acme-challenge/"
) or

(starts_with(http.request.full_uri, "https://lemmy.eco.br/pictrs/") and 
  http.request.method eq "GET" and not 
  starts_with(http.user_agent, "Mozilla") and not 
  ip.src.asnum in {
    200373 198571 26496 31815 18450 398101 50673 7393 14061
    205544 199610 21501 16125 51540 264649 39020 30083 35540
    55293 36943 32244 6724 63949 7203 201924 30633 208046 36352
    25264 32475 23033 31898 210920 211252 16276 23470 136907
    12876 210558 132203 61317 212238 37963 13238 2639 20473
    63018 395954 19437 207990 27411 53667 27176 396507 206575
    20454 51167 60781 62240 398493 206092 63023 213230 26347
    20738 45102 24940 57523 8100 8560 6939 14178 46606 197540
    397630 9009 11878 49453 29802
})
  1. The User Agent contains the name of known Fediverse crawlers, and monitoring tools (e.g., "Observatory", "FediFetcher", "lemmy-stats-crawler").
  2. The User Agent contains the name of known Lemmy mobile and frontends (e.g., "Jerboa", "Boost", "VoyagerApp").
  3. The request comes from Cloudflare-verified bots in specific categories (like "Search Engine Crawler" or "Monitoring & Analytics") and is not targeting the specific hosts "old.lemmy.eco.br" or "photon.lemmy.eco.br" where I host alternative frontends.
  4. The request is a Let's Encrypt challenge for the domain (used for SSL certificate renewal).
  5. The request is a specific type of GET request to the "pictrs" image server that does not come from a standard web browser (a User Agent starting with "Mozilla") and does not originate from a list of specified Autonomous System Numbers (ASNs), this ASNs are all from VPSs providers, so no excuse for browsers UA.

Blocklist

This list blocks the majority of bad crwalers and bots. Use the action Block

(cf.verified_bot_category in {"AI Crawler"}) or

(ip.src.country in {"T1"}) or 

(starts_with(http.user_agent, "Mozilla/") and 
http.request.version in {"HTTP/1.0" "HTTP/1.1" "HTTP/1.2" "SPDY/3.1"} and 
any(http.request.headers["accept"][*] contains "text/html")) or

(http.user_agent wildcard r"HeadlessChrome/*") or

(
  http.request.uri.path contains "/xmlrpc.php" or
  http.request.uri.path contains "/wp-config.php" or
  http.request.uri.path contains "/wlwmanifest.xml"
) or

(ip.src.asnum in {
    200373 198571 26496 31815 18450 398101 50673 7393 14061
    205544 199610 21501 16125 51540 264649 39020 30083 35540
    55293 36943 32244 6724 63949 7203 201924 30633 208046 36352
    25264 32475 23033 31898 210920 211252 16276 23470 136907
    12876 210558 132203 61317 212238 37963 13238 2639 20473
    63018 395954 19437 207990 27411 53667 27176 396507 206575
    20454 51167 60781 62240 398493 206092 63023 213230 26347
    20738 45102 24940 57523 8100 8560 6939 14178 46606 197540
    397630 9009 11878 49453 29802
  }
  and http.user_agent wildcard r"Mozilla/*"
) or

(http.request.uri.path ne "/robots.txt") and 
((http.user_agent contains "Amazonbot") or
  (http.user_agent contains "Anchor Browser") or
  (http.user_agent contains "Bytespider") or
  (http.user_agent contains "CCBot") or
  (http.user_agent contains "Claude-SearchBot") or
  (http.user_agent contains "Claude-User") or
  (http.user_agent contains "ClaudeBot") or
  (http.user_agent contains "FacebookBot") or
  (http.user_agent contains "Google-CloudVertexBot") or
  (http.user_agent contains "GPTBot") or
  (http.user_agent contains "meta-externalagent") or
  (http.user_agent contains "Novellum") or
  (http.user_agent contains "PetalBot") or
  (http.user_agent contains "ProRataInc") or
  (http.user_agent contains "Timpibot")
) or

(ip.src.asnum eq 32934)
  1. The request comes from Cloudflare-verified "AI Crawler"s.
  2. The request originates from a Tor exit node (country code "T1"), it is a Tor heavy tier.
  3. The request uses a Mozilla browser User Agent with an older HTTP version and accepts HTML content, in 2025 it is super weird, all bots.
  4. The User Agent is HeadlessChrome, hence bot.
  5. The request path targets common WordPress vulnerability endpoints (/xmlrpc.php, /wp-config.php, /wlwmanifest.xml).
  6. The request originates from a specific list of Autonomous System Numbers (ASNs) and uses a Mozilla User Agent. Again, more bots.
  7. The request is not for /robots.txt and the User Agent contains the name of known crawlers or bots (e.g., "GPTBot", "Bytespider", "FacebookBot").
  8. The request originates from Autonomous System Number 32934 (Facebook).

Challenge

This one is to protect the frontends, I added some conditions in order to not make logged users verify with cloudflare. Normally a crawler won't have an user account. Set the action to Managed Challenge.

(http.host eq "old.lemmy.eco.br" and not len(http.request.cookies["jwt"]) > 0)

or (http.host eq "photon.lemmy.eco.br" 
  and not len(http.request.headers["authorization"]) > 0 
  and not starts_with(http.cookie, "ph_phc"))

or (http.host wildcard "lemmy.eco.br" 
  and not len(http.request.cookies["jwt"]) > 0 
  and not len(http.request.headers["authorization"]) > 0 
  and starts_with(http.user_agent, "Mozilla") 
  and not http.referer contains "photon.lemmy.eco.br")

or (http.user_agent contains "yandex"
  or http.user_agent contains "sogou"
  or http.user_agent contains "semrush"
  or http.user_agent contains "ahrefs"
  or http.user_agent contains "baidu"
  or http.user_agent contains "python-requests"
  or http.user_agent contains "neevabot"
  or http.user_agent contains "CF-UC"
  or http.user_agent contains "sitelock"
  or http.user_agent contains "mj12bot"
  or http.user_agent contains "zoominfobot"
  or http.user_agent contains "mojeek")

or ((http.user_agent contains "crawl"
  or http.user_agent contains "spider"
  or http.user_agent contains "bot")
  and not cf.client.bot)

or (ip.src.asnum in {135061 23724 4808}
  and http.user_agent contains "siteaudit")
  1. A request to the host "old.lemmy.eco.br" that does not have a "jwt" cookie.
  2. A request to the host "photon.lemmy.eco.br" that lacks both an "Authorization" header and a cookie starting with "ph_phc".
  3. A request to any subdomain of "lemmy.eco.br" that lacks both a "jwt" cookie and an "Authorization" header, uses a Mozilla User Agent, and does not have a referrer from "photon.lemmy.eco.br".
  4. The User Agent contains the name of a specific crawler, bot, or tool (e.g., "yandex", "baidu", "python-requests", "sitelock").
  5. The User Agent contains the words "crawl", "spider", or "bot" but is not a verified Cloudflare-managed bot.
  6. The request originates from specific Autonomous System Numbers (135061, 23724, 4808) and the User Agent contains the word "siteaudit".

All these are heavily inspired by this article: https://urielwilson.com/a-practical-guide-to-custom-cloudflare-waf-rules/

Please let me know your thoughts.

View original on lemmy.eco.br
brasil·BrasilbyAdemir

Algumas mudanças

Bom dia a todos, todas e todes! (Nossa o ademir fala todes 🤫)

ESSE É MEU ULTIMO ANÚNCIO NESSA COMUNIDADE

Ouvindo uma demanda já antiga, a comunidade [email protected] vai se tornar uma comunidade para assuntos gerais.

Vamos ter uma nova comunidade chamada [email protected] para temas internos da instancia, onde todos usuário terão sua voz ouvida. Essa será uma comunidade local (local only), ou seja, para participar você precisa ter uma conta aqui

Vamos promover algumas discussões para trabalhar na melhoria dos serviços e como melhor promover a comunidade.

Já peço que aproveitem esse topico para deixar algum feedback e criem topicos na nova comunidade fazendo pedidos e recomendações de melhoria. Vamos trabalhar juntos, coletivamente podemos ir muito mais longe.

Usuários que precisem de ajuda, temos uma nova comunidade chamada ![email protected]. Essa sim pode ser acessada a partir de qualquer plataforma federada do ActivityPub

View original on lemmy.eco.br
brasil·BrasilbyAdemir

Novidade: Salas de Conversação XMPP no Lemmy Brasil!

🎉 Novidade: Salas de Conversação XMPP no Lemmy Brasil!

Agora cada comunidade tem sua própria sala de conversação em tempo real via XMPP!

🔗 Acesse aqui:

💬 Exemplos de salas:

Sem necessidade de autenticação

Com usuário XMPP autenticado

ℹ️ Informações importantes:

Participe da conversa! 🚀

View original on lemmy.eco.br
brasil·BrasilbyAdemir

Lemmy atualizado: v0.19.10

Mudanças

  • Correção de miniaturas do YouTube aumentando o limite de busca de metadados para 1 MB #5266
  • Remoção de mensagens privadas ao banir um usuário com a opção "remover conteúdo" (adeus, Nicole) #5414
  • Ignorar o cabeçalho Accept-Language se nenhum idioma do site for especificado, evitando que usuários com inglês desativado não consigam ver a maioria dos posts #5485
  • Habilitar inglês para usuários em instâncias com todos os idiomas ativados, resolvendo o problema acima #5489 #5493
  • Listar apenas usuários banidos locais em /admin #5364
  • Adicionar crawl-delay ao robots.txt #3009
  • Otimizar migrações incluídas na versão 0.19.6 #5301
View original on lemmy.eco.br
brasil·BrasilbyAdemir

Atualizamos para o Lemmy v0.19.9

Lemmy v0.19.9

Mudanças

Esta versão corrige um potencial problema de segurança, impedindo que o Lemmy acesse URLs locais. Há também uma correção para uma falha durante a análise de markdown. O Lemmy agora usa mimalloc em vez do alocador de sistema (geralmente glibc), o que deve melhorar o desempenho e evitar o crescimento ilimitado de memória ao longo do tempo.

Lemmy

Lemmy-UI

View original on lemmy.eco.br