Spyke

Replies

Comment on

The end of tt-rss.org

I have copied the latest git revision c67b943aa894b90103c4752ac430958886b996b2 from https://gitlab.tt-rss.org/tt-rss/tt-rss to my gitea instance which is mirrored to https://gitlab.com/nodiscc/tt-rss and https://github.com/nodiscc/tt-rss.

I don't intend to make changes or bugfixes (it's working fine), but I will try to keep it compatible with the PHP version in Debian stable, since I've been using it for years and would really like to keep doing so.

Comment on

A list of Free Software network services and web applications which can be hosted on your own servers

Reply in thread

awesome-selhosted maintainer here. This critique comes up often (and I sometimes agree...) but it's hard to properly "fix":

Any rule that enforces some kind of "quality" guideline has to be explicitly written to the contribution guidelines to not waste submitters' (and maintainers) time.

As you can see there are already minimal rules in place (software has to be actively maintained, properly documented, first release must be older than 4 months, must of course be fully Free and Open-source...). Anything more is very hard to word objectively or is plain unfair - in the last 7 years (!) maintaining the list I've spent countless hours thinking about it.

For example, rejecting new projects because an existing/already listed one effectively does the same thing would give an unfair advantage to older projects, effectively "locking out" newer ones. Moreover, you will rarely find two projects that have the exact same feature set, workflow, release frequency, technical requirements... and every user has different needs and requirements, so yeah, users of the list are expected to do some research to find the best solution to their particular needs.

This is of course, less true for some categories (why are there so many pastebins??). But again, it's hard to find clear and objective criteria to determine what deserves to be listed and what does not.

If we started rejecting projects because "I don't have a need for it" or "I already use a somewhat equivalent solution and am not going to switch", that would discard 90% of entries in the list (and not necessarily the worst ones). I do check that projects being added are in a "production-ready" state and ask more questions during reviews if needed. But it's hard to be more selective than we already are, without falling in subjective "I like/I don't like" reasoning (let's ban all Nodejs-based projects, npm is horrible and a security liability. Let's also ban all projects that are so convoluted and impossible to build and install properly that Docker is the only installation option. Follow my thoughts?)

Also, Free Software has always been very fragmented, which is both a strength and a weakness. The list simply reflects that.

Another idea I contemplated is linking each project to a "review" thread for the software in question. But I will not host or moderate such a forum/review board, and it will be heavily brigaded by PR departments looking to promote their companies software.

A HTML version is coming out soon (based on the same data) that will hopefully make the list easier to browse.

I am open to other suggestions, keeping in mind the points above...

250+ self hostable apps

1268 exactly.

You can help cleaning up the list of unmaintained projects by working on this issue

Comment on

What Self-Hosted Single Sign-On (SSO) do you use?

I tried OpenLDAP but Jesus that was very involved.

OpenLDAP is easy :) Once you understand LDAP concepts.

Check this and read through the tasks/ directory (particularly openldap.yml and populate.yml. It sets up everything needed for an LDAP authentication service (if you don't use ansible you can still read what the tasks do and you should get a pretty good understanding of what's needed, if not let me know).

In short you need:

  • slapd (the OpenLDAP server)
  • set up a base LDAP directory structure (OUs/Organizational Units, I only use 3 OUs: system, users and groups)
  • an admin user in the LDAP directory (mine is admin directly at the base of the LDAP directory)
  • (optional but recommended) a so-called bind user in the LDAP directory (unvprivileged account that can only list/read users/groups) (mine is bind under the system OU)
  • (optional) groups to map users to their roles (e.g. only users in access_jellyfin are allowed to login to jellyfin)
  • actual user accounts, member of one or more groups if needed

When you login to an application/service configured to use the LDAP authentication backend, it connects to the LDAP directory using the bind user credentials, and checks that the user exists (depending on how you configured the application either by name, uid, email...) , that the password you provided matches the hash stored in the LDAP directory, optionally that the user is part of the required groups. Then it allows or denies access.

There's not much else to it:

  • you can also do without the bind account but I wouldn't recommend it (either configure your applications to use the admin user in which case they have admin access to the LDAP directory... not good. Or allow anonymous read-only access to the LDAP directory - also not ideal).
  • slapd stores its configuration (admin user/password, log level...) inside the LDAP directory itself as attributes of a special entity (cn=config), so to access or modify it you have to use LDIF files and the ldapadd/ldapmodify commands, or use a convenient wrapper like the ansible modules tools used above.
  • once this is set up, you can forget LDIF files and use a web interface to manage contents of the LDAP directory.
  • OUs and groups are different and do not serve the same purpose, OUs are just hierarchical levels (like folders) inside your LDAP tree. groups can contain multiple users/users can have multiple groups so they're like "labels" without a notion of hierarchy. You can do without OUs and stash everything at the top level of the directory, but it's messy.
  • users (or other entities) have several attributes (common name, firstname, lastname, email, uid, password, description... it can contain anything really, it's just a directory service)
  • LDAP is hierarchical by nature, so user with Common Name (CN) jane.doe in OU users in the directory for domain example.org has the Distinguished Name (DC) cn=jane.doe,ou=users,dc=example,dc=org. Think of it like /path/to/file.
  • to look for a particular object you use filters which are just a search syntax to match specific entities (object classes) (users are inetOrgPersons, groups are posixGroups...) and attributes (uid, cn, email, phonenumber...). Usually applications that support LDAP come with predefined filters to look for users in specific groups, etc.

Comment on

Why use Named volume vs Anonymous volume in Docker?

  • step 1: use named volumes
  • step 2: stop your containers or just wait for them to crash/stop unnoticed for some reason
  • step 3: run docker system prune --all as one should do periodically to clean up the garbage docker leaves on your system. Lose all your data (this will delete even named volumes if they are not in use by a running container)
  • step 4: never use named or anonymous volumes again, use bind mounts

The fact that you absolutely need to run docker system prune --all regularly to get rid of GBs of unused layers, test containers, etc, combined with the fact that it deletes explicitely named volumes makes them too unsafe for my taste. Just use bind mounts.

Comment on

Ollama Server Component Recommendations

I suggest using llama.cpp instead of ollama, you can easily squeeze +10% in inference speed and other memory optimizations from llama.cpp. With hardware prices nowadays I think every % saved on resources matters. Here is a simple ansible role to setup llama.cpp, it should give you a good idea of how to deploy it.

A dedicated inference rig is not gonna be cheap. What I did, since I need a gaming rig; is getting 32GB DDR5 (this was before the current RAMpocalypse, if I had known I would have bought 64) and an AMD 9070 (16GB VRAM - again if I had known how crazy prices would get I'd probably ahve bought a 24GB VRAM card). The home server runs the usual/non-AI stuff, and llamacpp runs on the gaming desktop (the home server just has a proxy to it). Yeah the gaming desktop has to be powered up when I want to run inference, this is my main desktop so it's powered on most of the time, no big deal

Comment on

Now that vmware is over, what should I move to?

Reply in thread

/thread

This is my go-to setup.

I try to stick with libvirt/virsh when I don't need any graphical interface (integrates beautifully with ansible [1]), or when I don't need clustering/HA (libvirt does support "clustering" at least in some capability, you can live migrate VMs between hosts, manage remote hypervisors from virsh/virt-manager, etc). On development/lab desktops I bolt virt-manager on top so I have the exact same setup as my production setup, with a nice added GUI. I heard that cockpit could be used as a web interface but have never tried it.

Proxmox on more complex setups (I try to manage it using ansible/the API as much as possible, but the web UI is a nice touch for one-shot operations).

Re incus: I don't know for sure yet. I have an old LXD setup at work that I'd like to migrate to something else, but I figured that since both libvirt and proxmox support management of LXC containers, I might as well consolidate and use one of these instead.

Comment on

Help understanding Reverse Proxies

Reply in thread

This answer says it all. A reverse proxy dispatches HTTP requests to several "backend" services (your applications), depending on what domain name is requested in the HTTP request headers. For example using Apache as a reverse proxy, a config block such as

<VirtualHost *:443>
  ServerName  media.example.org
  ...
  ProxyPass "/" "http://127.0.0.1:8096/"
</VirtualHost>

will redirect requests made on port 443 with the HTTP header Host: media.example.org (for example a request to https://media.example.org/my/page) to the "backend" service listening on 127.0.0.1 (local machine), port 8096 (which may be a media server, a wiki, ...). This way you only have to expose ports 80/443 to the outside network, and the reverse proxy will take care of dispatching requests to the correct "backend" service.

Most web servers can be used as reverse proxies.

In addition, since all requests go through the proxy, it is a good place to manage centralized logging, SSL/TLS certificates, access control such as IP whitelisting/blacklisting, automatic redirects...

Comment on

In the mood to self host more stuff and need ideas

https://github.com/awesome-selfhosted/awesome-selfhosted

Seriously though, I think there needs to be a rule against these kind of "What should I host" posts (nothing against you personally OP). It comes up almost every day, also used to come up everyday on /r/selfhosted... I was talking about this with someone just a few hours ago... https://lemmy.world/comment/780603

Mods, what about a ban on these posts, and redirect people to the "What do (should) I (you) self-host" pinned post where people can go and look for suggestions? Sorry, not trying to be negative - but this is exactly why /r/selfhosted was getting boring (that, and the disguised ads).

OP, sorry to hijack your thread. Here is my recommendation for you: Shaarli

Comment on

How to store backups?

Don't use a synchronized folder as a backup solution (delete a file by mistake on your local replica -> the deletion gets replicated to the server -> you lose both copies).

old pc that has 2x 80gb, 120gb, 320gb, and 500gb hdd

You can make a JBOD array out of that using LVM (add all disks as PVs, create a single VG on top of that, create a single LV on top of that VG, create a filesystem on top of that LV, format it as ext4 filesystem, mount this filesystem somewhere, access it over SFTP or another file transfer protocol).

But if the disks are old, I wouldn't trust them as reliable backup storage. You can use them to store data that will be backed up somewhere else. Or as an expendable TEMP directory (this is what I do with my old disks).

My advice is get a large disk for this PC, store backups on that. You don't necessarily need RAID (RAID is a high availability mechanism, not a backup). Setup backup software on this old PC to pull automatic daily backups from your server (and possibly other devices/desktops... personally I don't bother with that. Anything that is not on the server is expendable). I use rsnapshot for that, simple config file, basic deduplication, simple filesystem-backed backups so I can access the files without any special software, gets the job done. There are a few threads here about backup software recommendations:

In addition I make regular, manual, offsite copies of the backup server's backups/ directory to removable media (stash the drive somewhere where a disaster that destroys the backup server will not also destroy the offsite backup drive).

Prefer pull-based backup strategies, where hosts being backed up do not have write access to the backup server (else a compromised host could alter previous backups).

Monitor correct execution of backups (my simple solution to that, is to have cron create/update a state file after correct execution, and have the netdata agent check the date of last modification of this file. If it has not been modified in the last 24-25hrs, something is wrong and I get an alert).

Comment on

What type of computer setup would one need to run ai locally?

  • Small 4B models like gemma3 will run on anything (I have it running on a 2020 laptop with integrated graphics). Don't expect superintelligence, but it works for basic classification tasks, writing/reviewing/fixing small scripts and basic chat, writing, etc
  • I use https://github.com/ggml-org/llama.cpp in server mode pointing to a directory of GGUF model files downloaded from huggingface. I access it it from the built-in web interface or API (wrote a small assistant script)
  • To load larger models you need more RAM (preferably fast VRAM/GPU but DDR5 on the motherboard will work - it will be noticeably slower). My gaming rig with 16GB AMD 9070 runs 20-30B models at decent speeds. You can grab quantized (lower precision, lower output quality) versions of those larger models if the full-size/unquantized models don't fit. Check out https://whatmodelscanirun.com/
  • For image generation I found https://github.com/vladmandic/sdnext which works extremely well and fast wth Z-Image Turbo, FLUX.1-schnell, Stable Diffusion XL and a few other models

As for the prices... well the rig I bought for ~1500€ in september is now up to ~2200€ (once-in-a-decade investment). It's not a beast but it works, the primary use case was general computing and gaming, I'm glad it works for local AI, but costs for a dedicated, performant AI rig are ridiculously high right now. It's not economically competitive yet against commercial LLM services for complex tasks, but that's not the point. Check https://old.reddit.com/r/LocalLLaMA/ (yeah reddit I know). 10k€ of hardware to run ~200-300B models, not counting electricity bills