Last time I updated it was closer to 120GB but if you're not sweating 100 GB then an extra 20 isn't going to bother anyone these days.
Also, thanks for reminding me that I need to check my dates and update.
EDIT: you can also easily configure a SBC like a Raspberry Pi (or any of the clones) that will boot, set the Wi-Fi to access point mode, and serve kiwix as a website that anyone (on the local AP wifi network) can connect to and query... And it'll run off a USB battery pack. I have one kicking around the house somewhere
Just built one of those using Dietpi as the OS and NVME M.2 for the storage. I have many different ZIMs and running different services and only using about 270GB.
Works great for offline use. Probably should add an ISO or 2 as well.
Watchtower to auto-update the containers while they're still network connected
Transmission daemonized to download and seed the ZIMs or anything else non-pirate related
Use jojo2357's ZIM updater to auto-update ZIMs via cron job while they're still network connected
DietPi-Dashboard as an all-in-one dashboard to monitor and control the RPi from a web interface. (Yeah I know I can do everything SSH'ing in but I'm lazy.)
File Browser just in case I want other people to have access to files but since it's in maintenance mode and I'm unsure I want others to have access, might strip it out
I try to use containers from LinuxServer.io whenever possible. Mostly just cause it's what I do on my main server.
I'm still looking at adding/removing things as I get more time to sit down but I'm pretty happy with it's current state.
That's a good question (and good idea) that I hadn't really thought about past a collection of ZIMs. The one I built advertises it's own AP SSID that anyone can connect to and then access the ZIMs that are served via kiwix-serve on HTTP/80. That is, I wanted a single, low power, headless device that multiple people could use simultaneously via wifi and browser rather than a personal device.
I hadn't really thought about other helpful services past that. I mean, we've got a (wee) server so why not use it? I like the idea of OSM and their website is open source but has a lot of dependencies :
openstreetmap-website is a Ruby on Rails application that uses PostgreSQL as its database, and has a large number of dependencies for installation
A fully-functional openstreetmap-website installation depends on other services, including map tile servers and geocoding services, that are provided by other software. The default installation uses publicly-available services to help with development and testing.
I wonder how hard it would be to host everything it needs locally/offline... and what that would do to power consumption : )
Thanks for the idea - something to look into, for sure.
Saw your comment on mine and finally saw this one.
I'm gonna take a look at openstreetmap-tile-server and see about running that since if all has gone to shit, who knows if GPS will work. Least it's almost like a paper map and can be auto-updated as long as we still have internet. Quick Gist someone wrote here.
I might beat you to it. I've got Kiwix running in docker, just did a PR to the kiwix-zim-updater so it can run in Docker on a cron schedule next to the server, and have spun those up with Karakeep (self-hosted web archive I use for bookmarking).
Right now I'm adding a ZIM list feature to the updater to list available ZIMs by language, and then I'll move on to OSM.
Yeah also if you make a Zim wiki or convert a website into Zim then you can run that stuff too. If you use Emacs it's easy to convert some pages to wikitext for Zim too
I wonder if there's anyways to edit these files afterwards? They tend to be read only, right? I must confess, I don't have too much experience with this myself.
Well I think it would be cool to be able to fix/edit any inaccurate articles, or pages that may have been messed with by trolls, or to update with more up to date info.
It depends if you want the images or previous versions of wikipedia too. The current version is about 25Gb compressed, the dump with all versions is aparently multiple terabytes. They don't say how much media they have, but I'm guessing it's roughly "lots".
Still remember the PSU blast taking out my main drive plus my backup drive in like 2001. I thought I was so good because I at least had a backup 😑. Those were the days 🤷🏻♀️
Just built one of these myself. I went NVME M.2 instead of SD Card to avoid data corruption. I know SD Cards are fine if you don't write to them a lot but if you wanna update or add your own stuff, scares me. Plus NVME is just so much faster.
You find a generator, or solar panels, or wind mill, or water turbine, or a bicycle hooked up to a generator.
If electricity permanently goes out then we're in a scavenger situation and it is time to start taking apart things that are no longer necessary to build the things that are.
start taking apart things that are no longer necessary to build the things that are
Hey finally a good use for all of those cars, grab the alternators out for small generators (since bicycles are the ultimate apocalypse vehicle: simple, small, easy to maintain and don't require complex fuels)
Eink displays are pretty awesome for this sort of thing, I repuposed a kobo ereader as a household info display and it worked nicely. Those PaPiRus screens look easier to interface with, but a little small for reading wikipedia articles. They'd do in a pinch, but the eyestrain would have me looking for a bigger solution.
Pretty much what Sinthesis said; USB power brick and/or solar panels. Both at the ready and tested. Also got a big ass battery backup that will charge off solar panels.
You'd first have to buy a phone that can run postmarketos and these are much rarer than I wish they were. Is there even anything new that can run it? Pine64 stopped making phones and said they'll make a new one when they can make it RISC-V.
Fairphone maybe I guess. 4 is listed as a supported device, but someone has gotten it working on 6 too.
Cause if ya wanna go overboard like I did, 1TB of NVME storage, can add with SD Card if necessary. 16GB RAM. Very little learning curve for my part as I use SBCs often. Plus almost every Docker container and program I want works on RPi without any hassle.
There's also more robust guides and community for RPi.
The official repo is only about 80GB, I have an old copy from when I was running an airgapped system. Not sure about the AUR, it's probably in the TBs range though.
AUR might not be as big as you think. It would wouldn't work for offline tho since many AUR packages pull archives from websites during the build process.
Oh yeah, didn't think about that. Having a bunch of PKGBUILD files isn't very useful.
I guess you could compile all the things, if you had a lot of spare processing power. A once a year snapshot would probably be enough for anything short of a Mad Max future.
Hmm, 3tb to mirror all the architectures I have around is very tempting. I have an unused dual bay external, and another one that desperately needs (wants) to have its 4tb drives upgraded.
They are probably using a phone app which allows you to swipe sideways to downvote and also using screen gestures to 'go back'. I've accidentally downvoted things this way.
I have been archiving Linux builds for the last 20 years so I could effectively install Linux on almost any hardware since 1998-ish.
I have been archiving docker images to my locally hosted gitlab server for the past 3-5 years (not sure when I started tbh). I've got around 100gb of images ranging from core images like OS to full app images like Plex, ffmpeg, etc.
I also have been archiving foss projects into my gitlab and have been using pipelines to ensure they remain up-to-date.
the only thing I lack are packages from package managers like pip, bundler, npm, yum/dnf, apt. there's just so much to cache it's nigh impossible to get everything archived.
I have even set up my own local CDN for JS imports on HTML. I use rewrite rules in nginx to redirect them to my local sources.
my goal is to be as self-sustaining on local hosting as possible.
Everyone should have this mindset regarding their data. I always say to my friends and family, "If you like it, download it.". The internet is always changing and that piece of media that you like can be moved, deleted, or blocked at any time.
Neither are that bad honestly. I have jigdo scripts I run with every point release of Debian and have a copy of English Wikipedia on a Kiwix mirror I also host. Wikipedia is a tad over 100 GB. The source, arm64 and amd64 complete repos (DVD images) for Debian Trixie, including the network installer and a couple live boot images, are 353 GB.
Kiwix has copies of a LOT of stuff, including Wikipedia on their website. You can view their zim files with a desktop application or host your own web version. Their website is:
https://kiwix.org/
If you want (or if Wikipedia is censored for you) you can also look at my mirror to see what a web hosted version looks like:
https://kiwix.marcusadams.me/
Note: I use Anubis to help block scrapers. You should have no issues as a human other than you may see a little anime girl for a second on first load, but every once and a while Brave has a disagreement with her and a page won't load correctly. I've only seen it in Brave, and only rarely, but I've seen it once or twice so thought I'd mention it.
I rarely get bounced by Anubis, but oddly enough it has happened to me a couple times in FF, I suspect it’s the fingerprinting resistance settings that cause this to happen? Hasn’t happened in a while though
It was trading CD-R's during my high school days.... good times. Napster was just starting to take off by the time we had a CD-R trading network set up, Napster just increased the amount of CD's that got passed around.
Over a 30-mile (48 km) distance, a single pigeon may be able to carry tens of gigabytes of data in around an hour, which on an average bandwidth basis compared very favorably to early ADSL standards, even when accounting for lost drives.
Compared to what I use at home now, this sounds great
A good way to see what the future of places like the U.S are is to look at places like North Korea, where they do exactly this, move files around on flash media to avoid the state censors.
Torrents are often used for installers, but for packages it tends to be more trouble than what it's worth. Is creating a torrent for a 4k library worth it?
but if you want the easier version just get Kiwix on whatever device in front of you right now (yes, even mobile phone assuming you have the space) then get whatever content you need.
The point though is having such a repository takes minutes. If you don't have the space, buy a 512Go microSD for 50EUR then put that on, stuff it in a drawer then move on. If you want to every 3 months or whenever you feel like it, updated it.
TL;DR: takes longer to write such a meme than actually do it.
Watch out for flash data corruption. Lots of cheap flash (USB sticks, SD cards, SSDs) lose data after just a few years of offline storage. Something something quantum tunnel bullshit, iirc.
So either look for media that guarantee long cold storage retention (lots of businesses need to keep shit for 10 years for tax reasons), or occasionally plug it in and let do the housekeeping.
It's more that flash NAND uses a small electric charge to keep the NAND gates in the correct configuration. Over time, that charge dissipates. If you power the storage device every once in a while, you minimize these chances.
Here's a video explaining why it happens to Wii U's after being powered off for a while. https://youtu.be/JHME4zLs6Qs
User older flash tech can be useful here. You might not always need the highest density storage if you want to maintain files for a long time. Getting stuff built in a much larger process node makes for a much more stable form of storage.
Thanks but even though it's on a plugged HDD I don't even care for any of that data. What I mean is that none of that data is sensitive. It might be useful, potentially, but it's not unique. What I mean is that if somehow my .zim file for Wikipedia was corrupted I could download it again from https://library.kiwix.org/#lang=eng&category=wikipedia or elsewhere in ~30min (just checked).
What I'm trying to highlight here is more the process than the actual outcome.
TL;DR: yes, if one is actually serious about just getting and storing, they should verify periodically if the data is indeed fine. What I do want to highlight though is to first know how to do it at all. Anyway, you are right that for a proper solution on the long run one must understand how (cold) storage actually works. My heuristic is that it's like can food (which I don't use much), it might last a while, but not forever.
It can be but not to me. To me the point is to test what's actually feasible and usable. It can be Wikipedia on my HDD but it could also be SO on a microSD or a RPi ... or it could be something totally different on another piece of hardware with another piece of storage. It will depend on the context.
So again, sure, having the data itself feels nice but in practice I never really needed it. If tomorrow my HDD would die I would shrug. If tomorrow Kiwix library wouldn't work anymore, I'd be disappointed but I could rely on .zim file elsewhere, e.g. on torrent trackers.
IMHO the point isn't files, the point is usable knowledge.
Edit : to be clear this isn't philosophy, you can see exactly what I mean and even HOW I do it (and even when) with the edits of my public wiki or my git repositories.
-rw-r--r-- 1 fabien fabien 103G Jul 6 2024 wikipedia_en_all_maxi_2024-01.zim
# encyclopedia Wikipedia English with images and more
-rw-r--r-- 1 fabien fabien 81G Apr 22 2023 gutenberg_mul_all_2023-04.zim
# Project Gutenberg, book collection in multiple languages
-rw-r--r-- 1 fabien fabien 75G Jul 7 2024 stackoverflow.com_en_all_2023-11.zim
# StackOverflow, programming questions and answers
-rw-r--r-- 1 fabien fabien 74G Mar 10 2024 planet-240304.osm.pbf
# OpenStreetMap low resolution for the whole World
-rw-r--r-- 1 fabien fabien 3.8G Oct 18 06:55 debian-13.1.0-amd64-DVD-1.iso
# Debian base ISO
-rw-r--r-- 1 fabien fabien 2.6G May 7 2023 ifixit_en_all_2023-04.zim
# iFixit colection of guides to fix appliances
-rw-r--r-- 1 fabien fabien 1.6G May 7 2023 developer.mozilla.org_en_all_2023-02.zim
# Web development documentation
-rw-r--r-- 1 fabien fabien 931M May 7 2023 diy.stackexchange.com_en_all_2023-03.zim
# Do It Yourself Q&A
-rw-r--r-- 1 fabien fabien 808M Jun 5 2023 wikivoyage_en_all_maxi_2023-05.zim
# WikiVoyage, the version of Wikipedia for traveling
-rw-r--r-- 1 fabien fabien 296M Apr 30 2023 raspberrypi.stackexchange.com_en_all_2022-11.zim
# Raspberry Pi Q&A
-rw-r--r-- 1 fabien fabien 131M May 7 2023 rapsberry_pi_docs_2023-01.zim
# Rasspberry Pi documentation
-rw-r--r-- 1 fabien fabien 100M May 7 2023 100r-off-the-grid_en_2022-06.zim
# Off the grid documents
-rw-r--r-- 1 fabien fabien 61M May 7 2023 quantumcomputing.stackexchange.com_en_all_2022-11.zim
# Quantum computer Q&A
-rw-r--r-- 1 fabien fabien 45M May 7 2023 computergraphics.stackexchange.com_en_all_2022-11.zim
# Computer graphics Q&A
-rw-r--r-- 1 fabien fabien 37M May 7 2023 wordnet_en_all_2023-04.zim
# Graph of words in English
-rw-r--r-- 1 fabien fabien 23M Jul 17 2023 kiwix-tools_linux-armv6-3.5.0-1.tar.gz
# Kiwix to read .zim files
-rw-r--r-- 1 fabien fabien 16M Oct 6 21:32 be-stib-gtfs.zip
# public transport database in Brussels, Belgium
-rw-r--r-- 1 fabien fabien 3.8M Oct 6 21:32 be-sncb-gtfs.zip
# train transport database in Belgium
-rw-r--r-- 1 fabien fabien 2.3M May 7 2023 termux_en_all_maxi_2022-12.zim
# Termux, Linux tooling on Android, documentation in English
-rw-r--r-- 1 fabien fabien 1.9M May 7 2023 kiwix-firefox_3.8.0.xpi
# Kiwix Web Extension for the Firefox browser
By the way, there's now a Wikipedia 2025 snapshot.
I am currently trying to fit that on my phone somehow. I wish I could just omit the index database at the end that can't be split it seems. I have to keep it, but when it's split up, it doesn't work anyway (search is broken that way) (https://github.com/openzim/zim-tools/issues/295).
My phone can only do FAT32 for SD cards...
For 2024 Wikipedia, that seems to be around 18GiB of wasted space.
I've got Ulefone Armor 24. It can take a 1TB Micro SD, but only FAT32. Why a Linux-based OS can only do FAT32, despite supporting other FSs on internal storage goes beyond me.
Unfortunately, this is rather dependent on manufacturer (or rather how much they can fuck up).
Android 14, but without exFAT support.
I tried multiple, exFAT, ext4, f2fs, NTFS, nothing else works.
Yeah not gonna lie, i think i heard someone in a youtube video a while back talk about how the entirety of wikipedia takes up like 200 gigs or something like that, and it got me seriously considering to actually make that offline backup. Shit is scary when countries like the uk are basically blocking you from having easy access to knowledge.
Yeah, it’s surprisingly small when it’s compressed if you exclude things like images and media. It’s just text, after all. But the high level of compression requires special software to actually read without uncompressing the entire archive. There are dedicated devices you can get, which pretty much only do that. Like there are literal Wikipedia readers, where you just give it an archive file and it’ll allow you to search for and read articles.
If my experience with mashing the random article button is any indicator, you could reduce the size by 30% just by removing articles on sports players. I doubt I'll need those
UKGOV haven't started on things like Wikipedia yet. They know kids use it for school and blinded by ideology though they are, even they can see there'd be an enormous backlash if they blocked it any time soon.
If that's going to happen at all, I doubt it would be before the next election. That's whether Labour get re-elected or the Tories make an unexpected comeback. You can tell how far Labour have fallen in the eyes of their party faithful when they've taken a Tory-drafted policy and made it their own.
Ironically, the up and coming third option fascist party, have said they're going to repeal the Online Safety Act. They have other fish to fry if they get in, and they'll want to keep their preferred demographic(s) happy while they do it.
I assume that eventually something like the OSA would come back to "protect the children". They love the current US President.
None of this is hopeful. Take this as more of a rant.
Removing books about sucking cum out of anuses from public schools isn't really "burning books." You can still buy them whenever you want, just not putting them in taxpayer funded schools with children.
EDIT: Had to add some details of the "books being burned [but really just removed from public school]":
During public comment, one woman read a passage from “Yolo” by Lauren Miracle which is found in Freedom High School.
“I climbed onto of him and started kissing him in a way that said very clearly here I am, I’m ready to have sex,” the speaker read.
Another title, “Anatomy of a Single Girl” by Daria Snadowsky, was also read by a speaker.
“Guy tries rubbing my clitoris with his fingers, he wiggles his pelvis back and forth,” another woman read from the book.
“This is ridiculous that this school – any school – has this book,” the woman said to the board.
Julie Gebhards, the woman seen in the first video of our story, is a Hillsborough County mom of six children.
Gebhards read an excerpt from the book “Invisible Monsters Remix” by Chuck Palahniuk. According to the district’s online book library, the title is found in Steinbrenner High School.
“He shoots his load, and then plants his mouth on your anus and sucks out his own warm sperm, plus whatever lubricant and feces are present. That’s felching. It may or may not, I add, include kissing you to pass the sperm and fecal matter into your mouth,” Gebhards said.
We should ban mention of Christianity in public. We should also make it illegal for anyone to teach their children Christianity. Practicing Christians should be declared mentally ill, and if they practice their faith in front of children, they should be put on the sex offender registry.
These freaks actually put giant statues of a naked bleeding man up on full public display in buildings. And they believe the most holy book in the world is one that features incest, murder, rape, genocide, and often fully endorses these horrors. Their main ritual is a form of public ritual cannibalism.
Christians are too dangerous to be allowed near children.
Oh no! What have you done! Now, I want to go try felching because I saw a message about it online with no context and I just. Have. To. Try. It.
...
Oh wait, no. No, I don't. Pfew!
Yeah, seems like there's nothing as simple as something similar to a git clone available.
One would probably have to download multiple full copies from different times and then merge them with deduplication, to get that answer.
The broad censorship of government data in the US, combined with the recent political attacks on Wikipedia caused me to download the whole English Wikipedia earlier this year. Guessing OP is similar
Not sure why they'd download Debian with all packages though
Edit: I should mention it's less about a potential loss of Wikipedia as it is a personal source of truth on politically sensitive topics that get censored, or turned to propaganda by bots
For example the Wounded Knee Massacre. Pete Hegseth has recently been calling it the, "Battle of Wounded Knee". I wouldn't be surprised if the current administration went to war with Wikipedia and forced them to 1) Change articles they disagree with, and 2) Hide those changes from history
My rationale with Debian is that distros are kind of like portals to entire compendiums of free and open-source software. With the increasing attacks on vpns in particular right now, I'm concerned there are any number of programs we take for granted that we might not have access to soon.
The internet is already deeply enshittified. There is a real possibility that it will no longer be a free and open web in any capacity soon. So it's past time to make archives, and start setting up meshnets.
I had downloaded the full (no pictures) Wikipedia earlier this year for exactly this reason. This thread told me about kiwix, which is awesome, so I downloaded the "Wikipedia .08" using kiwix, which is the best 45,000 articles from Wikipedia with pictures and it's 7G, very manageable, has most topics anyone would care about.
I saw that post about texas requiring app stores (specifically says mobile devices, so not the typical distro repository... yet) to have age verification. If they expand that, it would mean all linux distros, while maybe leaving the windows .exe downloads (ugh, shudder) alone. Wikipedia is probably more relevant in most folks' minds for having a backup though.
To paraphrase Stan Lee here, comics are like boobs. They look good on the internet, but there is just something special about holding them in your hands.
Girls with Slingshots, it ended over a decade ago, but I still love the characters. I realized if the author dies and stops renewing the website it could disappear. As a foundational part of my early twenties I couldn't accept that.
Can't remember who it was (b3ta? popbitch? penny-arcade?), but I recently saw a comment by someone who's been running a website since the turn of the millennium, and they said that fully 99% of the links they posted two decades ago were no longer valid.
To really put that into perspective, you have to remember that for most sites to get linked to from a popular site like that, meant that it was usually something of value that would have had a lot of work put into it, and that people found interesting or useful.
Years ago I bought a physical encyclopedia. I remember having one as a kid and using it for school reports. Also just looking through it can be cool. Learning about something you never knew existed is just a unique experience and doing it through a physical book just deepens the whole experience.
I also learned the practice of printing a physical encyclopedia is going out of fashion. I think there is only one company the still prints a yearly encyclopedia and it's not Encyclopedia Britannica of all things. Might have change since I bought my copy but go give some physical media some love if you can.
I would add in some rom collections and book repositories as well. The whole library of Nintendo games is under a gig and would go a long way for entertaining people.
I would love to have a small Wikipedia browser that can survive the apocalypse.
E-ink display, mini keyboard and touchpad, multiple ways/ports to transfer info, All wrapped up in a heavy duty equipment case that's able to survive a building collapses and burns in an earthquake, that's shielded from EMP.
I would love to have a small Wikipedia browser that can survive the apocalypse.
I've got the full 120 GB Wikipedia dump running in Kiwix on a Raspberry Pi Zero. Works great (surprisingly)
E-ink display, mini keyboard
Have been using a Minimal Phone for a few months now which has both of those. Can connect to the Pi easily.
multiple ways/ports to transfer info,
Add a USB-C hub (or add a hub to the Pi) and you're set
All wrapped up in a heavy duty equipment case that's able to survive a building collapses and burns in an earthquake, that's shielded from EMP.
And that's where I'm limited - My 3D printer can only do so much lol. 😆
I've been working on a side project this week with a Orange Pi Zero 2W (Pi Zero "clone" but with better specs). It's got the Kiwix+Wikipedia like my older Pi (described above) plus a bunch of other neat stuff. It's kind of a combination travel router, portable web app server, party box, and extremely over-engineered bluetooth speaker all-in-one. Hoping to put together a show-and-tell post about it when I get the last of it squared away.
Very interested in your setup for that opi2w. I have one that is being retired from pihole duty that I'll be doing similar to. Also want to add an sdr to it so it can pull ghostnet js8call and the like.
Ooh, I haven't tried RTL-SDR on it yet, but I think I'm nearing capacity on what it can do at once lol.
Here's the block diagram for it (in spoiler below). Everything's up and running except the Bluetooth Receiver -> Snapcast (it works on the bench but I don't have the scripting/automation done yet). I'm also adding an SMA connector for an external antenna, but the new base part is still printing. Photo shows it "as is" of this writing.
SSL for the web apps was a PITA since I wanted real certs. Had to make a wildcard domain under my main hobby domain, so all my apps are like "https://{APP_NAME}.mobile.mydomain.xyz/"
As soon as I can get the Bluetooth + Pulseaudio scripting done, I'm gonna try to do a write up and maybe a show/tell post.
So I actually have a dockerized Debian/Ubuntu mirror I think is like 2 versions ob Debian and the latest Ubuntu and still less then 1tb in total size. The English wikipedia is 50gb so overall not that much and very doable. However pretty unnecessary
If anyone is interested in philosophy, religion, or just want to archive it for historical reasons, IIRC sacred-texts.com has a USB version of their entire archive. They sell it, but I'm sure someone could find a work around there, if they were opposed to supporting them* for some reason. It's a massive collection of philosophical and religious works, and I believe they even have things like constitutions and legal works, as well.
*I know nothing about the people that run it or their ideology
For wikipedia you'll want to use Kiwix. A full backup of wikipedia is only like 100GB, and I think that includes pictures too.
Last time I updated it was closer to 120GB but if you're not sweating 100 GB then an extra 20 isn't going to bother anyone these days.
Also, thanks for reminding me that I need to check my dates and update.
EDIT: you can also easily configure a SBC like a Raspberry Pi (or any of the clones) that will boot, set the Wi-Fi to access point mode, and serve kiwix as a website that anyone (on the local AP wifi network) can connect to and query... And it'll run off a USB battery pack. I have one kicking around the house somewhere
Just built one of those using Dietpi as the OS and NVME M.2 for the storage. I have many different ZIMs and running different services and only using about 270GB.
Works great for offline use. Probably should add an ISO or 2 as well.
What other services are you running?
@[email protected] asked what else I was running in a sibling comment to yours and I didn't have an answer because I'm not... yet : )
DietPi makes it dead simple to run most of these things as their "software suite" is pretty robust and simple to setup.
For "user facing" applications:
For "admin side" stuff:
I try to use containers from LinuxServer.io whenever possible. Mostly just cause it's what I do on my main server.
I'm still looking at adding/removing things as I get more time to sit down but I'm pretty happy with it's current state.
Do you recommend adding anything else to it?
For instance, OSM maps?
I've been thinking about running the Kiwix app + OSMAnd on an old Android phone and auto updating it once a year.
That's a good question (and good idea) that I hadn't really thought about past a collection of ZIMs. The one I built advertises it's own AP SSID that anyone can connect to and then access the ZIMs that are served via
kiwix-serveon HTTP/80. That is, I wanted a single, low power, headless device that multiple people could use simultaneously via wifi and browser rather than a personal device.I hadn't really thought about other helpful services past that. I mean, we've got a (wee) server so why not use it? I like the idea of OSM and their website is open source but has a lot of dependencies :
I wonder how hard it would be to host everything it needs locally/offline... and what that would do to power consumption : )
Thanks for the idea - something to look into, for sure.
Saw your comment on mine and finally saw this one.
I'm gonna take a look at openstreetmap-tile-server and see about running that since if all has gone to shit, who knows if GPS will work. Least it's almost like a paper map and can be auto-updated as long as we still have internet. Quick Gist someone wrote here.
Yeah, I feel the same in that it's assuredly doable, but how hard is it?
If you're able to dig into and make some progress, please tag me because I'm interested but don't have much time these days.
I might beat you to it. I've got Kiwix running in docker, just did a PR to the
kiwix-zim-updaterso it can run in Docker on a cron schedule next to the server, and have spun those up with Karakeep (self-hosted web archive I use for bookmarking).Right now I'm adding a ZIM list feature to the updater to list available ZIMs by language, and then I'll move on to OSM.
You'll definitely beat me to it : D
Do me a favor and tag me when you post your how to?
I will do my best to remember hah
Yeah also if you make a Zim wiki or convert a website into Zim then you can run that stuff too. If you use Emacs it's easy to convert some pages to wikitext for Zim too
120GB not including Wikimedia 😉
Also, I wish they included OSM maps, not just the wiki.
You can easily download planet.osm, I think it's a couple of TB for the compressed file.
You can also offline the whole of Project Gutenberg with Kiwix, it's about 70GB IIRC.
I wonder if there's anyways to edit these files afterwards? They tend to be read only, right? I must confess, I don't have too much experience with this myself.
It's probably hundreds of thousands of HTML files, no? What is the fear about being able to edit or not?
I believe kiwix uses zim files.
Okay, I'm unfamiliar with both. Well, I still don't understand why read-only state matters; are you concerned about tampering?
Well I think it would be cool to be able to fix/edit any inaccurate articles, or pages that may have been messed with by trolls, or to update with more up to date info.
Oh, well, yeah, you can do that with Wikipedia as it is. I would be surprised if you couldn't edit a local file.
The English Language Wikipedia probably wouldn't be hard, or Debian Stable.
All of Debian's packages might be a tad more expensive, though.
This might be a good place to start for Wikipedia;
https://meta.wikimedia.org/wiki/Data_dump_torrents#English_Wikipedia
And the english with no pictures is even smaller
And you can use Kiwix to setup a locally hosted wikipedia using the data dumps
It depends if you want the images or previous versions of wikipedia too. The current version is about 25Gb compressed, the dump with all versions is aparently multiple terabytes. They don't say how much media they have, but I'm guessing it's roughly "lots".
"backups"? Pray tell, fine sir and or madam, what is that?
You know there's only two kind of people, those who do backups and those that haven't lost a hard drive/data before. Also: raid is no backup
Still remember the PSU blast taking out my main drive plus my backup drive in like 2001. I thought I was so good because I at least had a backup 😑. Those were the days 🤷🏻♀️
That sounds like an adventure!
Ya, me learning that a dinky psu is your worst enemy, i upgraded my SOs old duron to an athlon for work, which used more energy...
My condolences! That said Athlons were late 90s (?) cool.
I stumbled across this sort of fascinating area of doomsday prepping a few weeks back.
https://prepperpress.com/usb/
A nice addition to that, don't just make it a USB, but a raspberry pi. So you'd have a reasonably low-powered computer you could easily take with you.
Not suggesting this one as it seems a bit expensive to me, but https://www.prepperdisk.com/products/prepper-disk-premium-over-512gb-of-survival-content?view=sl-8978CA41
Just built one of these myself. I went NVME M.2 instead of SD Card to avoid data corruption. I know SD Cards are fine if you don't write to them a lot but if you wanna update or add your own stuff, scares me. Plus NVME is just so much faster.
How would you access the info if electricity permanently goes out?
You find a generator, or solar panels, or wind mill, or water turbine, or a bicycle hooked up to a generator.
If electricity permanently goes out then we're in a scavenger situation and it is time to start taking apart things that are no longer necessary to build the things that are.
Hey finally a good use for all of those cars, grab the alternators out for small generators (since bicycles are the ultimate apocalypse vehicle: simple, small, easy to maintain and don't require complex fuels)
You only need 20 watts of power. One of those dinky fold up solar panels would work. Add a USB power brick for cloudy days.
2W for a RPi Zero with data on a microSD
You're going to need a monitor as well.
I have a PaPiRus ePaper eInk e.g. https://media.digikey.com/pdf/Data%20Sheets/Pi%20Supply%20PDFs/PaPiRus_ePaper_Web.pdf and even though I don't know the watts for a refresh but I assume it's one of the lowest solution you can use.
PS: FWIW if you don't refresh the display can keep the information on for months, if not years.
Eink displays are pretty awesome for this sort of thing, I repuposed a kobo ereader as a household info display and it worked nicely. Those PaPiRus screens look easier to interface with, but a little small for reading wikipedia articles. They'd do in a pinch, but the eyestrain would have me looking for a bigger solution.
Pretty much what Sinthesis said; USB power brick and/or solar panels. Both at the ready and tested. Also got a big ass battery backup that will charge off solar panels.
at this point why not just use a phone running postmarketos?
You'd first have to buy a phone that can run postmarketos and these are much rarer than I wish they were. Is there even anything new that can run it? Pine64 stopped making phones and said they'll make a new one when they can make it RISC-V.
Fairphone maybe I guess. 4 is listed as a supported device, but someone has gotten it working on 6 too.
there's lots of devices it runs on iirc, something like the pixel 3a can be had for less than a new rpi3b+ where I live
But hardware ages and dies. Will you trust a pixel 3a for the next 10 years? I'd rather have a new device for this.
Cause if ya wanna go overboard like I did, 1TB of NVME storage, can add with SD Card if necessary. 16GB RAM. Very little learning curve for my part as I use SBCs often. Plus almost every Docker container and program I want works on RPi without any hassle.
There's also more robust guides and community for RPi.
Just my thoughts.
Last I checked (3 years ago) postmarketOS drained the pinephone battery in record time :(
How would one go about making an offline copy of the repos? Asking for a friend.
Start from here https://wiki.debian.org/DebianRepository/Setup
Arch: https://wiki.archlinux.org/title/DeveloperWiki:NewMirrors
The official repo is only about 80GB, I have an old copy from when I was running an airgapped system. Not sure about the AUR, it's probably in the TBs range though.
AUR might not be as big as you think. It would wouldn't work for offline tho since many AUR packages pull archives from websites during the build process.
Oh yeah, didn't think about that. Having a bunch of PKGBUILD files isn't very useful.
I guess you could compile all the things, if you had a lot of spare processing power. A once a year snapshot would probably be enough for anything short of a Mad Max future.
That future might not be far off considering what Trump did today. Balance of power is seriously about to shift.
You ain’t wrong
Wikipedia has torrents of the text, but you'd have to download images separately.
Debian, and its base packages have mirroring instructions here. Third party repos would need mirroring separately.
Hmm, 3tb to mirror all the architectures I have around is very tempting. I have an unused dual bay external, and another one that desperately needs (wants) to have its 4tb drives upgraded.
Curious about the mindset of the one (so far) person who has downvoted this post. What is there to dislike about archiving Linux and Wikipedia? 🤔
They are probably using a phone app which allows you to swipe sideways to downvote and also using screen gestures to 'go back'. I've accidentally downvoted things this way.
I accidentally downvoted this comment
I upvote your downvote.
I was trying to go back, sorry
Kate, we have to go back!
They firmly believe that: Real men don't do backup, they cry instead.
Real men don't do backups, they write the data back to disk by memory.
Directly to disk, using a microscope and an electron gun.
Nah man, real men use butterflys.
The flapping of the wings creating turbulence creating small lenses to flip the bits manually.
Relevant XKCD
When did handwriting become so passe, anyway?
I use the website and it always automatically downvotes the 3/4th post.
I have been archiving Linux builds for the last 20 years so I could effectively install Linux on almost any hardware since 1998-ish.
I have been archiving docker images to my locally hosted gitlab server for the past 3-5 years (not sure when I started tbh). I've got around 100gb of images ranging from core images like OS to full app images like Plex, ffmpeg, etc.
I also have been archiving foss projects into my gitlab and have been using pipelines to ensure they remain up-to-date.
the only thing I lack are packages from package managers like pip, bundler, npm, yum/dnf, apt. there's just so much to cache it's nigh impossible to get everything archived.
I have even set up my own local CDN for JS imports on HTML. I use rewrite rules in nginx to redirect them to my local sources.
my goal is to be as self-sustaining on local hosting as possible.
respectable level of hoarding 🏅
Everyone should have this mindset regarding their data. I always say to my friends and family, "If you like it, download it.". The internet is always changing and that piece of media that you like can be moved, deleted, or blocked at any time.
The pornhub collapse should have taught the average person that.
You're awesome. Keep up the good work.
Neither are that bad honestly. I have jigdo scripts I run with every point release of Debian and have a copy of English Wikipedia on a Kiwix mirror I also host. Wikipedia is a tad over 100 GB. The source, arm64 and amd64 complete repos (DVD images) for Debian Trixie, including the network installer and a couple live boot images, are 353 GB.
Kiwix has copies of a LOT of stuff, including Wikipedia on their website. You can view their zim files with a desktop application or host your own web version. Their website is: https://kiwix.org/
If you want (or if Wikipedia is censored for you) you can also look at my mirror to see what a web hosted version looks like: https://kiwix.marcusadams.me/
Note: I use Anubis to help block scrapers. You should have no issues as a human other than you may see a little anime girl for a second on first load, but every once and a while Brave has a disagreement with her and a page won't load correctly. I've only seen it in Brave, and only rarely, but I've seen it once or twice so thought I'd mention it.
I rarely get bounced by Anubis, but oddly enough it has happened to me a couple times in FF, I suspect it’s the fingerprinting resistance settings that cause this to happen? Hasn’t happened in a while though
I also recommend downloading “Flashpoint archive” to have flash games and animations to stay entertained.
There is a 4gb version and a 2.3TB version.
That's quite the range
When I downloaded it years ago it was 1.8TB. It’s crazy how big the archive is. The smaller one is just so it’s accessible to most people.
Is that Flash exclusive or do they accept other games from that era?
I’m not sure, but I do think it’s just flash
What happens when they just cut the underwater cables? Torrent over carrier pigeon for a linux distro would take ages
Sneakernet to the rescue. Some of you are too young to know about walking around with boxes full of disks.
A wise man once said
It was trading CD-R's during my high school days.... good times. Napster was just starting to take off by the time we had a CD-R trading network set up, Napster just increased the amount of CD's that got passed around.
Pigeon latency is horrible, but the bandwidth is pretty great. You could probably load up an adult pigeon with at least 12TB of media.
https://en.wikipedia.org/wiki/IP_over_Avian_Carriers
Just gonna leave this here for whoever wants to read more on the methodology and potential risks.
Compared to what I use at home now, this sounds great
A good way to see what the future of places like the U.S are is to look at places like North Korea, where they do exactly this, move files around on flash media to avoid the state censors.
Tiny jump drives on pigeons is low key excellent imo
We need some more community wifi projects
Community Wisps are cool
@Maroon I thought torrent technology to be a godsend for package managers.
Why none of them use it?
I mean, damn.
@AnimalsDream
Turns out hosting a bunch of files is very cheap.
Torrents are often used for installers, but for packages it tends to be more trouble than what it's worth. Is creating a torrent for a 4k library worth it?
git and the lot are a lot better at this than people realize.
magnetlink to any linux package repo torrent if you don't mind?
i don't wanna scrape since it takes forever and burdens
FWIW :
but if you want the easier version just get Kiwix on whatever device in front of you right now (yes, even mobile phone assuming you have the space) then get whatever content you need.
If need a bit of help I recorded TechSovereignty at home, episode 11 - Offline Wikipedia, Kiwix and checksums with a friend just 3 weeks ago.
I also wrote randomly update https://fabien.benetou.fr/Content/Vademecum and coded https://git.benetou.fr/utopiah/offline-octopus but tbh KDE-Connect is much better now.
The point though is having such a repository takes minutes. If you don't have the space, buy a 512Go microSD for 50EUR then put that on, stuff it in a drawer then move on. If you want to every 3 months or whenever you feel like it, updated it.
TL;DR: takes longer to write such a meme than actually do it.
Watch out for flash data corruption. Lots of cheap flash (USB sticks, SD cards, SSDs) lose data after just a few years of offline storage. Something something quantum tunnel bullshit, iirc.
So either look for media that guarantee long cold storage retention (lots of businesses need to keep shit for 10 years for tax reasons), or occasionally plug it in and let do the housekeeping.
It's more that flash NAND uses a small electric charge to keep the NAND gates in the correct configuration. Over time, that charge dissipates. If you power the storage device every once in a while, you minimize these chances.
Here's a video explaining why it happens to Wii U's after being powered off for a while. https://youtu.be/JHME4zLs6Qs
User older flash tech can be useful here. You might not always need the highest density storage if you want to maintain files for a long time. Getting stuff built in a much larger process node makes for a much more stable form of storage.
Or look for industrial / business grade stuff with long retention times. Old flash also means less sophisticated controllers etc
Thanks but even though it's on a plugged HDD I don't even care for any of that data. What I mean is that none of that data is sensitive. It might be useful, potentially, but it's not unique. What I mean is that if somehow my
.zimfile for Wikipedia was corrupted I could download it again from https://library.kiwix.org/#lang=eng&category=wikipedia or elsewhere in ~30min (just checked).What I'm trying to highlight here is more the process than the actual outcome.
TL;DR: yes, if one is actually serious about just getting and storing, they should verify periodically if the data is indeed fine. What I do want to highlight though is to first know how to do it at all. Anyway, you are right that for a proper solution on the long run one must understand how (cold) storage actually works. My heuristic is that it's like can food (which I don't use much), it might last a while, but not forever.
I thought the point of backing stuff up was to have things in case just downloading it again isn't a viable option?
It can be but not to me. To me the point is to test what's actually feasible and usable. It can be Wikipedia on my HDD but it could also be SO on a microSD or a RPi ... or it could be something totally different on another piece of hardware with another piece of storage. It will depend on the context.
So again, sure, having the data itself feels nice but in practice I never really needed it. If tomorrow my HDD would die I would shrug. If tomorrow Kiwix library wouldn't work anymore, I'd be disappointed but I could rely on
.zimfile elsewhere, e.g. on torrent trackers.IMHO the point isn't files, the point is usable knowledge.
Edit : to be clear this isn't philosophy, you can see exactly what I mean and even HOW I do it (and even when) with the edits of my public wiki or my git repositories.
Whoa, what are all those things you have?
Commenting inline :
By the way, there's now a Wikipedia 2025 snapshot.
I am currently trying to fit that on my phone somehow. I wish I could just omit the index database at the end that can't be split it seems. I have to keep it, but when it's split up, it doesn't work anyway (search is broken that way) (https://github.com/openzim/zim-tools/issues/295).
My phone can only do FAT32 for SD cards...
For 2024 Wikipedia, that seems to be around 18GiB of wasted space.
Thanks, updating (~20min) accordingly.
FWIW I have a CMF Nothing 1 and I can put a 500Go microSD in it.
I've got Ulefone Armor 24. It can take a 1TB Micro SD, but only FAT32. Why a Linux-based OS can only do FAT32, despite supporting other FSs on internal storage goes beyond me.
Weird, assuming you have Android 13 it should be usable at least as exFAT and thus can be large enough
Unfortunately, this is rather dependent on manufacturer (or rather how much they can fuck up).
Android 14, but without exFAT support.
I tried multiple, exFAT, ext4, f2fs, NTFS, nothing else works.
Yeah not gonna lie, i think i heard someone in a youtube video a while back talk about how the entirety of wikipedia takes up like 200 gigs or something like that, and it got me seriously considering to actually make that offline backup. Shit is scary when countries like the uk are basically blocking you from having easy access to knowledge.
https://library.kiwix.org/#lang=eng&category=wikipedia
Yeah, it’s surprisingly small when it’s compressed if you exclude things like images and media. It’s just text, after all. But the high level of compression requires special software to actually read without uncompressing the entire archive. There are dedicated devices you can get, which pretty much only do that. Like there are literal Wikipedia readers, where you just give it an archive file and it’ll allow you to search for and read articles.
if you remove topics you are not interessed it can shrink even more
Sure, but removing knowledge kind of goes against what creating a Wikipedia backup is about..
If my experience with mashing the random article button is any indicator, you could reduce the size by 30% just by removing articles on sports players. I doubt I'll need those
Well, i doubt i will ever need to know anything about a football player or a car
"Fellow survivors, oh my God! What are your names?"
"I'm OJ Simpson. This is my friend Aaron Hernandez. And this is his car, Christine."
UKGOV haven't started on things like Wikipedia yet. They know kids use it for school and blinded by ideology though they are, even they can see there'd be an enormous backlash if they blocked it any time soon.
If that's going to happen at all, I doubt it would be before the next election. That's whether Labour get re-elected or the Tories make an unexpected comeback. You can tell how far Labour have fallen in the eyes of their party faithful when they've taken a Tory-drafted policy and made it their own.
Ironically, the up and coming third option fascist party, have said they're going to repeal the Online Safety Act. They have other fish to fry if they get in, and they'll want to keep their preferred demographic(s) happy while they do it.
I assume that eventually something like the OSA would come back to "protect the children". They love the current US President.
None of this is hopeful. Take this as more of a rant.
Every day it seems the entire west is gonna bee a fascist hellhole in a decade
I'm certain that when UK forces DigitalID upon the nation it will be a requirement for access to every website
Is there a context to this or just random thought?
You can ignore politics, but politics will not ignore you.
Is there a political movement targeting Debian and Wikipedia?
Conservatives hate knowledge, learning is toxic to them. Also the people who start with burning books usually end up burning people eventually
Removing books about sucking cum out of anuses from public schools isn't really "burning books." You can still buy them whenever you want, just not putting them in taxpayer funded schools with children.
EDIT: Had to add some details of the "books being burned [but really just removed from public school]":
During public comment, one woman read a passage from “Yolo” by Lauren Miracle which is found in Freedom High School.
“I climbed onto of him and started kissing him in a way that said very clearly here I am, I’m ready to have sex,” the speaker read.
Another title, “Anatomy of a Single Girl” by Daria Snadowsky, was also read by a speaker.
“Guy tries rubbing my clitoris with his fingers, he wiggles his pelvis back and forth,” another woman read from the book.
“This is ridiculous that this school – any school – has this book,” the woman said to the board.
Julie Gebhards, the woman seen in the first video of our story, is a Hillsborough County mom of six children.
Gebhards read an excerpt from the book “Invisible Monsters Remix” by Chuck Palahniuk. According to the district’s online book library, the title is found in Steinbrenner High School.
“He shoots his load, and then plants his mouth on your anus and sucks out his own warm sperm, plus whatever lubricant and feces are present. That’s felching. It may or may not, I add, include kissing you to pass the sperm and fecal matter into your mouth,” Gebhards said.
We should ban mention of Christianity in public. We should also make it illegal for anyone to teach their children Christianity. Practicing Christians should be declared mentally ill, and if they practice their faith in front of children, they should be put on the sex offender registry.
These freaks actually put giant statues of a naked bleeding man up on full public display in buildings. And they believe the most holy book in the world is one that features incest, murder, rape, genocide, and often fully endorses these horrors. Their main ritual is a form of public ritual cannibalism.
Christians are too dangerous to be allowed near children.
Oh no sex scenes!!! /s
Prude american?
As if kids aren't finding shit way worse on the internet on a daily basis. Well... maybe not felching that's pretty vile. But still.
Oh no! What have you done! Now, I want to go try felching because I saw a message about it online with no context and I just. Have. To. Try. It.
... Oh wait, no. No, I don't. Pfew!
So what was your point again?
You aren't making the point you think you're making.
https://gizmodo.com/elon-musks-wikipedia-competitor-is-going-to-be-a-disaster-2000665751
Debian? Not that I’m aware of.
Yeah I heard of wikipedia, but not debian.
gestures at everything
If you do this please share your IP so I can use your backup too
You can find me at ::1
Unlike OP, I'm not some hacker trying to get your IP address. I just need your regular address? :)
I can answer one part of your question. Yes, it's not as big as you think it is.
does this include images?
No
With images, it is 111,08 GB
That's still incredibly low, I'd have assumed an enormous increase.
Compressed or uncompressed? Can it be directly read?
Can be read directly, like normal Wikipedia.
That's very nice. Does it also include other languages, or would that take more space?
This is English only. Other languages are downloaded separately, though they typically take less space.
Nice.
How about, when included previous versions of pages? (excluding images)
Not sure, not having that option. Can imagine not much more, if proper version history management is involved.
Yeah, seems like there's nothing as simple as something similar to a
git cloneavailable.One would probably have to download multiple full copies from different times and then merge them with deduplication, to get that answer.
Sorry, I'm out of the loop. Is there something particular that triggered this that I missed?
gestures broadly
The broad censorship of government data in the US, combined with the recent political attacks on Wikipedia caused me to download the whole English Wikipedia earlier this year. Guessing OP is similar
Not sure why they'd download Debian with all packages though
Edit: I should mention it's less about a potential loss of Wikipedia as it is a personal source of truth on politically sensitive topics that get censored, or turned to propaganda by bots
For example the Wounded Knee Massacre. Pete Hegseth has recently been calling it the, "Battle of Wounded Knee". I wouldn't be surprised if the current administration went to war with Wikipedia and forced them to 1) Change articles they disagree with, and 2) Hide those changes from history
My rationale with Debian is that distros are kind of like portals to entire compendiums of free and open-source software. With the increasing attacks on vpns in particular right now, I'm concerned there are any number of programs we take for granted that we might not have access to soon.
The internet is already deeply enshittified. There is a real possibility that it will no longer be a free and open web in any capacity soon. So it's past time to make archives, and start setting up meshnets.
I had downloaded the full (no pictures) Wikipedia earlier this year for exactly this reason. This thread told me about kiwix, which is awesome, so I downloaded the "Wikipedia .08" using kiwix, which is the best 45,000 articles from Wikipedia with pictures and it's 7G, very manageable, has most topics anyone would care about.
Well for starters, teachers have had to start telling students that .gov websites are no longer considered credible sources for research.
Nice!
I saw that post about texas requiring app stores (specifically says mobile devices, so not the typical distro repository... yet) to have age verification. If they expand that, it would mean all linux distros, while maybe leaving the windows .exe downloads (ugh, shudder) alone. Wikipedia is probably more relevant in most folks' minds for having a backup though.
Nothing in particular that I'm aware of, just a growing recognition that things are very much not well in the US these days.
Yeah I wonder too.
Last year I bought a hard copy of my favorite webcomic in case the website goes down.
To paraphrase Stan Lee here, comics are like boobs. They look good on the internet, but there is just something special about holding them in your hands.
Which webcomic?
Girls with Slingshots, it ended over a decade ago, but I still love the characters. I realized if the author dies and stops renewing the website it could disappear. As a foundational part of my early twenties I couldn't accept that.
I'll have to check it out. Thanks for the recommendation.
Wait, isn't there an offline copy of a part of Wikipedia? The article Just by yourself a nice printer with enough ink and do it yourself ;)
It could cost a bit if you wanted to keep it up to date.
https://what-if.xkcd.com/59/
we need all repos to be stored offline, and documentations to troubleshoot.
the 1st i have no idea how much space we will need. Most linux packages are prerry light, no? But there is A LOT of them...
the 2nd is easy. Heard someone say the entire of wikipedia is 200GB, should be doable. Dont forget the technical wikis too: Debian, Gentoo, Arch.
Can't remember who it was (b3ta? popbitch? penny-arcade?), but I recently saw a comment by someone who's been running a website since the turn of the millennium, and they said that fully 99% of the links they posted two decades ago were no longer valid.
To really put that into perspective, you have to remember that for most sites to get linked to from a popular site like that, meant that it was usually something of value that would have had a lot of work put into it, and that people found interesting or useful.
It’s truly devastating how much of the old internet has died to the corporations taking over the internet.
The official USBs of Trixie fit all 28 DVDs of AMD64 on a 256GiB USB stick
https://www.linuxcollections.com/products/debian/debianusb.htm?id=51007
You'd probably want the 512GiB with all the sources for a real backup in this scenario
Years ago I bought a physical encyclopedia. I remember having one as a kid and using it for school reports. Also just looking through it can be cool. Learning about something you never knew existed is just a unique experience and doing it through a physical book just deepens the whole experience.
I also learned the practice of printing a physical encyclopedia is going out of fashion. I think there is only one company the still prints a yearly encyclopedia and it's not Encyclopedia Britannica of all things. Might have change since I bought my copy but go give some physical media some love if you can.
Get out of my mind.
Did I miss something? Whats happening to debian stable?
debian stable became the go to distro for long term usage in case our FOSS support structure goes haywire due to wars
I would add in some rom collections and book repositories as well. The whole library of Nintendo games is under a gig and would go a long way for entertaining people.
Book repos? I didn't know such a thing existed. Can you share more?
Project Gutenberg has a large collection of public domain books
Thank you kindly
I would love to have a small Wikipedia browser that can survive the apocalypse.
E-ink display, mini keyboard and touchpad, multiple ways/ports to transfer info, All wrapped up in a heavy duty equipment case that's able to survive a building collapses and burns in an earthquake, that's shielded from EMP.
You mean like the wiki reader:
I used it as an ebook reader until the screen gave out.
Sounds like the beginning of a proper Hitchhikers Guide to the Galaxy.
Actually having something telling me Don't Panic is big friendly letters would help my mental health...
I've got the full 120 GB Wikipedia dump running in Kiwix on a Raspberry Pi Zero. Works great (surprisingly)
Have been using a Minimal Phone for a few months now which has both of those. Can connect to the Pi easily.
Add a USB-C hub (or add a hub to the Pi) and you're set
And that's where I'm limited - My 3D printer can only do so much lol. 😆
I've been working on a side project this week with a Orange Pi Zero 2W (Pi Zero "clone" but with better specs). It's got the Kiwix+Wikipedia like my older Pi (described above) plus a bunch of other neat stuff. It's kind of a combination travel router, portable web app server, party box, and extremely over-engineered bluetooth speaker all-in-one. Hoping to put together a show-and-tell post about it when I get the last of it squared away.
Very interested in your setup for that opi2w. I have one that is being retired from pihole duty that I'll be doing similar to. Also want to add an sdr to it so it can pull ghostnet js8call and the like.
Ooh, I haven't tried RTL-SDR on it yet, but I think I'm nearing capacity on what it can do at once lol.
Here's the block diagram for it (in spoiler below). Everything's up and running except the Bluetooth Receiver -> Snapcast (it works on the bench but I don't have the scripting/automation done yet). I'm also adding an SMA connector for an external antenna, but the new base part is still printing. Photo shows it "as is" of this writing.
SSL for the web apps was a PITA since I wanted real certs. Had to make a wildcard domain under my main hobby domain, so all my apps are like "https://{APP_NAME}.mobile.mydomain.xyz/"
As soon as I can get the Bluetooth + Pulseaudio scripting done, I'm gonna try to do a write up and maybe a show/tell post.
:::spoiler Block Diagram :::
:::spoiler Current Case :::
I keep a wiki copy as well as Reddit pre-fuckuspez. A Debian archive copy sounds like a good idea.
the whole reddit? how big is it?
Ask OpenAI, since it is in their dataset 😂
Speaking of, how do I back up the entirety of ChatGPT 4? I've got a couple of spare SD cards lying around.
I'm also curious about the reddit archive. Did you copy it yourself or is this available somewhere?
I got it from Archive.org. There was a monthly dump. I can't easily find it but that's where I got it from.
don't let your dreams be dreams, friend
This is just minor datahoarding. I do it, on an extreme level.
Or, in this post fact era just generate a wiki with a hallucinating AI instead.
https://github.com/XanderStrike/endless-wiki
Honestly this project looks like a lot of fun.
I would also add Openstreetmap to the list
So I actually have a dockerized Debian/Ubuntu mirror I think is like 2 versions ob Debian and the latest Ubuntu and still less then 1tb in total size. The English wikipedia is 50gb so overall not that much and very doable. However pretty unnecessary
At this point I just keep it because I'm to lazy to change the apt.soruces files for the VM/physical PCs in my network again.
I downloaded wikipedia a month or two ago, I recommend it.
How big is Wikipedia?
If you don't care about edit history, and only care about English, there are zim files w/ images for <150 GiB
Wait why keep Debian? What happened to Debian?
Nothing, it's probably an attempt to have something stable and unchanging, so that aging doesn't show much.
The meme doesn't seem to be about Debian becoming bad, more like data hoarding.
Dont forget 3-2-1 when you do!
What is that?
3 copies of data, 2 of which are on different storage media (HDD, tape drive, etc.), 1 at an offsite location.
Don't forget the copies sealed in faraday containers.
💯
This post foreshadowed today's AWS outage.
👀
Official numbers here https://www.debian.org/mirror/size
About 4.4TB, but that's all architectures and (I believe?) all distributions (stable, testing...).
If you only want source+all+amd64+arm64, and only want stable, it will be smaller of course.
Not nothing, but at $10/TB or so, it's not much.
And if you're following 3-2-1, I'm pretty sure the "1" is already handled for you :)
Kinda curious where you’re getting $10/TB from
You're right, for new drives it looks like a little more with this 20GB retailing for $230, or $11.50/TB.
For refurbished, I recently got a factory renewed 12TB Seagate for $112 ($9.33/TB), but that price is now up to $199 for the same drive (!).
Okay so where do I find some cheap hard drives? Europe if possible :-)
look for dvr's they have huge hdds in them and you can find them at thrift stores for cheap
You'll need about 500gb of free space. not too much of an ask tbh
It makes me really happy that people can say "500gb ... not too much of an ask" these days.
Well we are talking about the greatest repository of human knowledge ever created. So we can afford to spend a little on it at least.
i know this because i actually do this. its more like ~300gb of space but its better to have even more just in case
Be smart and keep it all on thumb drives.
old pcs off amazon usually come with good reliable 1/2tb harddrive.
If anyone is interested in philosophy, religion, or just want to archive it for historical reasons, IIRC sacred-texts.com has a USB version of their entire archive. They sell it, but I'm sure someone could find a work around there, if they were opposed to supporting them* for some reason. It's a massive collection of philosophical and religious works, and I believe they even have things like constitutions and legal works, as well.
*I know nothing about the people that run it or their ideology
Thanks for reminding me about this.
Absolutely! Do you dare speak out against the words of the prophets‽
Might store it on an external HDD. I got plenty.
It's been on my to-do list for a few years now.
I still have a copy of wikipedia from 2021 somewhere on my NAS.
why? what happened 3y ago?
All the AI slop and internet censorship asking for ID now?
IT HAS ONLY BEEN 3 YEARS?!?!?!?
I kind of want that hackermans diy pc that runs on 18650 cells
I saw that Wikipedia was having funding problems, what happened to Debian?
They lie. Wikipedia has plenty of money. Do not give those parasites any more.
https://en.wikipedia.org/wiki/Wikimedia_Foundation#Spending_and_fundraising_practices
There should be.
What's a way to create a local repo mirror?
I heart there is Wikipedia on ipfs. Is that a good solution for Linux packages too?
When the Arch wiki was getting DDOS'd a few weeks ago I got a local copy from the AUR that was pretty handy.
I bought a 14tb drive just for backups of all my other drives... and I got a shitload more space.