This post knows where you're viewing it from (Lemmy doesn't proxy external images) [ARCHIVED]

This is possible because Lemmy doesn't proxy external images but instead loads them directly. While not all that bad, this could be used for Spy pixels by nefarious posters and commenters.

Note, that the only thing that I willingly log is the "hit count" visible in the image, and I have no intention to misuse the data.

177

targetx reply

programming.dev

Nice example!

I think proxying everything through lemmy would have a pretty big bandwidth/scalability impact. I expect the lemmy clients dont send any unique user info on these image requests so not sure how useful it would be as a spy pixel? Maybe I'm missing something :-)

Goddard Guryon reply

sopuli.xyz

It would be interesting to see just how much info is shared when lemmy requests the image. If there is [potentially] sensitive info being shared, the devs might be interested in working on it too (I have no idea how to check such a thing, this comment is just so I can find the post later when more people have shared their wisdom on it)

Muddybulldog reply

mylemmy.win

None (by Lemmy), as Lemmy doesn't actually request the image (that would be proxying). Your browser requests the image directly by URL. Lemmy, technically, doesn't even know an image exists. It just provides the HTML and lets your browser do the work.

A_A reply

Exactly. The text of this post is simply :

![An external image showing your user-agent and the total "hit count"](https://trilinder.pythonanywhere.com/image.jpg)
I get the same result when I browse directly to the link.

So, if OP links a malcious website we have a problem ... (?).

Goddard Guryon reply

sopuli.xyz

Oh dangit, it's simpler than I thought. So the only data being sent is...just whatever is sent in your average GET request.

newIdentity reply

Yes. It's also a pretty standard way of serving images. A lot of Email clients do that too.

That's also how these services that show you when a email is read work.

newIdentity reply

Not really that huge of a problem. When making requests you also usually send a header which includes the user agent.

The program just logs how many times the image has been requested and it reads the user agent data. No Javascript is actually executed.

Well it might be possible to have a XSS somehow but I haven't really done much research into this possibility.

In general it's a pretty standard way of handling embedded images. Email does this too. That's how you have these services that can check if someone read a mail

A_A reply

okay so I make a test here, with this :
![www.example.com](http://www.example.com/)

I believe this web page doesn't load automatically.

::: spoiler FWi The domain names example.com, example.net and example.org are second-level domain names in the Domain Name System of the Internet. They are reserved by the Internet Assigned Numbers Authority (IANA) at the direction of the Internet Engineering Task Force (IETF) as special-use domain names for documentation purposes. (...wikipedia) :::

CoderKat reply

Yup. And to add, your browser will send things like:

Your IP address. Technically this is sent by the OS doing networking and is unavoidable. At best, a VPN can hide this, because the VPN sits in the middle.
Various basic request headers, which most notably contains user agent (identifies browser) and language headers, both which you can fake if you want to.
Cookies for that domain (if you have any). Those can track you across multiple requests and thus build up a profile of you.

odbol reply

That's why you should use a native app, which won't send any of that identifying info (except for IP but there's nothing you can do on that)

ono reply

Notably, this allows remote parties to associate your IP address with your interests, as revealed by the Lemmy communities that you browse.

One way is for the image host to use the HTTP Referer field. (Standards-respecting web browsers pass the URL of the web page being viewed to the server hosting the image.)

Another way is by posting an image with a unique URL.

Even if Referer is withheld and the image is not unique, the image host can still do basic fingerprinting of your client's request header and your OS's TCP quirks, and associate that fingerprint with your IP address.

An option for Lemmy to proxy media would be very helpful. Small instances could perhaps disable it, although they might not need to, since the additional load would scale with the number of users on that instance.

PoliticalAgitator reply

Notably, this allows remote parties to associate your IP address with your interests, as revealed by the Lemmy communities that you browse.

I suspect with a coordinated pool of posts or multiple comments on the same post, you could narrow that IP address down to an actual user account.

When a new comment is posted by a user, store, against their username, all IP addresses that visited since the last comment in that thread (by anyone). When a second comment is posted by a user, remove any IP addresses that don't appear in both lists.

I suspect you would have a very short list after two comments, and a single address after 3. It would also be extremely easy to both lure someone into viewing an image and bait them into multiple replies. Geolocate that IP and you know know vaguely where that user lives.

Time to make sure you're always on a VPN I guess.

TriLinder reply

You could also send the image through a DM if you want to find a particular user

PoliticalAgitator reply

Oh yeah, that'd be much less effort.

ono reply

Even without that, once your Lemmy interests are sold/shared by IP address, they can be associated with your real identity as soon as you log in to a service that knows who you are.

lazylion_ca reply

Were you expecting otherwise? Loading an external image is no different than loading an external website with images. Lemmy and reddit are link aggregators, not proxies. Having to proxy everything would run a significant bandwidth for instance admin who are often paying out of pocket for hosting.

Seraph reply

Any chance that's why this account is posting the same image and gibberish? @googa

Erika2rsis reply

From what I remember, that image was hosted on hexbear.net, so I don't think so.

How do you get an image to run code? I guess I somehow missed something important in website development.

Edit: I saw that you said you're using Pillow to actually render the image from code. That's neat! ...and scary

possibly a cat reply

CoderKat reply

Proxying external images means that instead of the image being downloaded from the original link, your Lemmy server would download it and serve it for you. The Lemmy server acts as a proxy.

But it means performing a lot of extra traffic. And realistically you'd want to cache the image because otherwise your server will likely get banned for the high volume of requests you send. But caching the images requires more storage and can have potential for legal issues.

And images are one thing, but literally any content is the problem. Images are just the most obvious because they often load without even having to click on the image and thus you'll get far higher volume of user data. Literally anything you link to has this issue and you cannot proxy all of it.

elxeno reply

roon reply

Share source code? I'm curious

TriLinder reply

It's just a simple Flask server. I parse the user-agent using the user_agents Python library, apply some conditionals upon the result, render the image using Pillow and send it to the user.

Skull giver

[This comment has been deleted by an automated system]

157

Max reply

nano.garden

Finally. Someone noticed 🥹

vithigar reply

Joke's on you. IP geolocation where I am is an unreliable mess and your image got it wrong by about 1000km!

Skull giver reply

[This comment has been deleted by an automated system]

TwinTusks reply

outpost.zeuslink.net

Location is right, but I highly doubt anyone near me is using Lemmy (dictatorship here).

Skull giver reply

[This comment has been deleted by an automated system]

possibly a cat reply

icepuncher69 reply

Great, hot milfs near my location

mim reply

Thanks for the heads-up.

Routing my Lemmy mobile app through orbot from now on. Seems to have fixed the issue.

lFenix reply

I’m not using a VPN or anything and it got my location wrong by 700 kilometers 🤔

RickyRigatoni reply

Are you sure you are where you think you are? When's the last time you looked outside?

Daisy (she/her) reply

Oh no! I've been kidnapped!

👁️👄👁️ reply

Woah this is really cool. Though I was way off for me and I'm not on a VPN right now.

Skull giver reply

[This comment has been deleted by an automated system]

You can run Geolocation with images now? What the heck? How?

Skull giver reply

[This comment has been deleted by an automated system]

lightstream reply

It's not the image, it's a normal image. The server does the hard work when you make the request, and then it just builds the image accordingly.

Yeah I saw OPs explanation in the comments. That is fucking cool! And scary! I've never needed to generate images with code before, so Ive never even considered something like this before.

WndyLady reply

I wonder why the Baltimore community is so dead, then.

TriLinder reply

Thought about adding the user's location, but was worried PythonAnywhere could somehow cache the image between multiple people. A great demo though!

kabobglance reply

You have the code for this? Very interested in how you implemented it

Skull giver reply

kabobglance reply

Damn, PHP is such a sleeper of a language, I always forget how useful it can be.Thanks for sharing!

Skull giver reply

[This comment has been deleted by an automated system]

kabobglance reply

lemmy.villa-straylight.social

Nice, sounds like it's getting modernized. I'll have to give it another round, thanks!

salient_one reply

Genuinely curious, how is it superior to Python in your opinion?

Edit: Apart from the things you listed 😅

It can run natively on an Apache server without any frameworks required to render user website markup and serve pages. That's a pretty awesome advantage.

PHP is the OG bad-ass for getting shit done. No setup, no compile, no deployment pipelines. Hell, you can create and write the files right there on the server with nothing more than an SSH terminal if you want.

scottywh reply

PHP is pretty damn awesome really... Sad that it's gone out of favor IMHO

remotedev reply

My location is accurate, to give some good feedback on your program too lol

Skull giver reply

[This comment has been deleted by an automated system]

Altima NEO reply

lemmy.zip

Hah, not my town, but close. That's where my ISP is located though.

moitoi reply

I'm not using a VPN and the location isn't accurate.

newIdentity reply

Hey. I wanted to do this tomorrow.

Well I have a new idea which is pretty similar

Skull giver reply

[This comment has been deleted by an automated system]

newIdentity reply

I'm plannig to make one of these "dox'd memes" where someone says something controversial and another one answers with the ip address.

Skull giver reply

[This comment has been deleted by an automated system]

June reply

It’s got me about an hour from where I actually am

skankhunt42 reply

I hate this so much. Its super cool but MAN what the hell. I don't think I'm going to ever turn off my VPN anymore. I'm in a super small town and that image is correct.

It's cached somewhere because I can't get it to update. Maybe time for a new account too. Hmmmm

Skull giver reply

[This comment has been deleted by an automated system]

skankhunt42 reply

Yeah, app cache had to be cleared. We good

rektifier

I'm fine with this. Instances shouldn't proxy or cache images because it opens instance owners to a lot more liability than text. A client side setting to not load images in comments by default is better.

FancyFeaster reply

lemmy.fail

Each instance stores post thumbnails locally even if the post was on another server. It actually takes up quite a bit of hdd space.

edric

Mlem - knows exactly that it’s Mlem.
Memmy - sees Mobile Safari webkit.
Voyager - same as Memmy.
Thunder - just sees Mobile Client.

moonsnotreal reply

Jerboa - also just sees a Mobile Client

Zenaida macroura reply

Infinity for Lemmy - just says Android

Lmaydev reply

programming.dev

Connect - also says a mobile client

TheButtonJustSpins reply

Same for Liftoff on Android

CookieJarObserver reply

1984 reply

lemmy.today

Doesn't know it's sync.

roon reply

Voyager on Android

DrQuint reply

Which would be correct as Voyager is a Web App

Gollum reply

Lemmios

SokathHisEyesOpen

Oh neat, Jerboa doesn't identify itself. Cool.

Automated_Footprint reply

Same on Sync And on infinity

charlytune reply

mander.xyz

I get "unknown (mobile?) client" using Jerboa

DavyJones

lemmy.dbzer0.com

What is it supposed to say?

Blizzard reply

lemmy.zip

What is it supposed to say?

"You are viewing this from The Black Pearl, Davy Jones."

Kissaki reply

It names your browser and OS.

ares35 reply

it got mine wrong because i change default useragent and platform in the browser.

Zetaphor

zemmy.cc

Salient demonstration, but if image proxying were to come to Lemmy I'd hope it was made optional, as it could overburden smaller instances, especially one-person instances (like mine). We also need a simple integrated way of configuring object storage.

Skull giver reply

[This comment has been deleted by an automated system]

minorsecond

I'll be damned. I tried this from three different platforms and you've nailed it.

kostel_thecreed reply

I'm using Firefox on Mac and it thought I was on windows. Still a big issue though.

some_guy reply

It said I'm on Mac OS X, but that's wrong. It's been macOS for some years now. /s

It still makes me wanna cry.

TriLinder reply

Yeah, I just use whatever the user_agents Python library gives me as user_agent.os.family.

possibly a cat reply

coffeeguy

VPN using Librewolf user checking in. This post got nothing on me.

Forcen

lemmy.one

Easiest way to stop this from happening is to use ublock origin to block all third party request on your instance.

One way to do this is via dynamic filtering. This is for advanced users so be sure to read the info page: https://github.com/gorhill/uBlock/wiki/Dynamic-filtering

(Consider backing up your ublock settings before doing this)

If you are using lemmy.ml your rule would be this:

lemmy.ml * 3p block

if you're using another instance then change the domain or use both rules cause you might end up visiting the others as well. Note that adding this rule wont work unless enable advanced features in ublock origin.

EDIT: THIS MIGHT BREAK THINGS ON YOUR INSTANCE, its recommended to learn how to use dynamic filtering to unbreak it: https://github.com/gorhill/uBlock/wiki/Dynamic-filtering:-quick-guide If it breaks stuff just remove that rule.

You could also block it using static filters but I can't remember how to do that exactly, if you know please reply below.

CookieJarObserver

_I_ reply

Yeah, I'm using Mullvad with misc DNS blockers enabled so it has nothing on me ᕕ( ᐛ )ᕗ

superkret

KidsTryThisAtHome reply

I'm also on jerboa, but a Samsung with GPS, and it also tells me unknown device. Must be jerboa

sfgifz reply

It says unknown (mobile?) client for me too, using Sync with Bluetooth and location enabled and Play Store Services installed.

Whoever wrote that image tracking over-hyped it?

TriLinder reply

The user-agent detection definitely isn't great, this was just meant as a quick proof of concept for anyone curios.

It successfully identified Firefox when I checked it from the browser. Maybe some of the apps don't identify themselves in the useragent string?

synae[he/him]

I would've hoped that lemmy users on a c called privacy would understand the technology better, but I guess not.

ares35

for a little extra creepiness, modify the image-generating script to add geoip location data and http referer to the image.

Skull giver reply

[This comment has been deleted by an automated system]

TriLinder reply

Thought about adding the user's location, but was worried PythonAnywhere could somehow cache the image between multiple people.

jozo

What does it say? on jerboa is states that i use unknown mobile client, with infinity, android client. All i have is adaway on my phone

judas

Man, I remember I scared the crap out of trolls on Reddit when we started arguing over DM, and I added a link to a meme that tracked their IP and system info (without them knowing ofc). Let's just say they went AFK quickly after that. Good times!

LoudWaterHombre

lemmy.dbzer0.com

unknown device?

scottywh reply

unkown

Ben Hur Horse Race reply

The unkown sounds pretty fucking scary to me

scottywh reply

Ok.. I usually don't laugh much at comments and replies but that right there was pretty funny... I don't care who ya are.

TriLinder reply

Oh, how did I not notice that before? Now should be fixed.

scottywh reply

Still says unkown for me.

TriLinder reply

The user-agent detection definitely isn’t great. If it doesn't recognize a client, it just says unknown. But that wasn't the main point of the post anyway, this was just meant as a quick proof of concept for anyone curious.

LoudWaterHombre reply

lemmy.dbzer0.com

Whats the point of unknown?

Gamey

feddit.rocks

Image proxies are a must have, let's hope we get those soon!

skymtf

pricefield.org

I feel like there isn't a real way to fix this, since lemmy isn't a single service, like I can choose any image host I want. The only way I could think of would be to have your instance download the images but that's currently not even support on the mastodon alike platforms even. The only thing you can do on Mastodon that I'm aware of is cache the images on your own server which could get costly

Monkey With A Shell

lemmy.socdojo.com

Even without instance proxy, it should easy enough on the client side to not pull remote images unless directed to do so, similar to most email clients these days. At least it gives people a warning that they're passing data to a 3rd party location.

newIdentity reply

That's pretty stupid for a platform mainly based on images.

Monkey With A Shell reply

lemmy.socdojo.com

Maybe, but on the flip side though, as an instance owner, I don't nessecarily want my node to be in the logs acessing questionable content on behalf of the end user.

newIdentity reply

That's why it isn't done this way I guess

And because it's more resource efficient.

Mikug

Uriel238 [all pronouns] reply

I got mobile client from Liftoff.

Holy shit. How do we avoid this? VPN?

Slotos reply

feddit.nl

By not using internet. No, seriously, if you access something over the internet, you will leave tracks. This here post is nothing new or inherently scary on its own. I used to have forum signatures that would tell people what browser they were using or from what IP they were coming.

What you really want to do is disable third party cookies on everything you own. That (and things like hsts super cookies) is what tracks you.

If you’re using an app to browse Lemmy, you might ask for their implementation to reject cookies and fingerprinting attempts when displaying images and other embeddables.

a minute later edit: And yeah, if you don’t like web services to know the IP address given to you by your ISP, VPN is a decent option.

Erika2rsis reply

I would say a user agent spoofer would be more useful for this particular image. The Mozilla team recommends User-Agent Switcher and Manager for Firefox users.

ForestOrca reply

Where can I learn more about using this Firefox extension? I've installed it, but it hasn't changed the results of (https://trilinder.pythonanywhere.com/image.jpg).

I see I am able to black list pythonanywhere.com.

TriLinder reply

That's weird. The extension should definitely work with the image, as that's what I used when building this quick demo. Does the content of a site like this update?

Erika2rsis reply

Here's a six-minute YouTube video explaining how to use it

TL;DW: Click on the extension icon, use the drop-down lists to find a browser and OS, select a pre-configured user-agent string from the list, and click "apply (container)" or "apply (all windows)". Having your user-agent string change randomly with each request is possible but requires writing a bit of JSON in the options.

ForestOrca reply

TY!! That link works on Invidious, Yay! I'll check it when I get a break.

CookieJarObserver reply

🇵🇸 Free Palestine 🇵🇸 reply

Wow! But mine didn't. Which filter lists are you using?

CookieJarObserver reply

Hamartiogonic reply

sopuli.xyz

I’m using a VPN, and the picture knows everything about me regardless.

tjaden

Jokes on you! I use a Firefox extension that spoofs my browser profile. https://addons.mozilla.org/en-US/firefox/addon/chameleon-ext/

infinitejester reply

mub

All these people correcting the result effectively giving useful data to improve data collection and detection methods.

shortypig

A_A reply

it is because the website providing the image is overloaded and cannot create an image.
You just have to reload the image and eventually you will see one.

WhatAmLemmy

Lemmy clients should really include an option to group or only show the first instance of a link for cases like this; where the same link is posted to multiple places.

ZeroHora

User Agent Switcher and Manager

feugnis