#fyp @ the Edge

October 12, 2025

A big part of the Social Media Algorithm ProblemYou know, the one with the addictive dark patterns designed to drive “engagement” at the expense of our wellbeing. - at least as I see it - is that we mortals are not in control of the algorithms’ execution.

Instead, the service operator (Facebook, Tiktok, YouTube, whoever) keeps all the data about what we view, don’t share it with us (or at least not easily), and instead use it to hook us into MOAR ENGAGEMENT! MOAR EYEBALLZ! MOAR ADDZZZZZ!!!

All of which leads to the synthesis of “my” feed, FYP, etc - an infinite list of stuff chosen for me to look at and engage with, but not necessarily for my benefit.

If it’s not for my benefit, then whose benefit is it for? Who wins from it being the way it is?

Well, the service operator, who sells my “engagement time” to advertisers. So, Google. Apple. Netflix. Amazon.

I’m pretty depressed and fed up with social media feeds, app & product recommendations, search results, all of it, all being so untrustworthy. And I have been for a while.

And I’ve started thinking: what would have to be true for me to get social media feeds and product recommendations that I have more control over, and bring control over my FYP out to the edge of the network, closer to me?

What pieces would we need to make that happen?

Speculation mode: ENGAGE!

Data about me

First, obviously, there’s data. Probably quite a bit about me.

Browsing history
Media viewing history
Purchase history
Some data about what kinds of things I’m interested in. I might supply this explicitly, and/or there might be some inference based on my behaviour. Critically, I get to choose.
Some social graph data - who do I know? Which YouTubers’ stuff do I watch to the end? Whose blogs am I subscribed to?
Anything else I’m comfortable tracking, that might supply context for recommendations or feed content choices.

Creepy, huh? Certainly when a faceless corporation does it.

But maybe I’m a bit more comfortable storing this information on systems that I personally trust. These could be my own machines, which I can turn off, or whose storage I can erase, or whose analysis I can actually learn and benefit from. Or they could be machines run by someone I know, and who I trust not to screw me over.

Data about others

Now here comes the really tricky part: how do I turn this data into recommendations for things to read, watch or buy?

Suppose I shared some of this data. Voluntarily, and only the parts I’m comfortable making public.

Likes, ratings and comments
Subscriptions
Maybe some reading / viewing info
Content I’ve made myself
Purchases, or summaries of purchases
A bit of demographic info - some stated interests, connections, rough age, nationality, occupation. Nothing I haven’t already put on Facebook or LinkedIn.

And suppose lots of other people did that, too. Collect for themselves, and selectively and voluntarily share, a portion of the data that the Big Tech Villains already harvest and sell to advertisers.

Suddenly I can find out quite a bit about what my connections, and other folks similar to me, are engaging with. But only as they have chosen to share it.

And assuming they don’t lie, I guess. But even that can be handled if I control the algorithms behind my feed.

Ignoring important obstacles? Hell yes I am.

Obviously this is nowhere near as simple as I’m making it out here. I know there are huge obstacles to all of this. Just for starters:

Getting all this data out of the walled gardens where it lives.
Persuading enough people to gather their data (e.g. via Google Takeout), sift through it and share it publicly - without resorting to unethical manipulation!
Protecting people’s privacy as they do that - all of this stuff is potential fuel for scams, blackmail and identity theftNot that most of the really juicy information isn’t around already, but why make it worse? .

Algorithms

With that sweet sweet data, I can start to do a little bit of the recommending stuff that drives all those feeds & FYPs. And I can do that using systems and algorithms that I control – not in the sense that I’d necessarily understand them, but at least in the sense that I could turn one off, tune it a little, or swap it out for another algorithm that matches my priorities better.

All of which seems quite nice, but aren’t we now just implementing all those dark patterns on our own machines instead of Google’s?

Well, maybe. But I hope not, or at least I think it creates room for an alternative.

Because we own the machine that runs the algorithms, we can (at least in theory) choose our own algorithms. For example, say you have a bit of a problem with videos about cake decorations going horribly wrong. They’re your kryptonite - you see one, you have to click, you just can’t look away, and when you surface an hour later you’re invariably filled with regret.

So you decide you don’t want recommendations for any of those “baking disaster” videos (at least not during the week - maybe a little guilty pleasure on weekends is not too bad).

Rather than trying and failing to manipulate someone else’s algorithms into doing what you want, you tell your algorithm “don’t recommend videos about baking disasters during the week - and keep it to just a few on the weekends.”

Other examples:

Avoiding triggering material - mentions of violence or abuse, discrimination, hate speech, etc.
Promoting viewpoints that challenge your own, in order to avoid echo chambers and filter bubbles.
Changing your algorithm to suit what you’re doing (or supposed to be doing).
- News in the morning, sitcoms in the evening.
- Holding off on the more grown-up content until after the kids’ bedtime.

You’re in control, at least more so than before. Or, a little more hyperbolically: you’ve siezed the means of recommendation.

Hard problems I am totally glossing over right now

There are quite a few of these, each with its own unique difficulties:

Coming up with alternative algorithms:
- They need to be efficient enough to actually run on consumer hardware.
- They need to provide a better experience than the status quo. Otherwise, who would switch?
- Finding a way to decide what to recommend, that isn’t dark-pattern-driven engagement, is probably a huge challenge.
- How will all the TikTok-ers and YouTubers get their social validation, if not for Likes?
Making the tech accessible to people who are not self-hosting enthusiasts, programmers and accessibility nerds (i.e. almost everyone).
Nasty people selling algorithms that are - overtly or subtly - not serving the end user’s interestsNot that selling algorithms is necessarily bad! But the product has to be worth the money, respect your privacy, serve your best interests, and just generally not be a scam. And making a profit within those constraints may be literally impossible, especially in the face of probably-unhealthy competition from the incumbents. .
- Honestly, I reckon this would be a bloody amazing public good for an open source project, non-profit or charity to champion.

Filters

On the sharing side, I need a way to choose what I publish about myself and my activities. And it will need to not involve checking back over my reading and viewing every day. That will get old super quick.

Ideas:

Allow- and deny-lists of what to share, and what not. Defaulting to private, of course. Easily understood, but prone to falling out of date without regular maintenance.
Subscribing to lists published by knowledgeable parties. This works decently in the IT security space, e.g. blocking known spammers, or domains that host malware.
Training a machine learning model to predict what I will and won’t choose to share. Possibly a bit more chaotic, but could be capable of adapting as your preferences change, or new cases arise.

Ignored issues

Privacy and oversharing
- This is an area where people may sometimes need protection from their own impulsivity.
- I don’t know of any really good systems for reviewing privacy settings. Largely because the kinds of places where they’re required (cough Facebook cough) also have strong incentives to make them overwhelming and difficult to use.
Protecting children and vulnerable people
Ethics of building machine learning models, especially those trained on copyrighted and unpaid-for source material.

Feeds

The means by which we share, and which our crawlers (see below) crawl. Probably similar to RSS, ActivityPub, or other existing protocols for sharing events like posts, likes, updates and so on.

Ignored issues

https://xkcd.com/927

Crawlers

These are our hunters and gatherers, bringing us data for our Crunchers.

They might:

Crawl the social graph that we and our connections share (e.g. via HTML microformats), gathering shared data.
Go and find updates from creators that we have subscribed to. (Yes, just like RSS. This isn’t new.)
Periodically check webpages for updates (“Is this item back in stock yet?”)
Assemble your daily agenda based on your email or calendar data - if you decide to trust it with that!

Issues

Crawlers need to be respectful clients - especially if we’re aiming to put one (or 1000) in every home.
I don’t imagine that writing a crawler capable of withstanding life on the open web is an easy feat.
Consumer hardware constraints again.
Validating & sanitising incoming data. These little critters are meant to introduce arbitrary data into your systems. Literally anything could be in there! So we need malware scanning, sandboxing, protection against injection attacks that could exfiltrate, delete, or change your stored data.

Crunchers

The decision making parts - these are components that look at your gathered data and decide what’s likely to appeal to you at the moment.

Crunchers would decide what goes in your FYP, but also potentially what you get realtime notifications about, what goes in your daily summary email, the order of podcasts in your queue, or which TV shows to suggest tonight (Full calendar today? Let’s suggest cartoons or reality TV, rather than that highly cerebral Scandinavian Noir thriller).

These might be things you want to tune in the moment, e.g. whether you’re looking for something familiar to watch, or something a little bit new or different. Or how much time you have available.

Issues

Transparency of decision making
- Making explainable decisions
- Presenting explanations that we users can make sense of, and that help us figure out what (if any) changes we want to make.
This is where our software is most likely to contain bias. It might be introduced by the author of the software, by the methods & data used in training if ML models are involved.

Presenters

These are different methods of showing things to you, e.g.: instant notifications, a daily or weekly bulletin, a scrollable feed, a search index, an infographic, your podcast queue. A context layer on your web browser or online map. Probably a few million other things, too!

Issues

It’s getting late! I don’t want to think about the issues anymore.

Summary

This is all wild speculation. But given the availability of small, cheap, silent computers; more and more (and faster, and cheaper!) ubiquitous network access; and growing distrust of tech companies, maybe it’s time to work a little harder on some decent alternatives.