Lemmy Server Performance

447 readers

1 users here now

lemmy_server uses the Diesel ORM that automatically generates SQL statements. There are serious performance problems in June and July 2023 preventing Lemmy from scaling. Topics include caching, PostgreSQL extensions for troubleshooting, Client/Server Code/SQL Data/server operator apps/sever operator API (performance and storage monitoring), etc.

founded 3 years ago

MODERATORS

RoundSparrow@lemmy.ml

RoundSparrow@bulletintree.com

Lemmy PERFORMANCE CRISIS: popular instances with many remote instances following big communities could stop federating outbound for Votes on comments/posts - code path identified (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

6 comments fedilink hide all child comments

I spent several hours tracing in production (updating the code a dozen times with extra logging) to identify the actual path the lemmy_server code uses for outbound federation of votes to subscribed servers.

Major popular servers, Beehaw, Leemy.world, Lemmy.ml - have a large number of instance servers subscribing to their communities to get copies of every post/comment. Comment votes/likes are the most common activity, and it is proposed that during the PERFORMANCE CRISIS that outbound vote/like sharing be turned off by these overwhelmed servers.

pull request for draft:

https://github.com/LemmyNet/lemmy/compare/main...RocketDerp:lemmy_comment_votes_nofed1:no_federation_of_votes_outbound0

EDIT: LEMMY_SKIP_FEDERATE_VOTES environment variable

top 5 comments

sorted by: hot top controversial new old

[–] chiisana@lemmy.chiisana.net 5 points 2 years ago

Part of what makes Lemmy (and other voting link aggregators) work is the voting aspect. By taking away outbound vote federation, it forces further consolidation into these popular instances. Thereby further exacerbate the problem because now they’re even more consolidated and the posts and comments eventually becomes the bottleneck for the exact same underlying chatty protocol. For this reason, I’d be vehemently against this change without a pairing PR that allows this information to be requested via a batch pull that the protocol makes available.

[–] King@vlemmy.net 2 points 2 years ago* (last edited 2 years ago)

Thanks for doing all this.

Do we have any real numbers from a real server? How many votes are trying to be federated to how many servers?

Just ballparking some approximate numbers:

!technology@lemmy.world
15k subscribers
4000 subscribed servers
10 votes per subscriber per day

15000 * 4000 * 10 = 600,000,000 federated actions. That is around 7,000 per second 24/7 for one community.

IMO, this real time federation just doesn't scale. We need to start planning the specs for federation batching.

[–] King@lemm.ee 1 points 2 years ago (1 children)

Somewhat related, but why are we federating votes? Why not just federate the upvote count and downvote count? Does each server need to track the identity of every voter on a subscribed community?

Each server will track votes from their own users, preventing duplicate votes.

[–] RoundSparrow@lemmy.ml 1 points 2 years ago

Why not just federate the upvote count and downvote count?

I think the answer to that is that it isn't an optimized design.

Does each server need to track the identity of every voter on a subscribed community?

I think so. Which isn't a terrible assumption that user who votes will eventually comment/post and that profile will be of use.

[–] RoundSparrow@lemmy.ml 0 points 2 years ago* (last edited 2 years ago)

prototype pull

pull request for prototype: https://github.com/LemmyNet/lemmy/pull/3475

The environment variable LEMMY_SKIP_FEDERATE_VOTES is as good a way as any to reference this code hack.

load more comments