The latest Mastodon single-person instance performance/scaling issue:
350+ queries to the database each taking over 2 seconds each basically every time I post something.
Very odd.
(In case you wonder why you can’t reach my instance after I post sometimes.)
The lovely @jan is on it and opened an issue (https://github.com/mastodon/mastodon/issues/25949) but there doesn’t seem to be much progress yet.
Looks like my account is a good canary in the coalmine for this stuff :)
@aral @jan It is insanely bad query programming to have 3.5k IDs in WHERE clause Considering table is properly indexed and vacuumed, I would split them into 100-200 microbatches and if each involves pinging a separate server of a follower account, even make those batches smaller. LIMIT and OFFSET should be the start from database side and then some task queueing from backend
#postgresql #sql #performance
@gytis @jan In their defense, hindsight is always 20/20 and we all learn about what we’re building as we’re building it. It might have made sense at some point (or, at least, wasn’t a hindrance). What matters is we fix things when they become issues. I’d rather that than premature optimisation. The latter is why things never get shipped and/or why we keep shipping over-complicated things. (All within reason, ofc. Not saying we shouldn’t keep learning or follow good practices whenever we can.)
@aral @jan Yet it is often problem when developers use database just to persist data without proper guidance from database experts. Maybe it's not the case with Mastodon (although mentioned query suggests), but I've seen far too many projects where devs just do "dump all" and "select all" with terribly narrow understanding of "if it does not perform - just add another index"