a thoughtful web.
Good ideas and conversation. No ads, no tracking.   Login or Take a Tour!
comment by rob05c
rob05c  ·  3439 days ago  ·  link  ·    ·  parent  ·  post: Hubski dev update: SQL ready to deploy next weekend

I'm predicting no effect, until we change more things to load dynamically. Because it's loading all publications on startup. So, right now, once loaded, the only difference will be the ctags which I made load dynamically. And that's only loaded when an individual story is displayed, and saved when a user adds a community tag.

And in fact, it already loads less. The ctags used to be part of the pub, loaded from file, the full list of ctags. The only data actually needed to display a story is, "has the current user created a ctag for this publication?" So, instead of loading all ctags, it queries only that, "select count(1) from publication_community_tagses where id = $1 and username = $2;" I think it will be faster, but insignificantly so.

For the same amount of data, in theory, SQL should be slower than raw disk reading. But right now, we have to read and parse the full pub, for every pub. So, I think it will be faster, once everything is changed to only fetch the data that the currently requested page needs. More importantly, it will be infinitely more scalable.

If not, memcached. If hubski had significantly more traffic, memcached would be essential. But I think hubski typically has <1 request/second (can you verify?). If hubski were getting say, 1k requests/second, there's no way a dozen SQL queries for each request would be fast enough. We'll just have to see.





beezneez  ·  3438 days ago  ·  link  ·  

| The ctags used to be part of the pub, loaded from file, the full list of ctags.

Sounds dangerous. At this point it would make sense for hubski to run on sqlite, but do you really foresee hubski requiring 1k requests per second in the near future? What 'kind' of file was it loading from beforehand?

rob05c  ·  3438 days ago  ·  link  ·  

    The ctags used to be part of the pub, loaded from file, the full list of ctags.

    Sounds dangerous.

Not really; just slow.

    it would make sense for hubski to run on sqlite

I think hubski would be fine on SQLite, but PosgreSQL doesn't hurt, and it helps with a lot of things, like Elasticsearch and backups.

    do you really foresee hubski requiring 1k requests per second

No. But it would be nice if we handled the Reddit influxes well.

    What 'kind' of file was it loading from beforehand?

Serialised s-expressions. Still is, just without the community tags key.