While you're reading this, keep in mind that I'm available for hire stupid!
Recently there’s been a bit of buzz about a piece of software called Mastodon. There’s a standard called OStatus, which builds atop Atom to provide a federated microblogging community, and Mastodon is an implementation of this standard. It’s exploded a bit recently, and that’s what got me looking into it. I wanted to set up a copy for myself, to experiment with it, but I decided I couldn’t be bothered setting up a whole rails environment to give it a spin. Much easier to write my own version from scratch, right? Well, no. But it was more fun.
If you’d like to check it out, it’s hosted at don.fknsrs.biz, with source code at fknsrs.biz/p/don. If you search for a user, it’ll auto-subscribe to that user. Feel free to add anyone you like!
OStatus Standard and Protocols
The concepts behind OStatus are really neat - it ties together several existing protocols (Atom, PubSubHubbub, WebFinger, Salmon, among others) in a really neat way, resulting in what seems to be a pretty sensible network design. The protocols it builds on are all quite simple in isolation, and luckily OStatus doesn’t add too much complexity to the mix itself.
Acct URIs
The first thing to learn about is the acct
URI scheme. This is part of
WebFinger (RFC 7033), and is basically
a standardisation of the classic username@host
syntax. For example, I have a
twitter account, and as an acct
url, I could refer to it as
acct:deoxxa@twitter.com
. The significant thing about this is that, according
to the WebFinger specification, you can use this URI to look up more
information if the service it’s associated with operates an appropriate
WebFinger endpoint.
Now, twitter doesn’t host a WebFinger endpoint, so we can’t use that URI for
anything, but I also have a mastodon.network
account - and they do provide a WebFinger endpoint. We can call that account
acct:deoxxa@mastodon.network
. This is the account that I’ll be using as an
example.
To get started, you have to find where the WebFinger endpoint is! There are
actually two ways to do this. The first is just to slap /.well-known/webfinger
onto the end of the host portion of the acct
URI. This is actually
standardised, so it should work nearly all the time, and you should probably
do this first. However! It’s totally possible for someone to host a WebFinger
endpoint at any location, and if you want to be able to handle that, you have
to know a teeny bit about “Link-based Resource Descriptor Documents”, or LRDD
for short. You can use this to recover the process if, for some reason,
someone hosts their WebFinger endpoint at a non-standard path.
Web Host Metadata
So, LRDD. There’s a “Web Host Metadata” specification in the form of RFC
6415 that details a way to request
metadata about a web host that you can use to find additional information.
Short version: request /.well-known/host-meta
, via HTTPS, from the host
you’d like to know about. This path (and scheme) is part of the protocol, so
you can safely hard-code it. If we use mastodon.network as an example, their
metadata looks like the following.
<?xml version="1.0"?>
<XRD xmlns="http://docs.oasis-open.org/ns/xri/xrd-1.0">
<Link rel="lrdd" type="application/xrd+xml" template="https://mastodon.network/.well-known/webfinger?resource={uri}"/>
</XRD>
Hooray XML. Get used to it, because literally all these protocols are XML.
Right there, look, it says “webfinger” - that must be it! Well, yes. It is,
but you can’t just assume that it’s always going to have the string
“webfinger” in the URL. It could technically be anything, and there could be
many more Link
elements depending on the service. What you have to do is
parse the document, then find the Link
element where the rel
attribute is
lrdd
. This element will have a template
attribute, which is a URI Template
(as specified in RFC 6570). In this
template, the {uri}
bit is where you put the acct
URI when you want to
make a request.
WebFinger
So, tying this all together, we end up with a URL of
https://mastodon.network/.well-known/webfinger?resource=deoxxa@mastodon.network
.
You probably won’t be able to just open that in your browser - your browser
will request HTML, and the server won’t supply it, so you’ll get a blank page.
Use cURL or something instead if you’d like to give it a try! It should look
something like the following.
<?xml version="1.0"?>
<XRD xmlns="http://docs.oasis-open.org/ns/xri/xrd-1.0">
<Subject>acct:deoxxa@mastodon.network</Subject>
<Alias>https://mastodon.network/@deoxxa</Alias>
<Link rel="http://webfinger.net/rel/profile-page" type="text/html" href="https://mastodon.network/@deoxxa"/>
<Link rel="http://schemas.google.com/g/2010#updates-from" type="application/atom+xml" href="https://mastodon.network/users/deoxxa.atom"/>
<Link rel="salmon" href="https://mastodon.network/api/salmon/4714"/>
<Link rel="magic-public-key" href="..."/>
<Link rel="http://ostatus.org/schema/1.0/subscribe" template="https://mastodon.network/authorize_follow?acct={uri}"/>
</XRD>
This is an XRD document - an “extensible resource description”. Apparently that makes me a resource. But don’t laugh too soon, because if you have a mastodon account, you are a resource too. So we’re in this together. Probably. Huddle tight for warmth.
So, now that we’ve got my user metadata, we can look at these different links.
What are they? There are so many of them! Computers are awesome. The one we’re
interested in at the moment is my Atom feed - it’ll always be located in a
Link
element with a rel
of http://schemas.google.com/g/2010#updates-from
.
This doesn’t mean my updates are hosted on Google or anything - it’s just that
Google initially defined that rel
value for one of their products, then a
few clients started using it, then everyone else started using it for
compatibility’s sake. You can safely treat it as an opaque value.
Atom
According to the document up there, my update feed URL is
https://mastodon.network/users/deoxxa.atom
. Go take a look at
it if you want! It should display
fine in your browser, but it’ll be a bit ugly. I’ve copied it below with some
highlighting.
<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:activity="http://activitystrea.ms/spec/1.0/" xmlns:poco="http://portablecontacts.net/spec/1.0" xmlns:media="http://purl.org/syndication/atommedia" xmlns:ostatus="http://ostatus.org/schema/1.0" xmlns:mastodon="http://mastodon.social/schema/1.0">
<id>https://mastodon.network/users/deoxxa.atom</id>
<title/>
<updated>2017-04-06T12:11:01Z</updated>
<logo>https://mastodon.network/system/accounts/avatars/000/004/714/original/f5ae14273acf55ef.jpg?1491480661</logo>
<author>
<id>https://mastodon.network/users/deoxxa</id>
<activity:object-type>http://activitystrea.ms/schema/1.0/person</activity:object-type>
<uri>https://mastodon.network/users/deoxxa</uri>
<name>deoxxa</name>
<email>deoxxa@mastodon.network</email>
<link rel="alternate" type="text/html" href="https://mastodon.network/@deoxxa"/>
<link rel="avatar" type="image/jpeg" media:width="120" media:height="120" href="https://mastodon.network/system/accounts/avatars/000/004/714/original/f5ae14273acf55ef.jpg?1491480661"/>
<link rel="header" type="" media:width="700" media:height="335" href="https://mastodon.network/headers/original/missing.png"/>
<poco:preferredUsername>deoxxa</poco:preferredUsername>
<mastodon:scope>public</mastodon:scope>
</author>
<link rel="alternate" type="text/html" href="https://mastodon.network/@deoxxa"/>
<link rel="self" type="application/atom+xml" href="https://mastodon.network/users/deoxxa.atom"/>
<link rel="hub" href="https://mastodon.network/api/push"/>
<link rel="salmon" href="https://mastodon.network/api/salmon/4714"/>
<entry>
<id>tag:mastodon.network,2017-04-06:objectId=28554:objectType=Status</id>
<published>2017-04-06T12:10:34Z</published>
<updated>2017-04-06T12:10:34Z</updated>
<title>In the beginning...</title>
<content type="html"><p>In the beginning...</p></content>
<activity:verb>http://activitystrea.ms/schema/1.0/post</activity:verb>
<link rel="self" type="application/atom+xml" href="https://mastodon.network/users/deoxxa/updates/2810.atom"/>
<link rel="alternate" type="text/html" href="https://mastodon.network/users/deoxxa/updates/2810"/>
<activity:object-type>http://activitystrea.ms/schema/1.0/note</activity:object-type>
<link rel="mentioned" href="http://activityschema.org/collection/public" ostatus:object-type="http://activitystrea.ms/schema/1.0/collection"/>
<mastodon:scope>public</mastodon:scope>
</entry>
</feed>
Arrrrggghhhh more XML!
Alright. There’s a lot there. Let’s start from the top. This is a plain old
Atom feed - the kind you’d shove in your RSS reader, if it was 2009. I
actually kind of miss 2009. First up we can see that the feed has a logo -
that’s my avatar. That same image is included later on as an avatar
link in
the author
section. Right at the bottom, there’s an entry
element.
Normally there’d be several of these, but I’ve only made one post on that
account. It’s a good thing too, since that’s already far too much XML for one
day.
The interesting part for us right now is actually in the middle - the link
elements. You can see one where rel
is set to hub
- this is a PubSubHubbub
server that we can use to subscribe to updates! If we ask it politely, we can
have it notify us of every new post that I make. Just what you always wanted -
a constant stream of unimportant garbage.
PubSubHubbub
Superfeedr have an excellent article that details exactly how to implement a PubSubHubbub consumer. I’m not going to go into too much detail here.
Firstly, we have to understand the basics of PubSubHubbub. It’s made up of three parts - producers, hubs, and consumers. Producers push events to hubs, consumers receive events from hubs, and hubs act as brokers between the two, forming a fan-out mechanism. Hubs also do some limited redelivery if consumers are unavailable when an event is pushed. This means consumers don’t have to poll for updates, and producers don’t have to implement all that messy retry logic on their own.
A hub is located at a particular URL. A consumer must make a request to that URL to subscribe to a particular “topic”. For our purposes, the topic is just the Atom feed URL. The consumer must also tell the hub where to send events to, when they come in from the producer. This forms the basic handshake.
As we saw before, we can find the hub URL by looking at the link
elements in
the Atom feed. If we make the right request to this URL, we can ask the hub to
notify us of updates as they happen. That request has the following
parameters.
hub.callback
: the URL you’d like the hub to send events tohub.mode
: this will always besubscribe
orunsubscribe
hub.topic
: this is the feel URLhub.verify
: this will besync
orasync
- it determines whether the hub verifies your subscription in-band or out-of-band. This is an advisory parameter. The hub is not obliged to follow it.hub.lease_seconds
: this is how long you’d like to subscribe for. This is also advisory. The hub may have a maximum (or minimum) lease time.
Once you make this request, the hub will immediately make a request to verify
your intent. The hub will send a parameter that you have to echo back, to
confirm to the hub that you did indeed intend to receive updates to that URL.
This helps to prevent invalid (either malicious or accidental) subscriptions
from being carried out. If you respond correctly to this request, the hub will
make your subscription active, and will deliver events to you as they come in,
until the subscription expires (as specified in the lease_seconds
parameter).
In DON, I generate a unique identifier for each subscription, and that forms part of the callback URL that I send to the hub. This way I can correlate incoming messages with particular subscriptions and feed URLs, without having to rely on the metadata from the hub.
Feed Updates
With OStatus, post updates come in the form of Atom feed documents. These
documents (usually!) contain author information, and exactly one entry
element. You can use the author information to update any locally cached data
you have, such as the user’s avatar, display name, etc. The entry
will be
one of several kinds of activity - it could be a post, a reply, a “share”
(kind of like a retweet), or a deletion.
In DON, I keep a history of posts for every user I’ve seen in the system. This may change in the future.
All Together
Let’s recap. We started from just an account - deoxxa@mastodon.network
. We
used the host portion of that account to look up some metadata about the host.
That host metadata told us how to find information about an account (my
account!) on the service. Then that account metadata told us how to find my
updates. Then when we finally fetched my post feed, a link in there told us
how to subscribe for notifications. Links on top of links on top of links.
Quite neat though, overall. Kind of like clicking through websites.
That just about wraps up the read side of these protocols. There is quite a bit of semantic behaviour I’ve glossed over for the sake of simplicity, but I’ll go through that when I describe the application logic itself.