So over the past couple days I moved my PDS hosting from a server on UpCloud to my self-hosted Kubernetes cluster, Phoebe. Not for anything cost-related, UpCloud are actually really nicely cost-effective, and I still run a couple things on their stack (including the backup storage for Phoebe in their Object Storage offering), but moreso because I have a crippling Kubernetes addiction and I love self-hosting stuff.
However, there were a couple gotchas involved when I was trying to build out my setup. So this post is gonna be partially documenting how I host it, but also some of the weird issues I saw along the way.
So the main problems with just going "fuck it, let's run my PDS in Kube" boil down to these three:
How do I run it? (there's no official Helm chart for the PDS)
How do I handle routing? (K8s is a little painful for wildcard ingress/gateway stuff)
How do I cleanly migrate my data? (moving my data from the VPS that hosted it into Kube)
Now, I solved all of those, but we're gonna start by answering question one; how.
Like I said, the Bluesky team don't publish a Helm chart for the PDS (yet, I may try and submit a PR), but what they do publish is a Docker image. Because of this, we can make use of a wonderful generic deploy-anything Helm chart called app-template, which will allow us to set up a very quick deployment for the PDS that will handle running updates and coordinating things like services and Gateway HTTPRoutes simple.
First off, let's define our main controller and container. The controller in app-template is basically used like a reference for how Kubernetes (and the chart itself) determine what talks to what (like which services map to which container sets).
# values.yaml
---
# yaml-language-server: $schema=https://raw.githubusercontent.com/bjw-s-labs/helm-charts/app-template-4.4.0/charts/other/app-template/values.schema.json
controllers:
pds:
containers:
main:
image:
repo: ghcr.io/bluesky-social/pds
tag: 0.4.188
env:
PDS_HOSTNAME: pds.<domain>
PDS_PORT: &port 2583
PDS_DATA_DIRECTORY: /tmp/data
PDS_BLOBSTORE_DISK_LOCATION: /tmp/data/blocks
PDS_BLOB_UPLOAD_LIMIT: '52428800'
PDS_EMAIL_FROM_ADDRESS: pds@<domain>
PDS_DID_PLC_URL: https://plc.directory
PDS_BSKY_APP_VIEW_URL: https://api.pop1.bsky.app
PDS_BSKY_APP_VIEW_DID: did:web:api.bsky.app
PDS_REPORT_SERVICE_URL: https://mod.bsky.app
PDS_REPORT_SERVICE_DID: did:plc:ar7c4by46qjdydhdevvrndac
PDS_CRAWLERS: https://bsky.network
LOG_ENABLED: true
TZ: Europe/London
envFrom:
- secretRef:
name: atproto-pds-secret
probes:
liveness: &probes
enabled: true
custom: true
spec:
httpGet:
path: /xrpc/_health
port: *port
initialDelaySeconds: 0
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 3
readiness: *probes
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities: { drop: ["ALL"] }
defaultPodOptions:
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
fsGroupChangePolicy: OnRootMismatch
Now, this configuration does a few things, but if you've worked with Deployments in Kubernetes before, it should look vaguely familiar. All this is doing currently is:
Configuring a Deployment for our PDS
Telling it to use the
ghcr.io/bluesky-social/pdsimage with version0.4.188Giving it environment variables (both inline for non-secret values and reading from a secret for those ones)
Giving it some healthchecks to run so it can automatically replace our container if it stops responding
Applying some security options to our container so that it runs as non-root and can't escalate itself _to_ root, as well as dropping linux capabilities that it doesn't need (in the PDS' case, this is _every_ capability)
We can install this to our cluster by running the following:
helm repo add bjw-s https://bjw-s-labs.github.io/helm-charts
helm upgrade --install atproto-pds bjw-s/app-template --namespace atproto-pds --create-namespace -f ./values.yamlNow, we should see we get an atproto-pds container starting up in our cluster. Awesome!
❯ kubectl get pods -n atproto-pds
NAME READY STATUS RESTARTS AGE
atproto-pds-64cf9cf984-965jr 1/1 Running 0 1mHowever, we've got no way to _reach_ our cluster. So let's go ahead and add a Service to our Helm values, too.
# values.yaml
---
controllers:
...
defaultPodOptions:
...
service:
pds:
controller: pds
ports:
http:
port: *port(note: the *port here is referencing the PDS_PORT: &port 2583 we entered earlier. This is a YAML feature called an Anchor.)
Now we've added a Service, let's update our Helm chart again:
helm upgrade atproto-pds bjw-s/app-template --namespace atproto-pds -f ./values.yamlAwesome, now you should see a service!
❯ kubectl get svc -n atproto-pds
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
atproto-pds ClusterIP 10.107.114.16 <none> 2583/TCP 1mLet's try connecting to it to get a response. First, let's get a port forwarded for it.
❯ kubectl port-forward service/atproto-pds -n atproto-pds 8080:http
Forwarding from 127.0.0.1:8080 -> 2583
Forwarding from [::1]:8080 -> 2583Now, if we open localhost:8080 in our browser, we should see this:
Cool. Now we've done that, let's close our port forward (hit Ctrl+C in the terminal you started the port-forward command in) and look into persistence.
Currently, if the PDS container restarts or is deleted, all our data is gone with it! Not very handy. To remedy this, let's add a Persistent Volume Claim (PVC) and a volume mount to our setup.
# values.yaml
---
controllers:
atproto-pds:
containers:
main:
...
env:
...
PDS_DATA_DIRECTORY: /data
PDS_BLOBSTORE_DISK_LOCATION: /data/blocks
...
...
persistence:
data:
enabled: true
type: persistentVolumeClaim
accessMode: ReadWriteOnce
size: 5GiNow, thanks to the way that app-template works, because we've defined a persistence item called data, it will automatically be mounted at /data inside the container, so we don't have to manually tell it to do that.
We've also told the container to use /data and /data/blocks as its data directory and blob storage location respectively by setting PDS_DATA_DIRECTORY and PDS_BLOBSTORE_DISK_LOCATION on the container's environment variable set.
The disk is 5 GB in size (size: 5Gi), but you can make that bigger if you want.
Now, when we re-apply the chart, we should have a volume set up and our container should now have a /data directory mounted.
Next, let's set up routing/ingress.
---
...
# If you're using Gateway API like me:
route:
pds:
annotations:
external-dns.alpha.kubernetes.io/cloudflare-proxied: "false"
hostnames:
- pds.<domain>
- '*.pds.<domain>'
parentRefs:
- name: external
namespace: network
sectionName: https
rules:
- backendRefs:
- identifier: pds
port: *portThis sets up a Gateway API HTTPRoute which is used to route traffic from outside the cluster into the cluster.
Replace the reference in parentRefs with whatever is relevant to your setup.
The annotation on it prevents External DNS from proxying your traffic with Cloudflare (if you're not using External DNS with the Cloudflare provider/webhook/etc., just skip that bit).
If you're not using Gateway API, though, you can also use Ingress, though I _strongly_ recommend switching to Gateway and Envoy or similar.
The only exercise for the reader from here is ensuring your TLS certs will work for pds.<domain> and (crucially) *.pds.<domain>. The second one is a little more annoying, and is the reason for me turning off proxying through Cloudflare (it messes up TLS termination on wildcard subdomains for some reason).
Now, finally, we can redeploy the Helm chart and you should have a usable PDS running entirely within Kubernetes!
helm upgrade atproto-pds bjw-s/app-template --namespace atproto-pds -f ./values.yamlAt this point, you should be able to hit pds.<yourdomain> in your browser and be greeted with the same welcome page we saw before. From here, you can use it the same way as if you used the installer script, you just need to figure out how to pass it the things like admin passwords and such to get your invite codes.
Hopefully this ends up being useful for folks, I know it took me a while to get it all working nicely. If you want to see how my setup works in particular, the Helm Release flux resource I built is here.
If you enjoyed this, consider supporting me on ko-fi and maybe at some point I'll be able to afford a new node for my on-prem cluster.