By Doron Guttman & Roei Ben-Harush
Webhooks automate workflows by sending data from one app to another on certain events. They require a public URL, which can be a problem for testing or development. Smee.io is a payload delivery service which proxies payloads from the webhook source and transmit them to a locally running app. However, it was designed for GitHub, so customizing it for other services is necessary - here's how we did it.
WIIFM (what's in it for me)
Webhooks are a powerful way to automate workflows and integrate different applications. They allow you to send data from one service to another when certain events happen. For example, you can use webhooks to notify your team on Slack when someone pushes code to GitHub, or when a new issue is created. Slack also uses webhooks to let your application know a user has interacted with a card you sent them on Slack.
However, webhooks have a limitation: they require a public URL that can receive HTTP requests from the webhook source. This means that a webhook cannot be configured to send a message to an endpoint it can't reach, like http://localhost; if you want to test or develop your webhook integration locally, you need some way to expose your local host to the internet.
This is where smee.io comes in handy. Smee.io is a webhook payload delivery service that uses Server-Sent Events (SSE) to proxy payloads from the webhook source, then transmit them to your locally running application. Smee.io is "Made with ♥️ by the Probot team." (and we thank them for it 🙏). It's free, easy to use, and works with any service that supports webhooks - or at lease, in theory it should.
Since smee.io was designed to work with Probot for enabling development of GitHub applications it works very well with GitHub webhooks and its UI is somewhat tailored for use with GitHub (parsing GitHub specific headers); however, if you want to leverage smee.io for other services that use webhooks you may hit a few snags.
Specifically when developing our Slack app, we ran into some of those snags.
In this blog post, I will show how I customized smee.io so we can use it to integrate with Slack webhooks. This should be applicable for other webhook services.
While smee.io provides a lot of benefits while developing and testing your webhook integrations, especially with GitHub, it does have some downfalls. Some of which proved to be a real blocker for us.
First of all, since smee.io is a free service (thank you again!) it is completely understood why it would have some limitations. One of which is that there is no guarantee to it's availability. Unfortunately I could not find a service which monitors smee.io so I can provide factual information on how often it is down, but I can tell you that as a team working mostly in US Eastern time zone, we encounter it a lot.
Reservation
In order to create a channel on smee.io, you should point your browser to https://smee.io/new, which will then redirect you to a randomly created channel. That random channel ID would be up to 16 alphanumeric characters. From the smee.io code:
Since that channel ID is not saved anywhere, together with the fact the channel endpoints (app.get('/:channel'...) and app.post('/:channel'...)) don't care about the channel name itself, means you can actually pick your own (e.g. https://smee.io/foo). However, if anyone knows your channel ID, they will be able to listen to messages sent to the channel, so you don't want to pick an easily guessable channel ID.
Smee.io relies on message authentication rather than channel security. This makes a lot of sense as smee.io has no configuration and all the webhook messages GitHub sends out have a signature header (e.g. X-Hub-Signature-256: sha256=xxxxxxx...). This is actually a good practice, which Slack follows as well (different header), however, not all services do. It would have been great if you could protect your channel with a key somehow, wouldn't it? Without it, someone can spam or listen to our channel, intentionally or not. It would also mean that you would give your channels meaningful names, like /slack-integration or /github-app
Smee.io only supports application/json content as the webhook payload. I do not think that was intentional, but it was not intentionally designed to support other content types as well. This is due to the use of a common Express.js body parsers:
Taking Slack webhooks as an example, most events are sent as application/json, however, some others like the interaction events, are sent as URL encoded (application/x-www-form-urlencoded). I, personally, don't understand why Slack chose to do so, but it is what it is.
Because smee.io uses the express.urlencoded body parse, when the it receives content-type: application/x-www-form-urlencoded it will automatically convert the payload (e.g. key1=Some%20Value&key2=Other%20Value) to JSON:
This is very useful when you want to work with the content, which smee.io does in order to display the content on the web UI, but it is breaking the forwarding of the message to the clients (as the client expects it to arrive as URL encoded payload).
Smee.io is a payload delivery service (PubSub) which receives payloads and forwards them to all subscribers:
This means it automatically responds to the service with 200 OK, this happens even if there are no subscribers:
Unfortunately, with Slack, there's an ownership verification phase, in which you have to respond with a challenge. With a payload delivery service, like Smee.io, there is no way to verify the endpoint, which means you can't use it for Slack webhooks. Note this verification only happens once, when you configure the service.
Smee.io does not support path forwarding. For example, if your webhooks service includes some of the important information in the path (e.g. https://smee.io/foo/:type, where :type is the type of payload/event), it would not work with smee.io as it is not subscribing to the rest of the path which follows the /:channel. The service will actually receive a 404 Cannot POST /foo/:type.
Public IP / Port Forwarding
Public IPs can be costly and hard to manage. In addition, it require a different configuration (and sometimes entire new application instance) for each developer. Your app has to be deployed out in the public as you'll need a public connection to your dev box. You can combine a public IP solution with port forwarding and trade off some of the complexity, but you will probably have to ask your IT for help with this every time there's a change.
Tunneling Service
Tunneling services can be considered as a solution in some cases. Services like ngrok, frp, localtunnel and sish create a public endpoint that tunnels communication to your local endpoint via a tunnel client.
This is great when you need to craft a special response to the webhook service, for example the Slack ownership verification mentioned above. In addition, tunneling solutions support path forwarding out of the box as it tunnels the full request.
Tunneling services have downsides too. For example, it is a 1:1 relationship between a tunnel endpoint and a local app (there can be only one local app listening to a public endpoint). When using HTTP webhooks, for every webhook message (request) there must be a response.
In our case, we have multiple developers working on the integration at the same time; this means that at any given time, while using ngrok we had to edit the webhook configuration and switch it over from one developer to the other, thus "stealing" the connection.
Let's compare the different solutions discussed above:
Given the above, we decided to take Smee.io and customize it to our needs.
My recommended starting point is to fork/clone the smee.io repo and get it working. And by working, I don't mean deploying it, but being able to build and run it locally to a point where you can place a break point in the code and have it pause there. This is necessary in order to customize any code. I actually had some issues doing that with the main branch as it was in commit 3a01759.
I'm using WSL2 and I'm not sure it was related, but I had to replace node-sass@4.14.0 with sass@1.58.3
and upgrade a bunch of packages:
I also had to add the --inspect flag to the start-dev npm script
and declare a newer version of the node engine compatibility in package.json
as well as in the Dockerfile together with specifying the platform architecture for compatibility building on MacBooks as well
You can find the final package.json and Dockerfile here.
Now that I had a working local dev environment and was able to successfully build and run the docker image, I was ready to customize the application.
This will allow to set a security operation mode and configure the channels. I chose to use the config package as I had good experience with it and it supports cascading config options.
Create the default base configuration:
Later, during deployment I can mount a config/local.js file to my docker and it will override some items in the config/default.js file.
You'll note that the Slack channel configuration defines a handler: 'slack', you'll see where that comes to play in a bit.
When a URL encoded payload is received, I want to pass it as is to the subscribers. For that, we need to keep the raw content (it will be signed as well in most cases), so I piggybacked on the body parser verifier:
This saves the original buffer into a new rawBody property based on this wonderful gist (thanks stigok! 🙏). Then we need to forward the raw body to the subscribers:
I wanted to touch the original code as little as possible, so it would be easy to merge updates from the upstream repo. This means there should be a minimal footprint for the custom handlers and security and the changes should be encapsulated as much as possible. The best way to do that in the Express.js world is via middleware.
First thing first, load the configuration and create a custom middleware installer:
The middleware will first take care of the /new route and block it if not in open mode:
Next we need to handle the /:channel routes:
To simplify the password protection we can embed it in the channel name (e.g. http://mysmee.io/foo:password). That way we will be able to support all (I hope) webhook services. Even those who do not support custom headers or query params. To resolve the password3:
Resolve the channel options from configuration:
Resolve custom handler from channel options:
What this means is that is a handler is configured, the code will try to load it from the /handlers folder. e.g. for the configuration shown above the Slack channel had the handler: 'slack' configured, which means that the code will try to resolve the module from ./handlers/slack; if it finds it, it will bind the connection params to the default export method and later execute it if it passes all the conditions.
Note that with node, loaded modules are cached, so it actually only loads it once while the application is running.
Moving on... based on the configured mode, apply conditions:
And here's the special Slack handler:
With this custom handler, our flow would look like this:
Put it all together with error handling:
Install the Middleware
In the server.js file, we'll import the middleware:
and use it as the last middleware before the endpoint registrations:
Well, that will require a change in the smee-client as well. So maybe a follow-up?
We decided to deploy on our AWS ECS. For simplification we created a deploy.sh script, you're welcome to use it as well, though it is outside the scope of this post.