Hacker News new | comments | show | ask | jobs | submitlogin
Run JavaScript on Cloudflare with Workers (blog.cloudflare.com)
249 points by jgrahamc a year ago | hide | past | web | 185 comments | favorite

Hi folks! This is my project at Cloudflare -- which, entirely by coincidence, I joined exactly one year ago today: https://news.ycombinator.com/item?id=13860027

Happy to answer any questions.

This is awesome.

I guess the uses are endless. Debug some webhook endpoints by duplicating the request to a private requestbin - granted this isn't the right level to do this but sometimes we're working with legacy setups. Add missing json fields that an API broke for some client - again not ideally level to handle but sometimes we bandaid it. Send an API auth spammer gigs of random garbage because chances are their shitty client doesn't have any limitations - granted after 14.9 seconds you'd need to subrequest to your own garbage generating origin - or a 10gb garbage file somewhere

My uses would mainly be live debugging where we don't have a perfect stack (i.e. nearly all the time).

I hate to say this, but you can send more garbage than that. Subrequests have to be begun within 15 seconds, but they themselves can stream data for far longer. We actually support the TransformStream API to allow you to stream your garbage back and not keep it in memory as well.

Beyond that, we support the WebCrypto API, so you are welcome to generate a near infinite amount of cryptographically secure random garbage, and we'll even help out with some from our lava lamps [1].

1- https://www.fastcodesign.com/90137157/the-hardest-working-of...

I can't even load the page, so your miles ahead of me!

If anyone at Cloudflare cares to take a look, here's what I get:

> Server Error

> Our network is literally on fire

> Ray ID: 3fb0bf524995797f

> Your IP address:

> Error reference number: 502

> Cloudflare Location: Seattle

2nd try had an Error reference number of 524, with everything else being the same.

Shenanigans like this are why I stopped using Cloudflare on my websites, better to have something I can debug and avoid a party MitM all my connections, than to gain a fleeting bit of "reliability" or "security" that Cloudflare supposedly offers.

I believe "Our network is literally on fire" is an error from Ghost.io, the platform we use to host our blog.

Shouldn't your edge be caching this static article? I don't see why each page needs to be custom generated...

The FAQ says that these are still in beta and not production ready - is that still true or has that page not been updated with this announcement?

Also, it says 50ms of cpu time, 15 seconds of real time as limits. When developing, are there easy ways to get measurements of how long things are taking? I can imagine 50ms on my MBP may not be the same as in production - either faster or slower - but I wouldnt want to get to production to find out.


> The FAQ says that these are still in beta and not production ready

Oops! That's outdated. Let me go fix that right now...

Workers are already being used in production by customers large and small today.

To answer the second part of your question, we are working on better ways to expose the resource usage of your Workers. That said, the vast majority of Workers consume less than a single ms of CPU time, so it is almost never a concern.

Hey, congratulations! Definitely going to play with this this weekend.

What are my options if I build something complicated using these workers and for whatever reason I need to stop using Cloudflare? Or if I want to write something open-source that people can deploy either on Cloudflare or on-premise? Is there a reasonable way to emulate this functionality (possibly with less sandboxing) on a self-hosted web server?

Also BTW you should update https://cloudflareworkers.com to note that it's live :)

> What are my options if I build something complicated using these workers and for whatever reason I need to stop using Cloudflare?

Great question. We chose to implement the W3C standard Service Workers API in part to give people some flexibility here. E.g. depending on what your Worker does and who your clients are, it might be possible to push your same Worker code to the browser as a regular Service Worker. My hope is that other services choose to implement W3C standard APIs in the future, rather than everyone doing something bespoke as they do today.

I believe there are also some Node.js-based Service Worker harnesses designed for server-side use, though admittedly I haven't tried any myself. It would be cool to see such things developed further.

> Also BTW you should update https://cloudflareworkers.com to note that it's live :)

Happening as we speak. ;)

Some immediate uses that spring to mind involve storing data as strings. An example is rendering templates from JSON or a more resource-friendly serialization protocol. What are the limits on the file size of the workers?

Also, regarding resource usage, is this memory usage exposed at all to the worker via an API? I'm thinking there may be applications where it would be useful to cache fetched resources but without hitting the memory ceiling and having the requests die with the worker.

The most common way to cache fetched resources is to simply fetch them and let the Cloudflare Cache do the hard work. There's no ostensible limit on how much you can cache there, but rarely accessed things do of course get evicted.

We haven't exposed memory information to the process yet for several reasons, one being it's rare to see a Worker that actually needs anything close to the memory limit (128mb). When people do run into that limit it tends to be in error. We would love to see what you can build that does do creative state management such that it would need that information though!

Are there any extra costs related to outbound requests one might make from a worker?


One more question. Whats to stop this being used for amplification DDOS? Register a stolen CC then launch x100 parrell http(s) subrequest for each incoming request to your intended target, possibly adding large random payload (well as much as you can generate in 50ms minus time to setup 100 subrequests with fetch API)

If you do that, it won't actually send very much traffic to the victim site, and you'll find your zone suspended pretty quickly. :)

As with any anti-abuse measures, though, it's best if we don't describe in detail what steps we've taken to prevent this sort of attack.

I don’t agree that it’s best you don’t describe anti-abuse techniques. I’d be impressed if your manager agrees. I’ll argue points if you like.

Can you talk about topology relative to ingress and origin? Eg. is execution pinned in select PoPs per ‘region’ or all? Unrelated, I also wonder about retention of edge cache, is it purged via LRU or similar?

Personal note, I was excited to get the early access email, feels like an excellent offering; I’ll be flexing it soon, have some ideas—been getting an HLS system online in the past month using RasPi+CF without a hiccup (720p@4500kbps) and it’s open source/libre, comparable to YT/Twitch—info incl. design/ lectures in profile.

Also read above, congrats on what must’ve been a good year! Speaks well of your org that you’ve been able to deliver and talk freely.

> I’d be impressed if your manager agrees.

I am his manager and I agree with him

I assumed that’d be the case. What’s the average response and resolution time for pro customers? Difference for free customers? How’s your escalation process, if I may ask? (PagerDuty, etc). Can you speak to false positive rates? I break things and run media channels, would instaquit for plan B if there were an issue. (Don’t mess with my paper is the stem of my concern). I’ve designed and built a similar offering at a big B2B company and we spent considerable time coming up with heuristics to plug the gaping holes possible in this kind of system, hence false positive concerns. I’m also a CF partner. Thanks in advance for any insights or pointers.

> Can you talk about topology relative to ingress and origin? Eg. is execution pinned in select PoPs per ‘region’ or all?

Nope. Every Worker is distributed to every location. Requests are handled at whatever PoP they arrive at.

> I also wonder about retention of edge cache, is it purged via LRU or similar?

TBH I'm not that familiar with the inner workings of our HTTP cache, but I would assume so?

Unless to the cache item expired first, yes

Great! Thanks

Do CloudFlare Workers require external HTTP endpoints to have CORS headers like Javascript within webpages need?


CORS is designed to protect two things:

- The user's cookies for other origins, which the browser will normally send on any request to those origins.

- Behind-the-firewall servers that might be accessible from the user's browser but not from the public internet.

Neither of these things apply to Workers: a Worker obviously has no access to the browser's cookie jar, nor does it have the ability to see behind-the-firewall services since it's not behind-the-firewall.

CORS does NOT protect against DDoS: Typically, CORS does not prevent you from sending requests; it prevents you from seeing the content of the responses. Any web site in a browser can use an <img> tag or submit an invisible <form> to cause a cross-origin request, CORS or not -- but it can't read what comes back.

So, Cloudflare Workers do not enforce CORS. We've implemented different measures specifically to mitigate DDoS.

Any private gitbucket/bitbucket integration on the card, like most CI SaaS do, gets an oauth2 token for the desire repositories, setups the notification webhooks, deploys on push (of nominated branch)?

We're looking into building tools like this, but also note that you can whip up something yourself that pushes via the Workers API. Maybe you can even implement it as a Worker... :)

Lack of tooling integration is one of the most challenging things about using services like yours in production. "Whipping something up" is not something I want my eng team doing. Having built a system where customers want a similar integration, I understand how difficult it is to provide!

I hear you, and we're working on it.

How does a cold start perform, I'm guessing with sandboxing of v8 vs a whole new container, and deserialization of a single script into a free sandbox v8 slot (perhaps they store the AST or preJITed code for even faster start? - not familar with v8 internals) it is much faster than cold starting nodejs on AWS lambda?

This is one of those questions I love answering. :)

It turns out V8 is very well-optimized for loading JavaScript quickly, since that's exactly what it needs to do when loading a web site with a user waiting.

We're seeing average cold start times under 5ms.

There are some features in V8 to make loading a script even faster.

I'd be glad to connect over these features on twitter (same username). I implemented them.

So no free tier? That’s kind of a bummer, but I guess it’s time I start paying Cloudflare for something. :)

One too many people joked about running a Bitcoin miner... ;)

Correct, no free tier. You can use Cloudflare's free tier for all our usual stuff (caching, acceleration, SSL, etc.) that's included and add this on for minimum $5/month.


Can cloudflare be used to serve purely assets? i.e say you are disqus and can cloudlfare be used to serve the javascript which sites need to include in each page?

Edit: The question was misunderstood. I asked because cloudflare ToS seems to be against serving just assets and they want to serve entire sites

Yes. Cloudflare is, among other things, a CDN, and can do that. With Workers it becomes an exceptionally powerful and configurable CDN, as we actually expose an API to control the behavior of our caching systems to your Worker.

I don't see why not. It's just like serving the assets for your own site, except they're being pulled by other sites.

I've been doing that. The problem is when users' IP get challenged by CloudFlare the assets will not load, even with security level set to essentially off.

Asked a sales rep and got told to contact support, which is pretty much impossible as I don't know who are affected until they reported the problem to us.

Congrats, this is really great. Question, since it’s not based on Node.js there is no way to leverage on a regular package manager, how can I create reusable code between edge workers? How can I make an open source snippet without copy and paste?

For now, I recommend thinking of Workers like you do browsers: Use Webpack or Browserify or one of the many other tools to transpile npm packages into a single script, then upload that to Cloudflare using the API.

Meanwhile now that the base product is out we will be spending more time improving the tooling, and integrating with Cloudflare Apps so that you can package (and sell) Workers for other people to use on their own sites.

The short answer is webpack!

The long answer is webpack, rollup, or browserify :)

Are the server edge-points true to UTC Time? As in, are they synced with something like NTP? There is a severe lack of accurate time coming from edge-points in systems such as AWS Lambda, and no one appears to be confirming if they do or don't make sure their time is correct.

It would be incredibly useful if there was an edge-point service that could return linux epoch time in milliseconds (Thats accuraet to withing a 1ms of UTC time). I've been working on a Live broadcasting syncing system and there really wouldn't be anyone in a better position than CDN's with lambda-like functionality.

This isn't my department, but I'd assume our servers have accurate time. But I'd also assume Amazon has accurate time, and you say they don't, so maybe I don't know anything.

Then again being accurate to 1ms worldwide sounds like one of those problems that runs up against the speed of light? We don't have atomic clocks like Google does...

Atomic clocks would get you far below the 1ms threshold. I'll rephrase to within 25ms, since that can be pretty acceptable and doable across the internet by just popping up NTP.

Without a synchronizing service, time can't be trusted to be accurate. Time should always (try) be absolute to a global time, and not individual clocks a devop manually put some numbers in for.

Right, I just find it shocking that AWS wouldn't be running NTP. But I'm not a sysadmin, I dunno.

Correction, it may not be under Amazon API service. Apologies.

Will every request to a worker-enabled site, pass through the worker v8 engine and charged at the going rate, even requests for static ressources like favicons, or jpgs etc? Or is there some way to limit the worker engine request matching to a specific area of your site(like /service)?

You can specify URL patterns where the worker should be enabled or disabled, under "routes" in the config UI. You won't be billed for requests to routes where the worker is disabled.

I wish routing and page rules had rudimentary regex support like VCL or most other location matching route libraries/servers.

Yeah, trouble is, it's depressingly easy to create a regex that performs very poorly, so if we allowed that, people could easily break the service.

However! One of the neat things about Workers is that you totally can use regexes in JavaScript, and our sandbox prevents runaway CPU usage. So if you really wanted regex-based page rules before, you can probably get that with a Worker.

We've exposed an API for controlling Cloudflare features from a Worker, including many things commonly controlled via page rules:


We'll be adding more in the future.

> a regex that performs very poorly

Would that still be true if you limited the regex support to DFAs instead of NFAs? I can’t think of a single use-case for backreferences in routing patterns. :)

Would it be possible to invoke native code like in other FaaS runtimes: https://github.com/mankash/nativeGcpFunction?files=1

Once we enable WebAssembly, you'll be able to use compiled languages like C/C++/Rust/Go/etc. through that. It's not exactly "native code" but should accomplish the same goals.

We don't plan to allow true native-code because sandboxing native binaries is much more expensive than sandboxing with V8 / WebAssembly. In fact, for these kinds of massively-multi-tenant scenarios, code running in a V8 sandbox can actually be faster than "native code" due to reduced context switching. See: https://blog.acolyer.org/2017/08/22/javascript-for-extending...

Any way to increase the global state that can be stored and shared across instances? Say to 100mb? Could this be an option to be requested?

Tests ive run using CFW as a memory cache have been very very fast.

That limit is actually already over 100mb (128 to be exact, but there's a bit of overhead).

I can't promise we can raise it, but please feel free to reach out to us (workers-developer-help@cloudflare.com) and we'll be happy to weigh in.

Did you consider having a 5$ annual minimum charge? Or is there any deeper reason for having a 5$ monthly charge other than just a 5$ minimum charge total.

To be honest, pricing and billing is not my department. But, I don't think we have annual billing in general right now, so doing it for just one product would be weird. There's definitely a lot of discussion going on about better ways to do pricing and billing in the future.

Is WASM support in the cards? This would be awesome for isymtope.org but the engine is in Rust.

I think you just launched the next serverless resolution ;)

> Is WASM support in the cards?

Heck yeah. We almost get it for free with V8, except that the default WebAssembly API is designed around dynamically loading code at runtime, whereas we want to make sure that all code we run is uploaded strictly through the config UI/API and is available for forensics. So basically we just need to build a way to upload WebAssembly blobs and import them.

Is the issue with a length limit on data: uri's? something with the CSP?

Will there be some kind of session storage or is it completly stateless?

> Is the issue with a length limit on data: uri's? something with the CSP?

The problem is if we left the WebAssembly API enabled as-is, then people could fetch code from the internet and execute it. We already disable JS eval() so we had to disable the WebAssembly API too.

> Will there be some kind of session storage or is it completly stateless?

We're working on storage. It's a complicated problem. :) (See elsewhere in this thread.)

In the meantime, for many use cases you can use the HTTP cache as a sort of storage. Also, each worker instance has global variables which are writable -- there's no guarantee we won't reset your worker at any time, but generally one worker instance will handle multiple requests so can do some in-memory caching in globals.

This site loads very slowly, most images are in the megabyte range :(

Working on that, thanks for the feedback.

Thanks, you inspired me to take a look at Google Pagespeed as well as shrink the images further.

Would there be any issues running something like expressjs on it?

Workers are based on the Service Worker API, not based on Node.js APIs. So, Express is not likely to work out-of-the-box; it would need to be rewritten to be based on Service Workers.

The good news is that the Service Worker API is more modern than Node, e.g. using Promises instead of callbacks, which IMO makes it a better experience to work with.

Is the price for each request or is it request * workers

Same thing. Your site has only one worker script, and each request is therefore processed by at most one Worker. You can of course merge many independent pieces of logic into one script -- it's code, after all.

Enterprise customers are allowed to have multiple scripts mapped to different URL routes. This is mainly so that different teams owning different parts of the site don't step on each other. However, generally if you stuff your logic into one script rather than multiple, it will perform better, since that one script is more likely to be "hot" when needed.

ohhh, its one worker per domain, you cant have different ones that match on url? so you have to do that matching in your worker to run different code?

Note that Enterprise customers can have multiple scripts-per-domain, allowing you to run specify a specific script per route (or routes). Additional matching logic (e.g. headers, cookies, response codes) can then be done within the Worker itself.


Hopefully that limitation can be relaxed over time. Having to stuff all my logic in one big script sounds a bit annoying. At minimum having access to a second worker script route would be welcome for testing/development purposes, so one doesn't muck up a working production script.

> Having to stuff all my logic in one big script sounds a bit annoying.

Keep in mind that you can write your code as a bunch of modules and then use a tool like webpack to bundle it into one script, before you upload it to Cloudflare.

> for testing/development purposes

I agree with this, we definitely plan to add better ways to test script changes. Currently there is the online preview that shows next to the code editor, but it's true that there are some limits to what you can test with it.

Would this be capable of doing image resizing?

I guess it would be depend on the resizing profile, fetching from a remote origin and reading into a buffer should involved <1ms of computer time (latency not an issue given 15seconds of async wait) but then guaranteeing all done in 50ms of compute time for resizing I would be doubtful, if you could get compute bound resizing node.js code to yield control the event loop every 1ms you could abort after 14ms and issue a subrequest to your own image resize processing server or something like imgix.

As the person who's desk is next to @kentonv, I think the only reasonable thing to do is to build in image resizing service, make it incredibly popular, and drive him crazy with your never ending resource requirements.

Joking aside, we don't know yet what is and isn't possible. Please build this and other things. If you need help or more resources, including CPU time, please reach out (workers-developer-help@cloudflare.com).

>all done in 50ms of compute time for resizing I would be doubtful

I don’t know the details of your environment or timing precision, but I’m quite sure that a single core on a modern server CPU can resize and JPEG compress more than 20 reasonably-sized images per second. :-)

I benchmarked various programs to resize images a year ago. There was a JS package with native bindings to GD, a go program, and a pure JS implementation. Both go and GD were an order of magnitude faster at scaling down and converting PNG to JPG.

You're probably right, presumably the 50ms of compute time, is that of reasonable 1.6ghz xeon+, and compute time is accounted as actual real compute time - not inclusive of waiting on some virtual CPU to be multiplexed onto a real core.

Happy cake day! Congrats on the big ship!

> Due to the immense amount of work that has gone into optimizing V8, it outperforms just about any popular server programming language with the possible exceptions of C/C++, Rust, and Go.

Odd statement and it’s not true.

I work with Node.js a lot, which is using V8 and have tested a lot of code cross compiled to both JavaScript and the JVM.

As a matter of fact the JVM beats the shit out of V8 in everything but startup time.

And this is not an educated guess, I have the same code, some cross-compiled via a compiler that can target both (plenty of such compilers these days, including Scala, Clojure and Kotlin) and some code hand optimized for each specific platform and the difference is huge in both cases. And let’s be clear, this is not code that handles numeric calculations, for which JS would be at a big disadvantage.

Imagine that I’m not running the same unit tests, for JS I have to do much less interations in property based testing, because Travis-ci chokes on the JS tests. So it’s a constant pain that I feel weekly.

And actually anybody that ever worked with both can attest to that. The performance of V8 is rather poor, except for startup time where the situation is reversed, V8 being amongst the best and the JVM amongst the worst if startup time matters.

You're right, Java should have been included along with the other languages I mentioned. As a strongly-typed, compiled language, it should indeed beat V8 handily. I had intended to say that V8 outperforms other dynamically-typed languages like PHP, Python, Ruby, etc.

But because startup time and RAM usage are so important to our use case, Java has never been in the running as a plausible implementation choice, so to be honest I sort of forgot about it. :/

Yes, I admit the startup time for the JVM is terrible and your choice of V8 makes sense in that light.

Plus indeed people will be able to use WebAssembly and compile languages like Rust to it.

It's funny I was writing ScalaJS for a while and one idea I had was to use it for jobs where startup time really mattered and then use the ScalaJVM for everything else. So I could kind of get the best of both worlds. It turned out we were able to live with the JVM startup time without any problems for needs so I never pursued it further.

I think the big things coming down the pipe for the performance of each is: Value Types for java. A lot of what is needed can be done with sun.misc.unsafe already but it's an awkward way to have to program, I haven't kept up with it but hopefully it supports being dropped right onto memory mapped i/o for stuff like CapNProto style parsing.

On the v8 side, I think wasm could be a massive game changer and basically eat everything in concert with javascript. I wonder how it's performance is going to stack up to true native / a warmed up jvm (that has done all the fun inlining optimizations that can make it so fast).

I'm also doing Scala.js for the quick startup time btw. The JVM isn't usable with AWS Lambda last I tried.


I think WebAssembly is going to be great, but it's effectively a sandbox for native code, so it targets languages like C++ and Rust, or in other words languages that have a very light runtime and no garbage collector.

I don't know what hooks WebAssembly will be able to provide, but consider that at this point most high level languages would also have to ship at least a garbage collector with the compiled binary, because WebAssembly does not give you one, see open issue: https://github.com/WebAssembly/design/issues/1079

So you won't be able to run languages like Java, Scala, C#, Go, Clojure, etc any time soon. These languages are better off targeting JavaScript for now.

That said the thought of targeting browsers and Node with binaries built out of Rust fills me with joy.

> I'm also doing Scala.js for the quick startup time btw. The JVM isn't usable with AWS Lambda last I tried.

Heh, I was using Scala.js when AWS lambda was first announced I enthusiastically posted to the mailing list at the time that Scala.js would be great with this new paradigm of programming. I think the JVM is going to just be dead in the water on that front. So much of the design of v8 is around beign tossed a bunch of code and being told to start running it at full performance ASAP, whereas the JVM has been optimized under a very different set of constraints (e.g. long running server processes which it pivoted to 15 years ago after the failure of applets).


> I think WebAssembly is going to be great, but it's effectively a sandbox for native code, so it targets languages like C++ and Rust, or in other words languages that have a very light runtime and no garbage collector.

I think the end goal for wasm is to provide all the hooks that javascript has.[1]

And, In the meantime, Two stories I came across recently, indicate enhancements and polyfills and moving rapidly to bridge the gap:

https://news.ycombinator.com/item?id=16585315 (Making WebAssembly better for Rust and for all languages) https://github.com/alexcrichton/wasm-bindgen (A project for facilitating high-level interactions between wasm modules and JS.)

> I don't know what hooks WebAssembly will be able to provide, but consider that at this point most high level languages would also have to ship at least a garbage collector with the compiled binary, because WebAssembly does not give you one, see open issue: https://github.com/WebAssembly/design/issues/1079 > So you won't be able to run languages like Java, Scala, C#, Go, Clojure, etc any time soon. These languages are better off targeting JavaScript for now.

I could see there being some sort of jvm bytecode interpreter written in wasm happening some handy wavy time in the "future". My hunch is that wasm going to become the universal low level bytecode, so I think there will be more and more projects emitting wasm as compilation target in addition to then eventually surpassing x86 / arm etc (obviously this won't happen overnight).

That does still leave the area of server processes with long uptimes that benefit from JIT performance optimizations and which are ok with garbage collection overheads.

> That said the thought of targeting browsers and Node with binaries built out of Rust fills me with joy.

With all great power ... And a bit of fear of having articles & guis rendered to canvas removing the ability for users to control their interaction with the web.

Maybe once that happens you'll start seeing your node servers stop falling over.

1: e.g. https://github.com/WebAssembly/host-bindings (Host Bindings Proposal for WebAssembly: This repository is a clone of github.com/WebAssembly/spec/. It is meant for discussion, prototype specification and implementation of a proposal to add host object bindings (including JS + DOM) support to WebAssembly.)

$0.50/million requests, but it’s a minimum of $5/month (giving you 10million requests essentially.)

Not a criticism, this it looks like an excellent product for those who can benefit from it, just calling it out for others that read the comments here before the original content.

Looks like a little bit cheaper than AWS Lambda@Edge which is $0.6/mln, but more than regular AWS Lambda $0.2/mln. On Lambda you pay extra for resources, but you can get more RAM or CPU there (e.g. running Chrome Headless is an option there).

On the other hand CloudFlare Workers looks more distributed, but suitable just for 50ms CPU time, 15s wall time and 128 MB. This is enough for redirects, A/B testing, but often not enough for writing serverless applications or any kind of rendering.

I wonder whether CloudFlare wants to get into serverless business and this is first iteration or if it's just a CDN which is more customisable by allowing code to run there.

But no data transfer charges from Cloudflare. And the JavaScript deploys in seconds not minutes.

No data transfer fees from CloudFlare is their big competitive advantage. $0.5-0.7 GB / egress on major public clouds can be brutal. Transferring one object out of S3 cost same as storing it for 2.5 months.

Also note Lambda@Edge has a CPU time component to their pricing.

With a 50ms minimum. Because I guess you can't do anything useful faster than that. :)

It also looks to be much more capable than Lambda@Edge though, for what it's worth (particularly in that it can make multiple outbound requests).

I bet it's enough to drive a static blog from some markdown files and templates.

Which would be interesting for cheap hosting while also being on cloudflare CDN.

Cheap hosting that also manages to be served from 120+ places around the world.

And they typically run in under 1ms and global deploys take less than 30s and it's native V8.

Are Cloudflare Workers implemented as an NGINX module like OpenResty?

Nope. It's a new stand-alone HTTP server. This lets us implement a tighter sandbox, since all the code is purpose-built for running inside it.

> It's a new stand-alone HTTP server.

Are you going to publish more info about that HTTP server? Like the technology used to write it.

Yeah, I'll probably write that up at some point.

For now, here's the source code for the HTTP implementation. :)



That's from the KJ C++ toolkit library, which is part of the Cap'n Proto project, which is my own work that predates my time at Cloudflare. We're using this in the Workers engine and updating the code as needed.

Yes, I know, NIH syndrome. But it has been pretty handy to be able to jump in and make changes to the underlying HTTP code whenever we need to.

I hope to spend some time writing better docs for KJ, then write a blog post all about it.

The only major dependencies of the core Workers runtime are V8, KJ, and BoringSSL.


Wasn't running code on edge nodes how Cloudflare man in the middle attacked millions of encrypted websites for a year?

What I find missing here and in aws lambda or google function is the notion of "state".

I believe it would really be a game changer if we could open and maintain an open connection with a database.

In Lambda and Cloud Functions you can open a connection with a database (or anything else) and it will stay alive across invocations of the same underlying container.

Not sure how Cloudflare workers behaves here, but from their docs they recommend global variables a way to persist state, so perhaps an outbound TCP connection will stay alive across invocations of the same V8 "process".

Workers only support HTTP requests, not raw TCP. So, no, at present you can't hold a database connection open, though you can make requests to HTTP-based storage APIs available from any number of providers.

Thanks, I knew it doesn't work but I didn't dig down enough to reply to the previous comment.

My use case was to contact Redis and I wasn't able to maintain a connection open.

What about grpc which requires additional trailers but is still http2?

Indeed, we want to provide storage, but easy-to-use storage that can scale to 100+ (or 1000+) nodes is tricky. We're working on it. :)

PS. If you're a distributed storage expert and want to work on this, we're hiring!

I'm not an expert, but I've gotten used to CRDTs being available in Elixir-land (Phoenix.Tracker, in phoenix_pubsub).

But I don't believe >40 or so nodes has never been Erlang's strong point.

Edit: with the conclusion being of course that you wouldn't want to strongly connect the edge nodes, so instead it'd be something more traditional, where if people pay for a storage add-on, you folks are automating the schlep of spinning them up a long-lived storage cluster, and networking it so that only their edge nodes can access it. At which point you could be building on anything, maybe even something that already exists like Redis or etcd. Hmm, though for hello-world purposes I might still see what could be put together in beam-land, where everything is more at your fingertips ...

Flattering, but I don't think I will fit ;)

You definitely know this world better than myself, but as first approximation I wouldn't try to provide storage, which we know how messy it can be.

Just keep a TCP connection open can go a very long way and it will get costumer to hook to redis or basically any other database.

FWIW kentonv posting in this thread, who did this project, is the guy who invented protocol buffers :o

Technically, Kenton made Protocol Buffers version 2, and open sourced it, I believe. Kenton did, however, make Cap'n Proto, which builds on what he learned from doing Protocol Buffers. And he also created Sandstorm.io, a self-hosting platform I am quite fond of. :)

+1 Sandstorm is awesome. And porting an app is not complicated, thanks to a sensible architecture and good documentation.

Not really into this space. What would be some usecases for this? Call an api, do a thing? Or more like, process some data and post the results?

A great place to start is what companies like the Financial Times have been doing with Fastly. By pushing authentication to the edge [1] you can cache more. If you're receiving a lot of data in that you need to log and analyse later, you can use the Workers to redirect straight to your logging service - no servers involved [2].

1. https://www.fastly.com/blog/how-solve-anything-vcl-part-3-au...

2. Sorry, couldn't find the link. IIRC Fastly does this with its in-built logging feature.

There are some that people already think about doing on the edge, like complex caching rules, routing based on cookies, and edge side includes.

Then there are things people are just starting to think about, like doing a/b testing by serving different variants from the edge and building their API gateway into the edge.

Finally there are things people will only start dreaming of now that the tech is available, like filtering the massive stream of data coming in from IoT at the edge, or powering interactive experiences that require compute which individual machines don't have, but speed which centralization can't provide.

Kenton said Workers don't have access to client's cookies, how can you do routing based on cookies with Workers?

They don't have access to all the cookies in the users cookie jar, but since they work by intercepting requests, they can read the cookies that the browser send and act upon them.

Exciting times! Seems it's all about making a product that sells the features of serverless in the right way. Technically I would like websockets on these platforms, but I don't know how to sell that as a feature.

These numbers 50 ms of compute time and 15 s idle are interesting in the serverless space. Now I'm waiting for sane performance suites to figure out what suites you, I'm guessin this solution will kill in these strange latency test for AWS lambda from the other day: https://news.ycombinator.com/item?id=16542286

> Technically I would like websockets on these platforms

FWIW that's something I'm working on. It's tricky because it's not actually in the Service Workers standard. Currently, we support WebSocket pass-through (so if you do something like rewrite a request and return the response, and it happens to be a WebSocket handshake, it will "just work"), but haven't yet added support for terminating a WebSocket directly in a Worker (either as client or server).

Fanout (https://fanout.io) is useful for handling raw WebSockets from a FaaS backend.

Getting it to work with Cloudflare Workers is a little more involved since our Node libraries don't run in their Service Worker environment, but if you implement the negotiations manually it does work.

Do those workers have to complete a captcha ?

I don't think you read the article - this is your own JavaScript that runs on their edge severs, not humans doing work

It was a joke...

Will you be able to share how you sandbox these scripts?

We have multiple layers of sandboxing. To start, each Worker runs in a separate V8 isolate (which is actually a stronger separation than Chrome uses to separate an iframe from a parent frame, by default). We also have an extremely tight seccomp filter, and a long list of other measures.

We made an intentional decision early on to avoid providing any precise timers in Workers -- even Date.now() only returns the time of last I/O (so it doesn't advance during a tight loop). This proved to be a really good idea when Spectre hit. (But we also shipped V8's Spectre mitigations pretty much immediately when they appeared in git -- well before they landed in Chrome.)

How does this compare with running AWS Lambda's on CloudFront?


Looks very nice! Is the minimum cost ($5/m) per site or per account?

Can a Cloudflare app install a worker into a Cloudflare customer’s zone?

Not yet, but with Workers launched this is now my top priority, and something we're all very excited about.

Ok thanks.

Do workers play nicely with managed CNAMEs?

It's something we're iterating on, and the results are so far looking promising; but there are a few scenarios in which they may not work (yet). If you reach out to your SE they should be able to get into specifics.

Now this is indeed quite interesting.

Though 0.50 per million is only true if you have more than 10 million requests per month, otherwise the price will be higher since there is 5$ minimum.

In what case should this be preferred over plain old Service Workers running on the user's browser? Latter is even lower latency and for free.

1. It's easy to maintain/update the code because it is pushed once to Cloudflare and you don't have to worry about browser caching effects on JavaScript delivered to the browser.

2. The performance of the code will be much higher than in the browser because of the server resources available and also because and subrequests will happen across Cloudflare's fast/reliable links and not whatever the end user is connected to.

3. The end user has control over what JavaScript is executed and might use a tool like Disconnect to block external scripts preventing the code from running at all.

4. Security: you can include things like API keys.

5. Script starts executing earlier.

6. Conserves bandwidth/battery life of mobile users.

A few cases off the top of my head:

- When you need to work with older browsers or non-browser clients (e.g. API clients!) that don't support service workers.

- When it would be a security problem if the user can bypass or interfere with the worker (e.g. you can implement authentication in a CF Worker).

- When you specifically want to optimize your use of the shared HTTP cache at the Cloudflare PoP.

- When startup time of a service worker would be problematic.

- When CORS would prevent you from making the requests you need from the browser side. (CORS doesn't apply to CF Workers, since the things CORS needs to protect against are inherent to the client-side environment.)

It's lower latency once you've delivered the javascript.

There's a lot of interesting stuff you can do with edge applications. Image optimization/resizing, content rewrites, pre-rendering, API gateway, etc, etc, etc. These are all things you want to do once for many visitors.

There's a lot you _can't_ do in a browser because browsers are untrusted. Edge applications can run with a different level of trust.

this looks like something that could be used to host an entire web app (sans database). Pretty cool!

Is it possible to just change/rewrite the request origin like on AWS Edge?

Yes it is! You can make arbitrary requests in your Worker to anywhere on the internet you like, and return any response you like.

The JS requests are limited to several MBs so you can't download large assets. This applies on AWS Edge too but on AWS you can modify the request's origin and then cloudfront agent (which has higher limits ) will perform the actual request using the origin you set.

The Workers runtime fully supports streaming requests/responses. If you're just passing the response through, it does not get buffered in the worker, and it is not subject to the Worker's memory limit. You can absolutely download multi-gigabyte files through a Worker.

To expand on that, when you invoke fetch() and get a Response object back, that object is returned as soon as response headers have been received; it does not wait for the body. The Response object contains a ReadableStream from which you can read the body, but if you instead pass the response directly to event.respondWith(), then it will stream right back out to the client.

Can the playground show the resource limits info?

It's something we would love to expose in the future. That said, we have yet to see someone run into those limits (on purpose). The average worker runs with less than a ms CPU time.

Before using a Cloudflare product, please consider if you want to contribute to the Internet's largest man-in-the-middle attack. They have a poor track record when it comes to security[0], privacy[1], and censorship[2]. We're at the point where it's our responsibility to protect the Internet and keep these companies in check. Cloudflare is among the worst existential threats to the Internet.

[0]: https://gizmodo.com/everything-you-need-to-know-about-cloudb...

[1]: https://blog.torproject.org/trouble-cloudflare

[2]: https://www.nytimes.com/2017/09/13/opinion/cloudflare-daily-...

[0]: Cloudflare provides SSL certificates to millions of web sites (even ones that don't pay us), was one of the first to deploy TLS 1.3 and quantum-resistant crypto, provides DDoS mitigation to all customers (again including free customers), etc. But yeah, we had a bug once. :/

[1]: Cloudflare now implements Privacy Pass which means Tor users mostly don't see captchas anymore.

[2]: Please read: https://blog.cloudflare.com/why-we-terminated-daily-stormer/

I agree with what you said, and I like you (so I don't want to hammer on this on a day you should be celebrating a cool thing you made), but...

You missed what I think is the most important thing: Cloudflare currently entails correlated risk, for lack of a better term. A government intrusion into CF represents access to thousands and thousands of sites' decrypted streams. This is a huge target for the US, Russian, and other spy agencies, to the extent that I cannot believe you're not already compromised.

All those small customers who are using you for free TLS should be using Let's Encrypt so they can get end-to-end encryption, necessitating individual, active attacks (I suppose on DNS) rather than sweeping, passive attacks.

I think there are some cool and good things that Cloudflare does, but it's irresponsible to minimize the threat it presents to privacy in today's internet.

[Edit: Also, if you don't want to respond to this thread, I will totally understand, and think that's reasonable. I don't want to shit on your cake!]

Isn't the entire idea of the cloud a massive correlated risk? If AWS is hacked, it would be very bad. That said, experience has seemed to show that people who build infrastructure tend to make less mistakes in that way than the millions of people who are building businesses and personal sites would. I agree that in a perfect world security would be easy to get right and federated, but it seems like it you have to pick one 'right' is the best choice for now.

Do you use any cloud providers?

Yes, TLS termination is something that people get wrong, but there are other ways of decreasing that risk than to hand off the task entirely to someone else.

And yes, if AWS were compromised, that would suck. But right now a lot of CloudFlare sites are backed by AWS. So now their traffic is at risk in two places, not just one.

I don't tend to use cloud providers, no. I self-host some stuff out of my house, with reliance on DNS and CAs being the major points of "correlated risk". I use S3 for serving some public files.

> Cloudflare currently entails correlated risk, for lack of a better term. A government intrusion into CF represents access to thousands and thousands of sites' decrypted streams. This is a huge target for the US, Russian, and other spy agencies, to the extent that I cannot believe you're not already compromised.

Why is this different from a bunch of people running a LAMP monoculture on their own individual servers?

If anything, Cloudflare can use economies of scale to staff a dedicated incident response team, assuming that at all times they are already compromised and trying to stop each attacker. They can invest in systemic least-privilege isolation. They can test the latest upstream versions of software in CI and deploy patches quickly and have 24/7 on-call staff to manage those deployments. I can't do any of that on the Raspberry Pi in my bedroom. If an intelligence agency or even a not-that-intelligent agency decides they want in, they just need to wait for the next zero-day in L, A, M, or P, and bet correctly that I'm not going to patch and restart my server until at least when I get home from work. Scaling this to everyone like me is just a matter of putting their exploit in a for loop.

And I do server maintenance as my day job. I've maintained a many-thousands-of-users shared web host that has been broken into. I certainly don't expect myself as a hobbyist to do a good job of maintaining my systems; what about the person who just wants to run a website and has zero professional experience being a sysadmin?

See my response over here: https://news.ycombinator.com/item?id=16577496 (summary: "now you have two problems")

1. One of the exciting things about this specific project is that it's likely to be no longer necessary to run an EC2 VM behind your Cloudflare site any more - any computation can live entirely within Cloudflare.

2. If you're running behind Cloudflare, one pretty straightforward and common thing is to configure your web server to only respond to requests from Cloudflare. Since Cloudflare has its own WAF that's updated by a skilled security team, this decreases your exposure - something like Shellshock or the Rails mass assignment vulnerability would get dropped at the Cloudflare level before it makes it to your origin server, and nobody else can send you HTTP requests.

(At that point you can configure your machine for SSH keys only and reduce your attack surface to pre-authentication OpenSSH vulnerabilities.)

So I don't think you have two problems if you use Cloudflare. You are trading off one problem for another, yes, but for most people that's the right tradeoff.

Genuinely asking: isn't it true that Cloudflare gets to see the plaintext traffic of sites that it proxies?

Even if Cloudflare is a big champion of better encryption and is currently not doing anything shady with this ability, it's a concerning power for one organization to have.

(Of course, if Cloudflare doesn't see the plaintext, then disregard)

Isn't that true of essentially any IaaS provider? Heroku or AWS could access anything running on your machine just by instrumenting their virtualization system if they cared to.

Part of the move to the cloud was the decision that well organized companies with large security teams can do a better job protecting internet resources than the vast majority of individuals. Cloudflare is just that, for cache/firewall/etc. appliances, I don't see the difference.

That's a really good point. Unless you're using plain VM's, you're either giving your SSL keys to the provider or having them setup SSL for you.

Didn't really think about how many services do this: AWS' ELB, any serverless service, Heroku and other PaaS services, etc.

>Unless you're using plain VM

Even a plain VM is easily observable for whoever is hosting it. At the end of the day you have to either trust your service providers or do it yourself, whether that's securing your network infrastructure or emptying the trash can next to your desk.

I hope this doesn’t sound rude, but the number of people who mention Cloudflare as some kind of MITM threat and then also use a cloud provider with elastic load balancer and god knows what else at the same time is staggering - and just plain frustrating.

It does with its free offering. It has a paid version that does not MITM the connection, IIRC.

>[0]: Cloudflare provides SSL certificates to millions of web sites

With let's encrypt SSL certs are free and easier to use than ever.

>[1]: Cloudflare now implements Privacy Pass which means Tor users mostly don't see captchas anymore.

Yeah, at the expense of deanomymizing them and only if they install your addon in their web browser!

Privacy Pass does not deanonymize users. Go read the mathematics before making statements like that: https://privacypass.github.io/

Click "new identity" on the tor button plugin and privacy pass will continue to send the same tokens to the destination. Passes are persisted between browser sessions. It also identifies different people connecting via the same Tor circuit. Am I missing something?

> Am I missing something?

Yes, you're missing the whole cryptographic underpinnings of Privacy Pass which make it impossible to de-anonymize the user. I know, it sounds like impossible magic at first, but read the papers -- it actually works.

"Same tokens" or "same token"? If it sends a different token from the same set of one-time use tokens, and if their crypto does what it claims to do, then that doesn't deanonymize you.

By different users do you mean that it demonstrates to the server that multiple instances of the plugin are behind the same Tor circuit? If it's using different tokens, I don't think the server gets to learn that; it could be multiple instances of the plugin accessing 30 pages each, or one instance accessing 60 pages.

> [0]: Cloudflare provides SSL certificates to millions of web sites (even ones that don't pay us)

No, you don't. You proxy their traffic and encrypt it on the way out, undermining the security of the Web by creating a single point of failure that you've proven unable to defend.

It is noteworthy that Cloudflare's CEO thought it was inherently wrong that he could effectively take a website off the Internet that he didn't like.

That's, at least, a massively better position on censorship than many other web companies, like Google, who explicitly believes they should remove websites they don't like from the Internet.

Dude, Cloudflare is just a hosting company, the internet would carry on fine without them. I'm not sure why NYTimes is losing their rag over their long-overdue decision to take down a Nazi site. The famous nazi site is still up, so the internet continues to be place where you can post incitement to genocide - thank goodness!

If you want to rag on Cloudflare as a shabby internet citizen, ask why they provide DDoS protection services while also hosting most DDoS-for-hire sites: https://www.google.com/search?q=booter

Aren’t they improving, though? They open-sourced a quantum resistant TLS implementation.

They can improve compared to their past performance, but they are strictly an incredibly negative influence on the Internet.

I agree with that. I should admit that they’re much better at PR than they are at doing the right thing.

This was in compliance with a court order. I, for one, am still relatively appreciative corporations can't go around willy-nilly refusing to obey legal orders from the government.

If you have a problem with specific orders of our government, the correct avenue of upset is with them, or, should they be unwilling to change their position, by voting.

The prospect of scihub dying is terrifying, and scihub helped to push in that direction.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact