Phaser 4 Dev Report 4

Published on 3rd March 2020

Now we're entering the month of March, I wanted to post an update on Phaser development progress. Since the release of Phaser 3.22 I switched back to working on Phaser 4 again and progress has been both rapid and exciting. I'll cover this in more depth in this report but first I wanted to give a short-term roadmap covering the next batch of work.

Phaser 3.23

I'm going to spend another few weeks on Phaser 4, getting the first release ready (more on this later). After this, I'll switch back to Phaser 3. My first task will be the release of Phaser 3.23. This should happen towards the end of March. I've already done a lot of work for 3.23, including 100% complete documentation, the brand new Rope Game Object and lots more. If you'd like to get a head start, the GitHub 'master' branch contains 3.23 in its current state, which is perfectly safe to use. There is nothing experimental in there. However, there are some really great pull requests I will merge and further issues to explore before 3.23 is published.

Once 3.23 is out, again hopefully by the end of March, I'll prepare for 3.24 and start clearing down the issues list. At this point in time, v3 releases are mostly about bug fixing. Although this isn't always the case, as the new Rope Game Object demonstrates, it is definitely my main focus while working on it.

As always, thank you to everyone who keeps contributing towards the project, both in terms of patronage and pull requests.

Phaser 4

My objective with Phaser 4 has always been to bring the codebase into the modern browser age and take full advantage of TypeScript. As well as that, it was an opportunity to revisit the API structure and redo things that weren't fully realized the first time around. I've been working solidly to this end and covering a lot of ground, so let's dive into the most recent updates.

Up until recently, I had been creating test after test, trying out different approaches and performance benchmarking various strategies. This had all been taking place using a WebGL2 Renderer and the first 34 or so tests were all built around it. You can read my previous logs for details about those. The problem, however, is that WebGL2 is effectively dead. While it has a moderately decent desktop presence, it will never appear on iOS. Given that represents a significant portion of mobile (perhaps not as large as you may think, but still enough to be a major consideration), it feels pointless pushing any further in this direction. We have the very well supported WebGL1 covering all bases today and the only real choice the future presents is WebGPU.

To this end, it was around test 35 that I decided to switch track and build out a revised WebGL1 Renderer. WebGL1 was always a requirement, anyway, just like Canvas is. This work was always going to happen, so it made sense to get started on it now rather than later. Longer-term, we'll move directly to WebGPU, as this is where the future of rendering on the web lies.

Canvas 4 Ever

Before I dive into the WebGL1 side of things, I just wanted to cover Canvas quickly. A number of frameworks consider canvas support as a "legacy" feature or have deprecated it away entirely. I understand the reasoning, after all, you can do so much more with WebGL and more importantly, support for it is virtually everywhere. Yet Canvas is far from dead and to this day has a couple of significant benefits. The first is that if the file size is an issue (as I know it is for a lot of devs here creating playable ads), then a Canvas Renderer is often more than enough and can easily be significantly smaller in size than even the most streamlined WebGL one. Secondly, new features are still being added to it.

A few months ago at Chrome Dev Summit 2019, there was a great session talking about upcoming features such as the new Text Metrics API, which finally gives us accurate text measurements! Some new primitives and path constructs including rounded rectangles and ovals. The Recorded Pictures API, which allows you to record all of the commands a canvas receives and then play them back to the same canvas, or another one. And finally, there is the new Batched Draw Image API, allowing you to draw multiple images with a single API call. This is a significant feature in its own right.

Add Offscreen Canvas to this list and honestly, there is plenty of life in the old dog yet. This is why Phaser 4 will include a Canvas Renderer, just like all versions before it and will not consider it deprecated. As the new browser APIs land, I'll expose them as best I can. Now, back to WebGL.

WebGL1 Renderer

I went right back to the beginning with this. Literally starting from square one (or perhaps I should say 'quad one'?!) and coding it from scratch. The key wasn't to drop-in every possible feature, or even to make it as 'flexible' as possible, it was to just handle one thing and handle it well: the rendering of sprites.

I started building the renderer up, piece by tiny piece. The very first version shoved a tri on-screen:

The second a quad:

And so it went, piece by piece. I added in texture frame / UV support, a camera matrix, a projection matrix, a static buffer, sub data tests, batching, a basic display list, texture frames, quad positions and on and on.

I built on-top of every version, iterating as I went, testing different approaches. You can track the entire evolution of the renderer here on the Phaser 4 Dev Site. You can also find all of the code for each part here on GitHub.

Feel free to try them for yourself. Some demos are interactive, some may crash your browser :) You should think of them are being presented in chronological order. The further down you get, the more complete the renderer becomes.

Multi Texture Support

I was adamant that we would have full multi-texture support right from the start, so this is in and working right now. It's something I really should try and add to Phaser 3, too, after it was frustratingly left out of the original design. The 25 parts culminate in the classic, albeit of limited-use, bunny mark tests.

Feel free to try them for yourself and click on the canvas to add in more bunnies to the mix.

Note: None of the tests are expected to work on older browsers and likely not on mobile either.

I produced two bunny mark tests, one using multi-texture support and the other using a single texture. The difference varies based on the GPU and platform, but on my desktop gaming-grade PC I easily get an extra 4fps when rendering the exact same number of bunnies using the single texture shader, as opposed to the multi-texture shader. This is to be expected, as the shader has zero branching in it. Of course, in a typical asset-heavy game, the multi-texture approach is much more useful. But, if you know for a fact you can fit all your assets in a single texture atlas, then not having multi-texture support in the shader absolutely eases up the load on the GPU.

At the end of this process, I had a solid multi-texture batching renderer in place. And what's more, even including the shader source and helpers for loading, textures, matrix math, and a sprite class, it was still only 3.6 KB (min+gz) in size. The single texture shader version was even less, at 3.1 KB. This got me thinking.

Phaser Nano

At this point, I remembered a project I had started back in 2015, Phaser Nano. The whole point of Phaser Nano was to be a bare minimum version of Phaser 2, with just the core elements you needed in order to load assets, render sprites quickly, interact with them and take up the least possible filesize. The point being it would be small enough to use in extreme low-size environments, such as banner ads and playables, where every byte counts.

With this buzzing around my mind, I started refactoring my huge pile of tests into a clean, usable micro-framework. Phaser 4 Nano was born. To date, I've completed all of the following features:

* WebGL1 Multi-Texture Batched Renderer.
* Loader with support for Images, Sprite Sheets, Texture Atlases, JSON, CSV, and XML.
* Game + loop with delta values passed to your Scene.
* Texture Manager, with base Texture and Frame, including trimmed atlas frame support.
* An easy to extend Scene class file.
* A new TypeScript conversion of EventEmitter3 using native ES6 features.
* Keyboard input handler.
* A basic Camera.
* A textured Sprite entity, supporting per-vertex tint and alpha, rotation, scale, origin, and skew.
* A Container entity.
* A Sprite Buffer class, for zero-overhead static sprite rendering in huge numbers.
* A complete transform list. Every Sprite is a Container by default, nest them as deep as you like.

With all of the above included, it's still only 7.60 KB (min+gz), or 22.5KB min.

Sprite Buffers

I wanted to quickly mention what the Sprite Buffer class is. With a traditional batch rendering system, the biggest cost (aside from swapping shaders) is uploading all of the buffer data to the GPU. This is why you typically have quite small buffer sizes, say several thousand vertices, and then flush and draw them. It doesn't matter how you store this data or write to it, the very fact of uploading it each frame can be a real performance killer when you do it a lot.

However, if you have a large bunch of sprites you want to render that don't need to move or transform in any way, then you can put all of their transform data into a static buffer, upload it to the GPU and be done with it. Each frame, instead of re-sending all the data, you just tell WebGL to draw from the buffer and it happily does so.

(^ 100,000 sprites can get a bit messy!)

It's insanely fast. Think hundreds of thousands, or even millions, of sprites per frame, rather than just thousands. Of course, you're trading memory for this, because the buffer data has to be held on the GPU - and the larger the buffer, the more memory it takes up. So you still need to be careful. However, if you've got the sort of game where you need to render a large number of images that, once set, don't need to move again (other than when the camera moves), then this is ideal. I added the Sprite Buffer into Phaser Nano specifically for this purpose and it's perfect for things like game backdrops or maps.

Return of the Container

When Phaser 3 was originally devised we only ever wanted it to have a flat display list. No parents, no children. Just a lovely, easy, fast list that you could control by setting the z-depth of the sprites. Unfortunately, the most requested feature, time and time again, from the community was the ability to create a Container that you could put sprites, and other game objects, into.

We added this to Phaser 3 and, honestly, it caused quite a few issues. There's some nice spaghetti code still in there, even today, because of this late-stage change. In retrospect, I think we should have left them out, but hindsight is a wonderful thing.

It's not a mistake I'm making with any release of Phaser 4, even including this little Nano version! From the very start, Containers are in and what's more, every Sprite is a Container. You can add Sprites as children of other Sprites, to any depth you want, just like you did in Phaser 2.

The difference is that, unlike in Phaser 2, the transform list isn't deferred. It doesn't update at render time, it updates immediately, so you don't get stuck with outdated (i.e. one frame old) values in your transform properties, which was really frustrating in lots of situations. Obviously, the deeper your transform nest goes, the more iterations are taking place per update. So please, for the sake of VMs everywhere, keep them to a minimum,

Note that just because Sprites are linked in this way for their transforms, it doesn't mean they have to render in that order too. Of course, that will be the default. But there is nothing stopping them from rendering based on their z-depth, regardless of what their transform depth is.

First Release of Phaser 4 Nano is Coming Soon

Right now I'm still building Nano in the Phaser 4 Dev repo and it's changing significantly every single day. When I reach a state I'm happy with, I'll merge it all back into the main repo. There's a very small handful of features left that I want to implement and then I will release the first public version. All being well, I hope this to happen within the next few weeks.

It's been quite a journey of exploration to get to this stage. I'm very pleased with what I've learned and built so far. There's a lot more work to come, of course, to build-out the full Phaser 4 version. But Nano is the correct first step to take and it's a way to get a useful, working release into the hands of those I know need it, and those who are just curious to tinker!

Now we're entering the month of March, I wanted to post an update on Phaser development progress. Since the release of Phaser 3.22 I switched back to working on Phaser 4 again and progress has been both rapid and exciting. I'll cover this in more depth in this report but first I wanted to give a short-term roadmap covering the next batch of work.

Phaser 3.23

I'm going to spend another few weeks on Phaser 4, getting the first release ready (more on this later). After this, I'll switch back to Phaser 3. My first task will be the release of Phaser 3.23. This should happen towards the end of March. I've already done a lot of work for 3.23, including 100% complete documentation, the brand new Rope Game Object and lots more. If you'd like to get a head start, the GitHub 'master' branch contains 3.23 in its current state, which is perfectly safe to use. There is nothing experimental in there. However, there are some really great pull requests I will merge and further issues to explore before 3.23 is published.

Once 3.23 is out, again hopefully by the end of March, I'll prepare for 3.24 and start clearing down the issues list. At this point in time, v3 releases are mostly about bug fixing. Although this isn't always the case, as the new Rope Game Object demonstrates, it is definitely my main focus while working on it.

As always, thank you to everyone who keeps contributing towards the project, both in terms of patronage and pull requests.

Phaser 4

My objective with Phaser 4 has always been to bring the codebase into the modern browser age and take full advantage of TypeScript. As well as that, it was an opportunity to revisit the API structure and redo things that weren't fully realized the first time around. I've been working solidly to this end and covering a lot of ground, so let's dive into the most recent updates.

Up until recently, I had been creating test after test, trying out different approaches and performance benchmarking various strategies. This had all been taking place using a WebGL2 Renderer and the first 34 or so tests were all built around it. You can read my previous logs for details about those. The problem, however, is that WebGL2 is effectively dead. While it has a moderately decent desktop presence, it will never appear on iOS. Given that represents a significant portion of mobile (perhaps not as large as you may think, but still enough to be a major consideration), it feels pointless pushing any further in this direction. We have the very well supported WebGL1 covering all bases today and the only real choice the future presents is WebGPU.

To this end, it was around test 35 that I decided to switch track and build out a revised WebGL1 Renderer. WebGL1 was always a requirement, anyway, just like Canvas is. This work was always going to happen, so it made sense to get started on it now rather than later. Longer-term, we'll move directly to WebGPU, as this is where the future of rendering on the web lies.

Canvas 4 Ever

Before I dive into the WebGL1 side of things, I just wanted to cover Canvas quickly. A number of frameworks consider canvas support as a "legacy" feature or have deprecated it away entirely. I understand the reasoning, after all, you can do so much more with WebGL and more importantly, support for it is virtually everywhere. Yet Canvas is far from dead and to this day has a couple of significant benefits. The first is that if the file size is an issue (as I know it is for a lot of devs here creating playable ads), then a Canvas Renderer is often more than enough and can easily be significantly smaller in size than even the most streamlined WebGL one. Secondly, new features are still being added to it.

A few months ago at Chrome Dev Summit 2019, there was a great session talking about upcoming features such as the new Text Metrics API, which finally gives us accurate text measurements! Some new primitives and path constructs including rounded rectangles and ovals. The Recorded Pictures API, which allows you to record all of the commands a canvas receives and then play them back to the same canvas, or another one. And finally, there is the new Batched Draw Image API, allowing you to draw multiple images with a single API call. This is a significant feature in its own right.

Add Offscreen Canvas to this list and honestly, there is plenty of life in the old dog yet. This is why Phaser 4 will include a Canvas Renderer, just like all versions before it and will not consider it deprecated. As the new browser APIs land, I'll expose them as best I can. Now, back to WebGL.

WebGL1 Renderer

I went right back to the beginning with this. Literally starting from square one (or perhaps I should say 'quad one'?!) and coding it from scratch. The key wasn't to drop-in every possible feature, or even to make it as 'flexible' as possible, it was to just handle one thing and handle it well: the rendering of sprites.

I started building the renderer up, piece by tiny piece. The very first version shoved a tri on-screen:

The second a quad:

And so it went, piece by piece. I added in texture frame / UV support, a camera matrix, a projection matrix, a static buffer, sub data tests, batching, a basic display list, texture frames, quad positions and on and on.

I built on-top of every version, iterating as I went, testing different approaches. You can track the entire evolution of the renderer here on the Phaser 4 Dev Site. You can also find all of the code for each part here on GitHub.

Feel free to try them for yourself. Some demos are interactive, some may crash your browser :) You should think of them are being presented in chronological order. The further down you get, the more complete the renderer becomes.

Multi Texture Support

I was adamant that we would have full multi-texture support right from the start, so this is in and working right now. It's something I really should try and add to Phaser 3, too, after it was frustratingly left out of the original design. The 25 parts culminate in the classic, albeit of limited-use, bunny mark tests.

Feel free to try them for yourself and click on the canvas to add in more bunnies to the mix.

Note: None of the tests are expected to work on older browsers and likely not on mobile either.

I produced two bunny mark tests, one using multi-texture support and the other using a single texture. The difference varies based on the GPU and platform, but on my desktop gaming-grade PC I easily get an extra 4fps when rendering the exact same number of bunnies using the single texture shader, as opposed to the multi-texture shader. This is to be expected, as the shader has zero branching in it. Of course, in a typical asset-heavy game, the multi-texture approach is much more useful. But, if you know for a fact you can fit all your assets in a single texture atlas, then not having multi-texture support in the shader absolutely eases up the load on the GPU.

At the end of this process, I had a solid multi-texture batching renderer in place. And what's more, even including the shader source and helpers for loading, textures, matrix math, and a sprite class, it was still only 3.6 KB (min+gz) in size. The single texture shader version was even less, at 3.1 KB. This got me thinking.

Phaser Nano

At this point, I remembered a project I had started back in 2015, Phaser Nano. The whole point of Phaser Nano was to be a bare minimum version of Phaser 2, with just the core elements you needed in order to load assets, render sprites quickly, interact with them and take up the least possible filesize. The point being it would be small enough to use in extreme low-size environments, such as banner ads and playables, where every byte counts.

With this buzzing around my mind, I started refactoring my huge pile of tests into a clean, usable micro-framework. Phaser 4 Nano was born. To date, I've completed all of the following features:

* WebGL1 Multi-Texture Batched Renderer.
* Loader with support for Images, Sprite Sheets, Texture Atlases, JSON, CSV, and XML.
* Game + loop with delta values passed to your Scene.
* Texture Manager, with base Texture and Frame, including trimmed atlas frame support.
* An easy to extend Scene class file.
* A new TypeScript conversion of EventEmitter3 using native ES6 features.
* Keyboard input handler.
* A basic Camera.
* A textured Sprite entity, supporting per-vertex tint and alpha, rotation, scale, origin, and skew.
* A Container entity.
* A Sprite Buffer class, for zero-overhead static sprite rendering in huge numbers.
* A complete transform list. Every Sprite is a Container by default, nest them as deep as you like.

With all of the above included, it's still only 7.60 KB (min+gz), or 22.5KB min.

Sprite Buffers

I wanted to quickly mention what the Sprite Buffer class is. With a traditional batch rendering system, the biggest cost (aside from swapping shaders) is uploading all of the buffer data to the GPU. This is why you typically have quite small buffer sizes, say several thousand vertices, and then flush and draw them. It doesn't matter how you store this data or write to it, the very fact of uploading it each frame can be a real performance killer when you do it a lot.

However, if you have a large bunch of sprites you want to render that don't need to move or transform in any way, then you can put all of their transform data into a static buffer, upload it to the GPU and be done with it. Each frame, instead of re-sending all the data, you just tell WebGL to draw from the buffer and it happily does so.

(^ 100,000 sprites can get a bit messy!)

It's insanely fast. Think hundreds of thousands, or even millions, of sprites per frame, rather than just thousands. Of course, you're trading memory for this, because the buffer data has to be held on the GPU - and the larger the buffer, the more memory it takes up. So you still need to be careful. However, if you've got the sort of game where you need to render a large number of images that, once set, don't need to move again (other than when the camera moves), then this is ideal. I added the Sprite Buffer into Phaser Nano specifically for this purpose and it's perfect for things like game backdrops or maps.

Return of the Container

When Phaser 3 was originally devised we only ever wanted it to have a flat display list. No parents, no children. Just a lovely, easy, fast list that you could control by setting the z-depth of the sprites. Unfortunately, the most requested feature, time and time again, from the community was the ability to create a Container that you could put sprites, and other game objects, into.

We added this to Phaser 3 and, honestly, it caused quite a few issues. There's some nice spaghetti code still in there, even today, because of this late-stage change. In retrospect, I think we should have left them out, but hindsight is a wonderful thing.

It's not a mistake I'm making with any release of Phaser 4, even including this little Nano version! From the very start, Containers are in and what's more, every Sprite is a Container. You can add Sprites as children of other Sprites, to any depth you want, just like you did in Phaser 2.

The difference is that, unlike in Phaser 2, the transform list isn't deferred. It doesn't update at render time, it updates immediately, so you don't get stuck with outdated (i.e. one frame old) values in your transform properties, which was really frustrating in lots of situations. Obviously, the deeper your transform nest goes, the more iterations are taking place per update. So please, for the sake of VMs everywhere, keep them to a minimum,

Note that just because Sprites are linked in this way for their transforms, it doesn't mean they have to render in that order too. Of course, that will be the default. But there is nothing stopping them from rendering based on their z-depth, regardless of what their transform depth is.

First Release of Phaser 4 Nano is Coming Soon

Right now I'm still building Nano in the Phaser 4 Dev repo and it's changing significantly every single day. When I reach a state I'm happy with, I'll merge it all back into the main repo. There's a very small handful of features left that I want to implement and then I will release the first public version. All being well, I hope this to happen within the next few weeks.

It's been quite a journey of exploration to get to this stage. I'm very pleased with what I've learned and built so far. There's a lot more work to come, of course, to build-out the full Phaser 4 version. But Nano is the correct first step to take and it's a way to get a useful, working release into the hands of those I know need it, and those who are just curious to tinker!