Saturday, February 28, 2009

tac: Speed progress

OK, after running some tests, here are some conclusions:

- Culling off-screen triangles is important. There's a 20FPS difference between drawing 15,000 off-screen vertices and 5,400 - the latter should be plenty for visible tiles, so I need to do my own spatial culling.
- Tight-fitting my geometry around the sprite (ie. minimizing number of 0-alpha pixels I draw) helps a lot as well. I should auto-fit geometry in my atlas tool. But this still might not be fast enough, if the screen is full of tall tacos, in which case FPS drops to 20 (alpha testing + zbuffer).
- Just using blending with tight-fitting gives me 40FPS with 30x30! Drawing all tall tacos drops it to 24FPS, which is not bad at all.
- Disabling alpha blending AND testing gives me a huge boost - to 45-50FPS even with tall tacos, and it's not really fill-limited at all. With 15,000+ vertices, it keeps a steady 30FPS.

That last one is the kicker. Basically, I can potentially triangulate all of my sprites, to make them perfectly fit, and disable alpha testing/blending completely. I'll still need culling to avoid sending too many triangles, since each "sprite" might use a lot. I'll need to somehow manage the shuffling of all those vertices as well, to keep everything in one draw call with the spatial culling.

But, blending+sorting might be fast enough and yields the best visual quality. It's totally fill-rate limited with no visibility culling. While my test map yields 20FPS worst case, real maps could have much more overlap, causing more fill. There are some details I need to sort out (no pun intended) with between-tile animation, but I think I have a good idea for that (the problem is what order I draw things in - left to right, top to bottom, etc.). Spatial culling is still necessary, since off-screen triangles do still affect FPS. I could do some of my own visibility culling as well, and assuming the map isn't swiss cheese, it should reduce over-draw.

Looks like I'll need spatial culling no matter what I decide to do. Basically, I just need a few redundant copies of all the data in separate draw lists, and just draw the ones that are currently visible (easy). Hopefully I won't need more than 4 draw calls at a time. Duplicating the data isn't a huge deal - display lists won't take more than 2 MB each.

Triangulated-sprites could be an interesting approach, but I'd need to limit the shape-detail of each sprite. This is fine for things like walls and pillars, but for characters with complex silhouettes, this could be limiting. But then again, if the artist embraces it, it could make for a unique visual style.

There's still one thing I must try before deciding between alpha-sprites or triangulated-sprites: Texture compression. Tri-stripping is probably not worth trying, and glDrawTex is unlikely to be fast at all...maybe I'll try them when I'm bored out of my mind with everything else.

No comments: