Showing posts with label 2D Core. Show all posts
Showing posts with label 2D Core. Show all posts

Sunday, August 25, 2013

sverx in GBA homebrew land...

(I wrote this post quite a long time ago, during the port of Waimanu Daring Slides to GBA. It has been sitting here for a while, sorry for the delay. You can find the result of my pretty hard work here, on Disjointed Studio blog)

When Nintendo were planning the DS, they decided that the new console would be compatible with the previous one, the Game Boy Advance (GBA for short). So they have put in the same processor - an ARM7TDMI, which is the only processor in the GBA and the 'secondary' processor in the DS when running DS native code. They have also put in an evolution of the same 2D core, giving it many new powerful features. So the GBA, from the point of view of a DS homebrewer, is not completely different, but there are lots of differences that you should keep in mind if you decide to adventure yourself into GBA homebrew land.
So here's an overview of the GBA 2D core features comparing them to the DS... of course, without mentioning things you may easily notice such as that the GBA has got one screen only with a resolution of 240x160 pixels, whereas DS has two 256x192 pixels screens and 2 separate 2D cores. Of course, the GBA has no 3D core at all.
Please note that the list isn't comprehensive, and I'm describing differences pertaining to the graphical 2D core only.

- The DS supports up to 4 backgrounds at the same time. Two of them can only be 'normal' backgrounds (no rotation and scaling is supported on these backgrounds), but you can choose how you want the other two backgrounds to be. So your options are to have all 4 normal backgrounds or you can have 3 normal backgrounds and 1 that supports rotation and scaling  (known as 'rotscale' or 'affine' background) or even have 2 normal and 2 rotscale backgrounds. With GBA, you can have 4 normal backgrounds too, but if you need rotscale backgrounds, you have to give up two normal backgrounds for each rotscale background you want to use. So you'll eventually have 2 normal backgrounds and just one rotscale background or 2 rotscale backgrounds with no other backgrounds at all.
- The DS also features 'extended' rotscale backgrounds, which are rotscale backgrounds supporting up to 1024 different tiles, and each tile can be eventually flipped horizontally/vertically and/or use one of 16 separate 16-color or 256-color palettes. On the GBA there's no such 'extended' rotscale thing, and 'regular' rotscale backgrounds are 256 colors backgrounds that supports up to 256 different tiles only, with no flipping and of course no palette selection.
- Extended rotscale backgrounds on DS can also become bitmap backgrounds, making it possible to have bitmaps over (or under!) text/rotscale backgrounds, or even a bitmap over another. On the other hand, the GBA has very few bitmap oriented features. You can have only a single bitmap background, even if you have 3 choices of what to show in it. You can show a 240x160 15bpp (32 thousand colors) bitmap with a single framebuffer since there's not enough VRAM to have two of them - such a bitmap requires 75 KB. The second choice is a 240x160 256 colors bitmap with double framebuffer, and the last choice is quite a bizarre 160x128 15bpp double framebuffer bitmap background.
- The DS main 2D core (but not the sub core) has a 'large bitmap mode', featuring a single 1024x512 bitmap background. Of course, on the GBA that doesn't exists.
- Palettes: the DS has 16 256-color additional palettes (known as 'extended' palettes) for backgrounds plus another 16 for sprites, besides the regular 256 colors palette for the backgrounds and the regular one for the sprites. Both of these regular palettes also be used as if they were 16 separate 16-color palettes, for 16-color tiles and sprites. The GBA features the very same regular palettes, but there are no 'extended' palettes.
- The GBA has only a total of 96 KB of video RAM, of which 64 KB are dedicated to background maps and tiles, and 32 KB dedicated to sprites. This means, for example, that only 512 different 256-color tiles for sprites can be stored here, even if the GBA 2D core could use up to 1024 different tiles. Also, when choosing a bitmap mode, only 16 KB are left for sprites as the first 80 KB of VRAM are bound to the background bitmap framebuffer(s).
- On the GBA the sprites will always overlap with each other according to their order in the OAM (Object Attribute Memory). This means that sprite number 0 will be always 'on top' of sprite number 1, even if the latter is bearing higher priority than the former. On the contrary, on the DS the priority also works sprite-on-sprite, not simply sprite-on-background.
- Bitmap Objects (also known as 15bpp sprites) don't exist on GBA.

All that said, please don't let this scare you. It's really a lot of fun to code on that little neat machine, and it will surely give you lots of satisfaction.

Wednesday, March 20, 2013

One hundred twenty-seven shades of grey

Nothing to deal with the bestseller, assured.
As you may already know, the DS 2D core supports paletted and direct colors, both of which are expressed using 5 bits per primary color (red, green, blue). This is 15bpp, also known as HiColor mode.
Having 'only' five bits per primary color means that there are just 32 different shades of gray that can be defined including the darkest - black, and the brightest - white.


Back in 2008, while reading GBATek specifications, I had found out that the DS screens were 18bpp LCD panels, so I started a topic on gbadev forum suggesting that there might be a way to exploit this. It turned out that using the hardware alpha blending capabilities of the 2D core you can indeed force the hardware to show real 18bpp images. Fellow forum member Cydrak has even posted a very good demo then, and here's his original forum post with the link to his 18bpp demo.
So, having now 6 bits per primary color, it's possible to display 64 shades of grey. The improvement is significant, even if banding is still noticeable.


Pushing this thing further has been tickling me since then, so very recently I decided to give it a try. Of course, the hardware limit of 18bpp can't be overcome, and the display can't show more colors than what it's capable of. However, having a 2D core that can generate 60 frames per second, we can exploit the human eye persistence of vision. The idea here is that if we display two slightly different images alternatively at a sufficient high rate (60 times per second is surely high enough), our retinas will just perceive a sort of an average of the two images. And that's what we get: 127 shades of grey. Even if you know that the additional 63 shades aren't really there, they're simply the result of our perception.


Note: the second and third image in this post are fake: there are no emulators that can show 18bpp and, of course, there's no way of making any emulator show the image that your eyes perceive. So, I suggest that you test the demo yourself on your DS. You can download it here.

Monday, September 03, 2012

Hardware generated smooth scaling

In my August posts I focused on the improvements done with DSx86's ARM ASM smooth scaling routine where I did my best to make it as fast as possible, knowing that every CPU cycle saved there would turn useful in the emulation main loop. Then it took me a few more months to realize that actually the same result can be achieved by properly programming the NDS 2D graphical core. So here's how I did it.

The smooth scaling routine takes groups of five 256-color pixels on the same line and turns them into four 32K-color pixels on the DS screen by performing many palette lookups and regular/weighted averages, as we've seen already. The DS 2D core, on the other hand, can perform alpha blending between two backgrounds, without requiring any effort from the CPU. This alpha blending feature can achieve nothing less than an average between each pixel of the first background and the corresponding pixel on the second background, returning a 32K-color image.(1) Additionally, the 2D core can also perform background scaling. We need to exploit both these features.

Let's define the 5 original pixels as p0-p4, and the resulting 4 output pixels as r0-r3. What we need to get is:

r0 as the sum of 3/4 p0 and 1/4 p1
r1 as the sum of 1/2 p1 and 1/2 p2
r2 as the sum of 1/4 p2 and 3/4 p3
r3 as p4

If we could blend 4 backgrounds together we could simply copy specific pixels in the 4 backgrounds to obtain this (please check that each output pixel is exactly as expected):

BG0: p0 p1 p2 p4
BG1: p0 p1 p3 p4
BG2: p0 p2 p3 p4
BG3: p1 p2 p3 p4

Since the 2D core can do background scaling, we don't even need to copy specific pixels. Each background can be generated the way we need it starting from the unmodified original image stored in Video RAM using the scaling features. Thus, we program the 2D core to skip one source pixel each group of five, and choose which pixel has to be skipped.
For example, to generate each of the backgrounds (the code does that for BG2), we have to program the background affine matrix to scale a 320-pixel wide image in a 256-pixel wide background:

REG_BG2PA = (320 << 8) / 256;
REG_BG2PB = 0;
REG_BG2PC = 0;
REG_BG2PD = (1 << 8);


Then we should tell the 2D core to skip pixel p1. This is accomplished by using the reference point X coordinate register:

REG_BG2X = (3 << 8) / 4;

You can think of this register as if it was a sort of a counter of the fractional part. We initialize it to a precise value (3/4, in this case) and after each output pixel has been generated, 1/4 gets added to this counter. (It's because 320 divided by 256 gives 1 plus a fractional part of 1/4). When the counter reaches the unit, the scaling process skips one pixel of the original image, and in this case this will happen after processing one pixel. We can also tell the 2D core to use the same 320x200 bitmap for all the backgrounds, then program different reference point X coordinate values for each background.

Unfortunately, what we can't ask the 2D core is to blend all 4 backgrounds at the same time. However, we can make it blend 2 of these backgrounds each frame, and blend the other 2 backgrounds the next frame, at 60 frames per second.(2) The LCD screen and our retinas will average the 2 generated images, providing in fact the expected result.

DSx86 actually uses a slightly different implementation. It performs vertical scaling at the same time (200 lines down to 192 in VGA "Mode 13h" and 240 lines down to 192 in VGA "Mode X", using different affine matrices) in the so-called 'Jitter' mode.

(1) The DS screen output supports 18bpp color, and alpha blending is probably performed with even more precision.
(2) Since only BG2 and BG3 support bitmap backgrounds, the code will blend these two, redefining them as needed on each frame.