New 3D FPS released for 1979 Atari 800

legoxx · on Nov 3, 2021

Slovak group GMG released a full featured FPS for 40+ year old computer (64kb of ram, 1.79Mhz 6502)

Features:

- raycaster engine running at 25 - 30 FPS

- animated textures

- lighting system

- destroyable walls

- automap

- 3 enemy types

- final boss

game is fully playable in Altirra emulator

video: https://www.youtube.com/watch?v=lRd3MucaRoU

homepage: https://atari8.dev/final_assault

discussion: https://atariage.com/forums/topic/326709-final-assault-new-g...

aliswe · on Nov 3, 2021

the FPS is really smooth - and the floating point (?!) calculations of the raycasting engine seem to be totally "on point" ?!?!

flatiron · on Nov 3, 2021

Comment a bit of a limitation of the ray casting:

https://atariage.com/forums/topic/326709-final-assault-new-g...

simply amazing what they did. could you imagine if this came out in the 80s?!?!

_abox · on Nov 3, 2021

Yes this would have been mind-blowing.

Ps there was a game that had a similar FPS view in those days. It was a maze game with polygon graphics but no textures. I forget the name. No shooting though!

Thoreandan · on Nov 3, 2021

* Mercenary: Escape from Targ (Novagen) http://mercenarysite.free.fr/mercframes_graphic.htm "open world" vector FPS/RPG, fully RAM-resident, in 48k & 64k editions.

* Alternate Reality: The City (Datasoft/Paradise Programming) FRPG with 90-degree turns, rendered walls/doors as scaled textures between the animated backdrop and foreground NPC sprites.

I wrote an FAQ as a kid for Mercenary for local BBS' - I think I discovered at least 3 victory conditions. :^)

laumars · on Nov 3, 2021

There were a few.

I had Sultans Maze[1][2] and Articfox but plenty others existed too.

[1] https://en.m.wikipedia.org/wiki/Sultan's_Maze

[2] https://youtu.be/af1CfayG5Ec

[3] https://en.m.wikipedia.org/wiki/Arcticfox

legohead · on Nov 4, 2021

I could tell something was off, but couldn't put my finger on it. Very cool

mywittyname · on Nov 3, 2021

So I was curious about how the Atari 800 handled floating point calculations. As it turns out, Steve Wozniak helped develop the FP routines for the 6502.

http://archive.6502.org/publications/dr_dobbs_journal_select...

codedokode · on Nov 3, 2021

I doubt there are any floating point numbers because FP is very slow to emulate (for example, just to add two numbers you have to shift mantissas and 6502 has no fast way to do it). If I was writing a game for an 8-bit CPU, I would use fixed point numbers (for example, 8.8 numbers which use 8 bits for integral part and 8 bits for the fractional part, or maybe 10.6, or 12.4 numbers).

6502 cannot multiply or divide (division is a costly operation even on modern CPUs) so I would use adding or subtracting logarithms for this purpose. 12-bit precision logarithm table requires just 8 Kb RAM (and you can get 13-bit precision with interpolation).

The disadvantage of this approach is that every time you convert a number from linear to logarithm or vice versa you get approximately up to 0.3% error (for 12-bit logarithm). So multiplying or dividing several numbers in a row is fine, but if you have to alternate multiplication with additions you will accumulate an error. So I would look for formulas that avoid this. But for a game a little error in calculations is not noticeable.

Also one might think that the most time-consuming part for pseudo-3D game is math and calculations. I doubt that. The most of CPU cycles are usually spent in rasterisation and applying textures. Is is easy to calculate positions of 3 vertices of a triangle, but it takes a lot of time to draw it line by line, pixel by pixel, and if you want to have textures this time can be multiplied 5x-10x.

legoxx · on Nov 4, 2021

8kb table for logs would consider 16% of 64kb that you have for engine, game, intro and end sequence.

fstrthnscnd · on Nov 4, 2021

> for example, just to add two numbers you have to shift mantissas and 6502 has no fast way to do it

How much work is there to do beside shifting mantissa? (A shift is also necessary for many fixed point calculations).

brianpaul · on Nov 3, 2021

IIRC, the Atari ROM routines used a different 6-byte BCD floating point format.

legoxx · on Nov 3, 2021

I can assure you that no BCD routines (nor Woz's nor Atari's) were hurt during production of this game. They are really slow. You need to use all kind of tricks and cheats when creating 3d game on 8bit machine and all needed calculations are precalculated in lookup tables.

rasz · on Nov 4, 2021

- 80x30 resolution

- mostly 8 colors

Its impressive, but unplayable.

pixelpoet · on Nov 3, 2021

Having written realtime ray tracers in the classic demoscene style on e.g. 300 MHz Pentium 2 Celeron (the original one with no cache, that overclocked to 450-550 MHz), I sometimes wonder about how cool it would be to open source a modern rendering engine around the time of the first Pentium 3 with SSE, in 1999.

You could completely revolutionise computer graphics on that era of hardware, with the view to increasing vectorisation, and probably strongly steer it towards ray tracing instead of rasterisation (even skipping over the local minimum of k-D tree methods, the introduction of Surface Area Heuristic and eventually settling on modern BVH building and traversal).

q_andrew · on Nov 3, 2021

I was under the impression that the theory was there but the hardware was not. Like, rasterization was a necessary evil because it gave better results more quickly (and artists needed that feedback).

fistynuts · on Nov 3, 2021

Yes, absolutely right, and for games it was the only option with no fast floating point and very limited memory.

romwell · on Nov 3, 2021

First time reading about BVH. Sounds like Kirkpatrick's hierarchy, but in arbitrary dimension.

What's the advantage over BSP/kD-trees/octrees?

And what do you mean by rasterization - we still have to deal with pixels in the end, so it has to happen somewhere? (..I'd love to play with a color vector monitor though!).

pixelpoet · on Nov 3, 2021

> What's the advantage over BSP/kD-trees/octrees?

With BVH, the partitioning is fundamentally over lists of objects rather than space; if you split by space, you can/will have objects on both sides of the splitting plane, leading to duplicate references.

Doing it by lists means there are no duplicate references, however the combined bounding volumes can overlap, which is to be minimised, subject to the Surface Area Heuristic cost. It winds up being something like a quicksort, although for the highest quality acceleration structures you also want to do spatial clipping... this is an extremely deep field, and several people have spent considerable part of their professional career to it, for example the amazing Intel Embree guys :)

It also happens to work out best for GPU traversal algorithms, which was investigated by software simulation quite a few years ago by the now-legendary Finnish Nvidia team, and together with improvements on parallel BVH building methods and further refinements is basically what today's RTX technology is. (As far as I'm reading from the literature over the years.)

Here's a fundamental paper to get started: https://research.nvidia.com/publication/understanding-effici... (Note that these are the same Finnish geniuses behind so many things... Umbra PVS, modern alias-free GAN methods, stochastic sampling techniques, ...)

dahart · on Nov 3, 2021

> And what do you mean by rasterization - we still have to deal with pixels in the end, so it has to happen somewhere?

At a high level, you could think about it as where in your nested loops you put the loop over geometry (say, triangles). A basic rasterizer loops over triangles first, and the inner loop is over pixels. A basic ray tracer loops over pixels, and the inner loop is over triangles (with the BVH acting as a loop accelerator). Just swapping the order of the two loops has significant implications.

t8sr · on Nov 3, 2021

Octrees divide space into regular sized chunks. BVH divides space into chunks of varying size, but with a balanced population of bodies in each. The idealized BVH divides the population by 2 at each level.

Compared to octrees, BVHs deals well with data that’s unevenly distributed. At each level you split along the axis where you have the most extent. Finding the pivot is the interesting part. When I recently implemented a BVH from scratch, I ended up using Hoare partitioning and median-of-three and it worked really well. The resulting structure is well balanced, splitting the population of bodies roughly in half at each level, and that’s not even the state of the art, that’s just something my dumb ass coded in an afternoon.

codedokode · on Nov 3, 2021

The game uses ray casting, not ray tracing. Ray casting is when you send a ray once for every column of pixels to get a distance to a wall. Also, if the walls are only horizontal or vertical the calculations get simpler.

Also I wonder how you can achieve clock frequency like 300 MHz without a cache. Shouldn't CPU stumble on fetching every command?

dahart · on Nov 3, 2021

> The game uses ray casting, not ray tracing. Ray casting is when you send a ray once for every column of pixels

Both ‘ray casting’ and ‘ray tracing’ are overloaded terms, the distinction isn’t as clear as you suggest. You’re talking about 2d ray casting, but 3d ray casting is common, and means to many people the same thing as ‘ray tracing’. Ray casting “is essentially the same as ray tracing for computer graphics”. https://en.wikipedia.org/wiki/Ray_casting

There’s also Whitted-style recursive ray tracing, and path tracing style ray tracing, but ray tracing in it’s most basic form means to test visibility between two points, which is what ray casting also means from time to time.

tomxor · on Nov 3, 2021

I've given up on the semantics of "ray-tracing", everyone has their own opinion. However it's fairly common for "ray-casting" to mean non-recursive, and the wiki article you link explicitly says this.

I think the biggest difference between ray-type algorithms is everything vs ray-marching, because regardless of recursion, and strategies towards lighting, texture sampling and physical realism, with ray-marching a single ray is not really a ray at all but lots of little line segments, and you don't usually bother finding explicit and intersections which gets really complex and expensive... that's the whole point, instead you find proximity or depth, which means you can render implicit surfaces like fractals.

dahart · on Nov 3, 2021

Yes, ray casting does not ever imply recursion, it’s simply referring to casting a ray to test visibility from one point to another. Ray tracing now most commonly means exactly the same thing, and expecting anything else will often lead to confusion.

“Ray marching” is also overloaded ;) but what you’re referring to (also and originally called ‘sphere tracing’) is a new and separate idea from either ray casting or ray tracing (if you’re thinking of something other than ray casting when you say that.) Ray casting/tracing is most often done using non-iterative analytic intersection functions, where the style of ray marching you’re referring to is a distance field query, not a point to point visibility query, so ”ray marching” generally implies a different traversal algorithm, and (usually) a different representation of the geometry.

You can use any/all of these to build path tracers, but they come with different tradeoffs.

giantrobot · on Nov 3, 2021

The Covington Celerons had an L1 cache but no L2 cache. The Pentium II of the era had an off-die L2 cache. So the accelerometer was a binned Pentium II without any L2 cache. A later model of the Celeron was released with a 128k L2 off-die L2 cache.

At its base clock speeds the Celerons were middling chips. But since they readily overclocked you could get them up to 450-466MHz. They wouldn't be equivalent to the same speed Pentium II (because of no L2 cache) but they punched above their weight for the price.

happycube · on Nov 4, 2021

The Celeron-A's had on-die cache, making them pretty much a match for a regular P2 of the same clock/bus speed. There was also a mobile P2 with 256K of on-die L2, predating Coppermine P3's.

giantrobot · on Nov 4, 2021

I must have misremembered, it's been a long time. I swore there were Celeron 300As that were Covington cores with the external L2 cache but it makes sense they were the Mendocino cores with the on-die L2.

I really wanted an overclocked Celeron set up but for the year or so they were hot shit my upgrade money went to storage. That was more pressing for me at the time. By the time I was due for (and could afford) a system upgrade I was able to go directly to a Pentium III.

tdeck · on Nov 4, 2021

Is there somewhere I can read more about the history of these different techniques?

wombat-man · on Nov 3, 2021

Here's a video: https://www.youtube.com/watch?v=92K9wnk_4Cw

Pretty interesting but it's a little hard to make out what's going on sometimes.

kingcharles · on Nov 3, 2021

Thank you. I can't help but feel the wall textures have screwed the whole thing. It might be that they've tried to just be too clever with this and a less complex solution (flat shading, gouraud shading) would have worked better.

wombat-man · on Nov 3, 2021

Yeah I had the same thought. I played a few FPS on my ti-83 back in the day, it was black and white, and really it only drew the edges of the walls and outlines of people. But I could tell pretty well what was going on in comparison to this.

This atari game might also look a lot better on a crt or something.

laurent123456 · on Nov 3, 2021

If you reduce the window to a very small size (or look at your monitor from a few meters away) then it's actually easier to see what's going on.

buescher · on Nov 3, 2021

I wonder how it would look on a CRT TV over RF.

VeninVidiaVicii · on Nov 3, 2021

It's difficult for me to tell if the player is looking at a wall or a hallway...

VeninVidiaVicii · on Nov 5, 2021

Maybe it’s a wallway

memco · on Nov 3, 2021

It's a bit unclear what's demoing visuals vs. gameplay. It seemed often that when the player reaches a corner, instead of turning the corner and continuing down the corridor the player would sometimes turn into the corner and wiggle. I wonder is there something about corners that's a bit janky or is that just a habit of this particular user that they sometimes just turn left and right randomly particularly in corners?

tomhunters · on Nov 4, 2021

The gameplay is quite interesting; I noticed the upgrade on the game. I hope I have enough time to try the game out.

lizknope · on Nov 3, 2021

I had an Atari 800 in the early 1980s. The games were close to the arcade version and with the BASIC cartridge we could write programs and save them to a cassette tape drive or floppy.

https://en.wikipedia.org/wiki/Atari_8-bit_family

Jay Miner was one of the main developers who later went on to be the "father of the Amiga"

https://en.wikipedia.org/wiki/Jay_Miner

r00fus · on Nov 3, 2021

Was so jealous of my friends that had an Atari 400 even, the 800 was legendary.

varelse · on Nov 3, 2021

1982 is calling but expects to be connected to some HN pedant "helpfully" pointing out that you can't actually shoot stuff.

https://youtu.be/j4OLnrwLcJA

kingcharles · on Nov 3, 2021

That's incredible. Was this the earliest instance of raycasting in the wild?

Good intro to raycasting for those interested: https://lodev.org/cgtutor/raycasting.html

varelse · on Nov 3, 2021

I think it is. But... But PLATO Moria might be the OG of 3D in this space.

https://en.wikipedia.org/wiki/Moria_(1978_video_game)

col_sanders · on Nov 6, 2021

Neither Wayout nor Capture the flag, it's older prettier brother uses raycasting. Sources are out and there's some more info in AtariAge forums where some maniacs ripped it apart and even made it faster using more current tricks. https://atariage.com/forums/topic/320414-smooth-3d-movement-...

ThomW · on Nov 3, 2021

I had their follow up game Capture the Flag on my Atari 800. Same engine, but slightly upgraded graphics. Fun game!

https://moegamer.net/2019/03/05/atari-a-to-z-capture-the-fla...

varelse · on Nov 3, 2021

I once walked out of a John Carmack SIGGRAPH talk when he opened with claiming he invented the FPS because of Ultima Underworld, The Eidolon, and Way Out...

Like him otherwise, but nope he didn't invent it...

buescher · on Nov 3, 2021

The Lucasfilm games ('84-'85) blew my mind at the time. Fractalus, Ballblazer, Koronis, Eidolon.

varelse · on Nov 3, 2021

Hot take: The Eidolon is more impressive than this effort and it works with the technological limitations of the platform rather than against it like this does.

All those LucasArts games were stunning for the time and Rescue At Fractalus has a genuinely terrifying experience within it not replicated IMO until the first time you hear "Anytime..." in Alien vs Predator...

buescher · on Nov 3, 2021

To be pedantic, it needs a 64K machine like the 1982 1200Xl or 1983 800XL so would not run on the (max w/o 3rd party mods) 48K Atari 800 from 1979. But yes, wow!

protomyth · on Nov 3, 2021

Both the 400 and 800 had upgrades to 64K. The Atari 400 required some soldering and the 800 was simply a plug-in expansion.

CWuestefeld · on Nov 3, 2021

Not so. Behind the two ROM cartridge slots there were three slots for RAM expansion, each being 16KB.

karmakaze · on Nov 3, 2021

That's for the Atari parts using linear addresses. There were numerous third-party memory expansions that used bank switching as in the XL's to get much more. I wrote Point-of-Sale software using 128K banked expansion for an 800.

I also had my 400 upgraded to 48K with a mechanical keyboard.

protomyth · on Nov 3, 2021

Yeah, the 16KB cards were from Atari and the banked switched cards were from third parties. Antic Magazine (named for the chip) had a lot of ads in the back for those.

michaelcampbell · on Nov 3, 2021

IIRC (and I may not), I upgraded my 400 to 64k, and a "real" keyboard. But I think you could only address 48k of it at once or something hinky.

buescher · on Nov 3, 2021

Yeah, you had ROM space (and potentially cartridge ROM space) eating into your 64K on all the models and had to bank-switch for anything more than 48K IIRC. With a BASIC cartridge in place you'd lose another 8K.

Also I don't recall any of the 400/800 3rd party upgrades being compatible - they used a different bank-switching mechanism - with the later XL/XE models.

michaelcampbell · on Nov 3, 2021

"bank switching" sounds familiar, so I'm pretty sure my memory of 64k is correct then. I can't say I understand what it means, but I remember something of that term at the time.

bluGill · on Nov 3, 2021

You group your memory into groups of 16k (could be other sizes, 16k is what I recall but it has been years). Then you have a switch that you can program the switches between groups (also called banks). You can only access one at a time, but you can switch between them very fast. It was possible to get up to 5 MB of memory that way, even though you could only use 64K at a time.

protomyth · on Nov 3, 2021

Where did you get a real keyboard for the 400. I searched and searched and could not find one. I was so envious of the Timex / Sinclair folks who had a nice replacement.

michaelcampbell · on Nov 3, 2021

I can't remember but it was designed for the 400. This was circa 1983.

spicybright · on Nov 3, 2021

I have so much respect for people that take on challenges like this.

It might be "useless" in a utilitarian sense, but I think of it as an incredible art form.

atum47 · on Nov 3, 2021

I tried to write a raycast [1] in JavaScript once, I used lines instead of the traditional grid, but it end up messy and with weird texture artifacts. Still, I would like one day to come back to this and do a proper job.

I'll download the game after work and see if I can run it under an emulator, but the graphics and music alone seems wonderful.

1 - https://github.com/victorqribeiro/myRaycast

ciroduran · on Nov 3, 2021

I love that the credits mention the raycast tutorial by Permadi https://permadi.com/1996/05/ray-casting-tutorial-table-of-co...

I used this tutorial in the 2000's to write a raycaster myself in C using Allegro. It's wonderful that these tutorial are still useful so many years after they've been written.

joshspankit · on Nov 3, 2021

If this had been released when the 800 was new, it would have blown people’s minds.

mysterydip · on Nov 3, 2021

I was expecting a very basic raycaster, no textures etc. This is seriously impressive!

mypalmike · on Nov 3, 2021

Yeah just updating every pixel randomly at 20 - 30 fps is a feat on the 8 bit Atari, nevermind with a raycaster.

buescher · on Nov 3, 2021

I was expecting it to need 128K!

Minor49er · on Nov 3, 2021

The theme song linked at the bottom reminds me a lot of the System Shock 1 theme song

https://www.youtube.com/watch?v=uZRuDOqIIBA

greggman3 · on Nov 3, 2021

This isn't a complaint, it's a suggestion. Would be cool if I could just run it in my browser like https://www.2600online.com/

_abox · on Nov 3, 2021

Wow this looks amazing. Will give it a try on my real 800XL.

makach · on Nov 4, 2021

Well, that was akward. Silly me clicking the screenshot thumbnails expecting a higher resolution picture to pop up.

verifex · on Nov 3, 2021

Download link isn't working, for anyone that is trying to download it.

legoxx · on Nov 3, 2021

It is working very well for me Verifex. Please try https://atari8.dev/final_assault/Final%20Assault%201.0.zip

chaganated · on Nov 3, 2021

i'm not surprised. atari 800 was an exceptional machine. great work on the game though.

jhbadger · on Nov 3, 2021

Yes, it had a lot of custom chips, and some of the same people (like Jay Miner) who worked on the Atari 800 design later were involved in designing the Amiga which likewise was based on custom chips.

legoxx · on Nov 3, 2021

there were 3 custom chips (ANTIC, GTIA and POKEY) 1 semi-custom (SALLY was a VERY MINOR modification of 6502) and 1 common industry chip (PIA). I would not say exactlly "lot of" ;-)

jhbadger · on Nov 3, 2021

Well, relatively speaking that was a lot for late 1970s microcomputers. The original Apple ][ and TRS-80 Model I were essentially built from off the shelf chips (which is why both got off-brand clones).