Understanding the AY

Using Arkos Tracker efficiently requires a minimum knowledge of our beloved AY-3-8912. We are going to have a quick overview of its capabilities, with sound and drawings for a better understanding.


The channels

The AY is an old Programmable Sound Generator (PSG) which can produce sound in a rather limited, yet expressive way.

The AY has 3 channels. This means we can produce three sounds at the same time. The output may be mono or stereo depending on the hardware. The CPC has only one speaker, but also a stereo output for plugging an amplified system. The CPC Plus has two speakers, the Atari STF only one (arh arh).

Waves, amplitudes and periods

By default, the AY generates square waves. This is the most basic, saturated sound you can find.

Note: this fancy WAV player plugin displays the WAV with a symmetry. This is not an accurate representation of the actual sound!

Listening to three square waves is quite tiring, but fortunately, each wave has its own amplitude, so changing it in real-time allows creating more natural sounds.

There are 16 amplitude steps, from 0 (no sound) to 15 (full amplitude). The amplitude curve is logarithmic: this means that the amplitude difference between amplitude 10 and 11 is not as important as the difference between amplitude 14 and 15.

In order to play music, one must play notes. Each channel can be given its own frequency. The little example above, on top of decaying the amplitude, also changes their frequency over time.

Important: the PSG does not work with frequencies (440hz for example), but with periods. Periods are the invert of frequencies. The highest the frequency, the lowest the period. Mathematically speaking: period = 1 / frequency. However, we won't be using maths here so don't worry if you don't understand this. Just be aware that periods will be used, and thus by decreasing the period, the pitch of a sound will actually increase. For now on, we will talk about periods, not frequencies.


What about producing some drums? Well the AY got you covered: it has one noise generator. It can be coupled to any of the three channels, or two of them, or even the three of them. The period of the noise varies thanks to a value going from 1 to 31 (0 producing the same result as 1, 1 being a very high-pitched noise, 31 a low-pitched one). Here is an example of the noise going from 0 to 31.

Noise can be used alone as seen above, or mixed with a square wave:

I told you there was only one noise generator. If you apply the noise on two channels, both will use the same noise value. This is quite limiting, which is why, most of the time, you will not use two noises at the same time. It simply sounds ugly. Forget your dreams of simulating a hihat (noise to 1) and big explosion sound (noise to 31) at the same time.

The hardware envelope

All that is above is enough to make 99.9% of the game music from the 80's. You may want to go one step beyond and use another neat feature called "the hardware envelope", which is very useful to produce peculiar sounds, mostly used in basses, but which can have great effects in melodies or for special effects.

As you have seen, in order to have more expressive sounds, you would use the amplitude to create attack and decay to simulate real (or not) instruments. The hardware envelope allows you to do it automatically.

One envelope to rule them all

Note the use of the singular when I talk about hardware envelope. Just like there is only one single noise generator, there is only one single hardware envelope generator. And just like noise, it can be used on one, two, or the three channels at once. But just like the noise, it will probably sound ugly if you try to use it in more than one channel at the time.

The shapes

Two parameters defines the hardware envelope:

  • Its shape (going up, or down, cycling?)
  • Its period: how fast it goes.

There are 8 shapes available. Here are they, using a high period for you to hear them, and with the software wave on.

Shape 8: sawtooth from 15 to 0, loops.

Shape 9: from 15 to 0, loops at 0.

Shape 10 (0xA): triangle (15 to 0 to 15 and loops).

Shape 11 (0xB): 15 to 0, loops to 15.

Shape 12 (0xC): sawtooth from 0 to 15, loops (opposite to 8).

Shape 13 (0xD): 0 to 15, loops to 15.

Shape 14 (0xE): triangle (0 to 15 to 0 and loops) (opposite to 0xA).

Shape 15 (0xF): 0 to 15, loops to 0.

Some important remarks must be made :

  • Some shapes loop endlessly, and some stop after one cycle (they actually loop on the same value).
  • Whatever the shape is, the amplitude always goes between 0 and 15 (or the opposite). You can't ask a shape to go from 14 to 6 and then cycle or stop.

The latter limitation makes the hardware envelope as though it is useless. Why bother using an envelope when I can change the amplitude at will, choosing the exact values I want? In a sense, you are right, but the real interest is explained below.

Speed it up!

So hardware envelopes are limited and boring. Now let's try something. Let's play a sound using a cycling hardware envelope (like 8) and progressively decrease the period of the hardware envelope. The software envelope is fixed to a certain period and won't change.

Did you hear that? Isn't it awesome? By using a low period of the hardware envelope, a whole new sound is created from a boring square wave. Note that most of the sample sounded like rubbish, up to the end, where it sounded right. Why? This is explained below.


Why did it sound right? Because the period of the (square) software wave is proportional to the one of the hardware envelope. A good result is to have the latter being 8, 16 or 32 times (a power of 2) faster than the square wave:

Hardware period 8 times faster than the software period, shape 8.

Hardware period 16 times faster than the software period, shape 8.

Hardware period 8 times faster than the software period, shape 10.

Hardware period 16 times faster than the software period, shape 10.

So this sounds great, but you will probably very quickly encounter a limitation: the higher the pitch of the sound, the less accurate the periods are. Plus, the hardware envelope is more accurate than the software. So there *will* be times where envelopes will be desynchronized. Most of the time, with low sounds, it won't sound too disgracious. But go higher and...

The faster the internal clock of the AY is, the more accurate this will be. Which explains why some Atari ST music converted to the CPC can have its hardware sounds sound crappy: the Atari ST has a 2mHz YM, the CPC a 1mHz AY.


You may wonder how samples can be played on such limited chips. Well, depending on the hardware, it may be very easy to not-so-easy. But the basic is that: very quickly, the amplitude of one (or more!) channel must be changed very quickly (8000 times per seconds for a 8 kHz sound), thus producing a richer sound than the AY can normally produce by itself. Hardware like Atari ST have timers allowing to do that "in the background". But on CPC, it is up to the coder to perform these changes: synchronization in itself is not hard to do, but things can get very complex if you want the computer to perform other actions while playing the sample.


These are advanced topic which shall be talked about a bit later!