This essay was originally written as a submission to the 2026 Leonardo Competition.
I would like to begin this essay with an interactive activity. On whichever polyphonic instrument would be the most convenient, I invite the reader to play the note A4. Furthermore, I ask that the reader simultaneously play an E5. You will notice that these notes sound good with each other.
Figure 1.1. Perfect fifth
However, if you play A4 at the same time as E♭5, you will find this combination of notes to be dissonant.
Figure 1.2. Tritone
Playing A4 together with A♯4 arguably sounds even worse.
Figure 1.3. Minor second
The reason for this has to do with the frequencies of these pitches. A4 has a frequency of \(440\,\mathrm{Hz}\) and E5 has a frequency of \(660\,\mathrm{Hz}\). These are in the ratio \(3:2\) (E5 : A4) and so sound good together. Pitches with frequencies in this ratio are said to be in a perfect fifth interval. In fact, perfect fifths sound so good that Billie Joe Armstrong only plays power chords in his songs. However, the frequencies of E♭5 and A4 are in the ratio \(64:45\), a tritone, which sounds nasty due to the non-simple ratio. In fact, this interval has been called the Devil’s interval due to its profound dissonance. Similarly, the frequencies of A4 and A♯4 are in the ratio \(16:15\), a minor second.
However, one might astutely observe that this doesn’t really answer anything at all. Surely our ears can’t tell whether the frequencies of two pitches are in a simple or complex ratio. Many people don’t even know what a ratio is. So how can our ears possibly differentiate between consonant and dissonant intervals? The true answer lies in the harmonic series.
Most people know that all the sounds you hear are just made up of sine waves, where amplitude corresponds to volume and frequency corresponds to pitch. Most sounds, however, are not so simple. For example, listen to the two audio clips below of the bassline for “Money” by Pink Floyd. The first one is played by a pure sine wave, whereas the second is played by Matthew Duan (L5th Head’s).
Figure 2.1. Bassline for “Money” by Pink Floyd
One can clearly hear the difference between a pure sine wave and an instrument. The reason for this difference is that the wave corresponding to the sound made by an instrument is actually composed of multiple sine waves of different amplitudes and frequencies. Consider, for example, an electric bass playing the first note of the bassline above. If we decompose the sound emitted from the amplifier into sine waves (by a Fourier transform), we would see that it is made up of not only B1 (known as the fundamental frequency), but also B2, a 2% sharp F♯3, B3 and numerous other pitches in gradually smaller volumes. These are known as the overtones or harmonics of B1. Together, they form the overtone series or harmonic series (not to be confused with \(\sum_{n=1}^\infty\frac{1}{n}\), although related) of B1. The figure below shows the first 16 harmonics of A1, where the numbers indicate how sharp or flat the pitches are in cents (hundredths of a semitone).
Figure 2.2. Harmonic series of A1
I have also embedded a nice interactive demo of harmonics below. You can click on the strings below to play the harmonics of the pitch corresponding to \(100\,\mathrm{Hz}\) (about 35 cents sharp of G2). Make sure to unmute it at the top right-hand corner.
Demo 2.3. Demonstration of harmonics by Alexander Chen (original link)
The amplitudes of the overtones produced by an instrument are what determines its unique sound, or timbre. As a side note, harmonics also correspond to natural harmonics played on stringed instruments. For example, the second harmonic corresponds to twelfth-fret harmonics on guitar and the third harmonic corresponds to seventh-fret or nineteenth-fret harmonics. This is because the twelfth fret is \(\frac{1}{2}\) along the string and the seventh and nineteenth frets are \(\frac{1}{3}\) along the string. But what are the pitches in a harmonic series? In fact, the frequencies of harmonics are precisely the integer multiples of the fundamental frequency. For example, the harmonics of A1 (\(55\,\mathrm{Hz}\)) are A2 (\(110\,\mathrm{Hz}\)), E3 (\(165\,\mathrm{Hz}\)), A3 (\(220\,\mathrm{Hz}\)), and so on.
Now, going back to our original question, the reason why simple ratios sound good and complex ratios sound bad is because when, say, a perfect fifth is played, the frequencies line up very often – every 3 time periods of the lower-frequency wave and every 2 time periods of the higher-frequency wave. In particular, this is true because this interval appears earlier in the harmonic series, namely, between the second and third harmonics, whereas more dissonant intervals appear later on in the harmonic series. Furthermore, playing minor seconds or any two close frequencies can also cause beating, where the waves interfere with each other, causing a perceived periodic change in volume. This is clearly observed when, on an electric guitar with heavy distortion, you play two notes off by a semitone. You can even bend the lower note by a small amount like a quarter tone to increase the beating.
From here, we may define all the different intervals in terms of harmonic series. An octave is \(2:1\), a perfect fifth is \(3:2\), a perfect fourth is \(4:3\). We can even define minor sevenths, for example, as \(7:4\). This type of tuning system, based on harmonics, is known as just intonation. However, a problem immediately arises. Suppose we justly tune the A major scale with respect to A, letting A4 be \(440\,\mathrm{Hz}\). We end up with the following:
Table 3.1. Justly tuned A major scale
| Pitch | A4 | B4 | C♯5 | D5 | E5 | F♯5 | G♯5 |
|---|---|---|---|---|---|---|---|
| Ratio from A4 | \(1:1\) | \(9:8\) | \(5:4\) | \(4:3\) | \(3:2\) | \(5:3\) | \(15:8\) |
| Frequency / \(\mathrm{Hz}\) | \(440\) | \(495\) | \(550\) | \(587\) | \(660\) | \(733\) | \(880\) |
You’ll notice that in this tuning system, known as Ptolemaic tuning, A4 and E5 sound very consonant together because of the perfect \(3:2\) ratio. However, if you look at the ratio between B4 and F♯5, which is also supposed to be a perfect fifth, you’ll see that \(\frac{5}{3}:\frac{9}{8}\) is actually \(40:27\), about \(2.96:2\). This interval is known as a wolf fifth or an imperfect fifth because it sounds horrible.
Figure 3.2. Wolf fifth
The reason for this is because just intonation is not equally tempered. In other words, one interval can have two different ratios. For example, the ratio between A4 and B4 is \(9:8\) whereas the ratio between B4 and C♯5 is \(10:9\), even though both intervals are major seconds.
There have been many proposed solutions to this problem. One is Pythagorean tuning, which, because Pythagoras didn’t believe in irrational numbers, tries to produce every interval using only ratios of integer powers of \(3\) and powers of \(2\). This is essentially trying to produce every pitch by repeatedly either lowering or raising a reference pitch by perfect fifths and octaves. This gives the following ratios for intervals:
Table 3.3. Pythagorean tuning
| Semitones | \(0\) | \(1\) | \(2\) | \(3\) | \(4\) | \(5\) | \(6\) | \(7\) | \(8\) | \(9\) | \(10\) | \(11\) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Ratio | \(1:1\) | \(256:243\) | \(9:8\) | \(32:27\) | \(81:64\) | \(4:3\) | \(1024:729\) | \(3:2\) | \(128:81\) | \(27:16\) | \(16:9\) | \(243:128\) |
The ratio for 6 semitones can also be \(729:512\) depending on whether the interval is considered to be diminished fifth or an augmented fourth (both the same as a tritone in regular tuning). As an exercise, I invite the reader to try to spot the powers of \(2\) and \(3\) in the ratios.
Of course, this carries its own problems. Notably, if you go around the circle of fifths, going up twelve perfect fifths should take you back to the original note (up 7 octaves). However, twelve perfect fifths in Pythagorean tuning are \(\left(\frac{3}{2}\right)^{12} \approx 129.75 \ne 128\). This discrepancy is known as the Pythagorean comma which has a value of \(\frac{1.5^{12}}{2^7} \approx 1.014\), about 23.46 cents. This can also be seen between two enharmonic notes like a diminished fifth and an augmented fourth as seen above.
What we really need is some ratio \(r\) for a semitone such that \(r^{12} = 2\) so that a semitone is always the same, whilst simultaneously satisfying the condition that twelve semitones make up an octave. Because a ratio of \(0.9175+0.5297i\) wouldn’t make much sense here, we instead choose \(r\) to be \(\sqrt[12]{2}\). As such, twelve-tone equal temperament (12 TET) was born. In 12 TET, we begin by defining a semitone to be \(\sqrt[12]{2}\) and A4 to be \(440\,\mathrm{Hz}\). Of course, A440 is largely arbitrary and some songs use other standards; “Pink Triangle” by Weezer, for instance, uses \(445\,\mathrm{Hz}\). From here, any interval can be defined simply as powers of \(\sqrt[12]{2}\). For example, a major third is \((\sqrt[12]{2})^4:1 = 2^{\frac{1}{3}}:1\) and an augmented sixth is \((\sqrt[12]{2})^{10}:1 = 2^{\frac{5}{6}}:1\).
Now, you’ll notice that in 12 TET, a perfect fifth is actually \(2^{\frac{7}{12}}:1\approx 2.997:2\) instead of \(3:2\). This is actually why in Figure 2.2, the third harmonic is labeled as 2 cents sharp with respect to 12 TET, and also why Jacob Collier might tell you that his piano is out of tune. This also means that I lied at the beginning when I claimed that E5 was \(660\,\mathrm{Hz}\) – it’s actually about \(659.26\,\mathrm{Hz}\). Of course, this is not ideal, but clearly our ears don’t mind anyway, considering all the music you listen to is (probably) in 12 TET.
Inspirational Poster 3.4. Feel free to hang on walls or set as desktop wallpapers.
All figures and audio created in Musescore except for bassline played by Matthew. Inspirational poster created in Inkscape.