Monday, December 21, 2015

Modeling rhythms using numbers

Context

I like to dabble a bit in generative music from time to time. While thinking about how to generate percussion patterns I was wondering about compact representations of rhythm. This blog entry documents my current approach (which, as usual, may or may not exist already, and may or may not be useful to you.)

(Note to self: this blog entry lacks some pictures for clarity.) 

Encoding rhythm in a number

Consider the following simple rock beats:

This beat has 3 voices: 
  • the upper voice represents the hi-hat
  • the middle voice represents the snare drum.
  • the lower voice represents the bass drum (kick)

Simple case: a voice has notes with equal duration (e.g. upper staff)

In the upper staff, there's an easy conversion between notes being present/absent and the bits in a binary number.

For the hi-hat, consider each measure as consisting of 8 8th notes. To each 8th note we can associate a bit in a binary number. Since on each beat an 8th note is played all bits are set to one. Therefore the hi-hat in the first measure in this case could be represented as a binary number (1111 1111), which can be written as the decimal number 255, and a resolution 2 (The resolution, 2, represents the number of bits per beat. Here it's 2 because there are 2 8th notes per beat).

The bass drum can be seen as consisting out of 4th notes. There's a kick on the first and third beat, but not on the second and fourth beat of the measure. The bass drum voice in the first measure therefore can be represented as the decimal number (1010) (in decimal this is number 10) with a resolution of 1. 

The snare drum also consists of 4th notes. There's a snare drum on beats 2 and 4, but not on beats 1 and 3 of the measure. For this reason, the snare drum can be represented as (0101) (in decimal this is number 5) with resolution of 1. 

The complete first measure therefore can be summarized as:
  • hi-hat: 255 (resolution 2)
  • snare drum: 10 (resolution: 1)
  • bass drum (kick): 5 (resolution 1)

Second case: a voice has notes with unequal duration (e.g. lower staff)

The simple notation we used before no longer suffices. The bass drum voice has a kick on the first beat, and one on the second half of the second beat. As a first idea, we can pretend that the notes are written as 8th notes that are tied together. In that case the bass drum could be almost modeled as (1101) with a resolution of 2, except that this number doesn't model at all that the first two 8th notes are tied together to form one longer kick.

To overcome this limitation, introduce a new digit 2. 2 indicates that the current note is present, and tied to the next note (whereas digit 1 indicates that the current note is present but not tied to the next one). An accurate representation for the bass drum in the lower staff therefore is (2101) with a resolution of 2. Because of the number 2, this is no longer a binary number, but it can be interpreted as a ternary number (a number in number base 3).  (2101) in number base 3 corresponds to decimal number 64.

Since one cannot tie a note to a rest, the number combination 20 doesn't make any sense and for all practical purposes can be replaced with 10. 

Without loss of generality, we can also interpret the numbers of the upper staff as numbers in number base 3. The upper staff therefore is modeled as:
  • hi-hat: (1111 1111) in number base 3, or 3280 in number base 10, (resolution: 2)
  • snare drum (0101) in number base 3, or 10 in number base 10, (resolution: 1)
  • bass drum (kick): (1010) in number base 3, or 30 in number base 10, (resolution: 1)
The lower staff is modeled as:
  • hi-hat: (1111 1111) in number base 3, or 3280 in number base 10, (resolution: 2)
  • snare drum (0101) in number base 3, or 10 in number base 10, (resolution: 1)
  • bass drum (2101) in number base 3, or 64 in number base 10, (resolution: 1)

What's the point?

Any decimal number can be rewritten in number base 3 and vice versa, so any integer represents a drum pattern voice, and every drum pattern voice can be written as a single integer. So drum pattern voices can be enumerated and constructed systematically.

Hah! I bet you can't do triplets can you?

Why not? Of course I can. Suppose you have a drum pattern voice that mixes 8th notes with 8th-based triplets. You can again consider the 8th notes as consisting of 3 tied 16th-triplet notes, and the 8th triplet notes as consisting of 2 tied16th triplet notes. The resolution is 6 (since there are 6 triplet 16ths per beat), and the pattern for 8th note triplets is (212121) (decimal: 616), whereas the pattern for 8th notes is (221221) (decimal: 700). 

How did I know I had to use a resolution of 6? A single beat has 3 triplet 8th notes, or 2 8th notes. The least common multiple of 2 and 3 is 6. Therefore I had to subdivide the beat into 6 equal parts (which corresponds to using a triplet 16th as reference length).

Is this system general enough to encode mixtures of different tuplets?

It is, but if you got to very exotic rhythms, you may end up with large resolutions and many digits. Rest assured: in popular practice most rhythms don't need very complex encodings.

Can you convert representations between different resolutions?

To some extent, yes, but not every pattern can be expressed in any resolution without loss of information. In number base 3, if you understand what we're doing here, it's rather trivial. E.g.
  • (1010) with resolution 2 corresponds to (21002100) in resolution 4. What we did here is replace every 8th note with tied 16th notes. This boils down to applying rewriting rules 1 -> 21 and 0 -> 00. This is "up"sampling the rhythmic representation, and it may be a preparation step for other transformations later on.
  • Similar we can upsample (21002100) in resolution 4 to (2221000022210000) in resolution 8. Here we replaced every 16th note with tied 32th notes.This boils down to applying rewriting rules 2 -> 22, 1-> 21, 0 ->00.
  • If you want to halve the resolution, you process the base 3 numbers by two: 
    • Take drum pattern represented by 2221000022210000 in number base 3, and group by 2: (22,21,00,00,22,21,00,00)
    • Then substitute: 22 -> 2, 21 -> 1, 00 -> 0 (This is a form of "down"-sampling without loss of information)
    • If you encounter other patterns than 22, 21 or 00 you cannot reduce the resolution without mutilating the rhythm.In that case you down-sample while losing some information (a kind of low-pass filtering).
    • If you downsample the rhythm to a lower resolution, and while doing so are forced to mutilate the rhythm, you can upsample it again, then subtract the resulting number from the original number to get an error rhythm (a kind of high-pass filtered rhythm).

Are digits 0,1,2 enough to notate any rhythm?

Yes and no. Yes: you can form any rhythm using this system. No: you cannot accurately annotate certain expressive marks (e.g. staccato, marcato, ghost notes) using this system. To add such information should be possible by introducing new digits (which themselves can encode different dimensions of information, e.g. by forming the digits by multiplying prime factors, where presence of a given prime factor indicates presence of a certain expressive mark). In that case not every conceivable number is a valid rhythm anymore and things may get hairy. Instead of absorbing the expressive marks directly in the rhythm model, they can also be added as meta-information, e.g. in the form of a second (binary) number where each bit represents presence or absence of a given expressive mark.

So how do I use this in my generative music?

It's up to you how you use the representation to create music. Here are some possibilities.
  • You can generate random integers and interpret them as drum pattern voices.
  • You can start from an integer, and use rewriting rules like the ones shown above to upsample to a higher resolution. By using rewriting rules other than the ones present in the previous section you can systematically calculate variations on the starting pattern. E.g. try 21->11, 22->11 or 22->21 to break ties, or 21 ->  10 to replace a longer duration with a shorter one.
  • Instead of using rewriting rules, you can also use systematic calculations on the decimal representations (or representation in any other number base really), and interpret the results as rhythms again. In that case the variations are stilll systematic, but most likely more unpredictable to an observer.

Sunday, September 6, 2015

Fear of change and its influence on the practice of music composition

Introduction

In this blog entry I will formulate some thoughts about how fear of change can explain a number of principles in music composition. It's very well possible that all this has been written before, and much better explained than I will ever be able to do, but I'm in philosophical mood today, and perhaps you'll start to think differently about some things after reading this text. If you experience a feeling of skepticism while reading this article, and feel like it's written by an internet crackpot theorist, rest assured that this is in complete correspondence with what the article predicts will happen :)

Fear of change as an organizing principle in the universe

Fear of change, while sounding specific to human psychology, really permeates the universe. In Newtonian physics, any action will cause a counter-action that resists the original action. If you push your table top down, the table top pushes back and cancels out your intention to change it (until you hit it so hard that it breaks or deforms). Dynamic processes strive for equilibrium, that is, a state in which all changes canceled each other out perfectly and nothing happens anymore. Exactly why all things strive for minimal energy to the best of my knowledge is not known to anyone but it's an empirical observation that has held together science for a few centuries already and has been observed over and again in experiments and observations.

In psychology, “fear of change” is a well-known topic. When Copernicus found that the earth rotates around the sun, it caused massive resistance from the world population. When confronted with the implications of quantum theory (that he helped to establish), Einstein resisted the change in world view it would bring about and declared that "God doesn't throw dice". Announcements for big changes in an organization, e.g., are typically met with skepticism, and quickly resistance and conservatism will pop up to cancel out the announced change. Just google for “change management” to find a myriad of books explaining how to reorganize a corporation. As I will argue in this blog entry, this same mechanism of fear of change (or better "resistance to change") also permeates music theory. 

At the same time the universe doesn't appear to like complete rest. Quantum physics (a revolutionary theory that of course was met with a lot of skepticism at first!) predicts that there's no such thing as complete “rest”. The Heisenberg uncertainty relation necessitates that even at a temperature of 0 Kelvin (the lowest possible temperature in the universe) there must still be a small rest energy. In nature and technology we also observe constant evolution. Change is inevitable it seems. Similarly, in music, listening to a piece that consists of a single note without volume or rhythmic variation that lasts forever is not a pleasant experience. Ask anyone who's suffering from tinnitus what it's like...

Finally, I want to stress that this fear of change is not a bad thing per se. After all, it has helped us survive since the stone age. It was probably safer to eat the berries that your parents ate than to try new berries every day. And it continues until today, since not all big changes or revolutionary “new insights” really have the merit they claim they have (and that may well apply to this blog entry too!)

Fear of change in music

In this section I will list some places where I see fear of change in action in music composition. If you know about more examples, by all means, comment!


3.1 Music style

In modern classical music, certain experiments have been branded "interesting", whereas other experiments have proved to be wildly successful with wide audiences. 

The “12 tone” music style, that resolutely throws away the organizing principle of “sounding good” and replaces it with a different organizing principle of “using all available notes and treating them without differences”, as introduced by Schoenberg, results in music that has many leaps and bounds and, let's face it, has failed to attract a significant audience. On the other hand, “minimal music” with composers like Philip Glass, Steve Reich, Brian Eno, Michael Nyman, Terry Riley and a myriad of others, makes slowly evolving music and continues to be wildly successful with wide audiences. Compared to the 12-tone music, minimal music minimizes change. It also offers just enough changes to keep it from being boring. As such it avoids complete rest.


3.2 Writing melody

When writing melody, e.g. in the context of counterpoint, or in the context of a song, it is advised to avoid big leaps. The reason given by ancient theorists is that smaller leaps are easier to sing. What makes a bigger leap harder to sing accurately than a smaller leap? Is the larger change of pitch a cause for distress in our brains? At the same time, I also took an introduction to counterpoint class, in which I was warned to avoid “turbulence”, i.e. writing a flurry of notes that doesn't seem to go anywhere. Minimize the change, while avoiding complete lack of direction (lack of direction would be a form of equilibrium or rest).


3.3 Voice leading

When moving from chord to chord, voice leading is the principle that makes you do these movements while minimizing the changes in notes. Voice leading is an important topic in many courses on harmony and jazz theory. It is perhaps the most common principle that governs modern music styles (apart from those styles that avoid it on purpose, like the 12-tone music mentioned earlier). Voice leading is a direct application of minimizing change between chords. The fact that you move between chords and don't just stay on the same chord all the time, is a direct application of avoiding complete rest. 

Minimizing changes between chords historically probably also has a second reason: when playing chords on a keyboard it is easiest to play chords that are close together, i.e. where you minimize the changes in required hand and finger movements. Minimizing unneeded movements is absolutely required when learning to play an instrument at the level of a virtuoso. This synergy between physical minimization of change and pyschological minimization of change probably has led voice leading towards the huge role it plays in music composition.


3.4 Fugue construction

While constructing a fugue according to the classical rules, the composer first states the theme, then restates the theme a fifth away from the original theme (but without introducing new accidentals, a so-called modal transposition), then returns to the original theme. This is the so-called “exposition” part of the fugue. During the exposition, the listener is taught the theme that will return in all kinds of variations later on. The theme is taught three times (minimize change), but the second time a fifth away compared to the first and third time (no complete rest). Why a fifth away? At first sight, a fifth seems like a large jump. Why didn't the composer just write the theme a second higher?

There's again an application of the principle here and it requires some explanation. If you transpose all notes from the C major key a perfect fifth up, you get the notes from G major. If you compare the notes of C major and G major, you will notice that they share all the same notes, except for the note f (in C major) compared to a note f# (in G major). G major therefore represents a key that is as close as possible to C major (since it differs in only one accidental) while not being completely the same (since it differs in at least one accidental). A theme written in C major that is modally transposed from c to g will therefore sound maximally the same as the original theme (minimize change). Next time you wonder why moving along the circle of fifths is so popular, fear of change may be the answer you look for.

3.5 Modulation

Modulation is the art of moving from musical key to musical key. When you move from one key to another, you want to gently guide the listener towards this change. When you read about modulation, you will often be advised to modulate to “near” keys, that is, musical keys that do not differ in number of accidentals too much. This is a direct application of minimization of change. 

One can also modulate to more distant keys. In those cases a sudden change, known as direct modulation, is frown upon by composers and theorists. To modulate between keys, especially to distant keys, several advanced techniques have been invented and they involve clever voice leading, sometimes going as far as substituting chromatic notes for enharmonic equivalents, towards a cadence to confirm the new key. These techniques incrementally introduce small changes to the audience so they are guided from the old key into the new key without sudden changes.


3.6 Writing hit songs

Commercial pop music often reuses the same chord progressions. During the 1980-ies, these chords where typically I, IV, V (think: C F G). Nowadays, the new chord progression used in virtually all hit songs is (I, V, vi, IV) (think: C G Am F). Why is that? Why exactly those progressions? Why did I,V,iv,IV come after I,IV,V?

Look at I, IV, V. Remember from the section about fugue construction that transposing a theme a fifth up will maximally retain the existing melody notes. The same is true when transposing a theme a fifth down (note “c” transposed a fifth down gives an “f”. This can just as well be thought of as transposing it a fourth up). When transposing something a fifth up, you need an extra sharp (or one less flat) to completely preserve the same melody. Similarly, when transposing a theme a fifth down, or equivalently a fourth up, you need an extra flat (or one less sharp) to completely preserve the melody. This means that by playing with the chords I, IV, V you have minimized the changes in the set of notes that need to be recognized by an audience, and maximally preserved the possible melodies that can be written on top of these chords.

Complete rest is still not desirable, and so after 20 years of I, IV, V the time was ready for a new chord progression that finds a way to minimize change while avoiding complete rest. And this new chord progression appears to be I, V, vi, IV. It's an evolution from I,IV,V in that it introduces an extra chord. Because the audience is already very used to I, IV, V, this extra chord can inject a bit of much needed change into the music again. The new chord of course is not chosen arbitrarily. It's chosen in such a way that it minimizes change with respect to the old chord progression. 

As before, when going from I to V, you need only one extra accidental. When going from V to vi, you need one less accidental (since vi is the minor equivalent of major I). Then when going from vi to IV, you need one less accidental again, and finally when going back from the final IV to I to sing the next verse, you need a single extra accidental again, making the circle round. Changes have been minimized between every two consecutive chords, and total rest is avoided by moving between different chords.


I'm afraid we're still stuck with I, V, vi, IV for a while, but if you want to define the future, grab your chance, and design a new chord progression that minimizes change while avoiding complete rest :) Unfortunately, you may have trouble selling it to the music publishing companies, since they will probably resist these sudden changes you try to introduce ;) “Never change a winning team/theme!”


Wednesday, August 19, 2015

More notes to self on treating vocals

Steps to follow:

Remove background noise using gate

  • e.g.: threshold: -17dB, reduction: -100dB, attack: 5ms, hold: 30ms, release: 60ms, hysteresis: -3dB, lookahead: 0, high cut: 20kHz, low cut: 20Hz

Corrective Equalization

see "Remove Rumble" and "Sweep Sound" part of previous post

Normalize gain to -3dB

De-esser (some say it should come after compression)

  • e.g.: detection frequency: 9800Hz, sensitivity: 26%
  • e.g.: suppressor: 9300Hz, strength: -9dB

Compression

  • e.g.: attack: 2ms, knee: 1, threshold: -22dB RMS, gain: 8dB, limiter threshold: -0.5dB

Equalization and Enhancing

see "Give Glitter" part of previous post. 

Note: to make sure that all similar vocals are treated similarly, route them through a common bus and apply the effects on the bus.

Add reverb and delay

use send/return configuration for all time based effects

Last minute fix-ups

Autotune + see "Special fx" part of previous post

Sunday, August 16, 2015

Note to self about equalizing vocals

Since I'm sure I will forget this information, I'm putting it online where I know I will find it back. After each step, also check the effect in the mix (i.e. together with other instruments). A subtle effect is usually better than an over-the-top effect. The information here is summarized from https://www.youtube.com/watch?v=qdDDVortvRU . Be sure to check out their video for sound examples.

Step 1: remove rumble

Use a high pass filter (aka low cut filter). Increase the cut-off frequency until you just start to hear the difference, then reduce it a bit. That's right, aim for not hearing the effect. This ensures that you only remove rubbish, and don't remove valuable data. A typical cut-off frequency will be around 80Hz-120Hz.

When done, check the effect in the mix.

Step 2: give glitter

For this purpose use a high shelving filter. Try to boost frequencies above 8kHz with anything from 1dB to about 6dB. If you want a more subtle effect, try to boost above 12kHz-16kHz instead.

When done, check the effect in the mix.

Step 3: sweep sound

Use a small bandpass filter, vary its center frequency and search for frequency bands that obviously stand out compared to other frequency bands. You can attenuate these a bit. A typical action is to attenuate around 800Hz-1kHz.

When done, check the effect in the mix.

Step 4: special fx

This step is optional.
  • To make sound brighter, try to boost 2kHz-5kHz.
  • To make vocal sit better in the mix in quieter passages, try cutting between 100Hz-250Hz
When done, check the effect in the mix.

Wednesday, January 14, 2015

Stefaan Himpe - Fairy tale for piano solo

Fairy tale for piano solo

It's been a while but I finally found some time to write, practice and record a new piece. All things music have been on a break for a while now, and I'm glad that I can take a break from taking a break.

You can download the piece from my soundcloud account: https://soundcloud.com/stefaanhimpe/fairy-tale-for-piano-solo



So... what do we have here really? 

The piece started as a sad ending theme for a video game. It was never used for that purpose and so I decided to rework it into something longer. Maybe it's interesting if I tell you how I think about the music, bar by bar. (Then again, maybe it's not ;) Feel free to skip the explanations. Most of all, I'm also curious how much of this I will still recognize in a year or in 5 years. I may even come back to my description and update it as my insights change.)

Bars 1-4: The piece starts off quite melancholic (which was a requirement for the sad ending of the video game). This part is written in C minor key, a key that is traditionally regarded as the key in which to express a declaration of love with lament of unhappy love, sighing of the lovesick soul. Clearly someone is thinking back about something.

Bars 5-8: The beginning is repeated with a slight variation in the accompaniment to avoid making it sound the same twice. It is as if the accompaniment adds more detail to the memories as they are repeated.

Bars 9-12: Suddenly the melancholic fantasy is interrupted with a little waltz. A short distraction? Or perhaps a related memory, a sudden free association. 

Bars 13-20: The distracting thoughts are pushed away and make place again for the fantasy. As the fantasy repeats itself, the melody notes again add more details to the memories. By coloring the melody using notes outside the scale, the color of the memories changes. Little imperfections that keep the story alive as it were.

Bars 21-27: The fantasy continues. The fantasizing person remembers and overthinks some of the consequences that resulted from whatever happened in bars 13-20.

Bars 27-32: Stress level increases a bit. Temper gets a bit heated. 

Bars 33-34: Some soothing thoughts manage to calm down the person.

Bars 35-38: There's our distraction waltz again. It's barely interesting enough to keep our thoughts away from what happened.

Bars 39-48: The fantasy is resumed, but in bar 45 it suddenly takes a different path. The person has second thoughts about how everything really went back then. For a while, things are remembered as perhaps more festive than before (bars 47-48).

Bars 49-52: These finish the first fantasy and prepare the music to change into a different key of f minor. F minor is the key in which one traditionally expresses deep depression, funeral lament, groans of misery and longing for the grave. No doubt the passage that follows will have a sinister side to it.

Bars 52-85: A buildup of emotions, a waterfall of notes follows. In the left hand we have a sad but very static accompaniment with typical f minor notes. In the right hand, in bars 53-64, we also have very dissonant chords which work to create a somewhat uneasy feeling. In bars 65-76, the dissonant chords are now replaced with a locrian motif that further increases the uneasyness in the music. In bar 74, the deepest note of the piece (a very low "d") is reached. Then bars 77-85 repeat the same techniques but a fifth higher and with many more, and much faster notes, which adds even more drama to the already dramatic state of mind of our fantast. The fifth higher brings us in the key of Bb minor, which traditionally is used to express feelings of mocking God and the world, discontented with itself and everything, preparation of suicide. Quite restless and dramatic indeed :)

Bars 86-91: The deepest darkest memories subside and make place again for the sweeter earlier theme.

Bars 92-113: The earlier theme is repeated, but again the colors have changed to something more bitter-sweet. This time the chords in the right hand sound much more yearning than in the beginning of the piece. As if the darkest memories increased the feeling of having lost something valuable and make it more painful to think back about what was lost.

Bars 114-117: Eventually, the person snaps out of his trip through memory lane and we hear the waltz playing again.

Bar 118: The piece ends with some bitter sweet ending chords. A near happy end, and time to get back to work ;)