On+The+Auditory+Scene

On "The Auditory Scene," by Albert S. Bregman. Thoughts and tangents by Lucas S. Rohm.

When it comes to the perception and cognition of music and sound, it seems that our brains almost always organize separate musical events by analyzing the "Auditory Scene." If someone, who had at least minimal musical exposure in the past, were to be played a piece of music that he or she had never heard before, they would be able to perceptually isolate and identify the different parts of the piece, if not by naming specific instruments, then by denoting them with orchestral labels (i.e. rhythm section, melody, harmony, bass, percussion, ambiance, etc). Bregman would mostly likely argue that this type of discretion that occurs when perceiving the auditory scene rises out of the Gestalt principles of grouping, that "perceptual elements...were attracting one another like miniature planets in space with the result that they tended to form clusters in our experience" (19). Although, there are many variations on the basic Gestalt principles of perception, they seem to collectively refer to the idea of [|Prägnanz,] which refers to our tendency to perceive visual and auditory scenes as made up of simple or orderly elements. One of these sub-principles deals with grouping by proximity of individual elements. This is analogous in polyphonic music using a single timbre--let's say, for example, that a pianist plays a solo piece comprised of 100 individual notes played on the piano, and "tones tend to group perceptually with those that are nearest to them in frequency" (Bregman, 15). We will automatically group a portion of the notes and label them as "the bassline," and group and label another portion as "the melody," based solely on the proximity of those notes in pitch. It's very interesting, that some, if not all, listeners will inherently create a perceptual boundary at some middle frequency or frequencies to group notes and associate them with a title. What's even more interesting are the analogues that Gestalt-based perception has in the field of [|semiotics], which is the study of signs and how they symbolize events in human activities. With the above example, we use the word "bassline" as a sign that represents the collection of notes in a piece for piano solo that have frequencies under a certain frequency value (perhaps 261.61 Hz, or whatever). In another example, if we hear a single melody played, perhaps we represent it with a semiotic sign: a series of note names written out, or a musical staff containing the time signature and note heights and durations. Music, in its purest form, is something we can neither physically see or touch, and thus it makes sense that we have naturally created a system of signs and symbols to represent the various qualities of music. With principles of Gestalt organization, we will often make the mental "jump" from the sensation and perception of musical properties straight to it's semantical representation. We will also create groups from more complex pieces of music through other principles of Gestalt perception, for example, similar melodic lines may operate within close pitch proximities of one another, but we will create separate groups from the notes which comprise them based on the similarity between the different timbres which voice those notes (i.e. a harpsichord has a different tone color than an electric guitar). Similarly, we can create groups out of melodies exhibiting harmonic counterpoint voice by the same instrument based on common fate: one group of notes seems to steadily rise in pitch, and another seems to steadily fall in pitch. Rens Bod, a musical psychologist affiliated with the University of Amsterdam, challenges the Gestalt principle based theories of musical analysis in his study titled "[|A Memory Based Model for Music Analysis.] " Bod's experiment basically provides a computational model for the analysis of a large database of simple Western music, the Essen Folksong Collection. Bod essentially creates a computer simulation of how an adult human would group simple folk melodies into phrases, based on small examples of how phrase divisions //should// be made in such situations. He argues that memory plays a larger role in the grouping of musical elements than Gestalt principles by contrasting how his simulation "correctly" handled divisions over certain pitch jumps that a different simulation, defined by Gestalt principles of proximity and similarity, would have divided into incorrect phrases. This study seems to do very little to discredit the principles of Gestalt psychology when it comes to perceiving auditory scenes or events for a few reasons. The first is that all Bod tests is a //simulation// of human musical analysis, which seems unrelated because the emergence of Gestalt organizational principles seem to be a very uniquely human phenomenon. The second reason that this study seems to hold little water is that one of the guiding principles in the Gestalt school is that of "reproductive thinking," that is, perceiving individual elements by using that which is already known about form, AKA a **memory-based** theory of solving problems and organizing elements. Additionally, Bod's study seems to be rife with discrepancies and loose ends ("It is noteworthy that our DOP-Markov parser predicted to a very high degree...the correct grouping boundaries...although it often assigned additional subphrases within these phrases). All in all, Gestalt principles of perceptual organization seem to largely define how we represent, listen to and compose music. The real debate is, is this aspect of our perception of sound beneficial or detrimental to our ability to compose? Even atonal music seems to be subject to principles of Gestalt organization, if not for other reasons, then for the timbrel qualities of sound alone.