Help Privacy Policy Disclaimer
  Advanced SearchBrowse




Journal Article

Evaluating an analysis-by-synthesis model for Jazz improvisation


Frieler,  Klaus
Scientific Services, Max Planck Institute for Empirical Aesthetics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

(Publisher version), 2MB

Supplementary Material (public)
There is no public supplementary material available

Frieler, K., & Zaddach, W.-G. (2022). Evaluating an analysis-by-synthesis model for Jazz improvisation. Transactions of the International Society for Music Information Retrieval, 5(1), 20-34. doi:10.5334/tismir.87.

Cite as: https://hdl.handle.net/21.11116/0000-000A-1F3C-2
This paper pursues two goals. First, we present a generative model for (monophonic) jazz improvisation whose main purpose is testing hypotheses on creative processes during jazz improvisation. It uses a hierarchical Markov model based on mid-level units and the Weimar Bebop Alphabet, with statistics taken from the Weimar Jazz Database. A further ingredient is chord-scale theory to select pitches. Second, as there are several issues with Turing-like evaluation processes for generative models of jazz improvisation, we decided to conduct an exploratory online study to gain further insight while testing our algorithm in the context of a variety of human generated solos by eminent masters, jazz students, and non-professionals in various performance renditions. Results show that jazz experts (64.4% accuracy) but not non-experts (41.7% accuracy) are able to distinguish the computer-generated solos amongst a set of real solos, but with a large margin of error. The type of rendition is crucial when assessing artificial jazz solos because expressive and performative aspects (timbre, articulation, micro-timing and band-soloist interaction) seem to be equally if not more important than the syntactical (tone) content. Furthermore, the level of expertise of the solo performer does matter, as solos by non-professional humans were on average rated worse than the algorithmic ones. Accordingly, we found indications that assessments of origin of a solo are partly driven by aesthetic judgments. We propose three possible strategies to install a reliable evaluation process to mitigate some of the inherent problems.