Abstract
The appreciation of dance, film, and other temporal art forms relies on the continuous integration of auditory and visual streams. In this study, we investigate how bimodal audiovisual preferences arise from unimodal auditory and visual preferences. To this end, we created and validated the open-resource complexity in audiovisual aesthetics stimulus set (https://osf.io/e5uh9/), consisting of 120 short, dynamic, and abstract auditory, visual and audiovisual stimuli in which auditory and visual complexity corresponds to the number and variety of elements. In Experiment 1, 87 participants rated liking and perceived complexity for each stimulus, with visual, auditory, and audiovisual blocks fully randomized. In Experiment 2, 53 participants rated how much they liked each stimulus with the audiovisual block presented first to avoid potential bias arising from prior experience of unimodal stimuli and the simultaneous complexity judgements. Structural equation modeling and linear mixed-effects analysis show that liking for audiovisual stimuli can be explained by a weighted sum of liking for their auditory and visual components modulated by audiovisual congruence. Audiovisual preferences exhibit inverted-U-shaped relationships with auditory and visual complexity, the latter mediated by perceived complexity and modulated by congruence. Our findings provide a carefully controlled departure point for better understanding the role of prediction of sequential structure for the experience of dynamic audiovisual art forms such as dance or film.