Visual complexity influences human attention, memorability, and task difficulty, yet most computational estimators produce a single point value without reporting confidence. We introduce an uncertainty-aware framework that estimates both a central tendency and a calibrated interval for image complexity using only pairwise human judgments. Our approach couples a heteroskedastic pairwise aggregation model with a compact predictor that produces mean and variance from intermediate visual features. To provide distribution-free coverage guarantees, we further wrap predictions with normalized conformal intervals. We propose an evaluation protocol that measures rank correlation to human scores alongside probabilistic calibration (coverage and sharpness) and show that uncertainty improves downstream decisions in complexity-aware compression and layout scheduling. Experiments across multiple categories demonstrate state-of-the-art correlations to human judgments while delivering well-calibrated prediction intervals. We release code, annotation interface, and splits to facilitate reproducible research.
Visual complexity influences human attention, memorability, and task difficulty, yet most computational estimators produce a single point value without reporting confidence. We introduce an uncertainty-aware framework that estimates both a central tendency and a calibrated interval for image complexity using only pairwise human judgments. Our approach couples a heteroskedastic pairwise aggregation model with a compact predictor that produces mean and variance from intermediate visual features. To provide distribution-free coverage guarantees, we further wrap predictions with normalized conformal intervals. We propose an evaluation protocol that measures rank correlation to human scores alongside probabilistic calibration (coverage and sharpness) and show that uncertainty improves downstream decisions in complexity-aware compression and layout scheduling. Experiments across multiple categories demonstrate state-of-the-art correlations to human judgments while delivering well-calibrated prediction intervals. We release code, annotation interface, and splits to facilitate reproducible research.