Abstract
Autism Spectrum Disorder (ASD) is known to have both genetic and environmental etiology, yet the behavioral manifestation of ASD can vary drastically from person to person. Such phenotypic heterogeneity complicates both biomarker discovery and clinical diagnosis. Currently, trained clinicians perform diagnostic procedures via behavioral assessments, but knowledge about the biological mechanisms underlying ASD is lacking. Autism research has long sought categorical subtypes to better identify individuals poorly served by a binary diagnosis alone; however, these subtyping efforts have yielded unstable, non-replicable groupings and fail to explain phenotypic heterogeneity. Categorical framing represents a fundamental oversimplification in the field: forcing discrete models on data with continuous topology yields inconclusive and misleading results. Advances in human neuroimaging, particularly structural, diffusion, and functional magnetic resonance imaging (MRI), have enabled more effective pathological characterization at levels behavioral assessments cannot reach. However, MRI data is difficult to obtain in large quantities due to the time-consuming and expensive imaging collection process; moreover, neuroimaging yields highly dimensional and noisy feature spaces. The confluence of these conditions leads to the “curse of dimensionality,” where the number of predictors far exceeds the number of available participants. This necessitates thoughtful modeling choices and careful consideration of feature selection and extraction to better relate to underlying biological mechanisms.
In this work, we explore autism as a sex-moderated, context-dependent condition through the lens of multimodal phenotypic and neural data, with particular attention to biological underpinnings and interpretability. Heterogeneity in autism manifests differently across behavioral domains, requiring topology-aware modeling strategies. Using multimodal neuroimaging, genetics, and behavioral phenotyping, we pursue three objectives. First, we apply linear discriminant analysis to executive function and repetitive behavior profiles, revealing sex-by-diagnosis interactions that differentiate autistic males from autistic females along dimensions invisible to binary diagnostic frameworks. Second, we introduce and rigorously validate an interpretable dimensionality reduction pipeline that fuses high-dimensional neurogenetic data while preserving feature traceability and biological provenance, achieving improved generalization over classification pipelines that use traditional PCA-based dimensionality reduction. Third, we demonstrate that sensory responsivity in autism does not form discrete subtypes but instead organizes along a continuous severity manifold detectable via spectral embedding. Critically, the neural correlates of this sensory gradient are context-dependent: insula-motor connectivity associations emerge only during social-sensory integration tasks and not at rest, with sex-specific decoupling patterns indicating compensatory mechanisms in autistic females. These findings have direct implications for neuroimaging protocol design, biomarker discovery, and understanding the female diagnostic gap. More broadly, this work establishes that subtyping should be treated as a testable hypothesis, not a default assumption, with analytic methods matched to the intrinsic topology of clinical data.