STIX Back Projection vs. AIA flare locations

A statistical examination via Gaussian Mixture Models


Back Projection images made with STIX for flares in August and September 2021 can disagree with AIA-observed locations of same-time events by a large amount, that cannot be accounted for via spacecraft roll-angle correction. STIX and AIA had similar views of the solar disk at the time, so agreement should be good.

Many events in August and September did not have a large number of (background-subtracted) counts, making imaging difficult. However, this is not a blanket criterion, and many back-projection images made with this type of data give locations in decent agreement with AIA.

Can the events be separated into different populations that are easy or difficult to image (small or large difference between STIX- and AIA-derived coordinates), based only on the characteristics of the flare itself? (location, instrument counts)

Datasets

raw

  • CFL flare list from STIX website. First ~70 entries are bad.
  • Back Projection coordinate list
  • AIA events from HEK queries based on Back Projection and CFL flare list peak event times

processed

  • Roll-angle corrected Back Projection coordinates, all AIA coordinates rotated into the Solar Orbiter reference frame and vice versa. Additional information on flare properties (e.g. GOES class) joined via HEK queries or existing entries in the CFL flare list.

Data selection

  • get rid of bad CFL records
  • only keep entries with both Back Projection and AIA coordinates
  • only keep HEK records from SolarSoft (ssw latest events), as these contain the GOES class information and Flare Detective entries do not
  • discard entries where the time difference between the STIX peak time and the AIA peak time is > 1000s

350 unique events remained

Features for the model

After some trial and error, the following features were selected for input to the model

Name Description
Bproj_x_corrected Roll-angle corrected flare x-coordinate, STIX POV, divided by apparent size of solar disk
Bproj_y_corrected Roll-angle corrected flare y-coordinate, STIX POV, divided by apparent size of solar disk
Bproj_peak_binned Peak counts in back projection image, rounded to the nearest ten
log_HEK_GOES_flux Log10 of GOES flux as recorded by HEK/AIA
Duration (s) Duration in seconds of STIX flare
hpc_x flare x-coordinate, AIA POV
hpc_y flare y-coordinate, AIA POV

log_STIX_GOES_flux was also considered, but was missing values more often than log_HEK_GOES_flux

Gaussian Mixture Model

Remarks

  • Clusters 0 and 3 together represent the majority of flares

    • The input features of these two clusters differ significantly in only one respect: their x-location on the solar disk. Most flares in cluster 3 have a negative x-coordinate as seen by AIA and vice versa. Due to the relative position of Solar Orbiter and AIA (which is dependent on time of year), this corresponds to Solar Orbiter mostly seeing the flares on the opposite side of the disk from its perspective.
    • Interestingly enough, when the features not included in the model are plotted with their assigned cluster, it seems that flares in Cluster 0 more often have disagreements between positions given by AIA and SOLO. Therefore, flares whose position is likely to be inaccurately represented by STIX back-projection images might come from a different population than those who are more accurately represented.
  • Cluster 1 represents long-duration flares

  • Cluster 2 represents events that happened very closely in time to one another. This occured on days where a large active region produced many flares of varying sizes.

Model evaluation

  • Silhouette Coefficient

    • -1 for incorrect clustering and +1 for highly dense clustering (0 for overlapping clusters)
  • Calinski-Harabasz Index

    • Ratio of the sum of between-clusters dispersion and of within-cluster dispersion for all clusters
    • Higher when clusters are dense and well separated
  • Davies-Bouldin Index

    • This index signifies the average ‘similarity’ between clusters, where the similarity is a measure that compares the distance between clusters with the size of the clusters themselves.
    • Zero is the lowest possible score. Values closer to zero indicate a better partition.

Interpretation

  • The distinctness between clusters 0 and 3 is small. The Silhouette Score indicates that a fair amount of overlap exists, and the Davies-Bouldin Index also indicates a poor partition
  • The implication that flares in these clusters, which appear to be related to low and high values of _Bproj_AIAx/ydiff, come from different populations is not strongly supported
  • A test case of 3 clusters instead of 4 ought to result in 3 well-separated clusters, where flares from (former) clusters 0 and 3 are now together in a single cluster


Test case: 3 clusters instead of 4

Conclusion

  • The data itself does not present any definitive reason why flare locations should disagree so much between STIX Back Projection and AIA imaging.

  • STIX Back Projection has known issues, and these should be addressed first

  • The 4-cluster GMM might provide a method by which to identify flares that need closer examination of the back projection results; however, as this requires both an initial back projection location from STIX and a location from AIA this is redundant, as the location difference can be checked directly