Findability and Accessibility of historical (1610-1980) Raw Sunspot Numbers
This BRAIN 2.0 (https://www.belspo.be/belspo/brain2-be/project_p2_en.stm#2023) project was selected mid-2022.
Our project is centered on historical sunspot collections, a heritage of national and international origin, on which WDC SILSO deploys its expertise within an international network of collaborators.
Context: Visual sunspot observations go back to the beginning of the seventeenth century (1610) and are used to form the only indicator of solar activity on the long-term: the international sunspot number. The farther we go back in time, the more complex it is to find and understand sunspot data. Despite being at the center of a series of workshops (SSN workshops, ISSI team) over the last 10 years, which led to a recalibration of the series (cf. topical issue of Solar Physics), the corresponding historical data are still largely scattered: some have been digitized (Zurich journals for the period 1610-1918 digitized at ROB between 2017 and 2019), some have been scanned but not digitized (Zurich journals between 1918 and 1980), some have been published in articles by various teams over time (with or without a link to the data), and some are still in archives or personal collections. In order to be able to reconstruct this crucial long-term index we need to gather all of these scattered data and make it easily findable and accessible for a larger scientific exploitation.
Objectives: The goal of this project is to make the identified raw sunspot data Findable and Accessible through the determination of common criteria submitted to a validation by the scientific community and the inclusion into an existing Virtual Observatory (VO). This standardization process aims at filling a gap that does not allow experts in statistics to use this specific data (with a lot of gaps, few overlap between observers, sparse data, inhomogeneous quality, changes in observing techniques, etc…) without the intervention of a data expert. We will transform the solar data expertise dimension into a set of common criteria that will be used as metadata for the historical sunspot data.
Methods: The project is organized around 4 axes: (1) gathering data sources, (2) processing the data support when necessary (3) validating data by adding homogeneous metadata and (4) disseminating the data. (1) will rely on the knowledge acquired through the Sunspot Workshops and the ISSI team (a non-exhaustive list of sources is already available). (2) requires a careful preparation of the different types of data through character recognition (OCR or HTR, Optical Character recognition or Handwritten Text Recognition) for the scanned sources and quality control of all other sources. (3) aims at making the data Findable through the creation of validation criteria and detailed metadata by extending methods developed for present data (through the BRAIN VALUSUN project) and past data (ISSI team) while (4) will enable the Accessibility via Virtual Observatory and dissemination of the data and the results of the project.
Results: Apart from the methods of reconstructions, a validation of all existing datasets in an homogeneous way is currently missing. Many archives are still unused precisely because they lacked this added value. The output of this project will enable the whole scientific community to readily find and use all raw sunspot data with an expert eye included. At the end of the project, we will be able to try and apply existing methods to the dataset, and achieve for example gap-filling, using for example singular value decomposition (Dudok de Wit, 2011) or combinations of data using tied ranking (Dudok de Wit: ISSI workshop), or matrix correlations (Usoskin et al., 2016).
Dissemination: The data gathered and homogenized through this project will be made accessible via virtual observatory and standard access protocol to all colleagues from the scientific community and outside. In addition to that, a call for additional data will be extended to the public from Belgium and over the world through a citizen science project (via the ISSI community).
Expected impact: Making these raw historical sunspot data collections findable and accessible will strengthen the activities led by the World Data Center SILSO at the Royal Observatory of Belgium. It will increase the visibility in the international community of this key dataset, and thereby improve the sustainability of the Belgian leadership in this area and its international recognition.
Sunspot workshops (2010-2017) : https://ssnworkshop.fandom.com/wiki/Home
ISSI team (2018-2020) : https://www.issibern.ch/teams/sunspotnoser/
Topical issue of Solar Physics, 2016 : https://link.springer.com/article/10.1007/s11207-016-1017-8
Zurich Journals, #1, 1850 : http://articles.adsabs.harvard.edu/pdf/1850MiZur...1....3W
Dudok de Wit, 2011: https://arxiv.org/pdf/1107.4253.pdf, https://doi.org/10.1051/0004-6361/201117024
Usoskin et al., 2016: https://arxiv.org/abs/1512.06421, doi:10.1007/s11207-015-0838-1