Analysis framework for rotation period extraction (MSAP4-03)#

This notebook provide an example of the analysis performed by the PLATO MSAP4-03 submodules: results from Lomb-Scargle periodogram, autocorrelation functions and composite spectrum are used to produce a set of features exploited by an existing instance of ROOSTER to return the final rotation period of the analysed target. You will find that what is done here is very similar to the ROOSTER tutorial notebook (rooster_training_framework, you should run it before doing this tutorial), the only big difference is actually that we will use here a pre-trained ROOSTER instance !

Here, the MSAP4-01A and MSAP4-02 steps required to feed MSAP4-03 are included in order to be able to run this notebook independently from the MSAP4-01A and MSAP4-02 notebooks.

Note that, due its significant computing time, the MSAP4-01B component dedicated to background analysis is executed independently in another notebook.

Note: This notebook has been designed for the purpose of scientific justification of MSAP4-03. The notebook illustrated the precise flowchart envisaged for MSAP4-03 is cs_rooster_sph_analysis.ipynb

import star_privateer as sp

A simple example#

import importlib
import tqdm
import os
import numpy as np
import matplotlib.pyplot as plt

if not os.path.exists ('stellar_analysis_features') :
    os.mkdir ('stellar_analysis_features')
if not os.path.exists ('stellar_analysis_plots') :
    os.mkdir ('stellar_analysis_plots')

Our working case is KIC 3733735, a well-known Kepler fast rotating star.

filename = sp.get_target_filename (sp.timeseries, '003733735')
t, s, dt = sp.load_resource (filename)

The first thing we have to do is run the analysis pipeline. In particular, we can take a look at the plots made from the different analysis methods.

p_in = np.linspace (0, 100, 1000)
(p_ps, p_acf, ps, acf,
 cs, features, feature_names, _) = sp.analysis_pipeline (t, s, periods_in=p_in, figsize=(8,10),
                                                            wavelet_analysis=False, plot=True,
                                                            filename='stellar_analysis_plots/003733735.png',
                                                        )

../../_images/stellar_analysis_framework_7_0.png

We then save the results to a csv file:

fileout = 'stellar_analysis_features/003733735.csv'
df = sp.save_features (fileout, 3733735, features, feature_names)

As in the previous tutorial, let’s build a feature catalog. This is actually not required here because we are analysing only one star, but this step allows to ROOSTER-analyse several stars together with a simple framework.

df = sp.build_catalog_features ('stellar_analysis_features')

Then, let’s load the ROOSTER instance that we have trained in the previous tutorial:

chicken = sp.load_rooster_instance (filename='rooster_instances/rooster_tutorial')

As previously, let’s split the DataFrame into ROOSTER required inputs:

(target_id, p_candidates,
 e_p_candidates, E_p_candidates,
 features, feature_names) = sp.create_rooster_feature_inputs (df, return_err=True)

Here, we can see that there is actually (almost) nothing to do, as the three methods have yielded the same \(P_\mathrm{rot}\) estimate. However, we need ROOSTER to provide us with the rotation score of the target. ROOSTER will also select one of the three p_candidates as the final estimate for our target.

p_candidates

array([[2.55994471, 2.59507401, 2.49247385]])

The analyseSet function implemented in ROOSTER allows to analyse the features we extracted with the analysis pipeline. By providing feature_names, we ensure that ROOSTER was trained with the same features that those we extracted.

rotation_score, prot, e_p, E_p = chicken.analyseSet (features, p_candidates, e_p_err=e_p_candidates,
                                                     E_p_err=E_p_candidates, feature_names=feature_names)

We finally get the rotation score and the final \(P_\mathrm{rot}\). A rotation score above 0.5 means that the ROOSTER analysis favours a detection of stellar surface rotation signal.

rotation_score, prot, e_p, E_p

(array([0.82]), array([2.55994471]), array([0.0604509]), array([0.0634239]))

Analysing a PLATO simulated light curves dataset#

In order to illustrate the pipeline features described above, we can apply the pipeline to a larger dataset of 255 PLATO simulated light curves in order to check what we recover.

import plato_msap4_demonstrator_datasets.plato_sim_dataset as plato_sim_dataset

if not os.path.exists ('plato_sim_features') :
    os.mkdir ('plato_sim_features')
if not os.path.exists ('plato_sim_plots') :
    os.mkdir ('plato_sim_plots')

list_id = sp.get_list_targets (plato_sim_dataset)

Note that in the current version of the demonstrator, we apply a 55-day high-pass finite impulse response filter to the simulated light curves (preprocess) function in order to remove low-frequency systematics while preserving at most the signature of stellar activity in the data. In the future, data product calibrated specifically for MSAP4 or dedicated calibration step aimed at reducing systematics at most would allow to significantly improve the analysis performances.

for elt in tqdm.tqdm (list_id) :
    str_elt = str (elt).zfill (3)
    fileout = 'plato_sim_features/{}.csv'.format(str_elt)
    filename = sp.get_target_filename (plato_sim_dataset, str_elt, filetype='csv')
    if not os.path.exists (fileout) :
        t, s, dt = sp.load_resource (filename)
        s = sp.preprocess (t, s, cut=55)
        (p_ps, p_acf, ps, acf,
         cs, features, feature_names, _) = sp.analysis_pipeline (t, s, periods_in=p_in,
                                                                    wavelet_analysis=False, plot=True,
                                                                    filename='plato_sim_plots/{}.png'.format(str_elt),
                                                                    figsize=(10,16),
                                                                    lw=1, dpi=300, smooth_acf=True)
        df = sp.save_features (fileout, str_elt, features, feature_names)

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 255/255 [00:00<00:00, 44256.53it/s]

We can now analyse the obtained features with ROOSTER to provide our final results.

df = sp.build_catalog_features ('plato_sim_features')
(target_id, p_candidates,
 e_p_candidates, E_p_candidates,
 features, feature_names) = sp.create_rooster_feature_inputs (df, return_err=True)
rotation_score, prot, e_p, E_p = chicken.analyseSet (features, p_candidates, e_p_err=e_p_candidates,
                                                     E_p_err=E_p_candidates, feature_names=feature_names)
df

	prot_ps	prot_acf	prot_cs	e_prot_ps	E_prot_ps	e_prot_acf	E_prot_acf	e_prot_cs	E_prot_cs	sph_ps	sph_acf	sph_cs	h_ps	fa_prob_ps	hacf	gacf	hcs
target_id
0	43.384664	32.631736	29.219674	1.152496	1.216480	-1.0	-1.0	1.242817	1.242817	754.547891	747.270985	722.371710	0.243752	0.000000e+00	0.204530	0.417558	0.457579
1	33.054982	11.208262	16.348645	4.861871	6.916978	-1.0	-1.0	3.874861	3.874861	196.002432	197.573001	217.818959	0.016114	0.000000e+00	-0.154258	0.000649	0.489209
2	17.353865	18.215161	17.308824	2.362525	3.234358	-1.0	-1.0	0.424494	0.424494	130.530199	132.035535	130.541438	0.083866	0.000000e+00	0.271299	0.626426	0.889873
3	21.034988	20.923477	20.924822	2.248453	2.852080	-1.0	-1.0	0.727613	0.727613	107.469907	107.619504	107.619504	0.152170	0.000000e+00	0.552096	1.054497	0.983289
4	28.923109	7.708284	8.893554	0.722536	0.760135	-1.0	-1.0	2.896222	2.896222	156.269982	149.832641	149.472409	0.011790	4.780577e-250	-0.077873	0.002125	0.357145
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
250	31.552483	29.027592	31.219175	5.676881	8.918001	-1.0	-1.0	3.982823	3.982823	215.787517	205.336661	202.222122	0.160314	0.000000e+00	0.561387	1.081777	0.855115
251	20.416312	19.117933	20.219246	2.109279	2.665563	-1.0	-1.0	1.734646	1.734646	184.403461	175.416258	184.363934	0.160625	0.000000e+00	0.397343	0.852581	0.851365
252	36.534454	37.638648	36.899833	4.307302	5.618974	-1.0	-1.0	4.736251	4.736251	1353.429169	1370.863674	1358.401532	0.250936	0.000000e+00	0.752538	1.515209	0.986710
253	17.353865	17.416555	17.478516	3.479639	5.770656	-1.0	-1.0	2.233398	2.233398	163.173697	163.119816	163.063621	0.078761	0.000000e+00	0.332430	0.760885	0.981911
254	18.760936	18.819324	19.007517	2.696398	3.768938	-1.0	-1.0	0.604807	0.604807	1162.545259	1161.618942	1157.821920	0.253955	0.000000e+00	0.343084	0.878379	0.973676

255 rows × 17 columns

Next, let’s load the reference catalog for these simulated light curves in order to compare the results from our pipeline with what was injected in the data.

prot_ref = sp.get_prot_ref (target_id, catalog='plato-sim')
cond_0 = (rotation_score>0.5)
cond_1 = (np.abs (prot - prot_ref) < 0.1 * prot_ref)
cond_2 = (np.abs (prot - prot_ref) < 0.1 * prot_ref) & (rotation_score>0.5)
score_0 = target_id[cond_0].size / target_id.size
score_1 = target_id[cond_1].size / target_id.size
score_2 = target_id[cond_2].size / target_id.size
score_0, score_1, score_2

(0.9450980392156862, 0.6274509803921569, 0.6078431372549019)

The score computed here means that we were able to successfully detect a rotation signal and recover the correct rotation period for about 61% of the stars in the sample. We can take a look at histograms to check the rotation score of our population and to compare the input rotation periods distribution to the one we recover.

fig, (ax1, ax2) = plt.subplots (1, 2, figsize=(10, 4))

bins = np.linspace (0, 1, 20, endpoint=False)
ax1.hist (rotation_score, bins=bins, color='darkorange')
ax1.axvline (0.5, ls='--', color='blue', lw=2)
bins = np.linspace (0, 80, 20, endpoint=False)
ax2.hist (prot, bins=bins, color='darkorange')
ax2.hist (prot_ref, bins=bins, facecolor='none',
         edgecolor='black', label='Ref')

ax1.set_ylabel (r'Number of stars')
ax1.set_xlabel (r'Rotation score')
ax2.set_xlabel (r'$P_\mathrm{rot}$ (day)')

ax1.set_xlim (0, 1)
ax2.set_xlim (0, 80)

(0.0, 80.0)

../../_images/stellar_analysis_framework_32_1.png

Analysis framework for rotation period extraction (MSAP4-03)

Contents

Analysis framework for rotation period extraction (MSAP4-03)#

A simple example#

Analysing a PLATO simulated light curves dataset#