ROOSTER#

class star_privateer.ROOSTER(rotclass_kwargs=None, periodsel_kwargs=None, **kwargs)#

ROOSTER object, wrapping a random forest classifiers framework designed to analyse surface rotation in stellar light curves.

__init__(rotclass_kwargs=None, periodsel_kwargs=None, **kwargs)#

Initiate a new ROOSTER instance. A RotClass and a PeriodSel classifiers are both created as attributes of the ROOSTER object. Additional parameters provided when initialising a ROOSTER instance will be passed to sklearn.ensemble.RandomForestClassifier.

Parameters:
  • rotclass_kwargs (dict) – Keyword arguments to pass to the RotClass random forest classifier. Optional, default None.

  • periodsel_kwargs (dict) – Keyword arguments to pass to the PeriodSel random forest classifier. Optional, default None.

  • **kwargs – Keyword arguments common to both random forest classifiers if trained on the fly.

analyseSet(features, p_candidates, e_p_err=None, E_p_err=None, feature_names=None, target_id=None)#

Analyse provided targets using ROOSTER.

Parameters:
  • features (ndarray) – Feature on which to perform the classification. Must be of shape (n, n_features).

  • p_candidates (ndarray) – Candidate periods to be recovered by PeriodSel. Must be of shape (n, n_class) where n_class is the number of methods used to provided rotation period candidates for each target.

  • e_p_err (ndarray) – Lower uncertainties on periods from p_candidates. Selected period uncertainties will be provided as output if e_p_err and E_p_err are provided. Optional, default None.

  • E_p_err (ndarray) – Upper uncertainties on periods from p_candidates. Selected period uncertainties will be provided as output if e_p_err and E_p_err are provided. Optional, default None.

  • feature_names (ndarray) – Feature names. Must be of shape (n_features). In case provided feature names are not consistent with the ones used to train the classifiers, an exception will be raised. Optional, default None.

  • target_id (ndarray) – If provided, will be stored in memory as the last set of identifiers analysed by the classifiers. Optional, default None.

Returns:

Tuple of arrays with, in this order, the rotation score attributed by RotClass, the rotation period selected by PeriodSel, and, if e_p_err and E_p_err were provided as input, the corresponding lower and upper uncertainties on periods.

Return type:

tuple of arrays

computePeriodSelTrueAccuracy(target_id, predicted_periods, tolerance=0.1, catalog='santos-19-21')#

Compute PeriodSel true Accuracy for a given sample of target by comparing the reference period value to the value chosen by ROOSTER, with a tolerance interval.

Parameters:
  • target_id (ndarray) – Identifiers of the n targets for which parameters are provided. Non-unique identifiers are allowed.

  • predicted_periods (ndarray) – Periods predicted by PeriodSel.

  • tolerance (float) – Tolerance to consider when comparing predicted_periods to the reference periods

  • catalog (str) – Catalog to consider for the reference rotation period value of each target. Optional, default santos-19-21.

Returns:

The PeriodSel classifier true accuracy.

Return type:

float

getFeatureNames()#

Get name of feature that ROOSTER requires for classification.

getLastAnalysisInfo()#

Get list of identifiers and corresponding classes obtained with the last analysis run.

Returns:

Arrays with target identifiers (might be None if they were not provided), corresponding rotation score, and index of selected period (with respect to the p_candidates array that was provided).

Return type:

tuple of arrays

getNumberEltTest()#

Return a tuple of integer, corresponding to the number of elements used to train each ROOSTER classifier.

getNumberEltTrain()#

Return a tuple of integer, corresponding to the number of elements used to train each ROOSTER classifier.

getScore()#

Returns ROOSTER classifying scores. Scores are returned in the following order: RotClassTestScore, PeriodSelTestScore. The ROOSTER instance must have been trained and tested before.

getTestPeriodSelInfo()#

Get list of identifiers and corresponding classes obtained when testing PeriodSel.

Returns:

Arrays with target identifiers, corresponding reference classes, and predicted classes.

Return type:

tuple of arrays

getTestRotClassInfo()#

Get list of identifiers and corresponding classes obtained when testing RotClass.

Returns:

Arrays with target identifiers, corresponding reference classes, and predicted classes.

Return type:

tuple of arrays

getTrainingPeriodSelInfo()#

Get list of identifiers and corresponding classes used to train PeriodSel.

Returns:

Arrays with target identifiers and corresponding reference classes.

Return type:

tuple of arrays

getTrainingRotClassInfo()#

Get list of identifiers and corresponding classes used to train RotClass.

Returns:

Arrays with target identifiers and corresponding reference classes.

Return type:

tuple of arrays

save(filename)#

Save the ROOSTER instance as filename.

selectParam(candidates)#

Select a parameter corresponding to the rotation periods selected previously by PeriodSel. The analyseSet function must be run before using this function.

Parameters:

candidates (ndarray) – parameters to consider to perform the selection. The array must have the same shape as the p_candidates array that was provided to analyseSet.

test(target_id, p_candidates, features, catalog='santos-19-21', verbose=False, feature_names=None, e_p_err=None, E_p_err=None, tolerance=0.1)#

Test ROOSTER classifiers with the provided test set.

Parameters:
  • target_id (ndarray) – Identifiers of the n targets for which parameters are provided. Non-unique identifiers are allowed.

  • p_candidates (ndarray) – Candidate periods to be recovered by PeriodSel. Must be of shape (n, n_class) where n_class is the number of methods used to provided rotation period candidates for each target.

  • features (ndarray) – Feature on which to perform the classification. Must be of shape (n, n_features).

  • feature_names (ndarray) – Feature names. Must be of shape (n_features). Optional, default None.

  • e_p_err (ndarray) – Lower uncertainties on periods from p_candidates. Selected period uncertainties will be provided as output if e_p_err and E_p_err are provided. Optional, default None.

  • E_p_err (ndarray) – Upper uncertainties on periods from p_candidates. Selected period uncertainties will be provided as output if e_p_err and E_p_err are provided. Optional, default None.

  • catalog (str) – Catalog to consider for the reference rotation period value of each target. Optional, default santos-19-21.

  • verbose (bool) – Output verbosity. Optional, default False.

  • tolerance (float) – Tolerance to consider when checking that at least one period in p_candidates is compatible with the target reference period (therefore allowing to use the target in the test set.

Returns:

Tuple of array with, in this order, target identifiers tested for RotClass, inferred class (rot or norot), target identifiers tested for PeriodSel, selected periods, and, if e_p_err and E_p_err were provided as input, corresponding lower and upper uncertainties on periods.

Return type:

tuple of array

train(target_id, p_candidates, features, feature_names=None, catalog='santos-19-21', verbose=False, tolerance=0.1)#

Train ROOSTER classifiers with the provided training set.

Parameters:
  • target_id (ndarray) – Identifiers of the n targets for which parameters are provided. Non-unique identifiers are allowed.

  • p_candidates (ndarray) – Candidate periods to be recovered by PeriodSel. Must be of shape (n, n_class) where n_class is the number of methods used to provided rotation period candidates for each target.

  • features (ndarray) – Feature on which to perform the training. Must be of shape (n, n_features).

  • feature_names (ndarray) – Feature names. Must be of shape (n_features). Optional, default None.

  • catalog (str) – Catalog to consider for the reference rotation period value of each target. Optional, default santos-19-21.

  • verbose (bool) – Output verbosity. Optional, default False.

  • tolerance (float) – Tolerance to consider when checking that at least one period in p_candidates is compatible with the target reference period (therefore allowing to use the target in the training set.

star_privateer.create_rooster_feature_inputs(df, return_err=False, candidate_names=None, candidate_lower_errors=None, candidate_upper_errors=None, priority=None, verbose=False)#

Take a DataFrame created by build_catalog_features and return ready-to-use input array for ROOSTER training and classification.

Parameters:
  • df (pandas.DataFrame) – The dataframe created by the build_catalog_features function, containing the features that will be used to train and test ROOSTER classifiers.

  • return_err (bool) – If set to True, the uncertainties on the candidate periods will be returned by the function. Optional, default False.

  • candidate_names (list) – Name of the candidate periods to select among the features. If not provided, the default will be to consider feature with name starting with prot. Note that related candidate_names, candidate_lower_errors, and candidate_upper_errors should be named with the same suffix in order to be consistently sorted. Optional, default None.

  • candidate_lower_errors (list) – Name of the lower uncertainty on the candidate periods to select among the features. If not provided, the default will be to consider feature with name starting with e_prot. Optional, default None.

  • candidate_upper_errors (list) – Name of the upper uncertainty on the candidate periods to select among the features. If not provided, the default will be to consider feature with name starting with E_prot. Optional, default None.

  • priority (list of strings) – In case candidate_names is None, provides the rule to follow to prioritise the methods. The code will attempt at finding each provided string as a substring of the candidate names (and errors). For each string provided in priority, the first matching element in candidate_names (and related candidate_lower_errors and candidate_upper_errors) will be moved towards the beginning of the list according to the defined priority rule. Optional, default None. In this case, the code will perform the sorting operation using priority = ["ps", "cs", "acf", "ls"].

  • verbose (bool) – Output verbosity. Optional, default False.

Returns:

Tuple of arrays, including, in this order, the target identifiers, the candidate rotation periods, the lower and upper uncertainties on rotation periods (only if return_err is set to True), the training features arrays and the corresponding feature names.

Return type:

tuple of arrays

star_privateer.load_rooster_instance(filename=None, methods=None, verbose=False, seed=None, rotclass_kwargs=None, periodsel_kwargs=None, **kwargs)#

If filename is provided, load the ROOSTER instance saved under this name, otherwise train a ROOSTER instance on the fly.

Parameters:
  • filename (str or Path instance) – If provided, the ROOSTER instance saved under this name will be loaded instead of training on the fly. Optional, default None.

  • methods (list) – The list of methods considered to select the training parameters if filename is None and the ROOSTER instance is trained on the fly. Allowed methods are "WPS"`, ``"GLS"`, ``"ACF"`, and ``"CS"`. If not specified, the full set of parameter will be considered for the training. Note that, in any case, if ``"WPS" is among the listed methods, the CS parameters will correspond to the GWPS x ACF composite spectrum (otherwise, the CS parameters will be taken from the GLS x ACF composite spectrum). Optional, default None.

  • verbose (bool) – Output verbosity.

  • seed (int) – Random seed to use in case ROOSTER is trained on the fly. Optional, default None.

  • rotclass_kwargs (dict) – Keyword arguments to pass to the RotClass random forest classifier if trained on the fly. Optional, default None.

  • periodsel_kwargs (dict) – Keyword arguments to pass to the PeriodSel random forest classifier if trained on the fly. Optional, default None.

  • **kwargs – Keyword arguments common to both random forest classifiers.

Returns:

The loaded ROOSTER object.

Return type:

ROOSTER instance