ROOSTER#
- class star_privateer.ROOSTER(rotclass_kwargs=None, periodsel_kwargs=None, **kwargs)#
ROOSTER object, wrapping a random forest classifiers framework designed to analyse surface rotation in stellar light curves.
- __init__(rotclass_kwargs=None, periodsel_kwargs=None, **kwargs)#
Initiate a new ROOSTER instance. A
RotClassand aPeriodSelclassifiers are both created as attributes of the ROOSTER object. Additional parameters provided when initialising a ROOSTER instance will be passed tosklearn.ensemble.RandomForestClassifier.- Parameters:
rotclass_kwargs (dict) – Keyword arguments to pass to the
RotClassrandom forest classifier. Optional, defaultNone.periodsel_kwargs (dict) – Keyword arguments to pass to the
PeriodSelrandom forest classifier. Optional, defaultNone.**kwargs – Keyword arguments common to both random forest classifiers if trained on the fly.
- analyseSet(features, p_candidates, e_p_err=None, E_p_err=None, feature_names=None, target_id=None)#
Analyse provided targets using ROOSTER.
- Parameters:
features (ndarray) – Feature on which to perform the classification. Must be of shape
(n, n_features).p_candidates (ndarray) – Candidate periods to be recovered by
PeriodSel. Must be of shape(n, n_class)wheren_classis the number of methods used to provided rotation period candidates for each target.e_p_err (ndarray) – Lower uncertainties on periods from
p_candidates. Selected period uncertainties will be provided as output ife_p_errandE_p_errare provided. Optional, defaultNone.E_p_err (ndarray) – Upper uncertainties on periods from
p_candidates. Selected period uncertainties will be provided as output ife_p_errandE_p_errare provided. Optional, defaultNone.feature_names (ndarray) – Feature names. Must be of shape
(n_features). In case provided feature names are not consistent with the ones used to train the classifiers, an exception will be raised. Optional, defaultNone.target_id (ndarray) – If provided, will be stored in memory as the last set of identifiers analysed by the classifiers. Optional, default
None.
- Returns:
Tuple of arrays with, in this order, the rotation score attributed by
RotClass, the rotation period selected byPeriodSel, and, ife_p_errandE_p_errwere provided as input, the corresponding lower and upper uncertainties on periods.- Return type:
tuple of arrays
- computePeriodSelTrueAccuracy(target_id, predicted_periods, tolerance=0.1, catalog='santos-19-21')#
Compute PeriodSel true Accuracy for a given sample of target by comparing the reference period value to the value chosen by ROOSTER, with a
toleranceinterval.- Parameters:
target_id (ndarray) – Identifiers of the
ntargets for which parameters are provided. Non-unique identifiers are allowed.predicted_periods (ndarray) – Periods predicted by
PeriodSel.tolerance (float) – Tolerance to consider when comparing
predicted_periodsto the reference periodscatalog (str) – Catalog to consider for the reference rotation period value of each target. Optional, default
santos-19-21.
- Returns:
The
PeriodSelclassifier true accuracy.- Return type:
float
- getFeatureNames()#
Get name of feature that ROOSTER requires for classification.
- getLastAnalysisInfo()#
Get list of identifiers and corresponding classes obtained with the last analysis run.
- Returns:
Arrays with target identifiers (might be
Noneif they were not provided), corresponding rotation score, and index of selected period (with respect to thep_candidatesarray that was provided).- Return type:
tuple of arrays
- getNumberEltTest()#
Return a tuple of integer, corresponding to the number of elements used to train each ROOSTER classifier.
- getNumberEltTrain()#
Return a tuple of integer, corresponding to the number of elements used to train each ROOSTER classifier.
- getScore()#
Returns ROOSTER classifying scores. Scores are returned in the following order:
RotClassTestScore,PeriodSelTestScore. The ROOSTER instance must have been trained and tested before.
- getTestPeriodSelInfo()#
Get list of identifiers and corresponding classes obtained when testing
PeriodSel.- Returns:
Arrays with target identifiers, corresponding reference classes, and predicted classes.
- Return type:
tuple of arrays
- getTestRotClassInfo()#
Get list of identifiers and corresponding classes obtained when testing
RotClass.- Returns:
Arrays with target identifiers, corresponding reference classes, and predicted classes.
- Return type:
tuple of arrays
- getTrainingPeriodSelInfo()#
Get list of identifiers and corresponding classes used to train
PeriodSel.- Returns:
Arrays with target identifiers and corresponding reference classes.
- Return type:
tuple of arrays
- getTrainingRotClassInfo()#
Get list of identifiers and corresponding classes used to train
RotClass.- Returns:
Arrays with target identifiers and corresponding reference classes.
- Return type:
tuple of arrays
- save(filename)#
Save the ROOSTER instance as
filename.
- selectParam(candidates)#
Select a parameter corresponding to the rotation periods selected previously by
PeriodSel. TheanalyseSetfunction must be run before using this function.- Parameters:
candidates (ndarray) – parameters to consider to perform the selection. The array must have the same shape as the
p_candidatesarray that was provided toanalyseSet.
- test(target_id, p_candidates, features, catalog='santos-19-21', verbose=False, feature_names=None, e_p_err=None, E_p_err=None, tolerance=0.1)#
Test ROOSTER classifiers with the provided test set.
- Parameters:
target_id (ndarray) – Identifiers of the
ntargets for which parameters are provided. Non-unique identifiers are allowed.p_candidates (ndarray) – Candidate periods to be recovered by
PeriodSel. Must be of shape(n, n_class)wheren_classis the number of methods used to provided rotation period candidates for each target.features (ndarray) – Feature on which to perform the classification. Must be of shape
(n, n_features).feature_names (ndarray) – Feature names. Must be of shape
(n_features). Optional, defaultNone.e_p_err (ndarray) – Lower uncertainties on periods from
p_candidates. Selected period uncertainties will be provided as output ife_p_errandE_p_errare provided. Optional, defaultNone.E_p_err (ndarray) – Upper uncertainties on periods from
p_candidates. Selected period uncertainties will be provided as output ife_p_errandE_p_errare provided. Optional, defaultNone.catalog (str) – Catalog to consider for the reference rotation period value of each target. Optional, default
santos-19-21.verbose (bool) – Output verbosity. Optional, default
False.tolerance (float) – Tolerance to consider when checking that at least one period in
p_candidatesis compatible with the target reference period (therefore allowing to use the target in the test set.
- Returns:
Tuple of array with, in this order, target identifiers tested for
RotClass, inferred class (rotornorot), target identifiers tested forPeriodSel, selected periods, and, ife_p_errandE_p_errwere provided as input, corresponding lower and upper uncertainties on periods.- Return type:
tuple of array
- train(target_id, p_candidates, features, feature_names=None, catalog='santos-19-21', verbose=False, tolerance=0.1)#
Train ROOSTER classifiers with the provided training set.
- Parameters:
target_id (ndarray) – Identifiers of the
ntargets for which parameters are provided. Non-unique identifiers are allowed.p_candidates (ndarray) – Candidate periods to be recovered by
PeriodSel. Must be of shape(n, n_class)wheren_classis the number of methods used to provided rotation period candidates for each target.features (ndarray) – Feature on which to perform the training. Must be of shape
(n, n_features).feature_names (ndarray) – Feature names. Must be of shape
(n_features). Optional, defaultNone.catalog (str) – Catalog to consider for the reference rotation period value of each target. Optional, default
santos-19-21.verbose (bool) – Output verbosity. Optional, default
False.tolerance (float) – Tolerance to consider when checking that at least one period in
p_candidatesis compatible with the target reference period (therefore allowing to use the target in the training set.
- star_privateer.create_rooster_feature_inputs(df, return_err=False, candidate_names=None, candidate_lower_errors=None, candidate_upper_errors=None, priority=None, verbose=False)#
Take a DataFrame created by
build_catalog_featuresand return ready-to-use input array for ROOSTER training and classification.- Parameters:
df (pandas.DataFrame) – The dataframe created by the
build_catalog_featuresfunction, containing the features that will be used to train and test ROOSTER classifiers.return_err (bool) – If set to
True, the uncertainties on the candidate periods will be returned by the function. Optional, defaultFalse.candidate_names (list) – Name of the candidate periods to select among the features. If not provided, the default will be to consider feature with name starting with
prot. Note that relatedcandidate_names,candidate_lower_errors, andcandidate_upper_errorsshould be named with the same suffix in order to be consistently sorted. Optional, defaultNone.candidate_lower_errors (list) – Name of the lower uncertainty on the candidate periods to select among the features. If not provided, the default will be to consider feature with name starting with
e_prot. Optional, defaultNone.candidate_upper_errors (list) – Name of the upper uncertainty on the candidate periods to select among the features. If not provided, the default will be to consider feature with name starting with
E_prot. Optional, defaultNone.priority (list of strings) – In case
candidate_namesisNone, provides the rule to follow to prioritise the methods. The code will attempt at finding each provided string as a substring of the candidate names (and errors). For each string provided inpriority, the first matching element incandidate_names(and relatedcandidate_lower_errorsandcandidate_upper_errors) will be moved towards the beginning of the list according to the defined priority rule. Optional, defaultNone. In this case, the code will perform the sorting operation usingpriority = ["ps", "cs", "acf", "ls"].verbose (bool) – Output verbosity. Optional, default
False.
- Returns:
Tuple of arrays, including, in this order, the target identifiers, the candidate rotation periods, the lower and upper uncertainties on rotation periods (only if
return_erris set toTrue), the training features arrays and the corresponding feature names.- Return type:
tuple of arrays
- star_privateer.load_rooster_instance(filename=None, methods=None, verbose=False, seed=None, rotclass_kwargs=None, periodsel_kwargs=None, **kwargs)#
If
filenameis provided, load the ROOSTER instance saved under this name, otherwise train a ROOSTER instance on the fly.- Parameters:
filename (str or Path instance) – If provided, the ROOSTER instance saved under this name will be loaded instead of training on the fly. Optional, default
None.methods (list) – The list of methods considered to select the training parameters if filename is
Noneand the ROOSTER instance is trained on the fly. Allowed methods are"WPS"`, ``"GLS"`, ``"ACF"`, and ``"CS"`. If not specified, the full set of parameter will be considered for the training. Note that, in any case, if ``"WPS"is among the listed methods, the CS parameters will correspond to theGWPS x ACFcomposite spectrum (otherwise, the CS parameters will be taken from theGLS x ACFcomposite spectrum). Optional, defaultNone.verbose (bool) – Output verbosity.
seed (int) – Random seed to use in case ROOSTER is trained on the fly. Optional, default
None.rotclass_kwargs (dict) – Keyword arguments to pass to the
RotClassrandom forest classifier if trained on the fly. Optional, defaultNone.periodsel_kwargs (dict) – Keyword arguments to pass to the
PeriodSelrandom forest classifier if trained on the fly. Optional, defaultNone.**kwargs – Keyword arguments common to both random forest classifiers.
- Returns:
The loaded ROOSTER object.
- Return type:
ROOSTER instance