We proposed a novel predictor termed STALLION (STacking-based Predictor for ProkAryotic Lysine AcetyLatION) containing six prokaryotic species-specific models to identify Kace sites accurately. To extract crucial patterns around Kace sites, we employed eleven different encodings representing three different characteristics. Subsequently, a systematic and rigorous feature selection approach was employed to identify the optimal feature set independently for five tree-based ensemble algorithms and built their respective baseline model for each species. Finally, the baseline models' predicted values were utilized and trained with an appropriate classifier using the stacking strategy to develop STALLION.
|
Example Protein sequence fragments >Positive|P12047|K115 LLKQANDILLKDLERFVDIIK >Positive|P12047|K162 LKLALWHEEMKRNLERFKQAK >Positive|P12047|K398 PFRELVEAEEKITSRLSPEKI >Negative|P17922|K785 EQTLTEEEVTKAHSKVLKALE >Negative|Q08788|K202 KCIRDAEGWKKWAKDITFHQF >Negative|P80879|K135 MLLAIHQNIEKHNWMLKAYLG |
|