The proposed methodology framework outlines a six-phase workflow for bug triage, starting with dataset collection from Eclipse and Mozilla repositories, preprocessing via TF-IDF, and model development using Random Forest, XGBoost, and DistilBERT. It concludes with performance evaluation (Top-3 Accuracy, F1-score), explainability via SHAP values, and integration into a web prototype for agile teams.
Proposed Methodology Framework
{ "headers": [ "Phase", "Key Activities", "Techniques/Tools" ], "rows": [ [ "1. Dataset Collection", "Gather bug reports from public repositories", "Eclipse, Mozilla bugs" ], [ "2. Preprocessing", "Clean text data and feature extraction", "TF-IDF vectorization" ], [ "3. Model Development", "Train and tune classifiers", "Random Forest, XGBoost, DistilBERT" ], [ "4. Evaluation", "Assess model performance", "Top-3 Accuracy, F1-score" ], [ "5. Explainability", "Analyze feature importance and decisions", "SHAP values" ], [ "6. Prototype Integration", "Develop deployable tool for triage", "Web prototype for agile teams" ] ] }
Source: MS Research Synopsis
Speaker Notes
Outline the sequential phases of the proposed methodology, emphasizing the use of classical ML, gradient boosting, and transformer models, along with explainability via SHAP for agile team integration.