EXPLORING THE CAPACITY AND PERFORMANCE OF SUPERVISED LEARNING METHODS FOR LABEL CLASSIFICATION IN CAUSAL INFERENCE: A COMPARATIVE STUDY

dc.contributor.authorAbu Saqer, Ola
dc.date.accessioned2024-08-25T07:30:10Z
dc.date.available2024-08-25T07:30:10Z
dc.date.issued2024-08-08
dc.description.abstractIn fact, discussions about machine learning are increasingly prevalent due to its accuracy in prediction and its ability to handle vast amounts of data. Furthermore, many relation￾ships in life are causal, which motivates the efforts to comprehend the cause-and-effect relationships among variables. For instance, understanding the extent of the effect of a particular medicine on an individual with an illness becomes crucial. While it might seem straightforward at first glance, a deeper examination tell the complexity inherent in such endeavors when using machine learning in causality. Machine learning methods have made a valuable contribution to the field of causal inference because unlike traditional approaches, machine learning methods offer greater flexibility in estimating causal ef￾fects, since machine learning techniques do not require modelling hypotheses., yet there is still a research in estimation causal effect when both treatment and outcome are binary variables, because machine learning has proven its ability to predict, and prediction does not mean causality. Perhaps this is the challenge for machine learning in obtaining more accurate and less biased estimates of causal effects. This study conducts a comparative analysis of supervised learning methods for label clas￾sification in causal inference. We evaluate the performance and capacity of four tech￾niques: Causal Forest (CF), Support Vector Machine (SVM), Generalized Linear Models (GLM), and Linear Probability Models (LPM) in estimating the causal effects for cat￾egorical response variable. In a randomized controlled trial simulation and real experi￾ments were performed to evaluate the methods’ performance under varying conditions, by xi changing the main characteristics of the data including the sample size, and the number of the explanatory variables. We have focused on these four methods because of their specific advantages: Causal Forests are particularly adept at making causal inferences easily; Support Vector Ma chines are recognised for their effectiveness in binary classification tasks; Generalised Linear Models are well established as optimal for modelling the binary response vari able; and Linear Probability Models are used for their ability to provide predictions as probabilities. The results provide valuable insights into the strengths and limitations of each method in each scenario in the causal effects simulation study. Furthermore, the methods are able to detect heterogeneity in the real data results, and it was expected that SVM, GLM and LPM would detect more heterogeneity than Causal Forest. This thesis helps us to improve our knowledge of machine learning techniques in causal inference and emphasizes the importance of carefully evaluating their performance in real-world applications
dc.identifier.urihttps://hdl.handle.net/20.500.11888/19434
dc.language.isoen
dc.publisherAn-Najah National University
dc.supervisorEID, Abdelrahman
dc.titleEXPLORING THE CAPACITY AND PERFORMANCE OF SUPERVISED LEARNING METHODS FOR LABEL CLASSIFICATION IN CAUSAL INFERENCE: A COMPARATIVE STUDY
dc.title.alternativeاستكشاف قدرة وأداء طرق التعلم بالشراف لتصنيف التسميات في الستدلل السببي : دراسة مقارنة
dc.typeThesis
Files
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections