An-Najah National University Faculty of Graduate Studies Mapping the Strategies of Evaluation as Employed by the English Language Faculty Instructors at the Palestinian Universities & Higher Education Institutions By Ghada Hamdan Supervisor Dr. Nidal Jayousi This Thesis is Submitted in Partial Fulfillment of the Requirements for the Master Degree in Methods of Teaching English Language, Faculty of Graduate Studies, An-Najah National University, Nablus, Palestine. 2017 II Mapping the Strategies of Evaluation as Employed by the English Language Faculty Instructors at the Palestinian Universities & Higher Education Institutions By Ghada Hamdan This Thesis was defended successfully on 15/ 1 /2017 and approved by: Defense Committee Members Signature Dr. Nidal Jayousi / Supervisor ………..……… Dr. Anwar Abd-Alraziq / External Examiner ………..……… Dr. Suzaane Arafat / Internal Examiner ………..……… Dr. Soheil Salha / Internal Examiner ………..……… III Dedication To my parents my family my friends my teachers and my students. . IV Acknowledgement My utmost gratitude is due to Allah before and after. I would like to express my deepest gratitude and respect to my supervisor, Dr. Nidal Jayousi, for his inexhaustible patience, inspiring experience and tactful guidance. I also would like to express my warmest gratitude to my ex-professors at the English Methodology Department and the English Department at An- Najah University: Dr. Suzan Arafat, Dr. Ahmed Awad, Dr. Sameer El- Essa, Dr. Soheil Salha and others who never hesitated to help or waited to be thanked. Special thanks are due to Dr. Mohammad Ateya from Al-Aqsa University in Gaza for his meticulous suggestions and constructive criticism. Special thanks are also due to Dr. Anwar Abd-Alrazeq from Birzeit University for his valuable comments. Finally, I would like to express my deepest gratitude to my family and my friends who stood by my side in times of need. V اقرار انا الموقعة أدناه صاحبة الرسالة التي تحمل العنوان: ستراتيجيات التقويم كما يوظفيا معممو المغة االنجميزيةإ تحديد في الجامعات ومؤسسات التعميم العالي الفمسطينية Mapping the Strategies of Evaluation as Employed by the English Language Faculty Instructors at the Palestinian Universities & Higher Education Institutions أقر بأن ما اشتممت عميو الرسالة انما ىو نتاج جيدي الخاص باستثناء ما تمت االشارة اليو حيثما ورد وأن ىذه الرسالة ككل أو أي جزء منيا لم يقدم من قبل لنيل أي درجة عممية أو بحث عممي لدى أي مؤسسة تعميمية أو بحثية أخرى. Declaration The work provided in this thesis, unless otherwise referenced, is the researcher‘s own work, and not has been submitted elsewhere for any other degree or qualification. :Ghada Hamdan Student‘s Name اسم الطالب: :Signature …………………………………… التوقيع: :Date 15/01/2017 التاريخ: VI Table of Contents Subject Page Dedication III Acknowledgement IV Declaration V Table of Contents VI List of Tables VIII List of Appendices X Abstract XI Chapter One: Introduction and Theoretical Background 1 1.1 The Theoretical Background of the Study 1 1.2 The Statement of the Problem 10 1.3 The Objectives of the Study 11 1.4 The Significance of the Study 11 1.5 The Questions of the Study 12 1.6 The Hypotheses of the Study 14 1.7 The Limitations of the Study 14 1.8 The Operational Definitions 15 1.9 Summary 17 Chapter Two: Review of Related Literature 19 2.1 Introduction 19 2.2 Aspects of Conventional Instructional Practices 19 2.3 Aspects of Conventional Evaluation Practices 23 2.4. Faculty Members & Students: Conflicting Perceptions: 27 2.5 Selection of Evaluation Tool 33 2.6 The Effects of Demographic Variables 35 2.7 Conclusion 38 2.8 Summary 39 Chapter Three : Methodology 41 3.1 Introduction 41 3.2 Methodology 14 3.3 Population of the Study 14 3.4 Samples of the Study 14 3.5 Instruments of the Study 11 3.6 Validity of the Study 14 3.7 Reliability of the Study 14 3.8 Procedures of the Study 14 3.9 Questions of the Study 14 3.10 Hypotheses of the Study 05 VII Subject Page 3.11 Variables of the Study 50 3.12 Statistical Analysis 51 3.13 Ethical Issues 53 3.14 Summary 53 Chapter Four : Results 54 4.1 Introduction 54 4.2 Results Related to the Faculty‘s Questionnaire 54 4.3 Results Related to the Multiple-Choice Question 54 4.4 Results Related to the First Sub- Question 58 4.5 Results Related to the Main Question 61 4.6 Results Related to the Hypotheses of the Study 62 4.7 Results Related to the Students‘ Questionnaire 70 4.7.1 Results Related to the First Sub-Question 70 4.7.2 Results Related to the Second Sub-Question 72 4.8 Results Related to the Hypotheses of the Study 75 4.9 Summary 80 Chapter Five: Results, Conclusions & Recommendations 81 5.1 Introduction 81 5.2 Discussion of the Research Results 81 5.2.1 The Main Practices in Instruction & Evaluation 81 5.2.2 The Underlying Institutional Practices 84 5.2.3 Evaluation Between Beliefs & Practices 86 5.2.4 Total Score of All Domains 91 5.3 Faculty‘s Preferable Evaluation Practices 92 5.4 Discussion of the Results Related to Faculty‘s Hypotheses 94 5.5 Faculty‘s Practices as Perceived by Students 96 A- Faculty‘s Instructional Practices as Perceived by Students 97 B- Faculty‘s Evaluation Practices as Perceived by Students 99 C- Total Score of All Domains 104 5.6 Discussion of the Results Related to the Students‘ Hypotheses 105 A- Discussion of the Results Related to the First Hypothesis 105 B- Discussion of the Results Related to the Second Hypothesis 105 5.7 Summary 107 5.8 Conclusions 110 Recommendations 112 References 116 Appendices 143 b الملخص VIII List of Tables Table No. Subject Page Table (1) Distribution of sample according to gender 42 Table (2) Distribution of sample according to academic qualifications 43 Table (3) Distribution of sample according to professional experience 43 Table (4) Distribution of sample according to age 43 Table (5) Distribution of sample according to pedagogical interest 43 Table (6) Distribution of students sample according to university type 43 Table (7) Distribution of sample according to major 44 Table (8) Reliability coefficients of domains and the total scores 47 Table (9-14) Frequencies and percentages of responses of the multiple question 55- 57 Table (15) Means, standard deviation, percentages and levels of instruction practices among instructors 58 Table (16) Means, standard deviation, percentages and levels of evaluation practices among instructors 59 Table (17) Total degrees of the instructional and evaluation domains 61 Table (18) Frequency of instructors‘ evaluation preferences 61 Table (19) T-Test for independent samples due to gender 62 Table (20) T-Test for independent samples due to academic qualification 63 Table (21) Frequencies, means, and standard deviations due to professional experience 64 Table (22) One–Way ANOVA to test the differences due to professional experience 65 Table (23) Scheffe post hoc results and total degree due to professional experience 65 Table (24) Frequencies, means, and standard deviations due to age 67 Table (25) One–Way ANOVA to test the differences due to age 67 Table (26) Scheffe post hoc results and total degree due to age 68 IX Table No. Subject Page Table (27) Frequencies, means, and standard deviations of the instruction and evaluation practices due to interest in modern pedagogy 69 Table (28) One–Way ANOVA to test the differences due pedagogy. 70 Table (29) Means, standard deviation, percentages and levels of students‘ views about instruction practices 71 Table (30) Means, standard deviation, percentages and levels of students‘ views about evaluation practices 73 Table (31) Means, standard deviation, percentages and levels of students‘ views about instruction, evaluation practices and total degree 75 Table (32) T-Test for independent samples of the students‘ views about instruction and evaluation practices due to university 76 Table(33) Frequencies, means, and standard deviations of the students‘ views about instruction and evaluation practices due to major 77 Table(34) One–Way ANOVA to test the differences due to major 77 Table (35) Scheffe post hoc results 78 X List of Appendices Appendix No. Title Page Appendix A Questionnaire Validity Committee 143 Appendix B Faculty Members‘ Questionnaire 144 Appendix C Students‘ Questionnaire 149 XI Mapping the Strategies of Evaluation as Employed by the English Language Faculty Instructors at the Palestinian Universities & Higher Education Institutions By Ghada Hamdan Supervisor Dr. Nidal Jayousi Abstract This study aimed at mapping the common evaluation practices employed by the English language faculty members at all Palestinian universities in the West Bank and Gaza. The study investigated the faculty members‘ preferences among various evaluation tools. Along with the evaluation practices, the underlying instructional practices were explored to trace their effect on evaluation practices. In addition, students‘ views about faculty members‘ instructional and evaluation practices were surveyed in order to recognize students‘ rights and significant roles in the evaluation process. The study examined the effects of the following variables on the instructors‘ practices (professional qualifications, experience, gender, age and interest in modern pedagogy). On the students‘ side, the variables of the major discipline of English (whether it is English Language & Literature, Translation or TEFL) and the type of university (whether it is public or private) were examined. This study was conducted at all Palestinian universities in the West Bank and Gaza in the academic year (2015-2016). The sample of the study is a stratified random. It consisted of (166) instructors and (400) students XII from the two populations. The percent of instructors‘ sample is (75.4 %) and the percent of students‘ sample is (26.6 %) of the whole population. Two questionnaires were distributed; one for faculty members at all universities, and the other for majors of English at An-Najah National University in Nablus and at the Arab American University in Jenin. The results suggest that conventional practices in lecturing and testing are common among Palestinian faculty members. Concerning the preferences of evaluation tools, formal testing is the faculty members‘ most-rated choice. There are significant differences among the faculty members attributed to academic qualifications, experience, age and gender, but no significant differences are attributed to the faculty members‘ interest in modern pedagogy. Majors of English, Translation and TEFL in both universities have moderate views regarding their instructors‘ performances in instruction and evaluation. However, there are significant differences among students attributed to the type of university in favor of the private university, and the major discipline in favor of the Translation major. In the light of the results of the study, faculty members are recommended to reconsider their practices, embrace training and experiment with modern evaluation pedagogies to tailor practices according to students‘ needs. They are invited to attempt a balance between institutional restrictions and students‘ best interests, open channels of communication with students and listen to their suggestions and criticism, involve them in the evaluation process and establish enlightened XIII assessment culture at the English department that can put together academics‘ efforts to respond to the highly diverse educational needs of foreign language teaching. 1 Chapter One Introduction and Theoretical Background 1.1. Introduction: Higher education has a leading role in helping to build a modern country by providing better qualified generations in various fields. However, in general, higher education in the Arab World faces many problems that hinder its ability to face up with the challenges of modern times. Al-Rashdan (2010) pointed out a number of problems and challenges especially the lack of academic freedom. He described how the current situation of higher education affects instructors and students as the faculty‘s roles are reduced to information providers and students are not encouraged to think critically or analytically. Two main manifestations of this situation is the deeply-rooted traditional type of lecturing as a common instructional practice, in addition to traditional testing as a main evaluation practice. Given the development in the learning-teaching pedagogies, a teacher-based practice like lecturing has become more debatable. It is usually simply planned (and technologically-assisted) aiming mostly at presenting information, explaining concepts and modeling thinking. In addition, lecturing usually sets limited time for discussion and occasional questions from students at the end of the class (in addition to its controversial attention span and inadequacy to change values or teach behavioral skills (Bligh, 2000; Bates, 2015). Further, more debate is on 2 traditional lecturing versus active learning. Weiman (2014) asserted that traditional lecturing has become defenseless. He argued that instructors‘ justifications of utilizing lecturing for teaching large classes or because of workload and content coverage are usually presented with no experimenting with other alternatives. Rahman (2011) indicated that the value of lecturing depends on the instructor‘s specific objectives. If the aim is to communicate information, lecturing is reasonably efficient, if it is meant to develop the power of critical thinking and problem solving skills, discussions and other active learning strategies are more effective according to modern research. However, McKeachie and Svinicki (2006) discussed other purposes of lecturing such as using lectures for compiling updated material from a variety of sources and adapting it to students‘ interests as well as for helping students discover models of thinking and key concepts. Nevertheless, the researcher would like to add that the priority for teaching has changed from transmitting and organizing knowledge to generating knowledge and using high- order thinking. Proponents of lecturing have attempted to use modern learning theories to avoid the criticism against the inherent disadvantages of traditional lecturing. Snell and Steinert (1999) discussed how interactive lecturing involves an increased interchange between teachers, students and the lecture content to promote active learning practices. This is usually presented by the use of modern technology. However, the researcher argues 3 that this form of lecturing is more of a byproduct of active learning. Still, it is not a sufficient equivalent to active learning which emphasizes students‘ roles and individualized learning. However, the quality of instruction is determined when it utilizes high-order thinking and leads to better learning rather than teaching to the test. As testing leads to pressure on students and teachers, students‘ efforts are channeled into cramming and instructional time is focused on preparing for the test. Teaching to the test narrows down the curriculum, minimizes students‘ creativity and undermines faculty‘s professional autonomy. The quality of instruction is determined when it addresses students‘ needs and realities by providing authentic tasks, intrinsic motivation, engagement and high order thinking (Gardner, 1993). Brown (2009), Race (2010) and Fautley & Savage (2008) confirmed that evaluation is a systematic process overlapping with and inseparable from instruction and learning. It is a valuable tool providing answers to all stakeholders on essential questions. This is how it becomes an inseparable part of instruction; it aims at measuring visible evidence of learning and skills more directly than when the learner communicates them in pen and paper as in summative and traditional evaluation. In modern pedagogies, better and different strategies are called for to draw attention to the challenges of the 21 st century demands for different learning and training, and to respond to global call for educational reform based on more 4 enlightened theories of learning and teaching in the wake of the growing dissatisfaction of the deeply-rooted practices in instruction and evaluation. Views about assessment and testing have started to change more radically since the early 1990s. Wiggins (1990) and later Brown (2009) confirmed that assessment and testing are not synonymous. Brown defined tests as basic formal and institutional procedures set at certain intervals in the curriculum, and learners are expected to do their best to demonstrate how much knowledge they have attained. In the current study, the aim of the form of the evaluation practices used by Palestinian faculty members is to arrive at a formal grade to be given to students. It focuses on endorsing summative, final judgment based mainly on testing and grading achievement. Sometimes other course components ( such as discussion, cooperation and attendance) are included. In contrast, the term ‗assessment‘ is used in modern pedagogy to refer to a more holistic, formative, continuous, learner-based and outcome-based process. It is an interactive process between students and faculty that informs them about the progress of learning and teaching (Angelo & Cross, 1993). Within the constructivist learning theory, a meaningful evidence of learning is sought for assessment. Reeves (2006) confirmed this by criticizing the evaluation of college students‘ teaching. He confirmed that in an authentic learning environment, assessment is based on observations of students‘ engagement and analysis of learning products rather than using 5 just one method. Effective assessment requires the critical analysis of multiple forms of evidence that learning outcomes have been attained. Robinson (2010), Shihadeh (2009), Mustafa (2010) and Al-Absi (2010) pointed out that evaluation has become a testing activity only with traditional tests and conventional question formats as the most common methods of evaluation. These types of tests most often measure information recall unless the educator is extremely skilled in test construction. This kind of evaluation is still separated from the learning process and set at the end of instruction. Reasons why teachers seek these forms of assessment are quite predictable. These tests are usually handy, easy to grade and formal (Robinson, 2010). Thus, instruction has become a tool for testing or teaching to the test. Shihadeh (2009) described the stressful effect of this current insufficient form of evaluation on both students and instructors. On the students‘ part, they are encouraged to score more grades. On the instructors‘ part, as testing is the most common practice for evaluation, they usually feel under pressure to make students pass tests. Shihadeh asserted the need for more holistic and quality learning visible in learning outcomes and definitely not only a testing tool. The view of knowledge in traditional testing and lecturing places most value on ‗knowing that‘ whereas ‗knowing how‘ can be both difficult to teach and to assess through pencil-and-paper means. The end-of-unit test practice originated from the role of education which treats learners as ‗ empty vessels‘ and thus, the role of the teacher is to fill in as much 6 knowledge as possible. The conventional evaluation is to calculate what has been filled in from the perspective of how full of knowledge the learner has become (Fautley & Savage, 2008, p.7). One more disadvantage of conventional testing is that it undermines chances of distinction in teaching (Hoffman, Assaf, and Paris, 2001). Researchers like (Hoffman et al., 2001, Serra, Gómez and Sáiz, 2014, Aquino, Ramos and Nolasco, 2015) indicated that faculty members highly regarded assessment as a vital tool to enhance learning and to promote students‘ development. However, in a striking contrast, the faculty considered students‘ participation in evaluation was not necessary. Another striking contrast the aforementioned researchers confirmed was that the most frequent tool used by (100%) of faculty was written tests and quizzes. Seemingly, instructors found pedagogical value in using written test and quizzes as main evaluation methods. In addition to workload which leads instructors to summative assessment, they interpreted this by reminding that in many universities, written tests are the most formally acknowledged evaluation practices. Nevertheless, Pellegrino, Chudowsky and Glaser (2001) argued that conventional tests can still do a practical job such as testing facts and concepts. However, they admitted that these facts and concepts are limited sections of the curriculum which fail to probe the depth of learning. As a result, current practices of testing cannot be reliable for important decision 7 making. Moreover, they have shortcomings in defining critical differences in high-order thinking among students. However, active academic professionalism stresses that faculty‘s practices have to be filtered. Eraut (1994) and Brookfield (1987) discussed the requirements for reflective practice and college teaching professionalism. They identified the powerful values of critical, reflective self-monitoring practitioners who can wisely invest the insights of academic life and call for making use of students‘ evaluation of faculty, in addition to institutional evaluation, action research, peer engagement and the utilizing of scholarly literature. However, research has identified certain obstacles that can deter the instructors‘ attempts in monitoring and managing their practices such as inadequacy or lack of training in the demanding process of assessment, and in seeking new forms of instruction and assessment (Hills, 1991; Sullivan & Chalnick, 1991). Naturally, training can help faculty manage time and course plan as well as assist teachers to choose appropriate formats to assess different achievement targets that can suit course objectives and instruction (Stiggins, 1992; Fink, 2003; Sabagh and Saroyani,2014). One important aspect of faculty‘s tendencies is to adhere to teacher- based practices even after taking training in professional development (Samuelowicz and Bain,2001; Woodbury & Gress-Newsome,2002; Fung and Chow, 2002; Ebert-May, Derting, Hodder, Momsen, Long and Jardeleza, 2011). According to these studies, instructors demonstrated very 8 good theoretical and pedagogical knowledge, but it was rarely demonstrated in classroom practices. However, assessment culture, administrative and institutional restrictions, self-efficacy, attitudes, academic qualifications, lack of training and interest in modern pedagogies, lack of motivation for research and workload can be behind this tendency. Other reasons are related to working conditions, collegial relation, students‘ competences as well as departmental policies (Gess-Newsome, Southerland and Johnston, 2003). In this theoretical background, it is worthy to discuss the assessment culture since it is an important factor in shaping the administrative and institutional policy in universities in general. Concerning the effect of the spreading quality assurance culture on higher education, (Lamine, 2010) pointed out how superficial and traditional the procedures are dealt with in the Arab countries. He confirmed that there is no significant impact on public education institutions since quality assurance criteria are not integrated in university life or its management. According to Diyen (2010), Haywood (2010) and Hutching (2010) when assessment culture is institutionally fostered, faculty are more likely to be more positively involved. Hence, the lack of assessment-oriented institutional policy is often cited as a primary obstacle for faculty‘s genuine involvement in the work of assessment. Looking at the students‘ side, perceptions of the faculty‘s practices in instruction and evaluation are important components of the teaching- 9 learning process. Exploring students‘ attitudes towards instructional and evaluation practices rises from the fact that students‘ perceptions can affect all levels of education hierarchy : they set objectives to instructors, clarify standards to students, modify instructional designs, provide valuable feedback, monitor progress as well as assess and evaluate performance (Herman, Aschbacher and Winters,1992). According to Robinson (2010), students are like customers so it is essential that teachers consider and seek their satisfaction by exploring attitudes, identifying needs and obtaining valuable feedback from students as one part of balanced assessment. Further, in an attempt to understand the students‘ perceptions and the faculty members‘ perceptions, research has good evidence that the faculty members and students tend to have contrasting views about instruction and evaluation practices (Brown, 2006; Rashidi and Moghdam, 2014). One significant indicator of the importance of students‘ perceptions of classroom practices is their experiences with different modes of assessment. Sambell, McDowell and Brown (1997) tried to investigate students‘ attitudes when experiencing different modes of assessment and their particular effects on their learning. The researchers reported that students often had negative attitudes when they discussed traditional assessment because they considered it might have a negative effect on their learning. In contrast, when students were exposed to new forms of assessment, they demonstrated quite dramatic attitudes. 10 This introduction has presented the main issues which are raised in this mapping study and their overlapping and inseparable relations in the diverse field of English language teaching in higher education. 1.2. The Statement of the Problem: Peterson and Einarson (2001), Race (2010) and Rust (2007) indicated that despite the plethora of research in educational pedagogies, a review of the literature shows relatively little research on faculty members‘ perceptions of their instructional and assessment practices. McLellan (2001), Knight (2002), Carless (2006) and Rust (2007) considered the current situation of assessment in higher education complicated and confused due to the heavy demands on it. Furthermore, faculty‘s practices and falling standards are more criticized especially in higher education in different parts of the world. Naturally, instructors‘ practices as Pellegrino et al., (2001) and Brown (2009) criticized are embedded in social and administrative structure which they consider difficult to change especially in the testing practices. It is noticeable that there is an overuse of the conventional testing in English language teaching in most of the Palestinian higher education institutions. The time has arrived to identify the evaluation strategies in our universities and seek more valid and inclusive forms of assessment which can provide more reliable evidence of students‘ competences and skills. 11 1.3. Objectives of the Study: This study aims at mapping the common evaluation practices among the faculty members of English in Palestinian high education institutions. Other objectives of this study are: 1- Identifying the evaluation preferences among the faculty members. 2- Exploring the common instructional practices which underlie these evaluation practices. 3- Examining the rationale behind utilizing the common evaluation practices. 4- Defining the existing variances attributed to age, gender, qualifications and years of academic experience as well as their interest in modern pedagogy. 5- Exploring majors‘ of English perceptions of the current underlying instructional and evaluation practices. 1.4. Significance of the Study: There have been few studies in the field of faculty‘s practices, not only locally, but also globally since more attention has been given to school teachers‘ practices. The significance of exploring evaluation practices rises from the fact that it can shed light on evaluation as a very significant area in the teaching process, as well as help to explore performances, 12 perceptions and trends among the faculty members. Getting to know where English faculty members stand in their evaluation practices, their performance is expected to improve as they try to experiment with new techniques of evaluation. Consequently, the students‘ competences are expected to be more developed as they are trained to do more meaningful tasks and demonstrate evidence of their learning not available in pen and paper tests. This study sheds more light on the current evaluation practices of English in higher education institutions and also majors‘ perceptions and general satisfaction with these practices. It points out to where the faculty members exactly stand from the growing interest and attention given to assessment and evaluation by global educational circles. 1.5. Questions of the Study: The main questions of the study are: 1- What are the most common evaluation practices utilized by the English language faculty members at the Palestinian Higher Education Institutions? 2- What are the faculty members‘ preferences among evaluation practices? In the light of the two major questions above, the researcher considers the following sub- questions: 13 1- What are the most common instructional practices underlying the evaluation practices? 2- Are there any significant differences in evaluation practices among the faculty members attributed to academic qualifications, professional experience, age, gender and interest in modern pedagogy variables? 3- How do majors of English at An-Najah National University (ANU) in Nablus and the Arab American University (AAU) in Jenin perceive the instructional practices underlying the evaluation practices? 4- How do majors of English at An-Najah National University in Nablus and the Arab American University in Jenin perceive the evaluation practices? 5- Are there any significant differences in the perceptions of evaluation practices among majors of English at An-Najah and majors of English at the Arab American University in Jenin attributed to the major discipline (English Language & Literature, Translation and TEFL) or to the type of university: public or private? 14 1.6. Hypotheses of the Study: The study examines the following hypotheses: 1- There are no statistically significant differences at (α≤0.05) among the faculty members in evaluation practices attributed to academic qualifications, years of experience, gender, age and interest in modern pedagogy. 2- There are no statistically significant differences at (α≤0.05) among the majors of English Language & Literature, TEFL and Translation at An-Najah National University (ANU) and the Arab American University (AAU) majors in the perceptions of the evaluation practices attributed to the type of university (public or private). 3- There are no statistically significant differences at (α≤0.05) among the majors of English Language & Literature, TEFL and Translation at ANU and AAU in the perceptions of the evaluation practices attributed to the major discipline variable. 1.7. The Limitations of the Study: This study has the following limitations: 1. Locative limitations: - the populations of the study consist of the English language faculty members at Palestinian universities and majors of English at ANU & AAU in Palestine. 15 2. Temporal limitations: the study is carried out in the academic year (2015 – 2016). 3. Human limitations: the populations of the study consist of the faculty members and majors of English at ANU & AAU in Palestine. 4. Topical limitations: - this study aims at mapping evaluation practices at higher education institutions in Palestine. 1. 8. Operational definitions: 1- Evaluation: in this study, evaluation is a summative, judgmental, and test-based process conducted by instructors in order to arrive at an official grade or score. It is more consistently used in this study to refer to Palestinian faculty members‘ testing practices. 2- Evaluation practices: a set of repetitive procedures taken by a faculty instructor to deliver a summative grade to students. 3- Instructional practices: a set of repetitive procedures taken by faculty to teach English. 4- English faculty member: the instructor or teacher of English in the English department or a higher education institution (it is used consistently in the study regardless of the master or doctoral degree). 5- A major of English: the student who studies English Language & Literature, TEFL or Translation. 16 6- Assessment: various techniques that educators use to evaluate, measure, document and follow up with students‘ learning progress and skill acquisition. In modern pedagogy, these methods are designed to give more opportunities to students to learn, be engaged in more authentic tasks and critical thinking other than paper and pencil tests. 7- Conventional evaluation: testing students to measure how much they know. The most-commonly used tools are traditional tests which contain different types of questions such as: multiple-choice, fill-ins, matching, short essays, sentence completions, short answers, true and false statement and definitions. This kind of questions goes for short, definite answers. 8- Lecturing: a teacher-based instructional practice that aims at delivering more information and covering more content. It has inherent limitation of student-based engagement. 9- Perceptions: Instructors‘ or students‘ feelings and views about their experiences regarding learning and teaching that can be reflected in their behavior and choices. 10- Major discipline: these are English Language & Literature, Translation and Methods of Teaching English (or TEFL). At Palestinian universities, the TEFL major belongs to the Faculty of 17 Education unless it is a minor which can be taught in the English Department in the Humanities Faculty. 1.9. Summary: This theoretical introduction draws attention to the current conventional practices in instruction and evaluation in the Arab World universities. It also casts more light on assessment culture and its restriction of academic freedom in evaluation as well as the inadequacy of traditional lecturing and testing in meeting up with the challenges and demands of teaching in the 21 st century. The integration of instruction and evaluation in modern pedagogies is confirmed due to its vital effect on the learning-teaching process. In the light of this assumption, the instructional practices are explored in parallel with evaluation practices to confirm their interconnectivity and overlapping effect on each other. Another important issue in the introduction is the distinction between assessment and evaluation. According to the constructive theory, for learning to be assessed, it should have more visible evidence generated from cognitive processes. In conventional evaluation, learning and teaching are narrowed down for testing purposes. Further, faculty‘s perceptions and responses are investigated to explore the reasons behind certain teacher-based practices. In addition, the theoretical background discusses the gap between faculty‘s beliefs and 18 practices. Faculty members usually refer the reasons behind their choices to departmental instructions, class size, workload, and the levels of students‘ competences. However, research provides evidence that testing is still their main choice in evaluation. The introduction above stresses the need for reflective practice and active professionalism which can monitor habitual practices and survive the challenges of the teaching career. Furthermore, this introduction emphasizes students‘ rights to have their views integrated in the assessment process since their contributions are considered valuable feedback. 19 Chapter Two Review of Related Literature 2.1. Introduction: The foreign language teaching and learning process is complicated and multi-faceted as several factors affect teaching and learning. As this is a mapping study, certain evaluation practices will be discussed as well as their underlying instructional practices. Below is a survey of what educational literature says about several faculty members‘ practices in instruction and in evaluation in particular. Faculty‘s and students‘ perceptions as well as faculty members‘ favorable assessment tools are also explored. This review highlights some key points which are relevant to the main questions of the study. 2.2. Aspects of Conventional Instructional Practices: Traditional lecturing is expected to continue to be the main practice almost all over the world given the pressures higher education is facing due to the economic demands (Bligh, 2000; Bates, 2015) especially in many developing countries (Khan & Akbar, 1997). It is considered to be the best method to teach large numbers of students and consequently lowers costs (Moore, 1996). These inherent limitations of lecturing jeopardize deeper learning by making students passive listeners and dependent on one source of knowledge (Grunwald &Peterson, 2003). 20 Fink (2003), Sabagh and Saroyani (2014) criticized higher education pedagogies and their justifications since teaching facts and concepts is prioritized more than developing of intellect and values. They stressed the fact that high –order thinking skills are widely assumed to be at the core of college education. However, literature has also indicated some students‘ preference for traditional learning because of poor competences, lack of training and self-confidence (Struyven et al., 2008). Orata (1999) and Bligh (2000) confirmed in their studies the effect of class size on traditional classroom practices. The researchers indicated that lectures tend to focus more heavily on the transmission of information as class size increases rather than on clarification and discussion. In return, as numbers increase, faculty members resort to more conventional testing and limited feedback. Workload is another factor against quality. Faculty members tend to use less time and preparation and consequently seek quicker and easier methods of evaluation. Compared to active learning, traditional lecturing is considered ineffective based on pedagogical consideration of the cognitive theory (Hansen & Stephens, 2000; Sullivan, 2002; Berry, Chen and Honig, 2008). It is not very effective in high-order thinking and can suppress learners‘ creativity, encourage passivity, give limited feedback and neglect individual differences and motor skills (Killen, 2007; Moore, 1996). In the same context, research has confirmed that content coverage is still a high 21 priority for faculty members and one of the faculty self-declared reasons for using traditional lecturing ( Cooper and Robinson, 2000). Concerning the role of PowerPoint presentations in traditional lecturing to cover more content, Robinson (2010) and Bates (2015) argued that PPT presentations are usually utilized and loaded by a huge content to be presented in a short period of time. They raised the question of the quantity of learning over the quality. The traditional use of PPT presentations is less-student centered. Modern pedagogical research calls for smarter use of rich media, but not as a cosmetic means. In conclusion, any instructional practice that minimizes the learner‘s roles is expected to be ineffective even when utilizing rich media. Dependence on conventional lecturing can affect the specification of language skills and the time given to authentic and meaningful language activities and tasks because better linguistic processing depends on both input and output. Carefully-structured classroom activities can make foreign language learners attempt to generate better output. The need for interaction in classroom context is best achieved by asking learners to perform tasks that require both oral and written language (Krashen, 1982; Skehan, 1998; Swain, 1995; Ellis, 2001). Findings in Umbach and Wawrzynski‘s research (2005) and Vo‘s (2010) are in line with this approach. These researchers discussed the results of national research data which was completed by thousands of respondents (both students and instructors). The findings suggested that 22 students were more attracted to non-traditional activities used by instructors in different educational institutions. Another aspect of traditional instruction is limiting students‘ chances in giving classroom oral presentations in fear of plagiarism and consuming course time. King (2002) discussed the merits of undergraduates‘ designing oral presentations such as developing real communication, integrating language skills, enhancing team work and activating students in their own learning. Zovkovic (2014) stressed that English language instruction should assist students to develop these communicative presenting skills. For this reason, she called for considering oral presentations an important part of language teaching. Similarly, written assignments and research papers are not given the necessary attention. Andrews (2003) and Badke (2014) tackled writing skills of language majors as a manifestation of language in the argumentative persuasive styles. Badke pointed out how students lack basic skills of research in addition to their lack of critical thinking and writing skills. Although Badke admitted that writing is a demanding process, he confirmed that instructors do not teach sufficient and consistent research or writing skills. Rafidi‘s research (2013) at Birzeit University drew attention to a more student-centered method in developing majors‘ of English writing skills. The findings indicated students‘ preference for cooperative learning. When infused with critical thinking strategies, it effectively promoted critical thinking and progress in writing in English. 23 Another implication of the traditional lecturing is the lack of interest and time utilized for asking questions and classroom discussion. Effective use of questions and discussions are other tools that can be used to foster a thoughtful environment to enrich thinking in the classroom. Felder (1994) and Pennell (2000) warned against the use of questions by faculty members especially the ‗any question?‘ practice at the end of the presentation. They called upon instructors to utilize questions through all the stages of the classroom presentations as an integrated part of the course plan (Cashin, 199; Nilson, 2010). The various aforementioned practices in instruction have been tackled in the current study. The literature review provides good evidence about the issues raised by this study and their inherent relation to conventional teaching especially in lecturing. The literature above confirms how conventional lecturing creates teacher-based model and minimizes students‘ roles in demonstrating oral and written language. It also discusses the justifications behind these models such as faculty members‘ workload, class size and content coverage concerns. 2.3. Aspects of Conventional Evaluation Practices: Assessment and evaluation are complicated processes which depend on a variety of strategies, practices and procedures to reach a judgment or a measurement (Kwako, 2003). Since the main aim of this study is to map the common evaluation practices of English language faculty members, more focus will be given to conventional or traditional testing as a most 24 pervasive practice for summative evaluation utilized by English language faculty members. Assessing learning is an integral part of the learning process, therefore, this connection should be well-defined. Herman et al., (1992) used cognitive learning theory as a basis for the discussion of instruction and assessment. Cognitive learning emphasizes generating of knowledge and individualized learning experiences through developing critical thinking skills, discussion of new ideas, encouraging diverse thinking and managing individual learning differences. In the light of this theory, traditional testing has to be evaluated. Rudner (1991) and Meisels (1993) asserted that conventional testing neglects the vital cognitive processes since it only focuses on getting the right answer. As it mainly emphasizes the acquisition of simple facts and low-level thinking, it fosters superficial memorization and grade-based achievement. As might be expected, there is a plenty of educational literature that has concentrated on the negative sides of conventional testing. In highly- evaluative situations, foreign language testing anxiety is more detectable. The studies conducted by Vogel and Collins (2002), Kassim, Hanafi and Hancock (2008), Huberty (2009) explored different aspects of tests such as the negative attitude towards instruction especially when the test involves content that was not taught in class. However, there is also evidence that moderate, reasonable and natural test anxiety leads to better performance. Herrera et al. (2007) Aydin (2007), Yahya (2013) criticized teaching to the 25 test and narrowing down instruction and curriculum by conventional testing. Regarding students‘ views, Sambell et al., (1997) reported how students negatively criticized effects of conventional assessment on their learning because it depends more on information recall. Sambell et al., stated that conventional testing from the students‘ perspective is unfair and inaccurate because it is about one-shot attempt depending on last-minute cramming. Similar results were found by the action research project conducted by Waters et al., (2004) who addressed the effect of non- traditional testing on students and their assessment preferences. They found that most students preferred the new forms of assessment to be flexible in giving more choices to students and chances for better learning and decision making. In a significant Australian case study, Campell (2008) reported students‘ negativity towards evaluation practices and called for educational reform and reconsidering of the current evaluation practices. He confirmed that assessment is a powerful far-reaching tool which influences the quality of higher education. In a similar study, Scot (2006) conducted a large national Australian study which included hundreds of courses to investigate key elements about the teaching-learning process. Students criticized the current practices of rote-learning, conventional testing, low-level thinking skills and lack of authenticity. 26 However, compared to standardized tests, classroom tests are more flexible since they are usually designed around curriculum despite their weaknesses. Despite the fact that they are conventional tests, standardized tests are given more attention in literature since they are broadly-used as a main tool for decision making in students‘ admission to college. They are also economic given to large numbers of students at lower costs. Further, standardized tests are considered more statistically reliable. (Mathison, 1997; Gasporro, 1997; Franklin, 2002). Attempting to put the current evaluation practices in a less negative perspective, Kheir Allah (1998) and Almojahed (2006) called for more flexibility and expansion in constructing tests. They considered it more of a cooperative process with students as main stakeholders. Hence, constructing tests becomes a learning experience not an exclusively one- sided process or an administrative obligation with students at the receiving end. The studies on the effect of administrative and organizational structure and its cultural effect on assessment are quite few (Whitchurch, 2006; Ashwin, Ylänne, Trigwell and Nevgi, 2006; Hutchings,2010; Haywood Shaw and Laird, 2010; Diyen, 2010). These studies confirmed that pedagogical principles are not the only factors that can influence and shape assessment. Other complex administrative and organizational forces have to be taken into consideration. Knight‘s study (2002) in three colleges at a British university indicated that lecturers tended to follow the imposed 27 existing assessment practices. In addition, workload is another important finding that influences the decision of lecturers not to undertake time- consuming or innovative assessment and marking tasks. Despite the hindering forces in administration and organizational hierarchy, researchers believe that dealing with poor testing cultures and other obstacles can still be developed through reflective practice and active professionalism. Studies conducted by Eraut (1994), Harris (1998) and Kreber (2009) identified the powerful value of critical, reflective self- monitoring practices using the insights of academic life, peer engagement and research. The primary benefit of reflective practice for teachers is a deeper understanding of practices and thriving for more effectiveness. It can verify teacher‘s beliefs and challenge traditional practices. The review above supports the assumptions of the current study. It highlights the cognitive theory approach in utilizing knowledge and activating mental processes through high order thinking. It also supports the current study design in exploring instruction and its interconnection with conventional evaluation which is test-based. It also explores students‘ attitudes towards testing as it is currently applied. 2.4. Faculty Members & Students: Conflicting Perceptions: Investigating how faculty perceive their practices is definitely academically and professionally rewarding since faculty can check and compare their practices in reference to current research and their 28 colleagues‘ beliefs. In addition, they can understand what their students expect from them and consequently develop better pedagogical techniques, decision making and reflective practices (Pajares, 1992). Concerning how faculty members perceive their performance, Eison (2010) reported findings from extensive workshop experience with faculty members. He pointed out that most instructors think of themselves as being very good lecturers especially by using lectures to transmit information (Lacy & Sheehan, 1997; Noordin, 2009; Toker, 2011). However, investigating teachers‘ assessment practices revealed that they were not well prepared to meet the demand of classroom assessment due to inadequate training. Researchers reported that teachers are not always qualified to choose appropriate formats. Research also explained that the time constraints the teachers complained about (which prevent them from experimenting with new tools of assessment) is a result of lack of training in pursuing new forms of assessment (Hills, 1991; Sullivan, 2002; Stiggins, 2004). Likewise, Musawy‘s study (2009) explored teachers‘ and students‘ perceptions of classroom assessment in a higher education institute in Afghanistan. The majority of the students involved in the study criticized the weakness of the traditional methods which were dominant in this institution although teachers favored the summative achievement tests. Additionally, the study indicated that the faculty members had not attended 29 any workshop or any courses about classroom assessment; they just relied on their own experiences. Herrera, Murry & Cabral, (2007), Sambell et al.,(1997) and Campell (2008) Waters et.al., (2004), Gayton (2007) and Kvale (2007) reported how students negatively criticized effects of conventional assessment on their learning because it depends more on information recall. From students‘ perspective, conventional testing lacks fairness and accuracy as it has been discussed earlier in this chapter. In general, limited research has been done on English faculty members‘ beliefs and practices (Sullivan, 2002; Woods, 1996; Borg, 2003, 2006; TALIS, 2009). However, available research could detect a noticeable gap between beliefs and practices in faculty‘s performance (Gómez and Sáiz,2014;Aquino et al., 2015; Ebert-May et al., 2011; Woodbury & Gress-Newsome, 2002; Samuel and Bain, 2002;Norton et.al., 2005; Rieg & Wilson, 2009). While few authors report positive connection between faculty‘s beliefs and practices, others conclude that there is no direct link (Wilcox-Herzog, 2002). It is also important to look at the impact on faculty‘s beliefs, practices and attitudes of professional background, type of training, qualifications and professional development, major discipline and length of experience. It is important to note that any of these relationships can have a different effect. Again, looking at the students‘ side, several studies have been conducted to determine if there are differences between teachers‘ and 30 students‘ perceptions of the teaching-learning process. McCollin (2000) Cothran and Ward (2000) are among the researchers who reported results about the discrepancy between faculty‘s perceptions and students‘ perceptions. In general, faculty members tend to consider students‘ evaluation of their performance biased and immature (Douglas & Douglas, 2006; Theall and Franklin, 2001). However, advocates of students‘ rights to evaluate their instructors consider students as the target group who are mostly influenced by the teaching practices of their professors. They consider it a learning experience for students to develop a clearer conception of teaching that will in turn contribute to their learning. Consequently, it is essential that teachers be receptive to students‘ feedback (Williams & Burden, 1997; Birenbaum, 1996; Kwan, 1999; Cotterall, 1999; Davis, 2009; Shishavan & Sadeghi, 2009). Other researchers have also looked at the discrepancy from another perspective. In studies like: Horwitz, 1990; Kern, 1995; Moore, 1996; Schulz, 1996; Kikuchi, 2005; Brown, 2006; Shishavan & Sadeghi, 2009 Rashidi & Moghdam, 2014) there was a significant negative difference between teachers‘ beliefs about their performance and students‘ satisfaction with them. While teachers think highly of their practices in the classroom, students are not always satisfied with them. In contrast with the instructors‘ general high perception of their performance, research provides evidence that they tend to view students less positively in terms of levels of academic competences. Cherif et al., 31 (2011) conducted a study about reasons behind students‘ failure in college. According to (68%) of faculty, many students come to college with poor academic backgrounds to the extent that they need remedial or developmental classes in at least one necessary discipline before taking courses for college. In the same context, The Higher Education Research Institute (2004-2005) conducted a national research with a sample of (40,670) faculty members at (421) colleges and universities across all types of colleges and universities. Overall of (41%) of the faculty believed that most of the students they teach lack the basic skills needed for college level. By contrast, findings from the institute‘s results showed that (70%) of college students rated themselves as above average. Hechinger Report (2011), Spaights, Kenner and Dixon (2010) examined students‘ perception of the academic self-image (in contrast with their instructors‘ general opinions about them). According to a study from the University of Wisconsin, findings in general, highlighted the positive academic self-image students had. In a national American study conducted by Higher Education Research Institute in (2005), (70%) of college students rated themselves as above average, whereas only (36%) of faculty considered students to be well-prepared. A similar study was Salli-Copur‘s (2008) who explored the academic self concept of Turkish English graduates over 4 years. Findings revealed that the graduates perceived themselves to be competent, however, they expressed their need for more practice. 32 According to Marton and Sajlo (1997), Drew (2001), Fredericks (2005) and Mostafa (2010) effective evaluation is an authentic, continuous and collaborative process between students and teachers. Hence, students can start to develop individual responsibilities and self-monitoring. Drew (2011) indicated students‘ needs for clear assessment and feedback. His findings indicated that students prefer individual and written feedback (although they are aware of their instructors‘ workload). He stressed that students‘ motivations and orientations influence the ways in which they perceive and act upon their understanding of assessment. McGivney‘s (1996) highlighted more details of students‘ needs for feedback. He indicated that they need rapid and regular feedback as well as specific instructions to improve and guide their work. He also indicated students‘ needs for clear explanations of the grading system, practice in examination techniques and discussion of answers. Similar findings were found in studies like: (Seedhouse, 2001; Zacharias, 2007; Abu Shawish and Abd Al- Raheem, 2015). The review above supports the current study approach in pursuing students‘ perceptions and views regarding their instructors‘ performance. It proves the validity of the issues raised about students‘ academic self-image, need for individual feedback from their instructors and their attitude towards evaluation in general. This review also provides evidence of various gaps among students and their instructors which supports the approach of this study in seeking students‘ views and feedback. 33 2.5. Selection of Evaluation Tools: There are several factors which affect the selection and design of the evaluation tool. However, culture is almost always one of the very influential ones. Other factors are related to lack of formal training on assessment options, time constraints also appear to affect assessment choices, in addition to other academic competences and administrative restrictions. Research has indicated that instructors tend to find traditional testing very handy even after taking training or being free to select among assessment tools. Traditional testing question formats are the primary form of assessment in higher education (Kvale, 2007). This kind of testing is relatively easy to design, administer and score in addition to its measurement of explicit learning and institutional approval (Norton et al., 2006). Ebert-May et al., (2011) reported the findings of a year-long professional development training to help faculty move from teacher- to learner-centered learning for undergraduates programs. The professional training was given to instructors over a long period of time to test how learner-centered the teaching will turn and how compatible the reported instructors‘ practices were with the feedback given by independent observers of their performance. The majority of faculty (75%) used lecture- based and teacher-centered pedagogy showing a clear disconnection between faculty‘s perceptions of their teaching and their classroom practices. 34 In a similar study, Aquino et al (2015) investigated faculty‘s perceptions, skills, and practices of assessment in undergraduate programs. The sample consisted of (90) professors and instructors having postgraduate ranks. Faculty‘s self-reported views and responses indicated that they highly regarded assessment as a major tool to enhance learning, promote students‘ development and assign grades. These major findings were similar to studies conducted by Serra, Gómez and Sáiz (2014). They indicated that faculty regarded student learning as important and they were also confident of their skills to carry out an assessment for that purpose. However, it was reported that faculty felt students‘ participation in such evaluation was not necessary. The most frequent tool used by (100%) of faculty was written tests, and quizzes. Another important finding was that all respondents admitted that there are university guidelines that affect their practices to a very significant extent. This major finding was also indicated by Grunewald & Peterson (2003). Furthermore, faculty members expressed their need for more training in classroom assessment. They revealed that their workloads and administrative duties affect their time for preparing assessment and directly leads to the handier summative testing. These findings confirm that written tests are part of the university evaluation culture. This review supports the effect of assessment culture in defining assessment tools and provides more evidence about faculty members‘ gap 35 between theory and practice. It shows how much the testing culture is deeply-rooted and defiant to change. 2.6. The Effects of Faculty’s & Students’ Demographic Variables: Concerning the variables that might affect faculty members‘ views and practices, this study has explored some conventional and non- convention variables namely: academic qualifications, experience, interest in modern pedagogy, gender and age. However, it is important to bear in mind that the effects of the demographic variables have various, sometimes contradicting results among studies. In general, the studies related to gender have produced inconclusive results, but most have shown that this variable has little or no impact on faculty performance (Marsh Arreola, 2000; Theall & Franklin, 2001; Algozzine et al., 2004). Norton et al., (2005) conducted a study in the UK to explore the influence of gender, pedagogical training, years of experience and institutional and department culture on the beliefs and practices of faculty members. The findings indicated that the department culture has a greater influence on practices more than beliefs. It was found that the length of teaching experience and pedagogical training has no significant influence on practices. However, concerning the gender of the instructor, it was found that females tend to be more receptive to modifying their practices than males. In a similar research, Al-Thimiri & Hamdi (2015) explored 36 instructional and evaluation strategies. The findings of the study indicated that the faculty members‘ perception of evaluation standards was high among respondents followed by standards related to teaching strategies. The findings showed that there were no significant statistical differences in evaluation in all domains attributed to gender, academic rank and the academic experience of the faculty member. Regarding the variables of age and length of experience, although they are not usually included in faculty‘s demographics, Coffery & Gibs (2002), Kreber (2005) noted that years of teaching experience still can play a role in reflective experience and so it can improve learning outcomes. Al-Qaffas and Al-Farahati (2011) found that more experienced teachers with more educational qualifications tended to be more interested in evaluating different learning domains and following up with students‘ progress. The current study explores the effect of certain unconventional variables that might influence students‘ beliefs and perceptions such as: the type of university whether it is private or public in addition to the major discipline (Translation, TEFL or English Language & Literature). Concerning the university variable, research has shown that private universities usually have a slight significance in students‘ perceptions and level of satisfaction. Jones (2003) and Telford & Masson (2005) reported the quality assurance influence in higher education as one main reason which has become a focus of attention for private universities. Choi‘s study 37 (2013) in a Malaysian private college yielded similar results on the effect of lecturers‘ competencies on students‘ satisfaction. Mazumder (2013) conducted a research to explore students‘ perceptions at chosen public and private universities in Bangladesh. Only five universities responded to the research questionnaire because the other universities were un-cooperative. This may indicate that most of the higher education institutions do not necessarily consider students‘ satisfaction as a priority. It was found that students at private universities have a higher satisfaction levels than public universities. However, Naidu‘s & Derani (2016) comparison between private and public universities in Malaysia showed less significant differences between the two types of universities. Concerning the effect of discipline, some studies have shown that Humanities‘ students regard their instructors more positively than students in the social sciences or science faculties (Neumann, 2001; Franklin & Theall 1995; Scarboro, 2012). In the Palestinian context, Essa and Naqa (2009) conducted a study about students‘ views regarding faculty competences in lecturing, classrooms activities and methods of evaluation. The instructional competences ranked slightly higher than the average evaluation practices related to testing, grading and classroom activities. There were no statistical differences related to students‘ variables of gender, major or levels. Similar results are found in Khader‘s & Shaat study (2010). 38 Finally, it is worthy to mention that the call for changing practices in the teaching-learning process versus the adherence to traditional practices must have created some confusion and reservation among faculty members in different parts of the world. Assessment reform has been inconsistently applied by instructors in different parts of the world (Dassa, 1990; Gipps, 1994). In general, where changes have been introduced and assisted by training, or when assessment reform is directly introduced into the teaching programs, the pace of change is slow because it is still difficult for teachers to change practices which are closely embedded within the culture around them (Shepard, 1995). 2.7. Conclusion: First, it is important to bear in mind the peculiarities of foreign language teaching and the distinctive roles foreign language teachers have. Foreign language teaching is regarded to be more complex and varied than other subjects. The methodologies are considered to be more progressive than that of other subjects, and consequently, English language teachers are needed to be more up-to-date to cope with the advanced and progressive nature of language teaching methodology (Borg, 2006). For teachers, it is a necessary step for more professional development. In addition, language teaching is always in need for new ideas and successful practices. 39 2.8. Summary: The previous review conveys probably more of realistic practices associated with foreign language teaching in higher education. The review highlights the main aspects of the conventional practices in instruction and evaluation. It draws attention to some justifications behind these practices such as the institutional assessment culture, economic considerations, content coverage and lack of pedagogical training. The effect of the conventional practices minimizes students‘ chances of demonstrating their learning because of lower cognitive processes, less engagement and limited chances of demonstration of language competences in oral and written activities. Conversely, the assessment which supports the learning process as well as the products of learning tends to be more satisfactory and sufficient and goes past the limited results in conventional testing. Instead of rote learning and basic facts teaching, students are trained to practice problem-solving, open-ended questions and more authentic tasks that can generate more personalized and genuine learning. The review also highlights the role of active professionalism and reflective practice in improving and monitoring habitual practices. In addition, the review highlights the conflicting views between faculty members and students. The review draws attention to faculty members‘ general positive perceptions of their performance, whereas they tend to think less positively of students‘ competences. In contrast, students, who are conventionally instructed and evaluated, tend to regard their 40 instructors‘ practices less positively while they hold a positive academic self-image. Consequently, more researchers call upon faculty members to make the best use of students‘ evaluation of the teaching learning process as a valuable feedback resource and a tool for development in order to bridge this gap. Finally, the review draws attention to the inconclusive results concerning the effects of demographic variables in the diverse populations of students and faculty members. 41 Chapter Three Methodology and Procedures 3.1. Introduction: In this chapter, the methodology used in collecting and analyzing the data is defined. The researcher has presented the research methodology, the population and the sample of the study, the research instruments, validity and reliability of the instruments, the study procedures and the statistical analysis. 3.2 Methodology: A descriptive, analytical approach is used to achieve the main purpose and answer the research questions. To approach the problem, develop hypotheses and generate qualitative data, the researcher benefited from observation, contacts and interviews with faculty members and English majors from different universities. In addition, previous studies were used to generate more qualitative data. Two questionnaires were used to collect data about the faculty members‘ common evaluation practices and students‘ perceptions of these practices. The quantitative data is based on the statistical analysis of the responses which was used to formulate generalizations about the faculty members‘ practices and the English majors‘ perceptions as well as answer the research questions. 42 3.3. Population of the Study: The population of this study consisted of all the English language instructors in Palestinian universities in Gaza and the West Bank. The students‘ population consisted of the majors of English Language & Literature, TEFL and Translation majors at An-Najah National University in Nablus and the Arab American University in Jenin. The study was carried out in the academic year (2015/ 2016). The total number of instructors was (220) and the total number of students was (1500) according to the statistics provided by the English departments. 3.4. The Sample of the Study: The sample of the study is stratified random. It consisted of (166) instructors and (400) students from the whole population. The percent of teachers sample is (75.4 %) and the percent of students sample is (26.7 %) from whole population. Tables (1-8) below indicate the sample distribution in accordance to teachers and students independent variables. A- Instructors’ Variables: Table (1): Distribution of Sample According to Gender Variable: Gender Frequency Percentage % Male 451 44.5 Female 41 14.5 Total 611 61101 43 Table (2): Distribution of Sample According to Academic Qualification Variable: Academic qualification Frequency Percentage % Master 444 44.0 Ph.D 01 14.0 Total 166 100 % Table (3): Distribution of Sample According to Professional Experience Variable: Professional experience Frequency Percentage % Less than 5 years 44 40.4 6-10 years 44 14.1 More than 10 years 44 14.5 Total 166 100 % Table (4): Distribution of Sample According to Age Variable: Age Frequency Percentage % 25-35 15 41.4 36-45 14 44.1 46-55 14 41.4 More than 56 14 44.2 Total 166 100 % Table (5): Distribution of Sample According to Interest in Modern Pedagogy Variable: Interest in Modern Pedagogy Frequency Percentage % Average 44 4.4 Good 45 14.4 Very good 20 04.4 Total 166 100 % B- Students’ Variables: Table (6): Distribution of Sample According to University Type: University Frequency Percentage % Public 440 01.7 Private 440 14.1 Total 400 61101 44 Table (7): Distribution of Sample According to Major: Major Frequency Percentage % English language & literature 440 11.4 English language Methodology 410 11.7 Translation 25 44.0 Total 400 61101 3.5. Instruments of the Study: The researcher developed two questionnaires based on educational literature, related studies and other particular less-tested variables. The instructors‘ questionnaire consisted of (5) sections: - The first section consisted of (6) items about demographic data namely: the instructor‘s gender, age, academic qualification, university, professional experience and interest in modern pedagogy. - The second section consisted of a six-item multiple-choice question to explore the general instructional and evaluation practices among faculty members. This section is a secondary and an introductory question which requested faculty members to choose answers that best suited their cases from (5) options. The three first items were intended to explore general common instructional practices and another three items to explore general evaluation practices. This question also aimed at eliciting more responses from instructors. - The third section and the fourth section consisted of (24) items to explore more details about the instructional (items from 1-8) and 45 evaluation practices (items from 1-16). The researcher applied a four-level Likert scale to test the frequency of particular instructional and evaluation practices as well as explore certain views among instructors. Here is the scale:  Never 1 degree  Rarely 2 degrees  Sometimes 3 degrees  Always 4 degree - The fifth section is a rank-order scaling question consisting of (14) options of evaluation tools for instructors to choose from according to their own preferences and priorities. The students‘ questionnaire consisted of (3) sections: - The first section consisted of (3) items about demographic data namely: governorate, type of university and major. - The second and third sections consisted of (32) items exploring students‘ perceptions regarding their instructors instructional (items from 1-14) and evaluation practices (items from 1-18). A five-level Likert scale was used :  Strongly disagree 1 degree  Disagree 2 degrees 46  Neutral 3 degrees  Agree 4 degrees  Strongly agree 5 degrees 3.6. Validity of the Instrument: The two questionnaires were presented to a jury in the fields of English language and TEFL at An-Najah University and Al-Aqsa University in Gaza, in addition to the researcher‘s supervisor. The researcher was recommended to make some modifications and additions. 3.7. Reliability of the questionnaire: The Cronbach Alpha coefficient was used to find out the reliability of instructors‘ questionnaire and for both the two domains of the students‘ questionnaire and their total score. The following Table (8) shows reliability coefficients of each domain and the total score of the questionnaire. It shows that all the reliability coefficients are (0.86) which is considered to be suitable for scientific purposes of the study. 47 Table (8): Reliability Coefficients of Each Domain and the Total Score of the Questionnaire: Domains Number of items Reliability coefficient Instruction practices 8 0.70 Evaluation practices 16 0.73 Instructors’ questionnaire 24 0.75 Views on the instruction and lecturing practices 14 0.77 Views on the evaluation practices 18 0.76 Students’ questionnaire 32 0.86 3.8. Procedures of the Study: The formal procedures were taken to carry out the study. First, after establishing the utility of the instrument, the necessary modifications were added. Second, permission was given to the researcher to start administering the questionnaires. The questionnaires were distributed in the first and second semester in the academic year 2015– 2016. Every instructor and student was invited to complete the questionnaire. In order to obtain more valid and credible results, the researcher had to take several trips to all Palestinian universities to meet instructors and distribute questionnaires. Later, the researcher began to collect the questionnaires from the instructors and students. Fewer instructors‘ questionnaires from Gaza were completed on line, but the majority were completed in hard copies then sent by parcel mail to the researcher. The researcher herself distributed copies to most of the majors of English and TEFL at An-Najah and majors of English at the Arab American University in Jenin. The questionnaires were collected for statistical analysis. 48 3. 9. Questions of the Study: This research has 2 main questions:  What are the most common evaluation practices utilized by faculty members of English at Palestinian higher education institutions?  What do faculty members of English prefer to use for evaluation at Palestinian universities? The first question underlies these sub- questions:  What are the instructional practices utilized by faculty members at Palestinian universities which underlie the common evaluation practices?  Are there any significant differences in evaluation practices among the faculty members at the Palestinian universities attributed to the academic qualification variable?  Are there any significant differences in evaluation practices among the faculty members at the Palestinian universities attributed to the professional experience variable?  Are there any significant differences in evaluation practices among the faculty members at the Palestinian universities attributed to the age variable? 49  Are there any significant differences in evaluation practices among the faculty members at the Palestinian universities attributed to the gender variable?  Are there any significant differences in evaluation practices among the faculty members at the Palestinian universities attributed to interest in modern pedagogy variable? There is one main question in the students‘ questionnaire:  How do majors of English, TEFL and Translation at An-Najah National University in Nablus and the majors of English, TEFL and Translation at the Arab American University in Jenin perceive the evaluation practices employed by their instructors? More secondary questions are:  How do the majors of English, TEFL and Translation at An-Najah National University in Nablus and at the Arab American University in Jenin perceive the instructional practices underlying the evaluation practices? This question underlies more sub- questions:  Are there any significant differences among the majors of English, TEFL and Translation at ANU and at AAU attributed to the type of university variable: private or public? 50  Are there any significant differences among the majors of English, TEFL and Translation at ANU and at AAU attributed to the major variable: TEFL, English Language & Literature and Translation? 3. 10. Hypotheses of the Study:  There are no statistically significant differences at (α ≤ 0.05) in evaluation practices among the Palestinian faculty members attributed to academic qualifications, professional experience, age, gender and interest in modern pedagogy.  There are no statistically significant differences at (α ≤ 0.05) in the perceptions of evaluation practices among the majors of English, TEFL and Translation at ANU and the majors of English, TEFL and Translation at AAU attributed to type of university and major. 3.11. Variables of the Study: 1. Instructors’ Independent Variables:  Male / Female Gender.  Academic Qualifications which are divided into Masters and PhD.  Professional Experience :( less than 5 years, 6-10 years, more than 10 years). 51  Age (Less than 25 years, 26-35 years, 36-45 years, 46-55 years, more than 56 years).  Interest in Modern Pedagogy which ranges from average, good, very good. 2. Instructors’ Dependent Variables: The common evaluation practices which are employed by faculty members at Palestinian Universities. 3. 12. Students’ Independent Variables:  University Type which includes private or public.  Major which includes English Language & Literature, Translation and TEFL (Methods of Teaching English). Students’ Dependent Variables: The English majors‘ perceptions of the common evaluation practices which are employed by their instructors. Statistical Analysis: The Statistical Package for Social Science (SPSS) version 17.0 was used to analyze data. Various statistical tests were used including means, standard deviations, percentages, frequencies, independent T-test, One way ANOVA and Scheffe post hoc test to determine the sources of differences in the rejected hypotheses. 52 To estimate the instructors‘ responses about instructional practices and evaluation practices, a four-Likert scale was used. The levels of responses were calculated in percentages as follows: *81.25% and more is a very high degree. *62.50-81.24% is a high degree. *43.75-62.49% is a low degree. *43.74 % and less is a very low degree. To estimate students' responses, a five-Likert scale was used. The levels of responses were calculated in percentages as follows:  80% and more is a very high degree.  70-79.9% is a high degree.  60-69.9% is a moderate degree.  50-59.9% is a low degree.  50 % and less is a very low degree. 3.13. Ethical Issues: Permission to conduct this study was granted by the Faculty of Graduate Studies at An-Najah University. Participants of the study were informed about the purpose of the study and their participation was 53 voluntary. All the data of this study is considered confidential and used for the purpose of academic research only. 3.14. Summary: In this chapter, the researcher has presented the main components of the study. The populations, the instruments and the samples have been defined. In addition, the questions of the study, the hypotheses and the variables have been specified. The validity and reliability have been described as well. 54 Chapter Four Results 4.1. Introduction: This chapter is divided into several parts which present the research questions and hypotheses. The research results were analyzed by using the Statistical Package for Social Sciences (SPSS). The statistical analysis has revealed the following results: 4.2. Results Related to the Faculty’s Questionnaire: At first, the researcher explores general instructional and evaluation practices in a six-item multiple choice question (about instruction and testing) which was meant as a means of orientation. Frequencies and percentages for each practice were calculated. 4.2.1. Results Related to the Multiple Choice Questions: This section tackles the results related to the responses about common instructional practices utilized by faculty members of English: 55 A. Results Related Lecturing as a Favorable Practice: Table (9): Frequencies and Percentages of Responses of Lecturing as a Favorable Instructional Practice No. Response Frequency Percentage 1 It can cover a lot of material 44 11.1 2 It is part of the instructor's responsibility. 40 12.4 3 Students like lectures. 44 2.4 4 This statement does not apply to my case. 2 0.1 5 Information is inaccessible. 1 4.1 Total 166 100.0 Table (9) shows that (43.4%) of instructors see lecturing as a favorable practice of instruction because it can cover a lot of material, whereas (39.2%) see it as a part of the instructor's responsibility. Only (5.4%) do not consider lecturing as a favorable practice or applicable in their cases. B. Results Related to Skills Specification: Table (10): Frequencies and Percentages of Skills Specification Responses: No. Response Frequency Percentage 1 There are many students. 11 40.2 2 Three formal tests are sufficient to examine skills. 14 44.2 3 Students' skills are observable by the faculty instructor. 14 44.2 4 There is no time to specify all skills requirements. 41 41.0 5 It does not apply to my case. 41 41.2 Total 166 100.0 56 Table (10) shows that instructors have different views about the lack of skill specification. The highest response (25.9 %) indicates that the number of students does not allow instructors to define more learning skills. There are also equal responses rates which indicate that instructors consider skills can be either observed by the instructors or formally-tested. C. Results Related to the Use of Tests More than Presentations: Table (11): Frequencies and Percentages of Responses of Tests are Better than Presentations: No. Response Frequency Percentage 1 Students are not trained to give them. 44 15.1 2 They are demanding to their levels. 14 42.1 3 Students are usually shy. 44 44.1 4 This statement does not apply to my case. 44 44.1 5 It is a waste of course time. 41 4.4 Total 166 100.0 Table (11) shows that (40.4%) of instructors think tests are better than presentations because students are not trained to give them whereas (19.3%) see that they are demanding to their levels. D. Results Related to Students’ Non-Test Based Work: Table (12): Frequencies and Percentages of Responses about Students’ non-Test Based Work: No. Response Frequency Percentage 1 They are usually plagiarized from the internet. 41 14.4 2 They need more follow up. 14 44.2 3 They are hard to grade. 44 44.4 4 There is not enough time for these. 42 44.1 5 This statement does not apply to my case. 41 4.1 Total 166 100.0 57 Table (12) shows that (38.6%) of instructors think students' non-test based work is usually plagiarized from the internet whereas (28.9%) see that they need more follow up. E. Results Related to the Use of Power Point: Table (13): Frequencies and Percentages of Responses about the Use of PowerPoint: No. Response Frequency Percentage 1 They help me present more information 44 14.5 2 I like to use technology. 10 44.4 3 They impress students. 44 44.2 4 They enable students to take notes 44 41.1 5 This statement does not apply to my case. 41 4.4 Total 166 100.0 Table (13) shows that (41.0%) of instructors explain that they use PowerPoint presentations in order to present more information whereas only (21.1%) of instructors like to use technology. F. Results Related to the Reasons Behind Formal Testing: Table (14): Frequencies and Percentages of Responses: No. Response Frequency Percentage 1 It is more effective and reliable for evaluation. 44 14.1 2 An administrative procedure. 45 14.4 3 Other evaluation techniques are not suitable for our students. 45 44.5 4 Tests are easy to administer. 44 45.4 5 This statement does not apply to my case. 4 1.4 Total 166 100.0 58 Table (14) shows that (37.3%) of instructors assume testing is taken for granted as it is more effective and reliable for evaluation whereas (36.1%) of instructors assume it is an administrative procedure. 4.4. Results Related to the First Sub- Question about the Underlying Instructional Practices: Table (15): Means, Standard Deviations, Percentages and Levels of Instruction Practices Among Faculty: No Item M SD Percentage Level 1 Using brainstorming or concept mapping is better than students‘ listening and note-taking. 1.42 5.44 79.75 High 2 I think I need to collect data about my effective teaching from different sources (from other instructors and students, for example). 1.41 5.44 78.25 High 3 I prefer to discuss students‘ questions at the end of the class to help achieve more goals. 1.50 5.45 76.25 High 4 I think that current generations of our students might not be sufficiently motivated to perform, create or produce. 1.50 5.45 76.25 High 5 When I teach extra courses, I depend on formal testing as a reliable evaluation technique. 4.24 5.41 74.5 High 6 I think all language skills can be evaluated through testing as a reliable rating process. 4.40 5.44 71.25 High 7 I don‘t provide student-based tasks in classroom for every lecture because it is very demanding and time-consuming. 4.04 5.45 64.50 High 8 I find classroom tasks, which represent meaningful instructional activities, demanding and time consuming. 4.00 5.42 63.75 High Total score of instruction practices among instructors 2.92 0.32 73.0 High 59 Table (15) shows that the instructional practices among instructors achieved a mean of (2.92) and a percentage of (73.0). This means that the tested instructional practices have high responses. The responses range between (63.7- 79.7%). The items (7-8) are not as high as the rest. The highest levels range between (71.2 - 79.7%). They are the items from (1-6). A- Results Related to the Main Question about Faculty’s Common Evaluation Practices: Table (16): Means, Standard Deviation, Percentages and Levels of Evaluation Practices Among Faculty Members: No. Item M SD Percentage Level 1 I like to go over the exam questions with students after handing in their papers to let them learn from their mistakes. 1.14 5.44 87.0 Very high 2 I keep up to date with new developments in evaluation and assessment. 1.10 5.44 86.25 Very high 3 I try to think of different techniques for evaluations other than testing. 1.11 5.44 83.25 Very high 4 I think, since I want my students to possess meaningful learning, it is important to reconsider my evaluation practices. 1.42 5.41 79.75 High 5 The administrative instructions related to tests affect the quality of testing practices as adopted by the faculty staff. 1.54 5.44 76.75 High 6 Testing, as it is currently applied, overlooks complex thinking and problem-solving skills. 4.24 5.41 74.50 High 7 I think the test scores represent improvement or decline in teaching and learning. 4.24 5.41 74.25 High 60 No. Item M SD Percentage Level 8 Essays can be a better choice than open-ended questions 4.20 5.44 73.75 High 9 I have found that multiple – choice testing format is very helpful even if they have a limited relevance to real-world learning. 4.44 5.42 71.50 High 10 I believe experimenting with non- test based evaluation needs time, effort and training which the teaching staff cannot afford. 4.41 5.45 70.25 High 11 Open- ended questions are hard to mark and grade by many instructors. 4.45 5.45 70.0 High 12 Multiple - choice tests are more efficient in determining how well facts and concepts have been acquired. 4.40 5.41 68.75 High 13 I do not allow students to take part in evaluation because it is a formal administrative procedure 4.44 4.51 67.0 High 14 I think non- test based evaluation techniques do not apply in our case. 4.44 5.24 65.25 High 15 Writing notes on test papers is unnecessary because the test grade can provide the necessary feedback to the student. 4.14 5.24 57.75 Moderate 16 I think giving students additional tasks to improve their performance in formal tests is against faculty instructions and policy. 4.41 4.54 55.75 Moderate Total score of evaluation practices among instructors 2.91 0.31 72.75 High Table (16) shows that the evaluation practices among instructors achieved a mean of (2.91) and a percentage of (72.75). This means that tested evaluation practices have high responses. The very high responses range between (83.25- 87 %). They are the items from (1-3). The high 61 responses range between (67.0 – 79.75%). They are the items from (4-14). There are two moderate responses ranging between (55 – 57%). They are the items (15-16). Table (17): Total Degrees of the Instruction and Evaluation Domains: No. Domain M SD Percentage Level 1 Instruction practices 2.92 0.32 73.0 High 2 Evaluation practices 2.91 0.31 72.75 High Total degree 2.91 0.27 72.75 High Table (17) shows that the total degree of instruction and evaluation practices achieved a mean of (2.91) and a percentage of (72.75), which indicates that the items used to explore faculty‘s practices and beliefs have a relatively high degree of regularity in instruction and evaluation (according to the four-level Likert scale applied). B- Results Related to the Second Main Question about Faculty Members’ evaluation Preferences: The researcher suggested (14) alternatives and asked instructors to order (6) alternatives. Table (18) shows the results. Table (18): Frequency of the Best Evaluation Preferences: Order No Tool Frequency 1 3 Formal tests & exams 104 2 9 Conducting research 94 3 11 Creative papers 88 4 1 Student-proposed projects 87 5 6 A series of quizzes or chapter tests instead of comprehensive, high-stakes tests 84 6 4 Students‘ writing of critiques 71 7 12 Collective projects 67 62 Order No Tool Frequency 8 13 Students‘ journals 59 9 5 Annotated portfolio of students‘ work through the term 49 10 14 Interviews & questionnaires 42 11 2 Drama & performances 35 12 8 Utilizing self-assessment & rubrics 32 13 10 Students producing films & videos 26 14 7 Student-designed tests 18 Table (18) indicates that formal tests & exams are the first best tool with (104) frequencies. Conducting research is the second best tool with (94) frequencies. Creative papers is the third best tool with (88) frequencies. Student-proposed projects is the fourth best tool with (87) frequencies. A series of quizzes or chapter tests instead of comprehensive, high-stakes tests is the fifth best tool with (84) frequencies. Students‘ writing of critiques is the sixth best tool with (71) frequencies. 4.6. Results Related to the Faculty Members’ Hypotheses: A. Results Related to the First Hypothesis: There are no significant differences at (α≤0.05) in evaluation practices among faculty members due to gender. T-Test was used for independent samples. Table (19) shows the results. Table (19): T-Test for Independent Samples of Instruction and Evaluation Practices Due to Gender: Domain Male (N=103) Female (N=63) T- value Sig.* Mean S.D Mean S.D Instruction practices 2.95 0.31 2.87 0.33 1.581 0.116 Evaluation practices 2.95 0.30 2.83 0.31 2.521 0.013* Total degree 2.95 0.26 2.84 0.27 2.617 0.010* * Significant at (≤ 0.05), D.F = 164. 63 Table (19) shows that there are no significant differences at (α≤0.05) in the instruction practices due to gender in instructional practices, while there are significant differences at (α≤0.05) in the evaluation practices and the total degree. These differences are in favor of males. Hence, the first hypothesis is rejected. B. Results Related to the Second Hypothesis: There are no significant differences at (α≤0.05) in evaluation practices among faculty members due to academic qualification. T-Test was used for independent samples. Table (20) shows the results. Table (20): T-Test for Independent Samples of Instruction and Evaluation Practices Due to Academic Qualifications: Domain Master (N=112) Ph.D (N=54) T-value Sig.* Mean S.D Mean S.D Instruction practices 2.88 0.30 3.02 0.35 2.637 0.009* Evaluation practices 2.86 0.33 3.00 0.25 2.764 0.006* Total degree 2.87 0.27 3.01 0.23 3.251 0.001* * Significant at (≤ 0.05), D.F = 164. Table (20) shows that there are significant differences at (α≤0.05) in the evaluation practices due to academic qualification. These differences are in favor of Ph.D holders. Hence, the second hypothesis is rejected. 64 C. Results Related to the Third Hypothesis: There are no significant differences at (α≤0.05) in evaluation practices among faculty members due to professional experience. One-Way ANOVA was used to test the hypothesis. Tables (21- 23) show the frequencies, means and standard deviations of the instruction and evaluation practices due to professional experience and the results of One- Way ANOVA respectively. Table (21): Frequencies, Means, and Standard Deviations of the Instruction and Evaluation Practices Due to Professional Experience: D