An-Najah National University Faculty of Graduate Studies Developing Trip Generation Models Utilizing Linear Regression Analysis: Jericho City as a Case Study By Alaa Mohammad Yousef Dodeen Supervisor Prof. Sameer Abu-Eisheh This Thesis is Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Roads and Transportation Engineering, Faculty of Graduate Studies, An-Najah National University, Nablus, Palestine. 2014 III Dedication This research effort is dedicated to my family, friends, and instructors. Without their love and support, I could not have achieved this goal. IV Acknowledgment First of all, thanks God! My appreciation and thanks are extended to my instructors at An-Najah National University. My special thanks to Professor Sameer Abu Eisheh for his continuous help, support and time in this thesis. I would also like to thank the defense committee members for their valuable discussions. Finally, I thank my friends at Jericho Municipality, who provided me with the required data (Arial Photo, Master Plan, and Auto CAD Maps) and my friends who helped me in data collection and analysis. V اإلقرار م الرسالة التي تحمل العنوان:أدناه مقدالموقعة أنا Developing Trip Generation Models Utilizing Linear Regression Analysis: Jericho City as a Case Study أقر بأن ما اشتممت عميو ىذه الرسالة ، إنما ىي نتاج جيدي الخاص ، باستثناء ما تمت اإلشارة الرسالة ككل ، و أي جزء منيا لم يقدم من قبل لنيل درجة عممية أو إليو حيثما ورد ، و أن ىذه بحث عممي لدى أي مؤسسة تعميمية أو بحثية أخرى. Declaration The work provided in this thesis, unless otherwise referenced, is the researcher's own work, and has not been submitted elsewhere for any other degree or qualification. Student’s name: :اسم الطالب Signature: :التوقيع Date: :التاريخ VI Table of Contents No. Content Page Dedication III Acknowledgment IV Declaration V Table of Contents VI List of Tables IX List of Figures XI List of Appendices XII Abstract XIII Chapter One: Introduction 1 1.1 General Background 2 1.2 The Problem of Study 8 1.3 Objectives of the Study 9 1.4 Study Area: Jericho City 9 1.5 Thesis Outline 11 Chapter Two: Literature Review 12 2.1 Overview of Trip Generation 13 2.2 Literature Review of Trip Generation Variables 13 2.2.1 In the developed Countries 13 2.2.2 In the developing Countries 22 2.3 Summary of Literature Review 27 Chapter Three: Analytical Framework and Methodology 28 3.1 General Steps of the Methodology 29 3.2 Methods of Survey 30 3.2.1 Personal Interview Surveys 31 3.2.2 Telephone Interviews 32 3.2.3 Mail-Back Surveys 33 3.2.4 Online Surveys 33 3.2.5 Data Collection Method Used 34 3.3 Sample Size Calculation Methods 34 3.3.1 Standards of Bureau of Public Roads (BPR) 35 3.3.2 Sample Size Statistical Formulas 35 3.4 Overview of Linear Regression Method 36 3.4.1 The Linear Regression Analysis Process 37 3.4.2 Regression Model Building Approaches 38 3.5 Unit of Analysis 39 3.6 Data Analysis Software 39 3.7 Model Specification 40 VII 3.8 Models Estimation 40 3.8.1 General Trip Generation Model 41 3.8.2 Trip Generation Models by Trip Purpose 41 3.8.3 Temporal Trip Generation Models 41 3.9 Statistical Tests 42 3.9.1 Correlation Matrix and VIF: Testing for Multicollinearity 42 3.9.2 R-Squared: Goodness of Fit 43 3.9.3 F-Test: Testing Overall Significance of Model 44 3.9.4 T-Test: Testing Individual Coefficients 44 3.10 Logical Aspects Used in Model Selection 45 3.11 Determination of Study Area 45 3.12 Zoning System 47 Chapter Four: Field Survey and Data Collection 54 4.1 Population of Study Area 55 4.2 Sample Size of Study Area 56 4.3 Sampling Method 58 4.4 Questionnaire Design 58 4.5 Required Information 59 4.7 Conducting Field Survey 63 Chapter Five: Data Analysis and Results 64 5.1 Descriptive Data 65 5.1.1 Descriptive Data of Dependent Variables 65 5.1.2 Descriptive Data of Explanatory Variables 68 5.2 General Trip Generation Model 79 5.2.1 Interpretation of Regression Coefficients 80 5.2.2 Testing Individual Coefficients: T-Test 81 5.2.3 Testing for Multicollinearity: Correlation Matrix and (VIF) 83 5.2.4 Testing Goodness of Fit: R-Squared (R 2 ) 83 5.2.5 Testing Overall Significance of Model: F-Test 84 5.2.6 Model Verification 85 5.3 Trip Generation Models by Purpose 87 5.3.1 Work Trip Generation Model 87 5.3.2 Education Trip Generation Model 88 5.3.3 Shopping Trip Generation Model 90 5.3.4 Social Trip Generation Model 91 5.3.5 Recreational Trip Generation Model 93 5.4 Temporal Trip Generation Models 95 5.4.1 Trip Generation Model for Trips Made before 8 AM 95 5.4.2 Trip Generation Model for Trips Made between 8-9 AM 97 5.4.3 Trip Generation Model for Trips Made between 9-12 AM 100 VIII 5.4.4 Trip Generation Model for Trips Made between 12 AM - 4 PM 102 5.4.5 Trip Generation Model for Trips Made after 4 PM 104 Chapter Six: Conclusions and Recommendations 107 6.1 Summary and Conclusions 108 6.2 Recommendations 111 References 113 Appendices 119 ب الملخص IX List of Tables No. Table Title Page Table 3.1 Standards of Bureau of Public Roads (BPR)for Sample Size Calculation 35 Table 3.2 ANOVA Test Results Table 44 Table 3.3 Land Uses and Areas for Traffic Zones 50 Table 4.1 Localities in the Study and Estimates of Population 56 Table 4.2 Number of Households per Traffic Zone for Study Area and Sample Size Required 57 Table 4.3 Sample Size Calculation According to Statistical Formulas 57 Table 4.4 Explanatory Variables Used in the Models 62 Table 4.5 Dependent Variables Used in the Models 62 Table 5.1 Descriptive Data for the Total Daily Household Trips 65 Table 5.2 Descriptive Data for the Daily Household Trips by Purpose 66 Table 5.3 Descriptive Data for the Daily Household Trips by Time 66 Table 5.4 Distribution of Daily Household Trips by Purpose 66 Table 5.5 Temporal Distribution of Daily Household Trips 67 Table 5.6 Descriptive Data for the Household Size 68 Table 5.7 Gender Distribution for the Sample 69 Table 5.8 Descriptive Data for the Gender Variable 71 Table 5.9 Descriptive Data for the Number of Employed Persons 72 Table 5.10 Descriptive Data for the Number of Persons Continuing Education 72 Table 5.11 Distribution of Survey Respondents by Age Groups 73 Table 5.12 Descriptive Data for the Number of Licensed Drivers 74 Table 5.13 Distribution of Transportation Vehicles 75 Table 5.14 Descriptive Data for the Number of Vehicles Owned per Household 76 Table 5.15 Descriptive Data for the Monthly Household Income 78 Table 5.16 Regression Results for the General Trip Generation Model 80 Table 5.17 ANOVA Table for the General Trip Generation Model 84 Table 5.18 General Trip Generation Model Verification 86 Table 5.19 Regression Results for the Work Trip Generation Model 87 Table 5.20 Regression Results for the Education Trip Generation Model 89 Table 5.21 Regression Results for the Shopping Trip Generation 90 X Model Table 5.22 Regression Results for the Social Trip Generation Model 92 Table 5.23 Regression Results for the Recreational Trip Generation Model 93 Table 5.24 Regression Results for the Trip Generation Model (Number of Daily Trips Made before 8AM per Household) 96 Table 5.25 Regression Results for the Trip Generation Model (Number of Daily Trips Made between 8-9 AM per Household) 98 Table 5.26 Regression Results for the Trip Generation Model (Number of Daily Trips Made between 9-12 AM per Household) 100 Table 5.27 Regression Results for the Trip Generation Model (Number of Daily Trips Made between 12 AM - 4 PM per Household) 103 Table 5.28 Regression Results for the Trip Generation Model (Number of Daily Trips Made after 4 PM per Household) 105 XI List of Figures Figure No. Figure Title Page Figure 1.1 Sequences of Activities in Transportation Analysis 3 Figure 1.2 Origins and Destinations for Traffic Zones in an Urban Area 4 Figure 1.3 Types of Trips according to Movement in Study Area 5 Figure 1.4 Position of Jericho City in the West Bank 10 Figure 3.1 Draft Master Plan of Jericho city (2013) 46 Figure 3.2 Major Roads and Natural Barriers in Jericho City Used for Specifying Zones Boundaries 49 Figure 3.3 Explanation of Commuter Shed 50 Figure 3.4 Land Uses Plan with the Boundaries of Traffic Zones 51 Figure 3.5 TAZ's of Jericho City 53 Figure 5.1 Distribution of Daily Household Trips by Purpose 67 Figure 5.2 Temporal Distribution of Daily Household Trips 68 Figure 5.3 Daily Household Trips and Household Size 69 Figure 5.4 Gender Distribution for the Selected Sample 70 Figure 5.5 Daily household Trips and Number of Males 71 Figure 5.6 Daily Household Trips and Number of Females 71 Figure 5.7 Daily Household Trips and Number of Employed Persons 72 Figure 5.8 Daily Household Trips and Number of Persons Continuing Education 73 Figure 5.9 Distribution of Survey Respondents by Age Groups 74 Figure 5.10 Daily Household Trips and Number of Licensed Drivers 75 Figure 5.11 Distribution of Transportation Vehicles 76 Figure 5.12 Daily Household Trips and Number of Cars per Household 77 Figure 5.13 Daily Household Trips and Number of Bicycles per Household 77 Figure 5.14 Daily Household Trips and Number of Motorcycles per Household 78 Figure 5.15 Number of Daily Household Trips and Monthly Household Income 79 XII List of Appendices Appendix Appendix Title Page Appendix A Correlation Matrix 120 Appendix B Questionnaire Form 121 Appendix C SPSS Results (General Trip Generation Model) 125 Appendix D SPSS Results (Work Trip Generation Model) 126 Appendix E SPSS Results (Education Trip Generation Model) 127 Appendix F SPSS Results (Shopping Trip Generation Model) 128 Appendix G SPSS Results (Social Trip Generation Model) 129 Appendix H SPSS Results (Recreational Trip Generation Model) 130 Appendix I SPSS Results (Trip Generation Model for Trips Made before 8 AM) 131 Appendix J SPSS Results (Trip Generation Model for Trips Made between 8-9 AM) 132 Appendix K SPSS Results (Trip Generation Model for Trips Made between 9 AM - 12 PM) 133 Appendix L SPSS Results (Trip Generation Model for Trips Made between 12 - 4 PM) 134 Appendix M SPSS Results (Trip Generation Model for Trips Made after 4 PM) 135 XIII Developing Trip Generation Models Utilizing Linear Regression Analysis: Jericho City as a Case Study By Alaa Mohammad Yousef Dodeen Supervisor Prof. Sameer Abu-Eisheh Abstract The aim of this research is to develop trip generation models to predict the number of trips generated by households in the Palestinian areas considering Jericho City as the case study. The models are developed using multiple linear regression analysis, which establishes relationship between the number of trips generated by households and some socioeconomic attributes. The developed models include three types of models. The first model is a general trip generation model (i.e., a general model regardless of trip purpose and trip time). The second one includes trip generation models by trip purpose. These models include the work trip generation model, the education trip generation model, the shopping trip generation model, the social trip generation model, and the recreational trip generation model. Finally, five trip generation models by trip time are developed. The data consists of primary data, which was collected by conducting a household survey. The survey consists of 713 randomly selected households from Jericho City, the study area. The results indicated that the estimated general trip generation model has a good explanatory power. The R-square for this model is 0.69, indicating XIV that the explanatory variables included in the model explain 69% of the dependent variable. The variables that mostly affect trip generation are found to be the number of persons receiving education in the household, the number of employed persons in the household, as well as the household monthly income. The work trip generation model has R-square value of 0.74. In this model, the number of employed persons in the household and the number of persons with age between 31 to 50 years are the variables that mostly affect work trips. The educational trip generation model has R-square value of 0.97. The number of persons who are receiving education in the household is the main factor in this model. The shopping trip generation model depends on the number of persons in the household and the monthly household income. The social trip generation model depends mainly on the number of females in the household and the number of employed persons in the household. Finally, the recreational trip generation model depends mainly on the number of persons receiving education in the household, number of persons between 51 and 64 years old, and the monthly household income. 1 Chapter One Introduction 2 Chapter One Introduction 1.1 General Background Transportation planning processes have been intensively used to estimate the demand for travel encountered in the future. The estimated travel demand is utilized as a basis to plan for future transportation facilities and services. As for the transportation system, it is necessary to quantify the inputs and the outputs for the system. The system inputs are the quantum of demand for transportation in the future years, while the system outputs are system characteristics that are planned for meeting the demand on the horizon years (Hutchinson, 1974). In order to quantify the inputs and outputs, a major and commonly used planning method suggests that there are four analytical steps followed to get the total demand in the horizon year, which is called Urban Traffic Management System (UTMS). The characteristics of the system that is proposed for the horizon year condition are to be assumed. The analytical steps are: trip generation, trip distribution, modal split, and route assignment. These steps are shown in Figure 1.1. This research considers the first step of this sequence, which is related to trip generation where it will be thoroughly studied and analyzed for Jericho City as a case study of Palestinian cities. The expected outputs can be applied to other Palestinian cities with proper calibrations. However, how the model can be applied to other cities is out of the scope of this research. 3 Figure 1.1: Sequences of Activities in Transportation Analysis Source: Principles of Urban Transport Systems Planning (Hutchinson, 1974) Trip generation analysis means understanding the trips generated in different traffic zones in an urban area. The trip for the purpose of analysis is defined as a one-way movement from an origin to a destination for a person. The entire urban area is usually divided into smaller traffic zones, where the points of origins and destinations are fixed as zone centroid. This is illustrated in Figure1.2. 4 Figure 1.2: Origins and Destinations for Traffic Zones in an Urban Area After delineating the urban area boundary and fixing the zone centroids as points of origin or points of destination, the trips can be classified according to spatial movement into four kinds (Meyer and Miller, 1986): 1. Internal to internal trip 2. Internal to external trip 3. External to internal trip 4. External to external trip These types of trips according to movement are shown in Figure1.3. Trip generation analysis is mainly related to the internal to internal trips and less to internal to external trips. Home interview survey is the major tool to establish trip generation models as will be discussed later. The other kinds (internal to external, external to internal, and external to external) can be identified by cordon surveys, which means surveys conducted at convenient points on the cordon line or at points of intersection of radiating roads. 5 Figure 1.3: Types of Trips according to Movement in Study Area The principal task in trip generation analysis is to relate the intensity of trip making (number of trips made from a point to several other points) to and from traffic zones to measures of the type and intensity of the land use in these zones and to other socio-economic characteristics. There are two types of trip generation analysis that can be carried out; these are: 1. Trip production analysis 2. Trip attraction analysis The term trip production refers to the trips generated by residential zones, where these trips are either trip origins or trip destinations (Papacostas and Prevedouros, 2004). Trips, which end at home, are called home-based trips or trip production, while the term trip attraction is used to describe trips generated by activities at the non-home ends (Hutchinson, 1974). Home-based trips and non-home based trips are analyzed separately, because it is difficult to combine these categories of trips in developing models. There is a need to develop separate trip generation models for 6 home and non-home based trips, as the type of variables that might influence such trips are to be different, or if the variables are the same, the effect of these variables on trip making might be different, so the models for these two types must be separated (Papacostas and Prevedouros, 2004). The process of relating the trips produced by households to the factors influencing trip production by appropriate analytical technique is termed as trip production modeling and the process relating the trips attracted to non- residential ends to the factors influencing trip attraction by appropriate analytical technique is termed as trip attraction modeling (Papacostas and Prevedouros, 2004). Although the individual is usually the trip makers, numbers of trips per household are usually estimated. The household is defined according to the Palestinian Central Bureau of Statistics (PCBS), as one person or a group of persons with or without family relationship, who live in the same housing unit, share meals, and make joint provision of food and other essentials of living. This is the general condition, where the persons may be related or unrelated to one another or both. The unrelated persons are called institutional household and are also considered in this study (Palestinian Central Bureau of Statistics, 2003). Some factors that usually influence trip production are related to the household characteristics because the household is a major unit of trip production. These can be listed as follows (Arasan, 2012): 7 1. Household size and composition, where household size is number of persons in household, and composition like the average age of household, the distribution of the sex of the individuals 2. Number of employed persons 3. Number of students 4. Household income 5. Vehicle ownership 6. Number of persons in household who have driving license 7. Type of house if it is independent or apartment The factors that usually influence trip attraction are related to non-home zones or non-residential zones like commercial zones, industrial zones, institutional zones, and recreational zones. These factors, which reflect the type and the intensity of land-use, can be as follows (Arasan, 2012): 1. Retail trade floor area 2. Service and office floor area 3. Manufacturing and wholesales floor area 4. Number of employment opportunities in retail trade 5. Number of employment opportunities in service and offices 6. Number of employment opportunities in manufacturing and whole sale 7. School and college enrollment 8 8. Number of special activity centers like transport terminals, sport stadium, major recreational, cultural, and religious places On the other side, trip production or home-based trips can be classified into different categories based on trip purpose, which are: 1. Work trips 2. Education trips 3. Shopping trips 4. Social and recreational trips Finally, trip production or home-based trips can also be classified into different categories based on time of making the trip. According to the level of analysis, it could be analyzed on an aggregate level (zone or area), or disaggregate level (household or person). In this research, analysis on the level of households is adopted, because depending on this level, more accurate results can be obtained, through the study of movements for each household. 1.2 The Problem of Study There is a lack of specialized studies, which are related to quantifying and modeling travel demand in the Palestinian cities. Several factors led to considerable increases in trip generation in the Palestinian cities, including reduced taxes on prices of cars, which have led to the increase in their numbers, as well as the increasing need for mobility according to population growth. This is combined with the limited development of the transport networks within the cities, and the observed overloading of and congestion in the existing transport networks. 9 As the numbers of cars are expected to continue increasing in the future, the need of transportation planning appears to be essential for these cities. Proper modeling is lacking describing the four analytical steps; trip generation, trip distribution, modal split, and route assignment. To provide strong basis for the transportation planning process, this research considers the need for studies in this area, through studying the trip production part of trip generation stage and its various categories. 1.3 Objectives of the Study The aim of this study is to lay the basis to predict current and future traffic trips generated from different traffic zones that comprise a Palestinian city, thus studying and modeling trips produced from households according to their characteristics, relying on the principles of the regression analysis technique. It also aims to predict the number of trips according to trip purpose and trip timing of trip production or home-based trips for different traffic zones. 1.4 Study Area: Jericho City Jericho City is chosen to be the study area. It is located in the middle part of the West Bank. The city is considered medium-sized in area. It has a total area of 57.43 km 2 (Palestinian Central Bureau of Statistics, 2012). According to the latest population estimate conducted by the Palestinian Central Bureau of Statistics (PCBS) in 2012, the population of Jericho City was estimated to be about 20,253 people. The number of households was estimated to be 3,510 living in 3,386 buildings. The population density in Jericho City was estimated to be 1, 055 inhabitants/km 2 . In the refugee 10 camps- in Jericho, the population density ranges between 3,763-4,190 inhabitants/km 2 . According to these statistics, Jericho City needs to expand between 2,000 and 7,000 acres. According to the PCBS statistics, there are nearly 3,386 buildings in Jericho City containing 4,549 units. Around 84.4% of these buildings are the property of their owners. The city includes the city center, the old residential area, areas of moderate density expansion, urban areas, newly developed suburbs, as well as refugee camps. Each of these styles is different from the other depending on the purpose, number of housing units, the nature of the buildings, and population density. There are many housing projects in progress in the city. The position of Jericho City in the West Bank is shown in Figure 1.4. Figure 1.4: Position of Jericho City in the West BankSource: The Palestinian Central Bureau of Statistics (2010) 11 1.5 Thesis Outline This thesis is divided into six chapters. Chapter one presents a general background, the problem of the study, the objectives of the study, study area, and thesis outline. Chapter Two is concerned with the literature review. Chapter Three discusses the methodology, while Chapter Four describes the field survey and data collection. Chapter Five addresses data analysis and results. Finally, Chapter Six presents the main conclusions and recommendations of the study. 12 Chapter Two Literature Review 13 Chapter Two Literature Review 2.1 Overview of Trip Generation Trip generation analysis involves estimation of the total number of trips entering or leaving a parcel of land as a function of the socio-economic, locational, and land use characteristics of the parcel. The function of trip generation analysis is to establish meaningful relationships between land use and trip making activity so that changes in land use can be used to predict subsequent changes in transport demand (Paquette and Ashford, 1982). 2.2 Literature Review of Trip Generation Variables This section addresses the literature of the explanatory variables that are included in trip generation models in studies conducted both in developed as well as developing countries. Indeed, several research papers studied the variables that affect trip generation and found some significant relationships between trip generation and the above mentioned characteristics. Most of the empirical research has been conducted in the US and other developed countries. However, some empirical research has been conducted in other developing countries. 2.2.1 In the Developed Countries Hunt and Broadstock (2010) constructed a trip generation model for a cross section of residential developments around the UK. The empirical model tested whether trip making patterns for residential developments are http://eprints.port.ac.uk/view/local_creators/5162.html http://eprints.port.ac.uk/view/local_creators/5072.html 14 independent of car ownership. The result was that trip generation is dependent upon car ownership, socio-economic factors and site-specific characteristics, in particular land-zone type. However, public transport services are not found to have a significant relationship with trip generation. Consequently, a policy implication of the results was that increasing bus services to residential developments is not associated with a reduction in generated trips. Guiliano (2003), Guiliano and Dargay (2006), and Guiliano and Narayan (2003) found significant differences in travel behavior between different demographic groups in the USA and the UK. Their data showed that American participants made 4.4 trips per day travelling approximately 31 miles (49.6 km), whereas the British participants travelled only 16 miles (25.6 km) in 3 trips per day. The method of data collection however varied between the US and the British studies. The USA data was collected by telephone using a stratified sample with participants using a one day „recall‟ diary. British participants were selected using a stratified random sample based on post code. The British participants were interviewed directly and were required to complete a seven day travel diary. Thus, methods of sampling and data collection varied between the two groups. The authors found that participants aged 65 years or older in the UK travelled half the distance and were less likely to travel on any given day than participants aged 18 – 64 years. In the USA study, participants aged 65 years or more travelled 60% of the distance of the younger participants. The authors also suggested that lower household incomes in the UK 15 compared with the USA produced lower travel demand and car ownership. Gender, age, and household income were all found to influence travel behaviors and that there were significant differences between travel behaviors in the USA and the UK. It was found that the significantly higher transport costs in the UK led to a decrease in trips. Newbold et al. (2005) studied the travel behaviors of Canadians aged 65 years or more to determine if their travel patterns were different from younger Canadians. Their study used data from the General Social Survey (GSS) of Canada. The data from approximately 19,000 participants provided a partial confirmation of the research question but recognized that factors other than age can influence travel behavior. The results showed that older Canadians make fewer daily trips than younger Canadians, but as expected, this could be caused by the fact that the participants in the study were no longer employed and hence were no longer making travel-to-work journeys. Thus, daily trip numbers and duration decreased significantly due to changes in employment and health status. In addition, there was a greater reliance on the car and a significant reduction in the use of public transport as the principal travel mode compared with younger Canadians. Best and Lanzendorf (2005) conducted an empirical study in Cologne, Germany to determine if there were gender differences in car use and travel patterns for maintenance travel. Overall, the authors found that there were no significant differences in the total number of trips or distances travelled between men and women. 16 However, the type or destination of trips did provide some gender differences. The authors found that women made fewer journeys to work by car and more journeys for non-work activities such as shopping and child-care. This was also confirmed by Boarnet and Sarmiento (1998) in their study of travel behavior in southern California. Moriarty and Honnery (2005) studied urban travel in all Australian State capital cities. Although the major emphasis was on studying the relationship between the distance from place of residence to the Central Business District (CBD) of each city and the impact on travel behavior. Their study found that women on average travel less often and for shorter distances than men. Olaru et al. (2005) studied the travel behavior in the Sydney Metropolitan area, Australia and found that a number of socio-demographic variables influenced travel behavior. Women were more likely to travel closer to home than men, particularly if they came from a non-English speaking household. Simma and Axhausen (2004) conducted a study to explore the impacts of personal characteristics and the spatial structure on travel behavior, especially mode choice in Upper Austria. The spatial structure is described among other things by accessibility measures. The models were estimated using structural equation modeling (SEM). The models were based on the 1992 Upper Austrian Travel Survey and the Upper Austrian Transport Model. 17 The results highlighted the key roles of car ownership, gender and work status in explaining the observed level and intensity of travel. The most important spatial variable was the number of facilities, which can be reached by a household. The municipality-based variables and the accessibility measures have rather little explanatory power. The reasons for this low explanatory power were considered. Although the findings in this study indicated that the spatial structure is not a decisive determinant of traffic. The results provided useful hints for possible policy alternatives. Polk (2004) wrote a paper to test the influence of gender on daily car use and on willingness to reduce car use in Sweden. Car use was modeled in terms of practical factors combined with manifestations of the specific influence of gender. Willingness to reduce car use was modeled in terms of attitudinal factors using a theory of environmentalism. The results confirmed the existence of a gender component. The author found a significant relationship between sustainable travel patterns and gender. Women were more willing to reduce their use of the car than men, more positive towards reducing the environmental impact of travel modes, and more positive towards ecological issues. The concluding discussion suggested that more research is needed to further theoretical understanding and methodological expertise regarding how gender can be modeled in travel research in order to attain current policy regarding gender equal transportation system. 18 Georggi and Pendyala (2001) conducted a detailed analysis of long- distance travel behavior in the USA for two key socio-economic groups of the population: the elderly and the low income. The analysis utilized data from the 1995 American Travel Survey that provided a rich source of information on long-distance travel undertaken over a period of 12 months. The analysis focused on comparing the elderly and the low-income groups of the population against other groups with respect to various demographic and trips characteristics. The travel behavior comparison included an analysis by trip purpose, travel mode, distance, trip duration, and trip frequency. In addition, regression models of long-distance trip generation were estimated separately for different groups to examine differences in trip generation propensity across the groups. The results showed that both the elderly and the low income undertake significantly fewer long-distance trips than other socio-economic groups. It was found that nearly half of the low income and elderly made no long- distance trips in the one-year survey period. In addition, it was found that long-distance trips made by these groups were more likely to be undertaken by bus and geared towards social and personal business activities. Badoe and Steuart (1997) estimated the total number of household shopping trips for the Greater Toronto Metropolitan area, Canada by using such variables as household size, number of workers, number of licensed persons, and number of vehicles. Regression analysis was used with 1964 and 1986 data for the Greater Toronto Metropolitan area. 19 The different model specifications provided little explanation of the variation in household shopping trips. The household size was found to have little explanatory power. The authors concluded that different approaches are needed to explain the variation in non-work trips, including shopping trips. Golob (1989) undertook a study in Germany to model the causal relationship, at the household level, among income, car ownership, with trip generation. The study was based on data obtained from the Dutch National Mobility Panel, which consisted of approximately 1,800 households, stratified by life cycle. The author found that car ownership directly affects public transport trip making, with additional effects from income. The author also stated that there are direct links from the lowest and highest income categories to public transport demand. Vickerman and Barmby (1984) analyzed the relationship between shopping expenditure and shopping trips. The data used in their study were collected as part of an earlier research project on shopping travel. The study sample consisted of 1,074 households in 25 districts in the County of Sussex, England. The authors estimated the number of weekly shopping trips and the total weekly shopping expenditures using simultaneous equations. The explanatory variables included household size, income, and auto ownership. It was found that income has little effect on trip making, but that accessibility and the cost of travel are important factors. 20 Stopher and McDonald (1983) conducted a trip generation analysis on data from the Midwest, USA using multiple classification analysis (MCA) in contrast to linear regression analysis. The household-structure variable was tested using both analysis of variance and MCA to determine how well the variable performs in various model structures when compared with other variables. The variables tested were the number of cars or vehicles available to the household, household size, housing type, total number of employed persons, household income, and total number of licensed drivers. The analysis concluded that the household-structure variable did not perform significantly better than the other variables tested. Supernak et al. (1983) presented a person-category model of trip generation as an alternative to household-based trip-generation models in Washington, D.C., USA. In their model, a homogeneous group of persons was used as an analysis unit. The variables of age, employment status, and automobile availability were found to be the most significant descriptors of a person's mobility. The final version of the model was based on eight person categories. Oldfield (1981) carried out an analysis on the effect of income on bus travel in the UK. The study was based on Family Expenditure Surveys and the National Travel Survey and considered mainly the non-car-owning households. The author examined some cross-sectional data on bus travel as a function of household income, and the way in which it changed over time, with the intention of seeing whether demand elasticity with respect to earnings can be estimated. Unfortunately, no consistent pattern had 21 emerged from the analysis to enable demand elasticity with respect to earnings to be fixed with precision. Downes et al. (1978) conducted a study on household and person trip generation model. The authors used a single data bank, comprising over 60,000 trips from sample household surveys in the UK to compare two alternative types of trip generation models; one based on household trip rates and the other on person trip rates for each household. Their performance was found to be similar, each accounting for over 50% of the variability in household trip rates, but the person trip rate model had been shown to be simpler to use and statistically more acceptable. The most important variables for modeling home-based trips were household size and car ownership in both types of models. Work trips required only household employment in a household rate model and car ownership in an employed person rate model. Household location and the year of study had a small but discernible effect on trip rates due to some reduction in the inner and middle area rates between the two years. Robinson and Vickerman (1976) were concerned with the development of a basic methodology for the future study of shopping. The authors demonstrated how the existing methodology failed to allow for some of the most important dimensions of the shopping choice decision. A revised model was developed and tested against especially collected data from a cross-section of households in Sussex, South East England. The authors demonstrated the importance of attraction and accessibility in determining variations in levels of household shopping activity. 22 2.2.2 In the Developing Countries Moussa (2013) conducted a study to develop a trip generation model for Gaza City, Palestine to determine the household travel characteristics pattern in the study area. The study also aimed to compare trip rates as modeled by the Conventional Cross Classification (CCA) method with that of Multiple Cross Classification (MCA) method in Gaza City. Furthermore, the researcher aimed to develop a trip attraction model using Multiple Linear Regression technique (MLR). Household interview survey was conducted to collect the primary data. The survey was distributed to 425 households in different districts of Gaza City. The results indicated that vehicle ownership, household size, income level, and the number of licensed drivers in the household are the main factors that affected trip production in Gaza City. In addition, the results showed that the Multiple Cross Classification (MCA) models are more effective in expressing trip rates for trip production than the Conventional Cross Classification (CCA) models. Sofia et al. (2012) developed a relationship between the daily household trips and socio-economic characteristics for Al-Diwaniyah City, Iraq. The authors used the stepwise regression technique (multiple linear regression) after the collected data had been fed to the SPSS software. The city was divided into 5 sectors with 70 zones covering an area of 52 km 2 . Home questionnaire forms were distributed through arrangements with the secondary, industrial, commercial school administrations, and some colleges. The results showed that the trip generation model mainly depends 23 on family size, gender, the number of workers, and the number of students in the family. Sarsam and Al-Hassani (2011) developed statistical models to predict trip volumes for a proper target year in Al-Karkh side of Baghdad City, Iraq. Non-motorized trips were considered in the modeling process. The traditional method to forecast the trip generation volume according to trip rate, based on family type, was used in the study. Families were classified by three characteristics: social class, income, and number of vehicle ownership. The study area was divided into 10 sectors. Each sector was subdivided into a number of zones. The total number of zones was 45 based on the administrative divisions. The trip rate for the family was determined by sampling. A questionnaire was designed and interviews were made for data collection from the selected zones. Two techniques were used, full interview and home questionnaire. The questionnaire forms were distributed in many educational institutes including intermediate, secondary, and commercial schools. The developed models were total trips per household, work trips per household, education trips per household, shopping trips per household, and social trips per household. These models were developed using stepwise regression technique after the collected data had been fed to the SPSS software. The results showed that total trips per household are related to the family size and the structure variables such as the number of persons who are above 6 years of age, the number of males, the total number of workers, the total number of students in the household, and the number of private 24 vehicles. The model had coefficient of determination equals to 0.67 for the whole study area. The results also showed that the home-based work trips are related to the number of workers in the household, number of male workers in the household, number of female workers in the household, and number of persons between 25 and 60 years of age. This model has coefficient of determination equals to 0.82 for the whole study area. Home-based education trips are strongly related to the number of students in the household and this model had coefficient of determination equals to 0.90 for the whole study area. Priyanto and Friandi (2010) developed a trip generation model for public transport passengers in Yogyakarta, Indonesia, using multiple linear regression analysis. The authors established a relationship between the number of trips and socio-economic attributes. The data consisted of primary and secondary data. Primary data were collected by conducting household surveys, which were randomly selected. The resulted model showed that public transportation trips seem to have negative correlation with income, motorcycle ownership, and car ownership. This means that the number of trips made by people decreases as income, the number of motorcycles, and cars owned increase. It is different from the general trip generation model (the model for all trips by either private or public transportation) where the number of trips commonly rises with the increase in income, motorcycle, and car ownership. 25 The model also showed that the number of public transport trips increases as the family size increases. Commonly, the higher the number of family members, more public transport trips will be made. Pettersson and Schmocker (2010) conducted a study to analyze travel patterns by those aged 60 or over in Metro Manila, The Philippines. Trip frequency and tour complexity were analyzed with ordered probit regression, separating the effects of socio-demographic characteristics as well as land-use patterns. The results were compared to observations made for cities in developed countries, in particular London as an example for a city in a first world country. The authors showed that there is a pronounced decrease in total trips made with increasing age in Manila. However, analyzing for specific trip purposes, the authors found, similarly to trends in developed countries, that the number of recreational trips is fairly constant in all age groups. Said (1990) conducted an empirical study to estimate work trip rates for households in Kuwait using a generalized linear model (GLM). Seven different household groups were identified from the 1985 census. One of these groups, Kuwaiti households living in villas, was used for some illustrative GLM analysis. The study showed that work trip rates of this household group are influenced by car ownership, household size, and the interactive effect of these two variables. Deaton (1985) undertook an empirical analysis on income and trip making for developing countries and cities including Hong Kong, Sri Lanka, Tunisia, Thailand, and several suburbs of Delhi, India. In this paper, the http://www.sciencedirect.com/science/article/pii/019126079090056C 26 author illustrated that data from household surveys from developing countries and cities could be used for analyzing the relationship between the household income/expenditure and the trip characteristic. The author also suggested that households spending anything on travel tended to increase with income in all surveys. The author also mentioned that conditional on vehicle ownership; both private and public transport demands were income elastic. Maunder (1981) undertook extensive household surveys in two different income communities in Delhi, India. The two study areas were Shakarpur and West Patel Nagar. Shakarpur was a low/middle income unauthorized residential community while West Patel Nagar consisted of both middle and upper income residents. Although the average monthly household income in West Patel Nagar was 50% more than that in Shakarpur, the trip rate in the two communities was similar. However, it was worth noting that both communities had a large proportion of household in possession of bicycles (i.e. 30% in West Patel Nagar and 48% in Shakarpur). Also, in Shakarpur, the bicycle was the major personal vehicle. Heraty (1980), with the cooperation of the Jamaican Government, carried out a study on the public transport in Kingston, Jamaica and its relation to low income households. In the study, the author analyzed the role public transport played in the life style of low-income households from the Jamaican Government survey data and in depth research with low-income households. The author concluded that household expenditures on travel increased with income. However, the percentage of total household income 27 spent on travel was greater for low-income families. The author also mentioned that car ownership was strongly related to income so that the relationship of public transport was greater for the lower income group. 2.3 Summary of Literature Review To summarize, a range of socio-demographic variables influences trip generation. These variables include gender, age, number of children, household income, employment status, number of workers in the household, retirement status, education status, vehicle ownership, family size, and type of house. In terms of the methods that are used in developing trip generation models, the primary tool used in modeling trip generation is the regression analysis method. Two types of regression are commonly used. The first uses data aggregated at the zonal level, with average number of trips per household in the zone as the dependent variable and average zonal characteristics as the explanatory variables. The second uses disaggregated data at the household or individual level, with the number of trips made by a household or individual as the dependent variable and the household and personal characteristics as the explanatory variables. 28 Chapter Three Analytical Framework and Methodology 29 Chapter Three Analytical Framework and Methodology 3.1 General Steps of the Methodology The research will mainly rely on the following methodology: 1. Desk and Internet Research: This step is done by reviewing the related literature including publications on trip generation analysis using different methods including the linear regression method. 2. Study Area Selection: This involves selecting the boundaries of the study area and dividing the city into traffic zones according to set criteria and benefiting from maps made available by the Municipality of Jericho or other relevant agencies. 3. Selecting Sample Size and Designing Household Questionnaire: This involves identifying needed information and designing a questionnaire for proper data collection of relevant socio-economic variables and trip data. 4. Collecting the Required Data: This is related to field data collection from the sample of households in the different traffic zones. Required data include the total number of trips made by household members, number of trips made by household for each category of trip production, whether according to trip purpose or trip time, through home interview survey. 5. Analyzing Available Data: This is performed depending on relevant computerized programs. 30 6. Building Models: This includes estimation of feasible models for the total number of trips produced by the household, and the total number of trips for each category of home-based trips (either according to trip purpose or trip time) utilizing linear regression method. 7. Selection of Appropriate Models: This is done using statistical and rational methods available to choose the most appropriate model that predicts the total number of trips produced by the household, and the models that predict the number of trips for each category of home-based trips. 8. Verification of Results: This is done by comparing the outputs of models with the actual numbers to verify these models, and make calibration of the variables used in model building. The details of steps of methodology are shown in the following sections and chapters. 3.2 Methods of Survey Once the questionnaire is ready, the next step is to conduct the actual survey. There are various types of survey methods, and each has its own advantages and disadvantages. Before choosing the proper survey method, there is a need to know the characteristics of the people that will be surveyed, the sample size, the cost, the expected response rate, and the size of study area. For example, if the target population is composed of college students, the online survey method would be preferable. In terms of sample size, the number of respondents in the sample should be considered when 31 choosing the survey method. Online surveys or mail-back surveys are best for surveys requiring a hundred or thousand responses. On the other hand, telephone surveys are ideal for studies requiring limited responses. If the cost or size of study area is limited, the online or mail-back surveys are preferable. For better response rates, the personal interview survey is preferable. The most common methods of survey are discussed below. 3.2.1 Personal Interview Surveys In this method, also known as a face-to-face survey, the researcher visits the home of the respondent and asks the questions and fills up the questionnaire by himself. This method is considered to be the most effective one. The purpose of conducting a personal interview survey is to explore the responses of the people to gather more and deeper information (Sincero, 2012). There are two types of personal interview surveys according to how the interviewer approaches the respondents: intercept and door-to-door interviews. In an intercept interview, the interviewer usually conducts a short but concise survey by means of getting the sample from public places such as malls, theaters, food courts, or tourist spots. On the other hand, a door-to-door interview survey involves going directly to the house of the respondent and conducting the interview either on-the-spot or at a scheduled date (Sincero, 2012). The personal interview surveys have many advantages including high response rate, more precise data, and better observation of respondents' behavior. However, personal interview surveys are more expensive than 32 other types of surveys and more time-consuming because there is a need to travel and meet the respondents at different locations. 3.2.2 Telephone Interviews Conducting interviews by telephone may offer potential advantages associated with access, speed and lower cost. This method allows the researcher to make contact with participants with whom it would be impractical to conduct an interview on a face-to-face basis because of the distance and prohibitive costs involved and time required. Even where „long-distance‟ access is not an issue, conducting interviews by telephone may still offer advantages associated with speed of data collection and lower cost. In other words, this approach may be seen as more convenient. However, there are a number of significant issues that are taken into account against attempting to collect qualitative data by telephone contact. Seeking to conduct qualitative interviews by telephone may lead to issues of reduced reliability, where the participants are less willing to engage in an exploratory discussion, or even a refusal to take part. There are also some other practical issues that would need to be managed. These relate to the researcher's ability to control the pace of a telephone interview and to record any data that were forthcoming. Conducting an interview by telephone and taking notes is an extremely difficult process and so using audio-recording is recommended. In addition, with telephone interviews the researcher loses the opportunity to witness the non-verbal behavior of participants, which may adversely affect the interpretation of how far to pursue a particular line of http://explorable.com/types-of-survey 33 questioning. Participants may be less willing to provide the researcher with as much time to talk to them in comparison with a face-to-face interview. The researcher may also encounter difficulties in developing more complex questions in comparison with a face-to-face interview situation. For the above mentioned reasons, interviewing by telephone is likely to be appropriate only in particular circumstances. It may be appropriate to conduct a short, follow-up telephone interview to clarify the meaning of some data, where the researcher has already undertaken a face-to-face interview with a participant. It may also be appropriate where access would otherwise be prohibited because of long distance. 3.2.3 Mail-Back Surveys The researcher sends the questionnaire to the respondents and asks them to fill the details and mail them back with required information. Care should be taken to design the questionnaire so that it is self-explanatory. The mail-back surveys save the researchers' time. In addition, this method is considered a quick process of collecting data. It involves minimal cost in comparison with other methods of survey like face-to-face surveys. The main disadvantages of mail-back surveys are that they need more effort in questionnaire designing because in order to be clear and concise to encourage every respondent to reply. This method has a low response rate and generates less precise data in comparison with other methods of survey. 3.2.4 Online Surveys For the past few years, the Internet has been used by many companies in conducting all sorts of studies all over the world. Whether it is market 34 or scientific research, the online survey has been a faster way of collecting data from the respondents as compared to other survey methods such as personal interviews (Sincero, 2012). Advantages of online surveys include ease of data gathering, minimal cost, automation in data input and handling, an increase in response rate, and flexibility of questionnaire design. However, online surveys have some disadvantages such as the absence of interviewer, inability to reach some target groups of population, and survey fraud. 3.2.5 Data Collection Method Used The method of survey that is used in this research to collect data is the personal interview (door to door). This method is used because it generates the highest response rate and the most precise data. In addition, the area of study is not so large and the target population is comprised of all kinds of people. 3.3 Sample Size Calculation Methods Unfortunately, there is no straightforward and one objective answer to the question of the calculation of sample size. Determining the sample size is a problem of trade-offs (Ortuzar and Willumsen, 1996): 1. A much too large sample may imply a data-collection and analysis process, which is too expensive given the study objective and its required degree of accuracy. 2. A far too small sample may imply results which are subject to an unacceptably high degree of variability reducing the value of the whole exercise. http://explorable.com/research-methodology http://www.wisegeek.com/what-are-online-surveys.htm http://explorable.com/selecting-the-survey-method 35 Therefore, between these two extremes lies the most efficient (in cost terms) sample size for the given the study objective. Two major methods of sample size calculation are usually used, which are presented below. 3.3.1 Standards of Bureau of Public Roads (BPR): The sample size can be determined based on the basis of population of the study area, and the standards given by the USA Bureau of Public Roads (BPR) are as shown in Table 3.1. Table 3.1: Standards of Bureau of Public Roads (BPR) for Sample Size Calculation Population of Study Area Sample size (Dwelling Units) Recommended Minimum Under 50,000 1 in 5 (20%) 1 in 10 (10%) 50,000 – 150,000 1 in 8 (12%) 1 in 20 (5%) 150,000 – 300,000 1 in 10 (10%) 1 in 35 (2.86%) 300,000 – 500,000 1 in 15 (6.67%) 1 in 50 (2%) 500,000 – 1,000,000 1 in 20 (5%) 1 in 70 (1.43%) Over 1,000,000 1 in 25 (4%) 1 in 100 (1%) Source: U.S Bureau of Public Roads, 1967 3.3.2 Sample Size Statistical Formulas: The sample size can be calculated according to statistical formulas. These formulas can be found in most statistics textbooks, especially descriptive statistics dealing with probability. The sample size of infinite population can be calculated as: Where: : Sample size required. : Standard normal statistic corresponding to ( confidence level. 36 : Fraction of area under normal curve representing events not within confidence level (Thus, is the desired level of confidence). P: Percentage of population picking a choice, expressed in decimal. C: Confidence interval or tolerable margin of error, expressed in decimal. The sample size of finite population is calculated as follows: Where: : Old sample size. P: Population of the study area. Therefore, the sample size is calculated using the infinite population formula first. Then the sample size derived from that calculation is used to calculate a sample size for a finite population. 3.4 Overview of Linear Regression Method There are several methods used in relating trips produced or attracted to the causal factors, which include the regression method and the cross- classification method. Linear regression method assumes that observations of the magnitude of the dependent variable, Y (number of trips), can be obtained for n observations of explanatory variables (Xs), and that an equation of the form is to be fitted to the data. This equation is for the single 37 independent variable case. The equation may be for independent variables as in the equation: Estimating the best regression line depending on the coefficient of determination (R²), the t–test for the parameters, and the F-test can be done using computer programs like Excel, SPSS or XLSTAT. 3.4.1 The Linear Regression Analysis Process Regression equations can be developed through the following sequence of steps (Hutchinson, 1974): 1. Examining the relationships between the dependent variable and each of the independent variables in order to detect nonlinearities. If nonlinearities are detected, the relationship must be linearized by transforming the dependent variable, the independent variable, or both. 2. Developing the inter-correlation matrix involving all the independent variables. 3. Examining the simple correlation matrix in order to detect the potential sources of multicollinearity between pairs of the independent variables. 4. After examining the correlation matrix, if there is multicollinearity between two independent variables (much closer to one), then one of them must be eliminated from the regression process. 38 5. After choosing the correlated independent variables, some of regression equations are suggested, and then the parameters of each of the potential regression equations are estimated. 6. For every model built, relevant tests are conducted to assess the goodness of the model based on logic and statistics testing. The statistical tests include the coefficient of determination (R²), the t- test for the parameters of the models, and the F-test for each model. In addition, there are logical aspects that must be taken in consideration, because the model must be valid statistically and logically. 7. After all the models or equations are examined based on test results, the outcome is summarized. The best model is then chosen. 3.4.2 Regression Model Building Approaches The most commonly used approaches in selecting the explanatory variables to be included in the regression model are the forward selection approach and the backward elimination approach. The forward selection approach starts with a model that contains only the constant term. At each step, the explanatory variable that results in the largest change in overall fitness is added to the model. This process is repeated until there is no more variable that results in a significant increase in overall fitness. The backward elimination approach starts with a regression model that contains all the explanatory variables. At each step, the explanatory variable that changes the overall fitness least is removed. The removal 39 process is repeated until removal of any variable results in a significant change in overall fitness. When deciding which explanatory variable should be removed from the regression model, the t-test is usually employed. Generally speaking, the explanatory variable with the least t- value is selected for removal. In this research, the backward elimination approach is employed for building the trip generation models. 3.5 Unit of Analysis Variables can generally be inferred from the unit of analysis. For travel modeling, two units are normally considered: the household and the individual. The household unit is preferred for various reasons: from a trip making point of view, the home is the basis where most trips start and end; from an economic point of view, income or car-ownership are usually shared by all members of the household; or from a social context, the family constitutes the „cell society‟ where all basic needs are usually met. Alternatively, if the individual is considered to be the base unit then the problem of allocation of some of the above mentioned variables needs to be overcome, or different quantities have to be taken into consideration. Therefore, the household is used as the unit of analysis in this research. 3.6 Data Analysis Software There are many statistical softwares that can be used for data analysis. The Statistical Package for the Social Sciences (SPSS) is one of the most widely used statistical analysis packages. The SPSS program provides a wide range of procedures and tests used in statistics. Moreover, it offers 40 descriptive statistics such as frequencies, means, and correlations. Finally, it is useful for making charts. In this research, the SPSS software will be used to estimate the trip generation models using the linear regression analysis method. 3.7 Model Specification The multiple regression analysis is one of the popular forms of model structure, which can be applied for trip generation models. Consequently, multiple regression equations will be used for developing the trip generation models for this study. The most common form of trip generation models is the linear function of the form Where is the constant term, are the explanatory variables, are the partial slope coefficients of the regression equation, and u is an error term that is assumed to be a random variable. The coefficients of the regression equation can be obtained by doing regression analysis using statistical analysis software. The above equation is called multiple linear regression equation. 3.8 Models Estimation After reviewing the related literature, the trip generation models will be estimated. The models will be estimated using the multiple linear regression technique by regressing the dependent variable on each of the explanatory variables. 41 In this research, the trip generation models are divided into three categories. The first includes a general trip generation model. The second includes trip generation models according to trip purpose. The final category includes trip generation models according to trip time. 3.8.1 General Trip Generation Model The general trip generation model will be estimated according to the following linear regression equation: 3.8.2 Trip Generation Models by Trip Purpose The work trip generation model, the education trip generation model, the shopping trip generation model, the social trip generation model, and the recreational trip generation model will be estimated according to the general form of linear regression equation: Where i=1 to 5, covering the five trip purposes. 3.8.3 Temporal Trip Generation Models Five temporal trip generation models will be estimated. These models include the trip generation model for trips made before 8 AM, the trip generation model for trips made between 8-9 AM, the trip generation model for trips made between 9 AM-12 PM, the trip generation model for trips made between 12-4 PM, and the trip generation model for trips made after 4 PM. The trip generation models according to timing will be estimated according to the following equation: 42 Where k=6 to 10, covering the five periods under consideration. 3.9 Statistical Tests The most common statistical tests that are used in model selection process are discussed below. 3.9.1 Correlation Matrix and VIF: Testing for Multicollinearity The problem of multicollinearity arises when two or more explanatory variables included in the regression model have linear relationships. There are two types of multicollinearity: exact or perfect multicollinearity and inexact or imperfect multicollinearity. If two or more independent variables have exact or perfect linear relationships, then we have exact or perfect multicollinearity. On the other hand, inexact multicollinearity occurs in the case that two or more independent variables are highly correlated. In the case of exact multicollinearity, there is no unique solution to the normal equations derived from the least squares principle. As a result, the regression coefficients cannot be estimated. By contrast, when explanatory variables have less than exact linear relationships, the normal equations can usually be solved to yield unique estimates. However, there are some consequences. First, regression coefficients are still best linear unbiased estimators (BLUE). Second, it is more difficult to precisely identify the separate effects of the correlated explanatory 43 variables. Finally, variances and standard errors are usually higher, making t-statistics lower and possibly insignificant. Analysis often relies on what is called the Variance Inflation Factor (VIF) to detect multicollinearity more formally. The VIF shows how the variance of an estimator is inflated by the presence of multicollinearity. As R 2 , the coefficient of determination of a given explanatory variable with other remaining explanatory variables in the model, increases toward one, the VIF also increases. The larger value of VIF, the greater the degree of muticollinearity of one explanatory variable with the other explanatory variables. As a rule of thumb, if the VIF of a variable exceeds 10, that variable is said be highly collinear (Kleinbaum et al., 1988). 3.9.2 R-Squared: Goodness of Fit The R-squared (R 2 ), also known as the coefficient of determination, measures the goodness of fit of the regression model. It measures the proportion of the total variation in the dependent variable that can be explained by the explanatory variables included in the model. The value of R-squared lies between 0 and 1. A value of R-squared closer to 1 indicates that the model has good fit, whereas a value closer to 0 indicates that the model has poor fit. However, there is no standard on how high R 2 value is “good” enough. It depends on the application. The R-squared is calculated as follows: 44 The ANOVA test results are used to show the analysis of the total variance in the dependent variable. This variance is divided into two sources: variance due to regression and variance due to errors. 3.9.3 F-Test: Testing Overall Significance of Model The F-statistic is used to test whether the regression coefficients are jointly equal to zero or not. In other words, the F-test is used to test the overall significance of the regression model. The null hypothesis for testing the overall significance of the model is that the regression coefficients for the explanatory variables are all equal to zero. The alternative hypothesis is that at least one of these coefficients is not equal to zero. Usually, a 95% level of significance for the F-value is accepted. Table 3.2 is a format of ANOVA test results table. Table 3.2: ANOVA Test Results Table Source Sum of Squares Degrees of Freedom Mean Square F-Statistic Significance Regression Residual Total 3.9.4 T-Test: Testing Individual Coefficients The t-statistic is used to test the significance of individual regression coefficients. As a rule of thumb, if the calculated t-statistic is greater than two in absolute value, it is concluded that the estimate is statistically different from zero at the 95% level of significance. 45 3.10 Logical Aspects Used in Model Selection The following is a list of logical aspects that should be taken into consideration in selecting the best regression model: 1. The regression coefficients must have the correct expected sign and their magnitudes must be reasonable. For example, an explanatory variable is expected to have a positive effect on the dependent variable, whereas the result of regression analysis shows a negative effect. 2. The constant (intercept) term must be reasonable both in value and sign. 3.11 Determination of Study Area Urban transportation planning means planning the transportation facilities based on the potential of growth of urban area for the next coming decades. So, the study area must cover the existing and potential continuously built- up areas of the city. The study area is specified by imaginary line representing the city boundaries, called external or outer cordon line. There are four factors for selecting and fixing this cordon line (Arasan, 2012): 1. The external cordon line should cover the built-up areas and the areas that will be developed in the future during the planning period. Figure 3.1shows the Draft Master Plan of Jericho City for the year 2013.There is an imaginary line representing the approved expansion areas, and depending on this line, the external cordon line is fixed as the boundary of the study area. 2. The external cordon line should include all areas of systematic daily life of people, oriented towards the city center and should in 46 effect be the commuter shed. Because there are areas beside Jericho City that include systematic daily life and affect travel patterns, these areas are included in the external cordon line. These areas are Al-Nwe'ma and Ein Dyok Alfoqa. 3. The external cordon line should be continuous and uniform in its course, so the movements cross it only in one point. The line should intersect roads where it is safe and convenient for carrying out traffic surveys. 4. The external cordon line should be compatible with the previous studies of the area or studies planned for the future. These previous studies could be related to census operation or population census. Figure 3.1: Draft Master Plan of Jericho City (2013) Source: Jericho Municipality, 2013. 47 3.12 Zoning System Zoning means dividing the study area after defining the boundary into smaller land use areas called traffic analysis zones (TAZ) or simple zones. The zones within the study area or external cordon line are called internal zones. The division of internal zones and their numbers will depend on a compromise between a series of criteria discussed in this section. The region external to the study area is normally divided into a number of external zones. In some cases, as in this case, it might be enough to consider each external zone to represent the rest of the surrounding regions in a particular direction; the boundaries of these different slices of the rest of the surrounding regions could represent the natural catchment areas of transport links feeding into the study area (Ortuzar and Willumsen, 1996). Zones are modeled as if all their attributes and properties were concentrated in a single point called the zone centroid. The centroids are connected to the nearest road junction or transport station by centroid connectors. The centroid and the centroid connectors are notional and it is assumed that all people have same travel cost from the centroid to the nearest transport facility which is the average for a zone (Ortuzar and Willumsen, 1996). The following are the general guidelines for zoning process (Ortuzar and Willumsen, 1996): 1. The zoning system must be compatible with other administrative divisions, particularly with census zones, this is probably the 48 fundamental criterion and the rest should only be followed if they do not lead to inconsistencies with it. 2. Zones should be as homogenous as possible in their land use and population composition, so as to reflect accurately the associated trip making behavior. There are different types of land uses in urban areas including residential, industrial, institutional, and recreational. 3. The anticipated changes in land use should be considered when sub- dividing the study area into zones. 4. Zone boundaries must be compatible with cordons and screen lines and with those of previous zoning systems. In this study, major roads are considered as zone border and commuter shed for particular traffic zone. On the other hand, external cordon lines are considered as part of boundaries of traffic zones, and wadis or rivers are also considered as zone boundaries. 5. The shape of the zones should allow an easy determination of their centroid connectors. This is important for later estimation of intra- zonal characteristics, and important for determination of the centroid, which represents the origin and destination of travel. 6. Zones do not have to be of equal size, if anything, they could be of similar dimensions in travel time units, therefore generating smaller zones in congested than in uncongested areas. 7. In general, zones should not be too large or too small, because the too large zone causes considerable errors in data and the too small zone causes difficulty in handling and analyzing the data. As a general guide, a population of 1,000-3,000 may be the optimum of 49 small area, a population of 5,000-10,000 may be the optimum of large urban areas, and as a whole the size of each TAZ between 0.25 to 1 squared mile, or 0.5 to 2.5 square kilometer. Also, no generated more than 15,000 person trips in base and future years (Cambridge Systematics, Inc., 2007). 8. Natural or physical barriers such as canals, rivers, and so on can form convenient zone boundaries. In this case, there are two rivers which can be considered as screen lines and zone boundaries as shown in Figure 3.2. (Cambridge Systematics, Inc., 2007). Figure 3.2: Major Roads and Natural Barriers in Jericho City Used for Specifying Zones Boundaries 50 9. Zone boundaries should preferably be watersheds of trip making (Arasan, 2012). If there are arterial roads, and there are a lot of commuters towards that arterial road for commuting that joining area, then the adjoining area becomes a commuter shed for this particular road, like the catchment area and the watershed (Arasan, 2012). Explanation of commuter shed is shown in Figure 3.3. Arterial roads in Jericho City, which are considered as a commuter shed, are shown in Figure3.3. Figure 3.3: Explanation of Commuter Shed The purpose of zoning system is to: 1. Facilitate spatial quantification of land use and economic factors, which influence travel pattern (Arasan, 2012). This can be done by matching the land-use plan with the boundaries of traffic zones as shown in Figure 3.4. After matching the land-use plan with zone boundaries, the area and land- use of each traffic zones are specified for this research as shown in Table 3.3. 51 Figure 3.4: Land Uses Plan with the Boundaries of Traffic Zones Source: Master Plan of Jericho (Jericho Municipality, 2010) 52 Table 3.3: Land Uses and Areas for Traffic Zones Traffic Zone No. Area (Km²) Land Uses in Traffic Zone 1 0.547 Residential, commercial, public facility. 2 2.479 Residential, commercial, public facility, agricultural, heritage. 3 2.409 Residential, agricultural, heritage. 4 1.254 Residential, commercial, public facility, parking, industrial, agricultural. 5 3.726 Residential, agricultural, commercial, heritage, cemetery. 6 2.049 Residential, agricultural, commercial, public facility, park. 7 1.668 Residential, agricultural, commercial, public facility, park, heritage. 8 2.142 Residential, agricultural, commercial, public facility. 9 2.113 Residential, agricultural, public facility. 10 4.185 Residential, commercial, public facility, agricultural. 11 1.262 Residential, commercial, public facility, agricultural, cemetery. 12 3.864 Residential, agricultural, public facility, cemetery, parking, industrial, commercial. 13 10.182 Residential, agricultural, public facility, commercial. 14 3.005 Residential, public facility, commercial, parking, agricultural. Source: Master Plan of Jericho (Jericho Municipality, 2010) 2. Help in geographically associating the origins and destinations of travel (Arasan, 2012). According to all previous points, the final TAZ's adopted for Jericho City are shown in Figure 3.5. 53 Figure 3.5: TAZ's of Jericho City 54 Chapter Four Field Survey and Data Collection Chapter Four 55 Field Survey and Data Collection 4.1 Population of Study Area In general, to specify the population of study area, statistics from the Palestinian Central Bureau of Statistics (PCBS) will be used. According to the specified borders of study area, there are refugee camps and villages outside the borders of Jericho Municipality, but these camps and villages, which are located just beyond the municipality borders have systematic daily life and affect the travel pattern of Jericho City. On the other hand, there are villages considered as part of Jericho and Valleys Governorate, but these villages will not be considered in this research because they are far from Jericho City. In order to calculate the sample size that will be used in this study, first the population size is specified from the PCBS as appears in Table 4.1. The population that will be considered in this study is 35,885 people. Based on a random sample of 100 household respondents, the average household size in the study area is calculated to be 5.03 persons. The number of households in the study area is thus 7,134households (35,885/5.03). Table 4.1: Localities in the Study and Estimates of Population, 2013 56 Population Locality Type Locality Code Locality Name 21,411 Urban 351920 Jericho City 1,453 Rural 351840 Al New'ma 958 Rural 351845 Ein Dyok Alfoqa 3,688 Refugee Camp 351865 Ein Al Sultan Refugee Camp 8,375 Refugee Camp 351975 Aqbat Jaber Refugee Camp 21,411 Total urban 2,411 Total rural 12,063 Total refugee camps 35,885 Total of Study Area Source: The Palestinian Central Bureau of Statistics, 2013. 4.2 Sample Size of Study Area According to the standards of the USA Bureau of Public Roads (BPR, 1967), the minimum sample size that will be taken in this study is 1 in 10 of dwelling units since the population of the study area is under 50,000.The sample size that will be considered in this study is 10% of the total number of households in the study area, 713 households (10% times 7,134), distributed over 14 TAZ‟s according to percentages and numbers shown in Table 4.2. In order to calculate the sample size based on Statistical Formulas, a random sample of 100 household respondents is initially taken to calculate the percentages of population (P) that make different types of trips. Then, these percentages are used in the equation to calculate the actual sample size required for this study as shown in Table 4.3. Table 4.2 Number of Households per Traffic Zone for Study Area and Sample Size Required 57 Traffic Zone No. # of Housing Units * Population ** % of Housing Units Sample Size 1 188 946 3% 19 2 489 2458 7% 49 3 190 958 3% 19 4 733 3,688 10% 73 5 289 1,453 4% 29 6 647 3256 9% 65 7 560 2816 8% 56 8 606 3047 8% 61 9 55 275 1% 5 10 301 1512 4% 30 11 580 2915 8% 58 12 162 814 2% 16 13 632 3179 9% 63 14 1703 8,838 24% 171 Total 7,134 35,886 100% 713 *Number of Housing Units in every traffic zone was found depending on the field survey. ** Population in every traffic zone was found by multiplying the number of housing units by the average size of household (5.03 person/household). Table 4.3: Sample Size Calculation According to Statistical Formulas Trip Type P 1-P α Z c Ss Modified S Work 0.22 0.78 0.05 1.96 0.04 412 407 Education 0.29 0.71 0.05 1.96 0.04 494 488 Social 0.17 0.83 0.05 1.96 0.04 339 336 Shopping 0.24 0.76 0.05 1.96 0.04 438 433 Recreational 0.09 0.91 0.05 1.96 0.04 197 196 Total 1.00 1,880 1,859 So, the sample size according to the statistical formula is 370households (1,859/5.03). However, as the sample size calculated from the BPR is found to be 713, which is higher than the 370 households, the later will be adopted in this study. This higher sample size is adopted to get a higher degree of accuracy. 58 4.3 Sampling Method The sample that is drawn from the household population is a stratified random sample that includes 713 households. Stratification reflects geographic locations defined as TAZ. Therefore, the study area is divided into 14 TAZ's. The samples are randomly selected from each of the 14 TAZ's. The sample size of each TAZ is shown in Table 4.2. 4.4 Questionnaire Design In terms of layout, the order of the questions normally seeks to minimize the respondent‟s resistance to answering them; the survey instrument (and any personal interviews) should try to satisfy the following criteria (Ortuzar and Willumsen, 1996): 1. The questions should be simple and direct. 2. The number of open questions should be minimized, so long answer avoided and use of code for answer preferable. 3. Difficult questions (e.g. relating to income) are usually put at the end. 4. Travel information must include the purpose of the trip. 5. Because of the growth of non-car modes, seek information about all modes of travel, including non-motorized travel. 6. Due to the growing importance of independent trips by children, and of non-motorized modes, all people in the household should be included in the survey, including non-family members, like maids in developing countries. 59 7. To facilitate the respondents task of recording all travel, it is recommended that an activity-recall framework be used; people record travel in the context of activities they have undertaken rather than simply trips they have made; this has been shown to result in much more accurate travel measurement. 8. Since people have difficulty recalling infrequent and discretionary activities, even when they are recent, it is recommended that travel day or days be assigned to each household in advance, they should be given a brief diary in advance of these days: The information in the diary may be then transferred to the self-completion form or reported to the interviewer at the end of the day. Anyhow, in this study, the recalling of trips or activities depended on the memory of persons in the household. 4.5 Required Information For any type of household survey, it is recommended that the survey be divided into two parts: 1. Personal and household characteristics and identification: This part includes questions designed to classify the household members according to their relation to the head of household (e.g. wife or son), sex, age, possession of a driving license, educational level, and occupation. In order to reduce the possibility of a subjective classification, it is important to define a complete set of occupations (non-household surveys are usually concerned only with person being interviewed; however, the relevant questions are the same or 60 very similar). This part also includes questions designed to obtain socio-economic data about the household, such as characteristics of the house, identification of the household vehicles, house ownership and family (Ortuzar and Willumsen, 1996). 2. Trip data: This part of survey aims at detecting and characterizing all trips made by the household members identified in the first part. A trip is defined as any movement outside a building or premise with a given purpose; but the information sought considers trips by stages, where a stage is defined by a change of mode (including walking). Each stage is characterized on the basis of variables such as origin and destination (normally expressed by their nearest road junction), purpose, start and ending times, mode used, amount of money paid for the trip, and so on (Ortuzar and Willumsen,1996). According to the points mentioned in the questionnaire design and the required information, the questionnaire form used in this thesis is shown in Appendix B. This questionnaire aims to collect in the first part personal and identification characteristics of households, which are considered as explanatory variables. These variables are used in the attempt to build the models, which will be examined statistically and logically to adapt the most appropriate model that can predict the trips generated by the TAZ's. The questionnaire contains two tables. The first table gives details about the socio-economic variables of households. Such details contain: gender, age, possession of driving license, possession of transportation facilities like private cars, bicycles, and motorcycles, house type (independent home 61 or apartment), and household income. The bicycles are taken into consideration because Jericho City is flat and there are bikes in almost every household. The explanatory variables that have been considered for possible use in the modeling process are summarized in Table 4.4. The age variable is divided into five groups as shown in Table 4.4. The first age group of ages up to 16 years includes persons who are assumed to be in schools. The second age group of ages between 17 to 30 includes persons who are either continuing education in universities or part of the labor force. The third and fourth age groups include persons who are heads of families and part of the labor force, which are divided into an age group from 31 to 50 and another from 51 to 64. Finally, the fifth age group of ages 65 and above includes elderly persons who are usually retired. The second table is designed to collect data concerned with the second part of survey (trip data). This table is made to calculate the total trips generated by the household and the number of trips for each type of trips (work trips, education trips, shopping trips, social trips, and recreational trips), and the trips according to their time. These trips are divided into five groups. The first group includes trips made before 8 AM. Trips made during this period are called the morning peak period trips which are usually made by most students and employees. The second time group includes trips that are made between 8-9 AM. During this time period, some students and employees make their trips. The third and fourth time periods include trips made between 9 AM -12 PM and 12-4 PM, respectively. These time groups are usually considered to be the AM and PM off peak periods. The 62 fifth time group includes trips made after 4 PM. This time period is considered to be the evening peak period. Usually, most of shopping, social, and recreational trips are made during this time period. All these trips are summarized as shown in Table 4.5. Table 4.4 : Explanatory Variables Used in the Models Number of persons in the household X1 Number of males in the household X2 Number of females in the household X3 Number of employed persons in the household X4 Number of persons receiving education in the household X5 Number of persons under 16 years in the household X6 Number of persons between 17 and 30 years in the household X7 Number of persons between 31 and 50 years in the household X8 Number of persons between 51 and 64 years in the household X9 Number of persons above 65 years in the household X10 Number of licensed drivers in the household X11 Number of cars owned by a household X12 Number of bicycles owned by a household X13 Number of motorcycles owned by a household X14 Monthly household income (Thousand New Israeli Shekel) X15 House type: 1if Independent, 0 if Apartment X16 Table 4.5: Dependent Variables Used in the Models Number of daily trips made by household Y Number of daily work trips made by household Y1 Number of daily educational trips made by household Y2 Number of daily shopping trips made by household Y3 Number of daily social trips made by household Y4 Number of daily recreational trips made by household Y5 Number of daily trips made by household before 8 AM Y6 Number of daily trips made by household between 8-9 AM Y7 Number of daily trips made by household between 9-12 AM Y8 Number of daily trips made by household between 12 AM - 4 PM Y9 Number of daily trips made by household after 4 PM Y10 4.6 Conducting Field Survey 63 After determining the sample size, designing the questionnaire, and choosing the method of survey, the field survey in this study was conducted according to the following steps (Mathew, 2011): 1. Two skilled enumerators are chosen to help in the survey process. 2. Enumerators have to be trained by briefing them about the details of the survey and how to conduct the survey. 3. The enumerators will be given random household addresses. 4. The random sample of households in different traffic zones will be numbered on a printed map and the same numbers are put on the questionnaire layout, where each number on the map and the questionnaire represents one household. 5. Since the actual survey may take place any time during the day, the respondents are required to answer the question about the travel details according to the previous day. This previous day must be typical working day because working days represent the most critical situation. In Palestine, the typical working days are from Sunday to Thursday. 6. The enumerators have to first get permission to be surveyed from the household. Then, they ask the household members about the details required in the questionnaire. Each member of the household should give answers about his or her own travel details, except for children below 12 years. Trip details of children below 5 years are normally ignored. 64 Chapter Five Data Analysis and Results Chapter Five 65 Data Analysis and Results 5.1 Descriptive Data In this section, descriptive statistics for both the dependent variables as well as the explanatory variables will be shown and discussed. This section is intended to give an overview of the distribution of various trips according to their purpose as well as to their time. This shows which trips are the most frequent. Moreover, this section includes figures that relate the number of daily trips with each of the explanatory variables so as to show graphically if these variables are related or not. This also shows if these relationships are linear or not. 5.1.1 Descriptive Data of Dependent Variables Table 5.1 lists discriptive statistics for the total daily household trips. There has been a total of 4,913 trips made by 713 households. Table 5.1: Descriptive Data for the Total Daily Household Trips Total Trips Mean Standard Deviation Maximum Minimum Range Y 6.89 3.096 16 0 16 It is apparent from Table 5.1 that the average daily trips per household is around 7. The maximum number of daily trips per household is 16 whereas the minimum number is 0. The discriptive statistics for the daily household trips according to their purpose are shown in Table 5.2. Table 5.2: Descriptive Data for the Daily Household Trips by Purpose 66 Trip Purpose Mean Standard Deviation Maximum Minimum Range Work 1.60 0.817 5 0 5 Education 1.84 1.575 6 0 6 Shopping 1.52 0.996 7 0 7 Social 1.03 1.143 7 0 7 Recreational 0.92 1.450 6 0 6 The discriptive statistcs for the temporal daily household trips according are shown in Table 5.3. Table 5.3: Descriptive Data for the Daily Household Trips by Time Trip Time Mean Standard Deviation Maximum Minimum Range Before 8 AM 2.58 1.640 7 0 7 8-9 AM 0.57 0.820 5 0 5 9-12 AM 0.24 0.501 3 0 3 12 AM - 4 PM 0.54 0.653 4 0 4 After 4 PM 2.95 1.820 10 0 10 Classification of trips according to their purpose is important to be carried out since people make the trips for various reasons. In this research, among 4,913 trips generated by 713 surveyed households, most of the trips (27%) are education trips. Work trips account for 23% of total trips generated. The distripution of household trips according to purpose is shown in Table 5.4. The same data are shown graphically in Figure 5.1. Table 5.4: Distribution of Daily Household Trips by Purpose Trip Purpose Number % Work 1,141 23% Education 1,310 27% Shopping 1,085 22% Social 727 15% Recreational 650 13% Total 4,913 100% 67 Figure 5.1: Distribution of Daily Household Trips by Purpose People do their activities in different times within a day. Hence, it is essential to carry out the temporal distribution of trips generated. According to the survey data, most of the trips are distributed within peak hours where 37% of trips are made during morning peak hours (before 8:00 AM) and 43% are made during afternoon peaks (after 4:00 PM). The temporal distribution of daily household tripsis shownin Table 5.5. The same data are presented graphically in Figure 5.2. Table 5.5: Temporal Distribution of Daily Household Trips Trip Time Number % Before 8 AM 1,841 37% 8 AM - 9 AM 409 8% 9 AM - 12 PM 174 4% 12 PM - 4 PM 383 8% After 4 PM 2,106 43% Total 4,913 100% 23% 27% 22% 15% 13% 0% 5% 10% 15% 20% 25% 30% Work Trips Education Trips Shopping Trips Social Trips Recreational Trips 68 Figure 5.2: Temporal Distribution of Daily Household Trips 5.1.2 Descriptive Data of Explanatory Variables The descriptive statistics for the household size are shown in Table 5.6. Table 5.6: Descriptive Data for Household Size Mean Standard Deviation Maximum Minimum Range