隨著現今科技不斷進步,對於大量數據的收集及分析,已不再是重大難題,如何從中萃取出能影響人們更頻繁的使用台灣的鐵道系統的因素,進而利用所擷取之規律以改善交通,才是重中之重。
基於政府與交通部鐵路管理局開放資料,以開放資料中的捷運、台鐵與台灣高鐵各站旅運量資料與中央氣象台的歷年天氣資料做大數據分析,找出天氣對捷運、台鐵與高鐵搭乘率之影響,並希望藉由分析結果做出預測搭乘率以供站方利用,例如:雨天、上班上課時段加開班次和人潮疏導等。
利用旅運量資料做為應變數(Response),搭配氣象資料做為因子(Covariate),將各天氣變數進行分類以及做探索,接著藉由決策樹(Decision Tree)模型對各站分別進行變數選擇,找出影響各車站搭乘意願的因子。
With the continuous innovation of science and technology, the Big Data collection and analysis are no longer a major problem. How to extract the factors, which lead human behavior of riding the railway system in Taiwan, in order to lower the traffic congestion are an important issue.
Based on the Big Data from the Railway Administration of the Ministry of Transport, we use the volume of the ridership of MRT, CRC and Taiwan High Speed Rail system and the annual weather data from the Central Air Station for Big Data analysis. The correlation between weather and volume of ridership of MRT, CRC and High-Speed Rail system is used to find the affecting factors and to predict the ridership. The affecting factor such as raining is used to adjust the shift to ease the crowded condition during working hours and so on.
By using the ridership data as response factor and the meteorological data as covariates, our approach could classify and explore the affecting weather factors. The comparison of decision tree and stepwise regression model is used for variable selections at each station. In this way, we could find out the factors that affect the willingness to ride at each railway station.