Modeling - Naive Bayes


Overview


Naive Bayes is a generative supervised machine learning classification method. The generative aspect indicates that the model learns from the probability of the data given previous knowledge of the label. This is in comparison to a discriminative model, where the goal is to find a function which distinguishes between groups (i.e. Logistic Regression). The supervised aspect means that the model is given labels to learn from. The Naïve aspect comes from the assumption that the categories have conditional independence in order to apply the Bayes’ Theorem. This is a Naive assumption because it’s unlikely that the variables within the data have true independence. For example, consider a model built from customer reviews which is trying to classify if the review was positive or negative. The review might have language such as “happy” and “glad”, which are clearly not independent terms. However, the assumption of independence is made to allow for the calculations to work. This example is one of sentiment analysis, however there are many potential applications of this method as it is implicitly acceptable for n-class classification. For example, predicting weather a label is “true” or “false” is a 2-class problem, but Naive Bayes can extend this to multiple labels. Back to the sentiment analysis example, a review could be “positive”, “negative”, or “neutral”. Applications also include document classification, which could be used to classify an article into categories such as “politics”, “sports”, “entertainment”, among many other overall article types.


In general, Naive Bayes uses the conditional independence assumption to apply the Bayes’ Theorem. Essentially, the goal is to find the probability of a label given a datapoint. The Bayes’ Theorem is appropriate for this task and uses several components of the probabilities within the data to calculate this. Namely, the probability of the data itself occurring, the probability of the label occurring, and the conditional probability in the opposite direction (i.e. the probability of the datapoint given a label). Especially in larger datasets, some of the conditional probabilities can be zero. This presents an issue due to the multiplicative calculations required, which would zero out the entire probability. Smoothing techniques are used to account for this, with the Laplacian Correction being the most common. This technique adds 1 to each case’s count. The general smoothing technique adds a specified variable (or alpha) to the count.


There are several primary forms of Naive Bayes:


Multinomial Naive Bayes


Multinomial Naive Bayes "is suitable for classification within discrete features (e.g., word counts for text classification)."[sklearn documentation]. Other discrete feature applications could be spam detection, sentiment analysis, and document categorization. Overall, this is a very efficient naive bayes method for text-based tasks.


Gaussian Naive Bayes


Gaussian Naive Bayes can be used when features are continuous, and are assumed to follow a normal (or Gaussian) distribution. Some of the specific applications could be medical diagnoses, real-time classification, and anomaly detection.


Bernoulli Naive Bayes


Bernoulli Naive Bayes is applicable for discrete data, specifically when features are binary. To extend from the multinomial example, instead of word counts, the data should be either 0s or 1s for if the word was included. The other applications from the multinomial applications would be eligible as well, as long as it is used in a binary sense.


Categorical Naive Bayes


Categorical Naive Bayes is "suitable for classification with discrete features that are categorically distributed" [sklearn documentation]. Note that this method requires encoding which represents the categorical variables as numerical. Ordinal encoding ensures order of the categorical variables (i.e. letter grades have an order), and non-ordinal categorical data (i.e. no implied order) can still be encoded. Some specific applications could be retail and marketing customer segmentation, healthcare classification, and others. Essentially, if categorical columns are present in the classification problem, this could be a decent method.


Analysis Code


The code for this analysis can be found [here].


Data Preparation


This project will be applying three of the Naive Bayes techniques:


Each of the methods requires different data preparation, which will be detailed below.


Initial Data

Following from the previous sections, this section will use the same datasets. As a reminder, here is a preview of the data that will be used:



Ski Resorts Data
Resort state_province_territory Country City Overall Rating Elevation Difference Elevation Low Elevation High Trails Total Trails Easy Trails Intermediate Trails Difficult Lifts Price Resort Size Run Variety Lifts Quality Latitude Longitude Pass Region
49 Degrees North Mountain Resort Washington United States Chewelah 3.4 564 1196 1760 68.0 20.0 27.0 21.0 7 82.0 3.5 4.0 3.3 48.277375 -117.701815 Other West
Crystal Mountain (WA) Washington United States Sunrise 3.3 796 1341 2137 50.0 8.0 27.0 15.0 11 199.0 3.2 3.6 3.7 46.928167 -121.504535 Ikon West
Mt. Baker Washington United States White Salmon 3.4 455 1070 1525 100.0 24.0 45.0 31.0 10 91.0 3.9 4.3 3.0 45.727775 -121.486699 Other West
Mt. Spokane Washington United States Mead 3.0 610 1185 1795 26.0 6.5 16.0 3.5 7 75.0 2.7 3.1 3.0 47.919072 -117.092505 Other West
Sitzmark Washington United States Tonasket 2.6 155 1330 1485 7.5 2.0 3.0 2.5 2 50.0 1.9 2.4 2.9 48.863907 -119.165077 Other West
Stevens Pass Washington United States Baring 3.3 580 1170 1750 39.0 6.0 18.0 15.0 10 119.0 3.1 3.5 3.6 47.764031 -121.474822 Epic West
The Summit at Snoqualmie Washington United States Snoqualmie Pass 3.0 380 800 1180 27.9 5.2 13.7 9.0 22 135.0 2.6 3.0 3.2 47.405235 -121.412783 Ikon West
Wenatchee Mission Ridge Washington United States Wenatchee 3.2 686 1392 2078 36.0 4.0 21.0 11.0 4 119.0 2.9 3.3 3.6 47.292466 -120.399871 Other West
Abenaki New Hampshire United States Wolfeboro 2.1 70 180 250 2.0 1.2 0.5 0.3 1 24.0 1.4 1.8 1.4 43.609528 -71.229692 Other Northeast
Attitash Mountain Resort New Hampshire United States Bartlett 3.2 533 183 716 37.0 7.4 17.4 12.2 8 129.0 2.9 3.3 3.7 44.084603 -71.221525 Epic Northeast


Weather Data
datetime tempmax tempmin temp feelslikemax feelslikemin feelslike dew humidity precip precipprob precipcover snow snowdepth windgust windspeed winddir pressure cloudcover visibility solarradiation solarenergy uvindex sunrise sunset moonphase icon stations resort tzoffset severerisk type_freezingrain type_ice type_none type_rain type_snow
2019-01-01 16.4 2.0 7.0 10.2 -13.3 -0.6 -1.2 69.1 0.008 100.0 20.83 0.0 20.7 18.30000 10.6 4.9 1014.5 59.0 8.6 116.8 9.9 5.0 07:26:20 16:51:51 0.85 snow ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 0 1
2019-01-02 24.3 -0.9 11.4 21.9 -11.9 5.4 -10.5 39.9 0.004 100.0 4.17 0.0 20.8 29.77377 8.7 353.1 1021.4 0.0 9.9 121.6 10.7 5.0 07:26:27 16:52:41 0.89 snow ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 0 1
2019-01-03 29.0 5.3 17.6 21.9 -4.0 8.6 4.1 56.1 0.004 100.0 4.17 0.2 20.8 32.20000 9.8 328.9 1024.7 0.0 9.8 123.3 10.6 5.0 07:26:31 16:53:33 0.92 snow ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 1 1
2019-01-04 34.0 11.9 23.4 28.7 3.4 17.1 7.0 50.4 0.001 100.0 4.17 0.1 20.8 20.80000 9.0 311.0 1025.5 0.0 9.9 123.7 10.7 5.0 07:26:34 16:54:26 0.96 snow ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 1 1
2019-01-05 34.1 14.3 27.1 29.4 4.3 20.1 1.9 33.9 0.001 100.0 4.17 0.0 20.4 20.80000 10.1 243.5 1022.2 19.4 9.7 110.3 9.6 5.0 07:26:34 16:55:20 0.00 rain ['72467523063', '72206103038', 'CACMC', 'DYGC2', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 1 1
2019-01-06 29.9 18.5 25.9 22.4 5.1 16.1 18.1 72.5 0.035 100.0 58.33 0.6 20.6 33.30000 16.9 266.7 1009.3 78.7 6.3 47.3 4.1 2.0 07:26:32 16:56:16 0.02 snow ['72467523063', '72206103038', 'CACMC', '72038500419', 'DYGC2', 'KCCU', 'KEGE', 'A0000594076', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 1 1
2019-01-07 24.8 14.7 20.3 12.8 2.5 6.5 13.7 75.2 0.004 100.0 8.33 0.4 21.3 45.70000 27.9 271.2 1015.6 83.7 4.8 35.8 3.0 2.0 07:26:27 16:57:13 0.06 snow ['72467523063', '72206103038', 'CACMC', 'DYGC2', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 0 1
2019-01-08 34.6 17.2 25.1 34.6 5.0 17.8 12.0 59.5 0.013 100.0 8.33 0.0 21.3 27.70000 15.2 312.1 1029.4 34.5 9.5 122.9 10.5 5.0 07:26:21 16:58:11 0.09 rain ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 1 1
2019-01-09 38.3 23.0 28.6 38.3 13.6 22.6 9.9 45.4 0.000 0.0 0.00 0.0 21.2 23.00000 13.0 142.9 1029.6 1.0 9.9 114.0 9.8 5.0 07:26:12 16:59:11 0.12 clear-day ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 1 0 0
2019-01-10 33.7 17.0 26.4 33.7 9.8 22.6 14.3 60.6 0.026 100.0 12.50 0.8 21.4 17.20000 8.8 323.7 1023.3 39.9 8.3 75.9 6.6 4.0 07:26:01 17:00:11 0.16 snow ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 1 1


Google Places Data
datetime tempmax tempmin temp feelslikemax feelslikemin feelslike dew humidity precip precipprob precipcover snow snowdepth windgust windspeed winddir pressure cloudcover visibility solarradiation solarenergy uvindex sunrise sunset moonphase icon stations resort tzoffset severerisk type_freezingrain type_ice type_none type_rain type_snow
2019-01-01 16.4 2.0 7.0 10.2 -13.3 -0.6 -1.2 69.1 0.008 100.0 20.83 0.0 20.7 18.30000 10.6 4.9 1014.5 59.0 8.6 116.8 9.9 5.0 07:26:20 16:51:51 0.85 snow ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 0 1
2019-01-02 24.3 -0.9 11.4 21.9 -11.9 5.4 -10.5 39.9 0.004 100.0 4.17 0.0 20.8 29.77377 8.7 353.1 1021.4 0.0 9.9 121.6 10.7 5.0 07:26:27 16:52:41 0.89 snow ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 0 1
2019-01-03 29.0 5.3 17.6 21.9 -4.0 8.6 4.1 56.1 0.004 100.0 4.17 0.2 20.8 32.20000 9.8 328.9 1024.7 0.0 9.8 123.3 10.6 5.0 07:26:31 16:53:33 0.92 snow ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 1 1
2019-01-04 34.0 11.9 23.4 28.7 3.4 17.1 7.0 50.4 0.001 100.0 4.17 0.1 20.8 20.80000 9.0 311.0 1025.5 0.0 9.9 123.7 10.7 5.0 07:26:34 16:54:26 0.96 snow ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 1 1
2019-01-05 34.1 14.3 27.1 29.4 4.3 20.1 1.9 33.9 0.001 100.0 4.17 0.0 20.4 20.80000 10.1 243.5 1022.2 19.4 9.7 110.3 9.6 5.0 07:26:34 16:55:20 0.00 rain ['72467523063', '72206103038', 'CACMC', 'DYGC2', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 1 1
2019-01-06 29.9 18.5 25.9 22.4 5.1 16.1 18.1 72.5 0.035 100.0 58.33 0.6 20.6 33.30000 16.9 266.7 1009.3 78.7 6.3 47.3 4.1 2.0 07:26:32 16:56:16 0.02 snow ['72467523063', '72206103038', 'CACMC', '72038500419', 'DYGC2', 'KCCU', 'KEGE', 'A0000594076', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 1 1
2019-01-07 24.8 14.7 20.3 12.8 2.5 6.5 13.7 75.2 0.004 100.0 8.33 0.4 21.3 45.70000 27.9 271.2 1015.6 83.7 4.8 35.8 3.0 2.0 07:26:27 16:57:13 0.06 snow ['72467523063', '72206103038', 'CACMC', 'DYGC2', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 0 1
2019-01-08 34.6 17.2 25.1 34.6 5.0 17.8 12.0 59.5 0.013 100.0 8.33 0.0 21.3 27.70000 15.2 312.1 1029.4 34.5 9.5 122.9 10.5 5.0 07:26:21 16:58:11 0.09 rain ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 1 1
2019-01-09 38.3 23.0 28.6 38.3 13.6 22.6 9.9 45.4 0.000 0.0 0.00 0.0 21.2 23.00000 13.0 142.9 1029.6 1.0 9.9 114.0 9.8 5.0 07:26:12 16:59:11 0.12 clear-day ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 1 0 0
2019-01-10 33.7 17.0 26.4 33.7 9.8 22.6 14.3 60.6 0.026 100.0 12.50 0.8 21.4 17.20000 8.8 323.7 1023.3 39.9 8.3 75.9 6.6 4.0 07:26:01 17:00:11 0.16 snow ['72467523063', '72206103038', 'CACMC', 'KCCU', 'KEGE', 'KLXV', 'DJTC2', 'K20V', '72467393009'] Vail 0.0 0.0 0 0 0 1 1


Using these datasets, the following preparation will take place to prepare them for the different Naive Bayes methods.



Multinomial Naive Bayes


The most recent full year of data for weather data was from 2023, and the weather type occurences are currently in binary format for daily data. This data was summed for each resort for the 2023 year.

The ski resort data additionally has the number of trails per difficulty and the number lifts, which will be used in this analysis.

Using this count data, the type of Pass for resorts will attempted to be modeled. The possible options are:

Below is a snippet of the prepared data for the Multinomial Naive Bayes data.

type_freezingrain type_ice type_none type_rain type_snow Trails Easy Trails Intermediate Trails Difficult Lifts Pass
16 0 134 199 112 20 27 21 7 Other
11 3 113 229 74 1 0 0 1 Other
9 3 100 184 140 1 0 0 1 Other
12 3 147 167 85 2 3 2 19 Epic
3 2 130 221 51 2 4 2 3 Other
21 6 73 236 132 10 6 4 5 Other
4 3 145 192 79 1 1 1 7 Other
8 3 130 206 80 2 1 1 12 Other
8 0 116 208 151 11 21 39 8 Ikon
0 0 70 199 200 14 31 17 7 Ikon

Gaussian Naive Bayes


Due to the machine learning method being able to continuous data, the entirety of the weather dataset was used for this Naive Bayes technique.

However, a small change was made to the label data for this dataset. To balance the label dataset, "partly-cloudy-day", "cloudy", "wind", and "fog" were transformed into "other".

This reduced the overall labels into:

Below is a snippet of the prepared data for the Gaussian Naive Bayes data.

temp dew humidity pressure cloudcover icon
7.0 -1.2 69.1 1014.5 59.0 snow
11.4 -10.5 39.9 1021.4 0.0 snow
17.6 4.1 56.1 1024.7 0.0 snow
23.4 7.0 50.4 1025.5 0.0 snow
27.1 1.9 33.9 1022.2 19.4 rain
25.9 18.1 72.5 1009.3 78.7 snow
20.3 13.7 75.2 1015.6 83.7 snow
25.1 12.0 59.5 1029.4 34.5 rain
28.6 9.9 45.4 1029.6 1.0 clear-day
26.4 14.3 60.6 1023.3 39.9 snow

Bernoulli Naive Bayes


For the Bernoulli setup, the data used was to test the efficacy of the Google Places API itself. The data was encoded to account for the return category calls for the API in 0s and 1s.

Using this binomial type data, the following label categories are:



Essentially, the Google API also returns a list of accompanying categories for the call category. This will test if there are any patterns between the call category and the other returns.

Below is a snippet of the prepared data for the Multinomial Naive Bayes data.

Call Category Initial Category_amusement_park Initial Category_art_gallery Initial Category_atm Initial Category_bakery Initial Category_bar Initial Category_beauty_salon Initial Category_bicycle_store Initial Category_book_store Initial Category_bowling_alley Initial Category_cafe Initial Category_campground Initial Category_car_repair Initial Category_car_wash Initial Category_casino Initial Category_cemetery Initial Category_church Initial Category_clothing_store Initial Category_convenience_store Initial Category_dentist Initial Category_department_store Initial Category_doctor Initial Category_drugstore Initial Category_electrician Initial Category_electronics_store Initial Category_finance Initial Category_florist Initial Category_furniture_store Initial Category_gas_station Initial Category_general_contractor Initial Category_grocery_or_supermarket Initial Category_gym Initial Category_hair_care Initial Category_hardware_store Initial Category_health Initial Category_home_goods_store Initial Category_hospital Initial Category_jewelry_store Initial Category_laundry Initial Category_lawyer Initial Category_liquor_store Initial Category_lodging Initial Category_meal_delivery Initial Category_meal_takeaway Initial Category_movie_rental Initial Category_movie_theater Initial Category_museum Initial Category_night_club Initial Category_park Initial Category_parking Initial Category_pet_store Initial Category_pharmacy Initial Category_physiotherapist Initial Category_post_office Initial Category_real_estate_agency Initial Category_restaurant Initial Category_rv_park Initial Category_school Initial Category_shoe_store Initial Category_shopping_mall Initial Category_spa Initial Category_storage Initial Category_store Initial Category_supermarket Initial Category_tourist_attraction Initial Category_travel_agency Initial Category_university Initial Category_veterinary_care Initial Category_zoo Secondary Category_amusement_park Secondary Category_art_gallery Secondary Category_atm Secondary Category_bakery Secondary Category_bank Secondary Category_bar Secondary Category_beauty_salon Secondary Category_bicycle_store Secondary Category_book_store Secondary Category_bowling_alley Secondary Category_cafe Secondary Category_campground Secondary Category_car_repair Secondary Category_car_wash Secondary Category_casino Secondary Category_church Secondary Category_city_hall Secondary Category_clothing_store Secondary Category_convenience_store Secondary Category_dentist Secondary Category_department_store Secondary Category_doctor Secondary Category_drugstore Secondary Category_electronics_store Secondary Category_finance Secondary Category_florist Secondary Category_food Secondary Category_furniture_store Secondary Category_gas_station Secondary Category_general_contractor Secondary Category_grocery_or_supermarket Secondary Category_gym Secondary Category_hair_care Secondary Category_hardware_store Secondary Category_health Secondary Category_home_goods_store Secondary Category_hospital Secondary Category_insurance_agency Secondary Category_jewelry_store Secondary Category_laundry Secondary Category_lawyer Secondary Category_liquor_store Secondary Category_local_government_office Secondary Category_lodging Secondary Category_meal_delivery Secondary Category_meal_takeaway Secondary Category_movie_rental Secondary Category_movie_theater Secondary Category_moving_company Secondary Category_museum Secondary Category_night_club Secondary Category_park Secondary Category_parking Secondary Category_pet_store Secondary Category_pharmacy Secondary Category_physiotherapist Secondary Category_place_of_worship Secondary Category_plumber Secondary Category_point_of_interest Secondary Category_post_office Secondary Category_premise Secondary Category_real_estate_agency Secondary Category_restaurant Secondary Category_roofing_contractor Secondary Category_rv_park Secondary Category_school Secondary Category_shoe_store Secondary Category_shopping_mall Secondary Category_spa Secondary Category_storage Secondary Category_store Secondary Category_supermarket Secondary Category_tourist_attraction Secondary Category_travel_agency Secondary Category_veterinary_care Tertiary Category_amusement_park Tertiary Category_art_gallery Tertiary Category_atm Tertiary Category_bakery Tertiary Category_bank Tertiary Category_bar Tertiary Category_beauty_salon Tertiary Category_book_store Tertiary Category_cafe Tertiary Category_campground Tertiary Category_car_dealer Tertiary Category_car_rental Tertiary Category_car_repair Tertiary Category_car_wash Tertiary Category_church Tertiary Category_clothing_store Tertiary Category_convenience_store Tertiary Category_department_store Tertiary Category_doctor Tertiary Category_drugstore Tertiary Category_electronics_store Tertiary Category_establishment Tertiary Category_finance Tertiary Category_florist Tertiary Category_food Tertiary Category_furniture_store Tertiary Category_gas_station Tertiary Category_general_contractor Tertiary Category_grocery_or_supermarket Tertiary Category_gym Tertiary Category_hair_care Tertiary Category_health Tertiary Category_home_goods_store Tertiary Category_hospital Tertiary Category_insurance_agency Tertiary Category_jewelry_store Tertiary Category_laundry Tertiary Category_library Tertiary Category_liquor_store Tertiary Category_local_government_office Tertiary Category_lodging Tertiary Category_meal_delivery Tertiary Category_meal_takeaway Tertiary Category_movie_theater Tertiary Category_museum Tertiary Category_night_club Tertiary Category_park Tertiary Category_parking Tertiary Category_pet_store Tertiary Category_pharmacy Tertiary Category_physiotherapist Tertiary Category_place_of_worship Tertiary Category_point_of_interest Tertiary Category_premise Tertiary Category_real_estate_agency Tertiary Category_restaurant Tertiary Category_rv_park Tertiary Category_school Tertiary Category_shoe_store Tertiary Category_spa Tertiary Category_store Tertiary Category_supermarket Tertiary Category_tourist_attraction Tertiary Category_travel_agency Tertiary Category_veterinary_care Tertiary Category_zoo
Restaurants 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
Restaurants 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Restaurants 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
Restaurants 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
Restaurants 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Restaurants 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Restaurants 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Restaurants 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
Restaurants 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Restaurants 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

Training and Testing Sets

Additionally, training and testing sets were created. The two sets are disjoint, and must be disjoint. Using non-disjoint data between testing and training won't give an accurate representation of the performance of the model. First, this could result in an overfit of the model, which could end up describing noise, rather than the underlying distribution. Second, the testing set being non-disjoint helps to represent real-world data (i.e. unseen data).


Multinomial Naive Bayes Training Dataset

Unnamed: 0 type_freezingrain type_ice type_none type_rain type_snow Trails Easy Trails Intermediate Trails Difficult Lifts Pass
373 20 4 67 251 111 20 16 10 8 Ikon
278 1 0 199 139 74 9 15 10 5 Other
249 21 6 76 227 132 5 7 9 4 Other
283 7 0 113 220 135 6 17 5 3 Other
167 10 1 162 123 101 2 2 1 3 Other
275 39 8 55 227 145 4 8 10 5 Other
119 4 2 164 194 26 1 2 1 7 Epic
364 8 6 88 212 134 1 4 4 2 Other
202 8 0 101 207 153 5 12 14 6 Other
19 2 0 145 145 143 24 40 70 5 Other

Multinomial Naive Bayes Testing Dataset

Unnamed: 0 type_freezingrain type_ice type_none type_rain type_snow Trails Easy Trails Intermediate Trails Difficult Lifts Pass
287 9 0 88 231 160 20 45 50 10 Other
329 26 0 103 182 164 27 28 58 14 Ikon
323 20 9 101 226 113 25 30 28 20 Ikon
145 18 2 90 244 79 4 3 1 2 Epic
55 11 4 56 247 133 2 1 1 3 Other
93 4 4 161 170 107 10 20 20 9 Other
340 29 1 88 243 124 2 2 0 1 Other
82 2 0 132 145 172 31 69 20 12 Epic
365 10 4 199 104 68 10 12 8 6 Other
148 8 2 75 207 137 6 8 9 3 Other

Gaussian Naive Bayes Training Dataset

Unnamed: 0 temp dew humidity pressure cloudcover icon
58778 7.9 0.8 72.5 1020.1 51.8 other
220139 70.0 35.8 31.5 1020.9 9.2 clear-day
56946 41.3 34.9 77.9 1015.7 30.4 other
679637 33.8 18.3 54.1 1015.2 31.6 other
742765 -5.6 -11.2 76.1 1010.6 85.2 snow
469639 31.8 13.1 47.1 1024.2 2.2 clear-day
35073 38.1 27.6 66.8 1020.0 4.7 clear-day
22505 50.6 29.2 43.7 1029.4 23.5 other
600010 59.3 55.4 87.7 1007.3 84.9 rain
34050 42.1 25.4 52.5 1009.9 88.6 other

Gaussian Naive Bayes Testing Dataset

Unnamed: 0 temp dew humidity pressure cloudcover icon
106540 30.5 27.4 88.3 1002.4 97.2 snow
725950 30.2 24.5 79.3 1004.7 71.8 snow
240926 55.5 44.0 66.5 1017.2 69.0 other
87906 38.4 30.4 73.8 1024.0 91.1 other
195013 45.9 38.9 79.2 1016.3 62.2 other
556715 53.9 47.3 79.6 1019.5 73.0 rain
585788 56.6 49.1 78.2 1021.2 56.8 other
436588 66.8 48.9 57.7 1009.8 20.3 other
102425 60.2 47.2 65.1 1014.0 44.6 other
482230 42.7 31.5 66.3 1015.8 69.8 other

Bernoulli Naive Bayes Training Dataset

Unnamed: 0 Call Category Initial Category_amusement_park Initial Category_art_gallery Initial Category_atm Initial Category_bakery Initial Category_bar Initial Category_beauty_salon Initial Category_bicycle_store Initial Category_book_store Initial Category_bowling_alley Initial Category_cafe Initial Category_campground Initial Category_car_repair Initial Category_car_wash Initial Category_casino Initial Category_cemetery Initial Category_church Initial Category_clothing_store Initial Category_convenience_store Initial Category_dentist Initial Category_department_store Initial Category_doctor Initial Category_drugstore Initial Category_electrician Initial Category_electronics_store Initial Category_finance Initial Category_florist Initial Category_furniture_store Initial Category_gas_station Initial Category_general_contractor Initial Category_grocery_or_supermarket Initial Category_gym Initial Category_hair_care Initial Category_hardware_store Initial Category_health Initial Category_home_goods_store Initial Category_hospital Initial Category_jewelry_store Initial Category_laundry Initial Category_lawyer Initial Category_liquor_store Initial Category_lodging Initial Category_meal_delivery Initial Category_meal_takeaway Initial Category_movie_rental Initial Category_movie_theater Initial Category_museum Initial Category_night_club Initial Category_park Initial Category_parking Initial Category_pet_store Initial Category_pharmacy Initial Category_physiotherapist Initial Category_post_office Initial Category_real_estate_agency Initial Category_restaurant Initial Category_rv_park Initial Category_school Initial Category_shoe_store Initial Category_shopping_mall Initial Category_spa Initial Category_storage Initial Category_store Initial Category_supermarket Initial Category_tourist_attraction Initial Category_travel_agency Initial Category_university Initial Category_veterinary_care Initial Category_zoo Secondary Category_amusement_park Secondary Category_art_gallery Secondary Category_atm Secondary Category_bakery Secondary Category_bank Secondary Category_bar Secondary Category_beauty_salon Secondary Category_bicycle_store Secondary Category_book_store Secondary Category_bowling_alley Secondary Category_cafe Secondary Category_campground Secondary Category_car_repair Secondary Category_car_wash Secondary Category_casino Secondary Category_church Secondary Category_city_hall Secondary Category_clothing_store Secondary Category_convenience_store Secondary Category_dentist Secondary Category_department_store Secondary Category_doctor Secondary Category_drugstore Secondary Category_electronics_store Secondary Category_finance Secondary Category_florist Secondary Category_food Secondary Category_furniture_store Secondary Category_gas_station Secondary Category_general_contractor Secondary Category_grocery_or_supermarket Secondary Category_gym Secondary Category_hair_care Secondary Category_hardware_store Secondary Category_health Secondary Category_home_goods_store Secondary Category_hospital Secondary Category_insurance_agency Secondary Category_jewelry_store Secondary Category_laundry Secondary Category_lawyer Secondary Category_liquor_store Secondary Category_local_government_office Secondary Category_lodging Secondary Category_meal_delivery Secondary Category_meal_takeaway Secondary Category_movie_rental Secondary Category_movie_theater Secondary Category_moving_company Secondary Category_museum Secondary Category_night_club Secondary Category_park Secondary Category_parking Secondary Category_pet_store Secondary Category_pharmacy Secondary Category_physiotherapist Secondary Category_place_of_worship Secondary Category_plumber Secondary Category_point_of_interest Secondary Category_post_office Secondary Category_premise Secondary Category_real_estate_agency Secondary Category_restaurant Secondary Category_roofing_contractor Secondary Category_rv_park Secondary Category_school Secondary Category_shoe_store Secondary Category_shopping_mall Secondary Category_spa Secondary Category_storage Secondary Category_store Secondary Category_supermarket Secondary Category_tourist_attraction Secondary Category_travel_agency Secondary Category_veterinary_care Tertiary Category_amusement_park Tertiary Category_art_gallery Tertiary Category_atm Tertiary Category_bakery Tertiary Category_bank Tertiary Category_bar Tertiary Category_beauty_salon Tertiary Category_book_store Tertiary Category_cafe Tertiary Category_campground Tertiary Category_car_dealer Tertiary Category_car_rental Tertiary Category_car_repair Tertiary Category_car_wash Tertiary Category_church Tertiary Category_clothing_store Tertiary Category_convenience_store Tertiary Category_department_store Tertiary Category_doctor Tertiary Category_drugstore Tertiary Category_electronics_store Tertiary Category_establishment Tertiary Category_finance Tertiary Category_florist Tertiary Category_food Tertiary Category_furniture_store Tertiary Category_gas_station Tertiary Category_general_contractor Tertiary Category_grocery_or_supermarket Tertiary Category_gym Tertiary Category_hair_care Tertiary Category_health Tertiary Category_home_goods_store Tertiary Category_hospital Tertiary Category_insurance_agency Tertiary Category_jewelry_store Tertiary Category_laundry Tertiary Category_library Tertiary Category_liquor_store Tertiary Category_local_government_office Tertiary Category_lodging Tertiary Category_meal_delivery Tertiary Category_meal_takeaway Tertiary Category_movie_theater Tertiary Category_museum Tertiary Category_night_club Tertiary Category_park Tertiary Category_parking Tertiary Category_pet_store Tertiary Category_pharmacy Tertiary Category_physiotherapist Tertiary Category_place_of_worship Tertiary Category_point_of_interest Tertiary Category_premise Tertiary Category_real_estate_agency Tertiary Category_restaurant Tertiary Category_rv_park Tertiary Category_school Tertiary Category_shoe_store Tertiary Category_spa Tertiary Category_store Tertiary Category_supermarket Tertiary Category_tourist_attraction Tertiary Category_travel_agency Tertiary Category_veterinary_care Tertiary Category_zoo
20362 Restaurants 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
16572 Restaurants 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
22264 Grocery 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
99 Grocery 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
18185 Grocery 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
9198 Restaurants 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3742 Restaurants 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
7828 Shopping 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
8377 Grocery 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2764 Shopping 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Bernoulli Naive Bayes Testing Dataset

Unnamed: 0 Call Category Initial Category_amusement_park Initial Category_art_gallery Initial Category_atm Initial Category_bakery Initial Category_bar Initial Category_beauty_salon Initial Category_bicycle_store Initial Category_book_store Initial Category_bowling_alley Initial Category_cafe Initial Category_campground Initial Category_car_repair Initial Category_car_wash Initial Category_casino Initial Category_cemetery Initial Category_church Initial Category_clothing_store Initial Category_convenience_store Initial Category_dentist Initial Category_department_store Initial Category_doctor Initial Category_drugstore Initial Category_electrician Initial Category_electronics_store Initial Category_finance Initial Category_florist Initial Category_furniture_store Initial Category_gas_station Initial Category_general_contractor Initial Category_grocery_or_supermarket Initial Category_gym Initial Category_hair_care Initial Category_hardware_store Initial Category_health Initial Category_home_goods_store Initial Category_hospital Initial Category_jewelry_store Initial Category_laundry Initial Category_lawyer Initial Category_liquor_store Initial Category_lodging Initial Category_meal_delivery Initial Category_meal_takeaway Initial Category_movie_rental Initial Category_movie_theater Initial Category_museum Initial Category_night_club Initial Category_park Initial Category_parking Initial Category_pet_store Initial Category_pharmacy Initial Category_physiotherapist Initial Category_post_office Initial Category_real_estate_agency Initial Category_restaurant Initial Category_rv_park Initial Category_school Initial Category_shoe_store Initial Category_shopping_mall Initial Category_spa Initial Category_storage Initial Category_store Initial Category_supermarket Initial Category_tourist_attraction Initial Category_travel_agency Initial Category_university Initial Category_veterinary_care Initial Category_zoo Secondary Category_amusement_park Secondary Category_art_gallery Secondary Category_atm Secondary Category_bakery Secondary Category_bank Secondary Category_bar Secondary Category_beauty_salon Secondary Category_bicycle_store Secondary Category_book_store Secondary Category_bowling_alley Secondary Category_cafe Secondary Category_campground Secondary Category_car_repair Secondary Category_car_wash Secondary Category_casino Secondary Category_church Secondary Category_city_hall Secondary Category_clothing_store Secondary Category_convenience_store Secondary Category_dentist Secondary Category_department_store Secondary Category_doctor Secondary Category_drugstore Secondary Category_electronics_store Secondary Category_finance Secondary Category_florist Secondary Category_food Secondary Category_furniture_store Secondary Category_gas_station Secondary Category_general_contractor Secondary Category_grocery_or_supermarket Secondary Category_gym Secondary Category_hair_care Secondary Category_hardware_store Secondary Category_health Secondary Category_home_goods_store Secondary Category_hospital Secondary Category_insurance_agency Secondary Category_jewelry_store Secondary Category_laundry Secondary Category_lawyer Secondary Category_liquor_store Secondary Category_local_government_office Secondary Category_lodging Secondary Category_meal_delivery Secondary Category_meal_takeaway Secondary Category_movie_rental Secondary Category_movie_theater Secondary Category_moving_company Secondary Category_museum Secondary Category_night_club Secondary Category_park Secondary Category_parking Secondary Category_pet_store Secondary Category_pharmacy Secondary Category_physiotherapist Secondary Category_place_of_worship Secondary Category_plumber Secondary Category_point_of_interest Secondary Category_post_office Secondary Category_premise Secondary Category_real_estate_agency Secondary Category_restaurant Secondary Category_roofing_contractor Secondary Category_rv_park Secondary Category_school Secondary Category_shoe_store Secondary Category_shopping_mall Secondary Category_spa Secondary Category_storage Secondary Category_store Secondary Category_supermarket Secondary Category_tourist_attraction Secondary Category_travel_agency Secondary Category_veterinary_care Tertiary Category_amusement_park Tertiary Category_art_gallery Tertiary Category_atm Tertiary Category_bakery Tertiary Category_bank Tertiary Category_bar Tertiary Category_beauty_salon Tertiary Category_book_store Tertiary Category_cafe Tertiary Category_campground Tertiary Category_car_dealer Tertiary Category_car_rental Tertiary Category_car_repair Tertiary Category_car_wash Tertiary Category_church Tertiary Category_clothing_store Tertiary Category_convenience_store Tertiary Category_department_store Tertiary Category_doctor Tertiary Category_drugstore Tertiary Category_electronics_store Tertiary Category_establishment Tertiary Category_finance Tertiary Category_florist Tertiary Category_food Tertiary Category_furniture_store Tertiary Category_gas_station Tertiary Category_general_contractor Tertiary Category_grocery_or_supermarket Tertiary Category_gym Tertiary Category_hair_care Tertiary Category_health Tertiary Category_home_goods_store Tertiary Category_hospital Tertiary Category_insurance_agency Tertiary Category_jewelry_store Tertiary Category_laundry Tertiary Category_library Tertiary Category_liquor_store Tertiary Category_local_government_office Tertiary Category_lodging Tertiary Category_meal_delivery Tertiary Category_meal_takeaway Tertiary Category_movie_theater Tertiary Category_museum Tertiary Category_night_club Tertiary Category_park Tertiary Category_parking Tertiary Category_pet_store Tertiary Category_pharmacy Tertiary Category_physiotherapist Tertiary Category_place_of_worship Tertiary Category_point_of_interest Tertiary Category_premise Tertiary Category_real_estate_agency Tertiary Category_restaurant Tertiary Category_rv_park Tertiary Category_school Tertiary Category_shoe_store Tertiary Category_spa Tertiary Category_store Tertiary Category_supermarket Tertiary Category_tourist_attraction Tertiary Category_travel_agency Tertiary Category_veterinary_care Tertiary Category_zoo
6689 Shopping 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
8197 Lodging 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
8149 Medical 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
19980 Grocery 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
20495 Grocery 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
7231 Grocery 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
16839 Lodging 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9682 Grocery 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
12540 Grocery 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
14777 Restaurants 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


Code

Once again, the code can be found [here].


Results

For each method, the balance of the labels was examined, the accuracy of the modeling on the test set was reported, and the confusion matrix was created.



Multinomial Naive Bayes


Balance Across Labels for Multinomial Naive Bayes.

The accuracy for this result was 75.44%.


Confusion Matrix for Multinomial Naive Bayes.

Although the accuracy would be acceptable in many applications, the balance of the labels were significantly skewed towards the Other type pass. Given that the model was trained on such a heavily unbalanced dataset, the accuracy is slightly misleading. Looking at the confusion matrix, Epic pass didn't get a single correct prediction, and Ikon had more total incorrect predictions than correct.

Balancing the dataset or experimenting with hyperparameters could be beneficial for this problem.


Gaussian Naive Bayes


Balance Across Labels for Gaussian Naive Bayes.

The accuracy for this result was 68.25%.


Confusion Matrix for Gaussian Naive Bayes.

Balancing was attempted in this problem by combining the low occcurence labels into an Other category. However, the labels weren't perfectly balanced. This was a rather large dataset, so there should be plenty of data to train the labels of lesser proportion. Weather prediction is a notoriously difficult problem, and some possible improvements to this specific data could be to include location data, at least at a regional scale. Or, perform some hyperparameter tuning.


Bernoulli Naive Bayes


Balance Across Labels for Bernoulli Naive Bayes.

The accuracy for this result was 92.94%.


Confusion Matrix for Bernoullie Naive Bayes.

Given that the accuracy shows a successful model, this does corroborate a pattern between the sub-categories and the parent call category given to the API. Hyperparameter tuning could result in better performance here, however, the incorrect predictions could actually present some correlations worth investigating. For instance, the label of Bars was correctly predicted more than it was not. However, the incorrect predictions for that label was most prevalant was Restaurants. Bars and Restaurants are often correlated together, and more correlations like this do appear in this confusion matrix.



Conclusions


In this analysis, several different patterns were explored.

Potential links between daily weather events and the types of ski resort passes was invesigated. This didn't yield great results, and could require more data or different methodology for a future analysis.

The relationship between real-time weather outcomes and the general weather descriptions for each day was examined. The results for this did yield more promising results, although location based data or further refinement of the methods might yield better correlations.

The relationships between different categories of businesses located near ski resorts was assessed. If someone were to search for a specific service on Google for businesses near ski resorts, they would likely receive the services that were requested. The findings also suggest that if the services wasn't exactly what they were requested, it would be highly correlated. For example, If a user searched for a bar, they receive results for a restaurant.