Association rule mining is a technique used to find and quantify relationships within sets,
specifically the occurrence of events together. A common use and colloquialism for ARM is
market basket analysis, where items purchased together by customers are examined. The goal of this
specific analysis is to provide an answer to the question “Given a limited number of items at a location,
what items are most associated with each other?” In general, does a set or subset imply another set or subset? This is the idea of a rule. This can also
be used for other purposes aside from customer-based studies, as this particular analysis will focus on.
['Interstellar', 'Click', 'The Lord of the Rings', 'Up', 'Scarface'] |
['The Martian', 'Die Hard', 'Beerfest'] |
['Dune', 'Scarface', 'Forest Gump'] |
['ET', 'Toy Story', 'Beerfest', 'Inception', 'Click'] |
['Interstellar', 'Inception', 'Raiders of the Lost Ark', 'Toy Story', 'Fight Club'] |
['The Martian', 'The Matrix'] |
['Shawshank Redemption', 'The Martian', 'Die Hard'] |
['Shawshank Redemption', 'The Martian'] |
['Up', 'ET'] |
['Toy Story', 'ET', 'Scarface', 'The Matrix', 'Inception'] |
support | itemsets | |
---|---|---|
0 | 0.2 | frozenset({'Beerfest'}) |
1 | 0.2 | frozenset({'Click'}) |
2 | 0.2 | frozenset({'Die Hard'}) |
3 | 0.3 | frozenset({'ET'}) |
4 | 0.3 | frozenset({'Inception'}) |
5 | 0.2 | frozenset({'Interstellar'}) |
6 | 0.3 | frozenset({'Scarface'}) |
7 | 0.2 | frozenset({'Shawshank Redemption'}) |
8 | 0.4 | frozenset({'The Martian'}) |
9 | 0.2 | frozenset({'The Matrix'}) |
10 | 0.3 | frozenset({'Toy Story'}) |
11 | 0.2 | frozenset({'Up'}) |
12 | 0.2 | frozenset({'The Martian', 'Die Hard'}) |
13 | 0.2 | frozenset({'Inception', 'ET'}) |
14 | 0.2 | frozenset({'Toy Story', 'ET'}) |
15 | 0.3 | frozenset({'Toy Story', 'Inception'}) |
16 | 0.2 | frozenset({'The Martian', 'Shawshank Redemption'}) |
17 | 0.2 | frozenset({'Toy Story', 'Inception', 'ET'}) |
antecedents | consequents | antecedent support | consequent support | support | confidence | lift | leverage | conviction | zhangs_metric | |
---|---|---|---|---|---|---|---|---|---|---|
0 | frozenset({'The Martian'}) | frozenset({'Die Hard'}) | 0.4 | 0.2 | 0.2 | 0.500000 | 2.500000 | 0.12 | 1.6 | 1.000000 |
1 | frozenset({'Die Hard'}) | frozenset({'The Martian'}) | 0.2 | 0.4 | 0.2 | 1.000000 | 2.500000 | 0.12 | inf | 0.750000 |
2 | frozenset({'Inception'}) | frozenset({'ET'}) | 0.3 | 0.3 | 0.2 | 0.666667 | 2.222222 | 0.11 | 2.1 | 0.785714 |
3 | frozenset({'ET'}) | frozenset({'Inception'}) | 0.3 | 0.3 | 0.2 | 0.666667 | 2.222222 | 0.11 | 2.1 | 0.785714 |
4 | frozenset({'Toy Story'}) | frozenset({'ET'}) | 0.3 | 0.3 | 0.2 | 0.666667 | 2.222222 | 0.11 | 2.1 | 0.785714 |
5 | frozenset({'ET'}) | frozenset({'Toy Story'}) | 0.3 | 0.3 | 0.2 | 0.666667 | 2.222222 | 0.11 | 2.1 | 0.785714 |
6 | frozenset({'Toy Story'}) | frozenset({'Inception'}) | 0.3 | 0.3 | 0.3 | 1.000000 | 3.333333 | 0.21 | inf | 1.000000 |
7 | frozenset({'Inception'}) | frozenset({'Toy Story'}) | 0.3 | 0.3 | 0.3 | 1.000000 | 3.333333 | 0.21 | inf | 1.000000 |
8 | frozenset({'The Martian'}) | frozenset({'Shawshank Redemption'}) | 0.4 | 0.2 | 0.2 | 0.500000 | 2.500000 | 0.12 | 1.6 | 1.000000 |
9 | frozenset({'Shawshank Redemption'}) | frozenset({'The Martian'}) | 0.2 | 0.4 | 0.2 | 1.000000 | 2.500000 | 0.12 | inf | 0.750000 |
10 | frozenset({'Toy Story', 'Inception'}) | frozenset({'ET'}) | 0.3 | 0.3 | 0.2 | 0.666667 | 2.222222 | 0.11 | 2.1 | 0.785714 |
11 | frozenset({'Toy Story', 'ET'}) | frozenset({'Inception'}) | 0.2 | 0.3 | 0.2 | 1.000000 | 3.333333 | 0.14 | inf | 0.875000 |
12 | frozenset({'Inception', 'ET'}) | frozenset({'Toy Story'}) | 0.2 | 0.3 | 0.2 | 1.000000 | 3.333333 | 0.14 | inf | 0.875000 |
13 | frozenset({'Toy Story'}) | frozenset({'Inception', 'ET'}) | 0.3 | 0.2 | 0.2 | 0.666667 | 3.333333 | 0.14 | 2.4 | 1.000000 |
14 | frozenset({'Inception'}) | frozenset({'Toy Story', 'ET'}) | 0.3 | 0.2 | 0.2 | 0.666667 | 3.333333 | 0.14 | 2.4 | 1.000000 |
15 | frozenset({'ET'}) | frozenset({'Toy Story', 'Inception'}) | 0.3 | 0.3 | 0.2 | 0.666667 | 2.222222 | 0.11 | 2.1 | 0.785714 |
This analysis will focus on finding associations between categories returned by the Google Places API. The API itself returns a list of categories associated with each business.
The categories themselves will be analyzed, however, a few labels can be applied to the transaction type data as well to help identify associations. Namely,
Preparing data for this type of analysis consists of creating the initial transaction-type data, and then allowing for expansion into labels.
The general preparation process:
ast.literal_eval()
to Ensure List Type.The initial dataset containing a column with list values required for the transaction data. Several cleaning steps mentioned above will prepare this data for merging in with the cleaned google places data.
latitude | longitude | name | rating | types | total_ratings | vicinity | resort | call_category | price_level |
---|---|---|---|---|---|---|---|---|---|
39.639411 | -106.367836 | Manor Vail Lodge | 4.7 | ['bar', 'lodging', 'restaurant', 'food', 'point_of_interest', 'establishment'] | 370.0 | 595 Vail Valley Drive, Vail | Vail | Restaurants | NaN |
39.641578 | -106.371678 | Gravity Haus Vail | 4.4 | ['gym', 'spa', 'lodging', 'restaurant', 'food', 'point_of_interest', 'health', 'establishment'] | 256.0 | 352 East Meadow Drive, Vail | Vail | Restaurants | NaN |
39.642639 | -106.377803 | Leonora | 4.3 | ['restaurant', 'food', 'point_of_interest', 'establishment'] | 167.0 | 16 Vail Road, Vail | Vail | Restaurants | 3.0 |
39.638962 | -106.369379 | Larkspur Events & Dining | 4.5 | ['restaurant', 'food', 'point_of_interest', 'establishment'] | 198.0 | 458 Vail Valley Drive, Vail | Vail | Restaurants | 3.0 |
39.630370 | -106.418694 | Subway | 2.7 | ['meal_takeaway', 'restaurant', 'food', 'point_of_interest', 'establishment'] | 105.0 | 2161 North Frontage Road West #11-12, Vail | Vail | Restaurants | 1.0 |
39.640861 | -106.374665 | Sweet Basil | 4.4 | ['bar', 'restaurant', 'food', 'point_of_interest', 'establishment'] | 838.0 | 193 Gore Creek Drive, Vail | Vail | Restaurants | 3.0 |
39.640228 | -106.374381 | Elway's | 4.3 | ['bar', 'restaurant', 'food', 'point_of_interest', 'establishment'] | 385.0 | Located Upstairs in The Lodge at Vail, 174 Gore Creek Drive, Vail | Vail | Restaurants | 4.0 |
39.643914 | -106.390088 | The Little Diner | 4.7 | ['restaurant', 'food', 'point_of_interest', 'store', 'establishment'] | 1390.0 | 616 West Lionshead Circle, Vail | Vail | Restaurants | 2.0 |
39.640248 | -106.373333 | Red Lion | 3.9 | ['bar', 'restaurant', 'food', 'point_of_interest', 'establishment'] | 740.0 | 304 Bridge Street St.1, Vail | Vail | Restaurants | 2.0 |
39.641490 | -106.397471 | Chicago Pizza | 3.9 | ['meal_delivery', 'meal_takeaway', 'restaurant', 'food', 'point_of_interest', 'establishment'] | 216.0 | 1031 South Frontage Road West, Vail | Vail | Restaurants | 1.0 |
The cleaned final google places data used across this project. Will be merged into.
Latitude | Longitude | Name | rating | total_ratings | Resort | Call Category | Initial Category | Secondary Category | Tertiary Category |
---|---|---|---|---|---|---|---|---|---|
39.639411 | -106.367836 | Manor Vail Lodge | 4.7 | 370.0 | Vail | Restaurants | bar | lodging | restaurant |
39.641578 | -106.371678 | Gravity Haus Vail | 4.4 | 256.0 | Vail | Restaurants | gym | spa | lodging |
39.642639 | -106.377803 | Leonora | 4.3 | 167.0 | Vail | Restaurants | restaurant | food | point_of_interest |
39.638962 | -106.369379 | Larkspur Events & Dining | 4.5 | 198.0 | Vail | Restaurants | restaurant | food | point_of_interest |
39.630370 | -106.418694 | Subway | 2.7 | 105.0 | Vail | Restaurants | meal_takeaway | restaurant | food |
39.640861 | -106.374665 | Sweet Basil | 4.4 | 838.0 | Vail | Restaurants | bar | restaurant | food |
39.640228 | -106.374381 | Elway's | 4.3 | 385.0 | Vail | Restaurants | bar | restaurant | food |
39.643914 | -106.390088 | The Little Diner | 4.7 | 1390.0 | Vail | Restaurants | restaurant | food | point_of_interest |
39.640248 | -106.373333 | Red Lion | 3.9 | 740.0 | Vail | Restaurants | bar | restaurant | food |
39.641490 | -106.397471 | Chicago Pizza | 3.9 | 216.0 | Vail | Restaurants | meal_delivery | meal_takeaway | restaurant |
The ARM-ready dataset. The main transaction data exists in one of the columns while the labels exist for use in expansion functions available in the functions script.
types | Call Category | Resort | Country | Pass | Region |
---|---|---|---|---|---|
['bar', 'lodging', 'restaurant', 'food', 'point_of_interest', 'establishment'] | Restaurants | Vail | United States | Epic | West |
['gym', 'spa', 'lodging', 'restaurant', 'food', 'point_of_interest', 'health', 'establishment'] | Restaurants | Vail | United States | Epic | West |
['restaurant', 'food', 'point_of_interest', 'establishment'] | Restaurants | Vail | United States | Epic | West |
['restaurant', 'food', 'point_of_interest', 'establishment'] | Restaurants | Vail | United States | Epic | West |
['meal_takeaway', 'restaurant', 'food', 'point_of_interest', 'establishment'] | Restaurants | Vail | United States | Epic | West |
['bar', 'restaurant', 'food', 'point_of_interest', 'establishment'] | Restaurants | Vail | United States | Epic | West |
['bar', 'restaurant', 'food', 'point_of_interest', 'establishment'] | Restaurants | Vail | United States | Epic | West |
['restaurant', 'food', 'point_of_interest', 'store', 'establishment'] | Restaurants | Vail | United States | Epic | West |
['bar', 'restaurant', 'food', 'point_of_interest', 'establishment'] | Restaurants | Vail | United States | Epic | West |
['meal_delivery', 'meal_takeaway', 'restaurant', 'food', 'point_of_interest', 'establishment'] | Restaurants | Vail | United States | Epic | West |
A snippet of the transaction-type data isolated.
['bar', 'lodging', 'restaurant', 'food', 'point_of_interest', 'establishment'] |
['gym', 'spa', 'lodging', 'restaurant', 'food', 'point_of_interest', 'health', 'establishment'] |
['restaurant', 'food', 'point_of_interest', 'establishment'] |
['restaurant', 'food', 'point_of_interest', 'establishment'] |
['meal_takeaway', 'restaurant', 'food', 'point_of_interest', 'establishment'] |
['bar', 'restaurant', 'food', 'point_of_interest', 'establishment'] |
['bar', 'restaurant', 'food', 'point_of_interest', 'establishment'] |
['restaurant', 'food', 'point_of_interest', 'store', 'establishment'] |
['bar', 'restaurant', 'food', 'point_of_interest', 'establishment'] |
['meal_delivery', 'meal_takeaway', 'restaurant', 'food', 'point_of_interest', 'establishment'] |
Using just the main transaction type data (i.e. no labels included), the Apriori Algorithm was ran to find frequent itemsets and then an Apriori Rule Based Algorithm was ran to find association rules.
Given that this was a large dataset, to capture as many frequent itemsets and association rules as possible, a low support threshold was used for the inital alogrithm and a low confidence threshold was used for the secondary algorithm. The final association rules can always be reduced via filtering on different
thresholds if required. The dataset being rather large is relevant to support since this is an initial measure on proportion in relation to the entire dataset. Rarer occurences would be pruned if not.
antecedents | consequents | antecedent support | consequent support | support | confidence | lift | leverage | conviction | zhangs_metric | |
---|---|---|---|---|---|---|---|---|---|---|
0 | frozenset({'point_of_interest'}) | frozenset({'establishment'}) | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.0 | 0.0 | inf | 0.0 |
1 | frozenset({'establishment'}) | frozenset({'point_of_interest'}) | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.0 | 0.0 | inf | 0.0 |
2 | frozenset({'point_of_interest'}) | frozenset({'food'}) | 1.000000 | 0.371505 | 0.371505 | 0.371505 | 1.0 | 0.0 | 1.0 | 0.0 |
3 | frozenset({'food'}) | frozenset({'establishment', 'point_of_interest'}) | 0.371505 | 1.000000 | 0.371505 | 1.000000 | 1.0 | 0.0 | inf | 0.0 |
4 | frozenset({'establishment'}) | frozenset({'food', 'point_of_interest'}) | 1.000000 | 0.371505 | 0.371505 | 0.371505 | 1.0 | 0.0 | 1.0 | 0.0 |
5 | frozenset({'point_of_interest'}) | frozenset({'establishment', 'food'}) | 1.000000 | 0.371505 | 0.371505 | 0.371505 | 1.0 | 0.0 | 1.0 | 0.0 |
6 | frozenset({'establishment', 'point_of_interest'}) | frozenset({'food'}) | 1.000000 | 0.371505 | 0.371505 | 0.371505 | 1.0 | 0.0 | 1.0 | 0.0 |
7 | frozenset({'food', 'point_of_interest'}) | frozenset({'establishment'}) | 0.371505 | 1.000000 | 0.371505 | 1.000000 | 1.0 | 0.0 | inf | 0.0 |
8 | frozenset({'establishment', 'food'}) | frozenset({'point_of_interest'}) | 0.371505 | 1.000000 | 0.371505 | 1.000000 | 1.0 | 0.0 | inf | 0.0 |
9 | frozenset({'food'}) | frozenset({'establishment'}) | 0.371505 | 1.000000 | 0.371505 | 1.000000 | 1.0 | 0.0 | inf | 0.0 |
10 | frozenset({'establishment'}) | frozenset({'food'}) | 1.000000 | 0.371505 | 0.371505 | 0.371505 | 1.0 | 0.0 | 1.0 | 0.0 |
11 | frozenset({'food'}) | frozenset({'point_of_interest'}) | 0.371505 | 1.000000 | 0.371505 | 1.000000 | 1.0 | 0.0 | inf | 0.0 |
12 | frozenset({'point_of_interest', 'store'}) | frozenset({'establishment'}) | 0.331246 | 1.000000 | 0.331246 | 1.000000 | 1.0 | 0.0 | inf | 0.0 |
13 | frozenset({'establishment'}) | frozenset({'store'}) | 1.000000 | 0.331246 | 0.331246 | 0.331246 | 1.0 | 0.0 | 1.0 | 0.0 |
14 | frozenset({'store'}) | frozenset({'establishment', 'point_of_interest'}) | 0.331246 | 1.000000 | 0.331246 | 1.000000 | 1.0 | 0.0 | inf | 0.0 |
antecedents | consequents | antecedent support | consequent support | support | confidence | lift | leverage | conviction | zhangs_metric | |
---|---|---|---|---|---|---|---|---|---|---|
0 | frozenset({'establishment', 'point_of_interest', 'restaurant'}) | frozenset({'food'}) | 0.241776 | 0.371505 | 0.241776 | 1.000000 | 2.691751 | 0.151955 | inf | 0.828904 |
1 | frozenset({'establishment', 'restaurant'}) | frozenset({'food'}) | 0.241776 | 0.371505 | 0.241776 | 1.000000 | 2.691751 | 0.151955 | inf | 0.828904 |
2 | frozenset({'food'}) | frozenset({'point_of_interest', 'restaurant'}) | 0.371505 | 0.241776 | 0.241776 | 0.650801 | 2.691751 | 0.151955 | 2.171324 | 1.000000 |
3 | frozenset({'food'}) | frozenset({'restaurant'}) | 0.371505 | 0.241776 | 0.241776 | 0.650801 | 2.691751 | 0.151955 | 2.171324 | 1.000000 |
4 | frozenset({'restaurant'}) | frozenset({'food'}) | 0.241776 | 0.371505 | 0.241776 | 1.000000 | 2.691751 | 0.151955 | inf | 0.828904 |
5 | frozenset({'restaurant'}) | frozenset({'food', 'point_of_interest'}) | 0.241776 | 0.371505 | 0.241776 | 1.000000 | 2.691751 | 0.151955 | inf | 0.828904 |
6 | frozenset({'establishment', 'point_of_interest', 'food'}) | frozenset({'restaurant'}) | 0.371505 | 0.241776 | 0.241776 | 0.650801 | 2.691751 | 0.151955 | 2.171324 | 1.000000 |
7 | frozenset({'food', 'point_of_interest'}) | frozenset({'restaurant'}) | 0.371505 | 0.241776 | 0.241776 | 0.650801 | 2.691751 | 0.151955 | 2.171324 | 1.000000 |
8 | frozenset({'restaurant'}) | frozenset({'establishment', 'food'}) | 0.241776 | 0.371505 | 0.241776 | 1.000000 | 2.691751 | 0.151955 | inf | 0.828904 |
9 | frozenset({'food'}) | frozenset({'establishment', 'restaurant'}) | 0.371505 | 0.241776 | 0.241776 | 0.650801 | 2.691751 | 0.151955 | 2.171324 | 1.000000 |
10 | frozenset({'point_of_interest', 'restaurant'}) | frozenset({'food'}) | 0.241776 | 0.371505 | 0.241776 | 1.000000 | 2.691751 | 0.151955 | inf | 0.828904 |
11 | frozenset({'establishment', 'food'}) | frozenset({'restaurant'}) | 0.371505 | 0.241776 | 0.241776 | 0.650801 | 2.691751 | 0.151955 | 2.171324 | 1.000000 |
12 | frozenset({'establishment', 'food'}) | frozenset({'point_of_interest', 'restaurant'}) | 0.371505 | 0.241776 | 0.241776 | 0.650801 | 2.691751 | 0.151955 | 2.171324 | 1.000000 |
13 | frozenset({'establishment', 'restaurant'}) | frozenset({'food', 'point_of_interest'}) | 0.241776 | 0.371505 | 0.241776 | 1.000000 | 2.691751 | 0.151955 | inf | 0.828904 |
14 | frozenset({'restaurant'}) | frozenset({'establishment', 'point_of_interest', 'food'}) | 0.241776 | 0.371505 | 0.241776 | 1.000000 | 2.691751 | 0.151955 | inf | 0.828904 |
antecedents | consequents | antecedent support | consequent support | support | confidence | lift | leverage | conviction | zhangs_metric | |
---|---|---|---|---|---|---|---|---|---|---|
0 | frozenset({'establishment', 'point_of_interest', 'store', 'supermarket'}) | frozenset({'grocery_or_supermarket'}) | 0.026149 | 0.043170 | 0.026149 | 1.0 | 23.164454 | 0.025020 | inf | 0.982522 |
1 | frozenset({'food', 'convenience_store', 'drugstore'}) | frozenset({'health'}) | 0.011377 | 0.208704 | 0.011377 | 1.0 | 4.791464 | 0.009002 | inf | 0.800401 |
2 | frozenset({'food', 'convenience_store', 'drugstore'}) | frozenset({'establishment'}) | 0.011377 | 1.000000 | 0.011377 | 1.0 | 1.000000 | 0.000000 | inf | 0.000000 |
3 | frozenset({'finance', 'food', 'store'}) | frozenset({'establishment', 'point_of_interest'}) | 0.012611 | 1.000000 | 0.012611 | 1.0 | 1.000000 | 0.000000 | inf | 0.000000 |
4 | frozenset({'convenience_store', 'drugstore'}) | frozenset({'establishment', 'food'}) | 0.011377 | 0.371505 | 0.011377 | 1.0 | 2.691751 | 0.007150 | inf | 0.635727 |
5 | frozenset({'finance', 'food', 'point_of_interest', 'store'}) | frozenset({'establishment'}) | 0.012611 | 1.000000 | 0.012611 | 1.0 | 1.000000 | 0.000000 | inf | 0.000000 |
6 | frozenset({'finance', 'food', 'store', 'establishment'}) | frozenset({'point_of_interest'}) | 0.012611 | 1.000000 | 0.012611 | 1.0 | 1.000000 | 0.000000 | inf | 0.000000 |
7 | frozenset({'establishment', 'convenience_store', 'drugstore'}) | frozenset({'health'}) | 0.011377 | 0.208704 | 0.011377 | 1.0 | 4.791464 | 0.009002 | inf | 0.800401 |
8 | frozenset({'convenience_store', 'drugstore', 'health'}) | frozenset({'establishment'}) | 0.011377 | 1.000000 | 0.011377 | 1.0 | 1.000000 | 0.000000 | inf | 0.000000 |
9 | frozenset({'convenience_store', 'drugstore'}) | frozenset({'establishment', 'health'}) | 0.011377 | 0.208704 | 0.011377 | 1.0 | 4.791464 | 0.009002 | inf | 0.800401 |
10 | frozenset({'drugstore', 'pharmacy'}) | frozenset({'point_of_interest', 'store', 'health'}) | 0.011465 | 0.037878 | 0.011465 | 1.0 | 26.400466 | 0.011031 | inf | 0.973280 |
11 | frozenset({'establishment', 'convenience_store', 'drugstore'}) | frozenset({'point_of_interest'}) | 0.011377 | 1.000000 | 0.011377 | 1.0 | 1.000000 | 0.000000 | inf | 0.000000 |
12 | frozenset({'convenience_store', 'point_of_interest', 'drugstore'}) | frozenset({'establishment'}) | 0.011377 | 1.000000 | 0.011377 | 1.0 | 1.000000 | 0.000000 | inf | 0.000000 |
13 | frozenset({'point_of_interest', 'drugstore', 'pharmacy'}) | frozenset({'store', 'health'}) | 0.011465 | 0.037878 | 0.011465 | 1.0 | 26.400466 | 0.011031 | inf | 0.973280 |
14 | frozenset({'store', 'drugstore', 'pharmacy'}) | frozenset({'point_of_interest', 'health'}) | 0.011465 | 0.208704 | 0.011465 | 1.0 | 4.791464 | 0.009072 | inf | 0.800473 |
antecedents | consequents | antecedent support | consequent support | support | confidence | lift | leverage | conviction | zhangs_metric | |
---|---|---|---|---|---|---|---|---|---|---|
0 | frozenset({'grocery_or_supermarket', 'store', 'supermarket'}) | frozenset({'food', 'establishment'}) | 0.026149 | 0.371505 | 0.026149 | 1.0 | 2.691751 | 0.016434 | inf | 0.645370 |
1 | frozenset({'finance', 'convenience_store', 'atm'}) | frozenset({'establishment', 'store'}) | 0.011641 | 0.331246 | 0.011641 | 1.0 | 3.018903 | 0.007785 | inf | 0.676631 |
2 | frozenset({'cafe', 'bakery'}) | frozenset({'point_of_interest', 'store'}) | 0.011156 | 0.331246 | 0.011156 | 1.0 | 3.018903 | 0.007461 | inf | 0.676299 |
3 | frozenset({'convenience_store', 'atm'}) | frozenset({'finance', 'food', 'establishment'}) | 0.011641 | 0.012744 | 0.011641 | 1.0 | 78.470588 | 0.011493 | inf | 0.998885 |
4 | frozenset({'cafe', 'point_of_interest', 'bakery'}) | frozenset({'store'}) | 0.011156 | 0.331246 | 0.011156 | 1.0 | 3.018903 | 0.007461 | inf | 0.676299 |
5 | frozenset({'establishment', 'convenience_store', 'point_of_interest', 'atm'}) | frozenset({'finance'}) | 0.011641 | 0.013449 | 0.011641 | 1.0 | 74.354098 | 0.011485 | inf | 0.998171 |
6 | frozenset({'establishment', 'convenience_store', 'atm'}) | frozenset({'finance', 'point_of_interest'}) | 0.011641 | 0.013449 | 0.011641 | 1.0 | 74.354098 | 0.011485 | inf | 0.998171 |
7 | frozenset({'convenience_store', 'point_of_interest', 'atm'}) | frozenset({'finance', 'establishment'}) | 0.011641 | 0.013449 | 0.011641 | 1.0 | 74.354098 | 0.011485 | inf | 0.998171 |
8 | frozenset({'convenience_store', 'atm'}) | frozenset({'finance', 'establishment', 'point_of_interest'}) | 0.011641 | 0.013449 | 0.011641 | 1.0 | 74.354098 | 0.011485 | inf | 0.998171 |
9 | frozenset({'restaurant', 'cafe', 'point_of_interest', 'bakery'}) | frozenset({'food', 'store'}) | 0.010495 | 0.172017 | 0.010495 | 1.0 | 5.813381 | 0.008689 | inf | 0.836765 |
10 | frozenset({'finance', 'establishment', 'convenience_store', 'atm'}) | frozenset({'store'}) | 0.011641 | 0.331246 | 0.011641 | 1.0 | 3.018903 | 0.007785 | inf | 0.676631 |
11 | frozenset({'establishment', 'convenience_store', 'store', 'atm'}) | frozenset({'finance'}) | 0.011641 | 0.013449 | 0.011641 | 1.0 | 74.354098 | 0.011485 | inf | 0.998171 |
12 | frozenset({'establishment', 'convenience_store', 'atm'}) | frozenset({'finance', 'store'}) | 0.011641 | 0.012964 | 0.011641 | 1.0 | 77.136054 | 0.011490 | inf | 0.998662 |
13 | frozenset({'food', 'convenience_store', 'atm'}) | frozenset({'finance', 'establishment'}) | 0.011641 | 0.013449 | 0.011641 | 1.0 | 74.354098 | 0.011485 | inf | 0.998171 |
14 | frozenset({'convenience_store', 'store', 'atm'}) | frozenset({'finance', 'establishment'}) | 0.011641 | 0.013449 | 0.011641 | 1.0 | 74.354098 | 0.011485 | inf | 0.998171 |
antecedents | consequents | antecedent support | consequent support | support | confidence | lift | leverage | conviction | zhangs_metric | |
---|---|---|---|---|---|---|---|---|---|---|
0 | frozenset({'food', 'drugstore'}) | frozenset({'point_of_interest', 'convenience_store', 'store', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
1 | frozenset({'food', 'store', 'drugstore'}) | frozenset({'establishment', 'convenience_store', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
2 | frozenset({'food', 'point_of_interest', 'store', 'drugstore'}) | frozenset({'establishment', 'convenience_store', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
3 | frozenset({'food', 'drugstore', 'establishment'}) | frozenset({'convenience_store', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
4 | frozenset({'food', 'drugstore'}) | frozenset({'establishment', 'convenience_store', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
5 | frozenset({'food', 'point_of_interest', 'drugstore'}) | frozenset({'convenience_store', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
6 | frozenset({'food', 'drugstore'}) | frozenset({'establishment', 'convenience_store', 'point_of_interest', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
7 | frozenset({'food', 'drugstore'}) | frozenset({'convenience_store', 'point_of_interest', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
8 | frozenset({'food', 'store', 'drugstore', 'establishment'}) | frozenset({'convenience_store', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
9 | frozenset({'food', 'drugstore', 'establishment'}) | frozenset({'convenience_store', 'store', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
10 | frozenset({'food', 'drugstore', 'establishment'}) | frozenset({'point_of_interest', 'convenience_store', 'store', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
11 | frozenset({'food', 'store', 'drugstore'}) | frozenset({'establishment', 'convenience_store', 'point_of_interest', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
12 | frozenset({'food', 'point_of_interest', 'drugstore'}) | frozenset({'establishment', 'convenience_store', 'store', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
13 | frozenset({'food', 'establishment', 'store', 'point_of_interest', 'drugstore'}) | frozenset({'convenience_store', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
14 | frozenset({'food', 'store', 'drugstore', 'establishment'}) | frozenset({'convenience_store', 'point_of_interest', 'health'}) | 0.011597 | 0.011641 | 0.011377 | 0.980989 | 84.268406 | 0.011242 | 51.987671 | 0.999727 |
The top rules without applying the lift parameter show that Establishment and Point of Interest are very common, if not in every rule. Therefore, by applying
the lift parameter of greater than 1, rules begin to show associations with more signficant results. In fact, when sorted by descending lift values itself, these illustrate
some of the most significant associations.
To further illustrate these associations, networks visualizations were created. Note that these networks are interactive and contain hover information.
Associations between the returned categories do reveal interesting assocations in there own right. However, insight can be gained into the efficacy of the Google Places API by appending the
call category label to the datasets. In other words, when the API was called with a specific business category in mind, what actually was returned?
For this process, the label was appended to the transaction-type data and association rules were made again with the same low thresholds to capture as many associations as possible. Once the rules were created, the
antecedents were reduced to only rules with the call category as a single antecedent.
antecedents | consequents | antecedent support | consequent support | support | confidence | lift | leverage | conviction | zhangs_metric |
---|---|---|---|---|---|---|---|---|---|
frozenset({'call_grocery'}) | frozenset({'atm'}) | 0.150631 | 0.012435 | 0.010980 | 0.072892 | 5.861883 | 0.009107 | 1.065211 | 0.976497 |
frozenset({'call_restaurants'}) | frozenset({'bakery'}) | 0.198033 | 0.020945 | 0.012788 | 0.064574 | 3.082947 | 0.008640 | 1.046640 | 0.842473 |
frozenset({'call_bars'}) | frozenset({'bar'}) | 0.067863 | 0.115001 | 0.067863 | 1.000000 | 8.695552 | 0.060059 | inf | 0.949430 |
frozenset({'call_restaurants'}) | frozenset({'bar'}) | 0.198033 | 0.115001 | 0.045507 | 0.229793 | 1.998176 | 0.022733 | 1.149040 | 0.622898 |
frozenset({'call_spas'}) | frozenset({'beauty_salon'}) | 0.075756 | 0.029897 | 0.028839 | 0.380675 | 12.732968 | 0.026574 | 1.566388 | 0.996992 |
frozenset({'call_restaurants'}) | frozenset({'cafe'}) | 0.198033 | 0.032058 | 0.023768 | 0.120018 | 3.743829 | 0.017419 | 1.099957 | 0.913871 |
frozenset({'call_bars'}) | frozenset({'establishment'}) | 0.067863 | 1.000000 | 0.067863 | 1.000000 | 1.000000 | 0.000000 | inf | 0.000000 |
frozenset({'call_bars'}) | frozenset({'food'}) | 0.067863 | 0.371505 | 0.038540 | 0.567901 | 1.528649 | 0.013328 | 1.454516 | 0.371005 |
frozenset({'call_bars'}) | frozenset({'point_of_interest'}) | 0.067863 | 1.000000 | 0.067863 | 1.000000 | 1.000000 | 0.000000 | inf | 0.000000 |
frozenset({'call_bars'}) | frozenset({'restaurant'}) | 0.067863 | 0.241776 | 0.034174 | 0.503574 | 2.082810 | 0.017766 | 1.527364 | 0.557729 |
As seen with the unlabled networks, Point of Interest and Establishment are central returns for the majority of the categories. However,
this also illustrates that there are significant associations between what was called within the API and categories that could be expected as returns.
In other words, this shows that the the Google Places API did perform well in the case of properly returning business types based on a call.
Several results were found within the Google data via Assocation Rule Mining. Most notably:
Associated Rule Mining, although more commonly applied in market basket analysis with transaction specific data, can be quite useful in finding associations and relationships across many applications. For example, categorizing businesses near ski resorts using Google Place’s application interface. Different types of main business categories are surrounding ski resorts, such as Restaurants, Bars, Shopping Centers, and Medical Services. The actual businesses may contain different subcategories within Google Place’s interface. By looking at the associations between main categories and subcategories of businesses surrounding ski resorts, patterns begin to emerge. For instance, businesses categorized as Food and Restaurant establishments are highly prevalent within this area. In general, Stores will be in the area given that other businesses are nearby. Additionally, where there’s Food and Stores, there’s a general trend of Convenience Stores and Health centers associated with the ski resort location. In summary, multiple businesses offering general amenities are almost certain to exist in these locations. Where there is one, there is likely many.