This Project is about free Android and IOS apps which consists of in-apps ads. The number of users of researching apps determines our revenue for each given app. The Project goal is to analyze available data to help developers understand what type of apps are likely to attract more users.
After analyzing the data, we defined a concept of the Profitable app profile. The great notice that really can increase our chances to maximize a profit is adding interactive elements (like podcasts) and even gamification within the app.
For more details, please refer to the the full analysis below.
1. Create open_file function and use it for opening data files as list of lists.
def open_dataset (file_name, header = False):
opened_file = open(file_name, encoding='utf8')
from csv import reader
red_file = reader(opened_file)
data_set = list(red_file)
if header:
return data_set[1:]
return data_set
android_apps = open_dataset('googleplaystore.csv')
ios_apps = open_dataset('AppleStore.csv')
2. Create function for exploring dataset. Explore data of both datasets.
Here is a link for documentation on the dataset with Android apps: Android_apps_doc. And the same for IOS: IOS_apps_doc.
def explore_data(dataset, start, end, rows_and_columns=False):
dataset_slice = dataset[start:end]
for row in dataset_slice:
print(row)
print('\n') # adds a new (empty) line after each row
if rows_and_columns:
print('Number of rows:', len(dataset))
print('Number of columns:', len(dataset[0]))
explore_data(android_apps, 0, 2, True)
explore_data(ios_apps, 0, 2, True)
['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver'] ['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up'] Number of rows: 10842 Number of columns: 13 ['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic'] ['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1'] Number of rows: 7198 Number of columns: 16
3. Start data cleanning. Delete a row with empty value wich causes extra shift of the row. This step is based on learning a discussions sections which are noticing what is wrong in datasets. You can take a look on it here: Discussions
print(android_apps[10473])
del android_apps[10473]
print(len(android_apps))
['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up'] 10841
4. Lets continue data cleaning. There is reasonable to check the datasets on duplicates exist in. For this we're needed to loop through datasets and looking for each entry of given apps. This help us answer a question is the app unique or not?
def duplicates_checking(data_set, app_name_id_column = 0):
unique_apps = []
duplicate_apps = []
for app in data_set:
app_name = app[app_name_id_column]
if app_name in unique_apps:
duplicate_apps.append(app_name)
else:
unique_apps.append(app_name)
print('Numbers of duplicate apps:', len(duplicate_apps))
print('\n')
print('Examples of duplicate apps:', duplicate_apps[:5])
duplicates_checking(android_apps)
print('\n')
duplicates_checking(ios_apps, 1)
Numbers of duplicate apps: 1181 Examples of duplicate apps: ['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings', 'join.me - Simple Meetings'] Numbers of duplicate apps: 2 Examples of duplicate apps: ['Mannequin Challenge', 'VR Roller Coaster']
5. But now another question appears. Which duplicates should we delete from the dataset? Rather than do it randomly lets create a criterion to make it wisdomly. If we examine the rows for 'Facebook' app the main difference happens on the fourth position of each row, which corresponds to the count of reviews. So the upper entry of the app with the bigest amount of views is the most accurate datapoint in the dataset.
for app in android_apps:
if app[0] == 'Facebook':
print(app)
print('\n')
for app in ios_apps:
if app[1] == 'Mannequin Challenge':
print(app)
['Facebook', 'SOCIAL', '4.1', '78158306', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'August 3, 2018', 'Varies with device', 'Varies with device'] ['Facebook', 'SOCIAL', '4.1', '78128208', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'August 3, 2018', 'Varies with device', 'Varies with device'] ['1173990889', 'Mannequin Challenge', '109705216', 'USD', '0.0', '668', '87', '3.0', '3.0', '1.4', '9+', 'Games', '37', '4', '1', '1'] ['1178454060', 'Mannequin Challenge', '59572224', 'USD', '0.0', '105', '58', '4.0', '4.5', '1.0.1', '4+', 'Games', '38', '5', '1', '1']
6. Create a function based on chosen criterion. Build the dictionaries for duplicates cleaning.
def reviews_max(data_set, header = True, app_name_column = 0, reviews_column = 3):
total_reviews = {}
if header:
for app in data_set[1:]:
name = app[app_name_column]
reviews = float(app[reviews_column])
if name in total_reviews and total_reviews[name]<reviews:
total_reviews[name] = reviews
elif name not in total_reviews:
total_reviews[name] = reviews
else:
for app in data_set:
name = app[app_name_column]
reviews = float(app[reviews_column])
if name in total_reviews and total_reviews[name]<reviews:
total_reviews[name] = reviews
elif name not in total_reviews:
total_reviews[name] = reviews
return total_reviews
android_reviews = reviews_max(android_apps)
ios_reviews = reviews_max(ios_apps, True, 1, 8)
# Checking for data in criterion dictionaries.
print(android_reviews['Facebook'])
print(len(android_reviews))
print(10842-1181-1) # apps_amount - duplicate_apps - header
print(ios_reviews['Mannequin Challenge'])
print(len(ios_reviews))
print(7198-2-1)
78158306.0 9659 9660 4.5 7195 7195
7. Remove the duplicate entries. We should using an extra list to exclude adding repetitive apps in cleaned dataset which has the same number off views.
def dataset_cleaning (data_set, criterion_dictionary, app_name_column = 0, reviews_column = 3, header = True):
dataset_cleaned = []
already_added = []
if header:
for app in data_set[1:]:
name = app[app_name_column]
reviews = float(app[reviews_column])
if name not in already_added and reviews == criterion_dictionary[name]:
dataset_cleaned.append(app)
already_added.append(name)
else:
for app in data_set:
name = app[app_name_column]
reviews = float(app[reviews_column])
if name not in already_added and reviews == criterion_dictionary[name]:
dataset_cleaned.append(app)
already_added.append(name)
return dataset_cleaned, already_added
android_cleaned, android_added = dataset_cleaning(android_apps, android_reviews)
print('Expected length for android_cleaned is 9659 rows, actual length is:', len(android_cleaned))
print('\n')
ios_cleaned, ios_added = dataset_cleaning(ios_apps, ios_reviews, 1, 8)
print('Expected length for ios_cleaned is 7195 rows, actual length is:', len(ios_cleaned))
Expected length for android_cleaned is 9659 rows, actual length is: 9659 Expected length for ios_cleaned is 7195 rows, actual length is: 7195
8. Preparing a function to isolate non-English apps within datasets. There are some special symbols like emojis which can be included in English-app name. So that we need to check more than one character for each app. This condition makes our function fairly effective in using.
def english_app_check (app_name):
out_of_range_count = 0
for character in app_name:
ASCII_number = ord(character)
if out_of_range_count == 3:
return False
elif ASCII_number > 127:
out_of_range_count += 1
return True
print(english_app_check('Instagram'))
print(english_app_check('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(english_app_check('Docs To Go™ Free Office Suite'))
print(english_app_check('Instachat 😜'))
True False True True
9. On this step lets exclude non-English apps from both datasets using created previously function. Notice that both datasets don't have a header row after data cleaning. So a function we are going to create doesn't inlude this option.
def only_english_apps (data_set, app_name_column = 0):
english_apps_only = []
excluded_apps = []
for app in data_set:
english_app = english_app_check(app[app_name_column])
if english_app:
english_apps_only.append(app)
else:
excluded_apps.append(app[app_name_column])
return english_apps_only, excluded_apps
android_eng, android_non_eng = only_english_apps(android_cleaned)
print('Amount of Android English apps is: ', len(android_eng))
print('Some of non-English apps: ', android_non_eng[:5])
print('\n')
ios_eng, ios_non_eng = only_english_apps(ios_cleaned)
print('Amount of IOS English apps is: ', len(ios_eng))
print('Some of non-English apps: ', ios_non_eng[:5])
Amount of Android English apps is: 9600 Some of non-English apps: ['Truyện Vui Tý Quậy', 'Flame - درب عقلك يوميا', 'At home - rental · real estate · room finding application such as apartment · apartment', '乐屋网: Buying a house, selling a house, renting a house', 'သိင်္ Astrology - Min Thein Kha BayDin'] Amount of IOS English apps is: 7195 Some of non-English apps: []
10. As our goal is to explore the revenue of free apps the last cleaning process is isolating those data to a separate list.
def isolate_free_apps (data_set, app_price_column = 7):
free_apps = []
for app in data_set:
if app[app_price_column] == '0.0' or app[app_price_column] == '0':
free_apps.append(app)
return free_apps
android_free = isolate_free_apps(android_eng)
print('Amount of isolated non-free Android apps: ', len(android_eng) - len(android_free))
print('\n')
ios_free = isolate_free_apps(ios_eng, 4)
print('Amount of isolated non-free IOS apps: ', len(ios_eng) - len(ios_free))
Amount of isolated non-free Android apps: 749 Amount of isolated non-free IOS apps: 3141
11. After we did data cleaning for both datasets which included:
we need choose the right analysing way of both datasets corresponding to our main Project goal. Here we have to speak about validation strategy which has three steps:
So that, we need to find app profiles that are successful in both markets. Let's begin the analysis by determining the most common genres for each market.
def freq_table (data_set, column_number):
dictionary = {}
data_set_length = len(data_set)
for app in data_set:
item = app[column_number]
if item in dictionary:
dictionary[item] += 1
else:
dictionary[item] = 1
for key in dictionary:
dictionary[key] = round(dictionary[key] / data_set_length * 100, 2)
return dictionary
def display_table(dataset, index):
table = freq_table(dataset, index)
table_display = []
for key in table:
key_val_as_tuple = (table[key], key)
table_display.append(key_val_as_tuple)
table_sorted = sorted(table_display, reverse = True)
for entry in table_sorted:
print(entry[1], ':', entry[0])
display_table (android_free, 1)
print('\n')
display_table (android_free, 9)
print('\n')
display_table (ios_free, 11)
FAMILY : 18.94 GAME : 9.73 TOOLS : 8.45 BUSINESS : 4.6 PRODUCTIVITY : 3.9 LIFESTYLE : 3.89 FINANCE : 3.71 MEDICAL : 3.54 SPORTS : 3.39 PERSONALIZATION : 3.32 COMMUNICATION : 3.23 HEALTH_AND_FITNESS : 3.08 PHOTOGRAPHY : 2.95 NEWS_AND_MAGAZINES : 2.8 SOCIAL : 2.67 TRAVEL_AND_LOCAL : 2.34 SHOPPING : 2.25 BOOKS_AND_REFERENCE : 2.14 DATING : 1.86 VIDEO_PLAYERS : 1.8 MAPS_AND_NAVIGATION : 1.39 FOOD_AND_DRINK : 1.24 EDUCATION : 1.16 ENTERTAINMENT : 0.96 LIBRARIES_AND_DEMO : 0.94 AUTO_AND_VEHICLES : 0.93 HOUSE_AND_HOME : 0.8 WEATHER : 0.79 EVENTS : 0.71 PARENTING : 0.66 ART_AND_DESIGN : 0.64 COMICS : 0.61 BEAUTY : 0.6 Tools : 8.44 Entertainment : 6.08 Education : 5.36 Business : 4.6 Productivity : 3.9 Lifestyle : 3.88 Finance : 3.71 Medical : 3.54 Sports : 3.46 Personalization : 3.32 Communication : 3.23 Action : 3.11 Health & Fitness : 3.08 Photography : 2.95 News & Magazines : 2.8 Social : 2.67 Travel & Local : 2.33 Shopping : 2.25 Books & Reference : 2.14 Simulation : 2.04 Dating : 1.86 Arcade : 1.84 Video Players & Editors : 1.77 Casual : 1.76 Maps & Navigation : 1.39 Food & Drink : 1.24 Puzzle : 1.13 Racing : 0.99 Role Playing : 0.94 Libraries & Demo : 0.94 Auto & Vehicles : 0.93 Strategy : 0.92 House & Home : 0.8 Weather : 0.79 Events : 0.71 Adventure : 0.68 Comics : 0.6 Beauty : 0.6 Art & Design : 0.6 Parenting : 0.5 Card : 0.45 Casino : 0.43 Trivia : 0.42 Educational;Education : 0.4 Board : 0.38 Educational : 0.37 Education;Education : 0.34 Word : 0.26 Casual;Pretend Play : 0.24 Music : 0.2 Racing;Action & Adventure : 0.17 Puzzle;Brain Games : 0.17 Entertainment;Music & Video : 0.17 Casual;Brain Games : 0.14 Casual;Action & Adventure : 0.14 Arcade;Action & Adventure : 0.12 Action;Action & Adventure : 0.1 Educational;Pretend Play : 0.09 Simulation;Action & Adventure : 0.08 Parenting;Education : 0.08 Entertainment;Brain Games : 0.08 Board;Brain Games : 0.08 Parenting;Music & Video : 0.07 Educational;Brain Games : 0.07 Casual;Creativity : 0.07 Art & Design;Creativity : 0.07 Education;Pretend Play : 0.06 Role Playing;Pretend Play : 0.05 Education;Creativity : 0.05 Role Playing;Action & Adventure : 0.03 Puzzle;Action & Adventure : 0.03 Entertainment;Creativity : 0.03 Entertainment;Action & Adventure : 0.03 Educational;Creativity : 0.03 Educational;Action & Adventure : 0.03 Education;Music & Video : 0.03 Education;Brain Games : 0.03 Education;Action & Adventure : 0.03 Adventure;Action & Adventure : 0.03 Video Players & Editors;Music & Video : 0.02 Sports;Action & Adventure : 0.02 Simulation;Pretend Play : 0.02 Puzzle;Creativity : 0.02 Music;Music & Video : 0.02 Entertainment;Pretend Play : 0.02 Casual;Education : 0.02 Board;Action & Adventure : 0.02 Video Players & Editors;Creativity : 0.01 Trivia;Education : 0.01 Travel & Local;Action & Adventure : 0.01 Tools;Education : 0.01 Strategy;Education : 0.01 Strategy;Creativity : 0.01 Strategy;Action & Adventure : 0.01 Simulation;Education : 0.01 Role Playing;Brain Games : 0.01 Racing;Pretend Play : 0.01 Puzzle;Education : 0.01 Parenting;Brain Games : 0.01 Music & Audio;Music & Video : 0.01 Lifestyle;Pretend Play : 0.01 Lifestyle;Education : 0.01 Health & Fitness;Education : 0.01 Health & Fitness;Action & Adventure : 0.01 Entertainment;Education : 0.01 Communication;Creativity : 0.01 Comics;Creativity : 0.01 Casual;Music & Video : 0.01 Card;Action & Adventure : 0.01 Books & Reference;Education : 0.01 Art & Design;Pretend Play : 0.01 Art & Design;Action & Adventure : 0.01 Arcade;Pretend Play : 0.01 Adventure;Education : 0.01 Games : 55.62 Entertainment : 8.24 Photo & Video : 4.12 Social Networking : 3.53 Education : 3.26 Shopping : 2.98 Utilities : 2.69 Lifestyle : 2.32 Finance : 2.07 Sports : 1.95 Health & Fitness : 1.87 Music : 1.65 Book : 1.63 Productivity : 1.53 News : 1.43 Travel : 1.38 Food & Drink : 1.06 Weather : 0.76 Reference : 0.49 Navigation : 0.49 Business : 0.49 Catalogs : 0.22 Medical : 0.2
12. If we take a quick look on results on a previous step we can make few conclusions for resulting datasets.The first is that in free apps scope we have quite balanced Android store with its apps and we ca n assume that IOS store is more populated with fun apps. The second we can assume on this step is that this observe doesn't give us a confidence to define the best app profile delivered the best revenue. For this, lets check the average number of user ratings for relevative apps corresponding to their genre within each dataset. We are going to use a nested loop.
ios_genres = freq_table(ios_free, 11)
for genre in ios_genres:
total = 0
len_genre = 0
for app in ios_free:
name = app[11]
if name == genre:
rating = float(app[5])
total += rating
len_genre += 1
print(genre, "with an average rating:", round(total / len_genre, 1), "and number of apps:", len_genre)
print('\n')
#Lets explore what is an average rating of all ios free apps.
total = 0
for app in ios_free:
rating = float(app[5])
total += rating
average_rating = total / len(ios_free)
print(round(average_rating, 2))
Social Networking with an average rating: 53078.2 and number of apps: 143 Photo & Video with an average rating: 27249.9 and number of apps: 167 Games with an average rating: 18941.1 and number of apps: 2255 Music with an average rating: 56482.0 and number of apps: 67 Reference with an average rating: 67447.9 and number of apps: 20 Health & Fitness with an average rating: 19952.3 and number of apps: 76 Weather with an average rating: 47220.9 and number of apps: 31 Utilities with an average rating: 14010.1 and number of apps: 109 Travel with an average rating: 20216.0 and number of apps: 56 Shopping with an average rating: 18746.7 and number of apps: 121 News with an average rating: 15892.7 and number of apps: 58 Navigation with an average rating: 25972.0 and number of apps: 20 Lifestyle with an average rating: 8978.3 and number of apps: 94 Entertainment with an average rating: 10823.0 and number of apps: 334 Food & Drink with an average rating: 20179.1 and number of apps: 43 Sports with an average rating: 20129.0 and number of apps: 79 Book with an average rating: 8498.3 and number of apps: 66 Finance with an average rating: 13522.3 and number of apps: 84 Education with an average rating: 6266.3 and number of apps: 132 Productivity with an average rating: 19053.9 and number of apps: 62 Business with an average rating: 6367.8 and number of apps: 20 Catalogs with an average rating: 1779.6 and number of apps: 9 Medical with an average rating: 459.8 and number of apps: 8 19759.36
According to output we can see that average rating of ios_free apps is 19759. Reference to this the apps which user rating is upper is looking good as app profile recommendation. Here we have Reference, Social Networking and Music apps as the most atractive for developing. Speaking specifally we need to detalize what apps are included in this case. So we'll be able to get an understading for recomended app profile.
12. Lets explore through the apps each suitable genre and recomend possible app profile.
def top5_genre (data_set, genre_name, app_column = 1, genre_column = 11, rating_column = 5, android_check = False):
genre_apps = {}
for app in data_set:
name = app[app_column]
genre = app[genre_column]
#One extra step in definition for android dataset to update invalid characters within number of installs.
if android_check:
installs = app[rating_column]
installs = installs.replace(',', '')
installs = installs.replace('+', '')
app[rating_column] = installs
rating = float(app[rating_column])
if genre == genre_name:
genre_apps[name] = rating
sorted_genre_apps = []
for key in genre_apps:
key_val_as_tuple = (genre_apps[key], key)
sorted_genre_apps.append(key_val_as_tuple)
sorted_genre_apps = sorted(sorted_genre_apps, reverse = True)
print('Top-5 in', genre_name, 'genre:')
for entry in sorted_genre_apps[:5]:
print('App', entry[1], 'with rating', entry[0])
print('\n')
top5_genre(ios_free, 'Reference')
top5_genre(ios_free, 'Social Networking')
top5_genre(ios_free, 'Music')
Top-5 in Reference genre: App Bible with rating 985920.0 App Dictionary.com Dictionary & Thesaurus with rating 200047.0 App Dictionary.com Dictionary & Thesaurus for iPad with rating 54175.0 App Google Translate with rating 26786.0 App Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran with rating 18418.0 Top-5 in Social Networking genre: App Facebook with rating 2974676.0 App Pinterest with rating 1061624.0 App Skype for iPhone with rating 373519.0 App Messenger with rating 351466.0 App Tumblr with rating 334293.0 Top-5 in Music genre: App Pandora - Music & Radio with rating 1126879.0 App Spotify Music with rating 878563.0 App Shazam - Discover music, artists, videos & lyrics with rating 402925.0 App iHeartRadio – Free Music & Radio Stations with rating 293228.0 App SoundCloud - Music & Audio with rating 135744.0
As we can see now there are an interesting case inside the most rating genres. Among 'top five' apps within Reference genre we see a few apps that have a religious context. Within Music genre we see applications that allow us to chose music by own interests and the best of that apps allow users to create flexible playlists. At last we have very popular social nets in Social networking apps. Notice that in this list we have two biggest messengers and the biggest service for picture exchanging Pinterest.
*So that the most pfofitable app profile can be an app that will have all of explored features from prime genres. This app might be based on some historical book or some international magazine which gathering a big community of people and include chating possibility. Also the app might have its own ready-made music playlists created special for community.*
13. Lets explore the same case for Android apps. We need to do some preparations before it'll possible to calculate the same as for IOS apps conclusions. It seems we are immediately going to work with installs column of Android apps which have an inappropriate for calculating type.
print(android_free[0][5])
10,000+
For this we are going to replace all invalid characters with a readable one.
android_genres = freq_table(android_free, 1)
for category in android_genres:
total = 0
len_category = 0
for app in android_free:
category_app = app[1]
if category_app == category:
len_category += 1
installs = app[5]
installs = installs.replace(',', '')
installs = installs.replace('+', '')
total += float(installs)
print(category, "with an average installs:", round(total / len_category, 1), "and total number of apps:", len_category)
print('\n')
#Lets explore what is an average amount of installs for android free apps.
total = 0
for app in android_free:
installs = app[5]
installs = installs.replace(',', '')
installs = installs.replace('+', '')
installs = float(installs)
total += installs
average_rating = total / len(android_free)
print(round(average_rating, 2))
ART_AND_DESIGN with an average installs: 1986335.1 and total number of apps: 57 AUTO_AND_VEHICLES with an average installs: 647317.8 and total number of apps: 82 BEAUTY with an average installs: 513151.9 and total number of apps: 53 BOOKS_AND_REFERENCE with an average installs: 8814199.8 and total number of apps: 189 BUSINESS with an average installs: 1712290.1 and total number of apps: 407 COMICS with an average installs: 832613.9 and total number of apps: 54 COMMUNICATION with an average installs: 38590581.1 and total number of apps: 286 DATING with an average installs: 854028.8 and total number of apps: 165 EDUCATION with an average installs: 1833495.1 and total number of apps: 103 ENTERTAINMENT with an average installs: 11640705.9 and total number of apps: 85 EVENTS with an average installs: 253542.2 and total number of apps: 63 FINANCE with an average installs: 1387692.5 and total number of apps: 328 FOOD_AND_DRINK with an average installs: 1924897.7 and total number of apps: 110 HEALTH_AND_FITNESS with an average installs: 4188822.0 and total number of apps: 273 HOUSE_AND_HOME with an average installs: 1360598.0 and total number of apps: 71 LIBRARIES_AND_DEMO with an average installs: 638503.7 and total number of apps: 83 LIFESTYLE with an average installs: 1446158.2 and total number of apps: 344 GAME with an average installs: 15606004.0 and total number of apps: 861 FAMILY with an average installs: 3695641.8 and total number of apps: 1676 MEDICAL with an average installs: 120550.6 and total number of apps: 313 SOCIAL with an average installs: 23253652.1 and total number of apps: 236 SHOPPING with an average installs: 7036877.3 and total number of apps: 199 PHOTOGRAPHY with an average installs: 17840110.4 and total number of apps: 261 SPORTS with an average installs: 3650602.3 and total number of apps: 300 TRAVEL_AND_LOCAL with an average installs: 13984077.7 and total number of apps: 207 TOOLS with an average installs: 10830252.0 and total number of apps: 748 PERSONALIZATION with an average installs: 5201482.6 and total number of apps: 294 PRODUCTIVITY with an average installs: 16787331.3 and total number of apps: 345 PARENTING with an average installs: 542603.6 and total number of apps: 58 WEATHER with an average installs: 5145550.3 and total number of apps: 70 VIDEO_PLAYERS with an average installs: 24727872.5 and total number of apps: 159 NEWS_AND_MAGAZINES with an average installs: 9549178.5 and total number of apps: 248 MAPS_AND_NAVIGATION with an average installs: 4049274.6 and total number of apps: 123 8501318.48
As we can see upper there are a few extrimaly populate categories within Android market place. For instanse, we have Communication category with above 38 millions installs, next to it is Video_Players app category with almost 25 millions installs and on the third place we have Social category with nearly 23 millions installs. Let's look specifically over them.
top5_genre(android_free, 'COMMUNICATION', 0, 1, 5, True)
top5_genre(android_free, 'VIDEO_PLAYERS', 0, 1, 5, True)
top5_genre(android_free, 'SOCIAL', 0, 1, 5, True)
Top-5 in COMMUNICATION genre: App WhatsApp Messenger with rating 1000000000.0 App Skype - free IM & video calls with rating 1000000000.0 App Messenger – Text and Video Chat for Free with rating 1000000000.0 App Hangouts with rating 1000000000.0 App Google Chrome: Fast & Secure with rating 1000000000.0 Top-5 in VIDEO_PLAYERS genre: App YouTube with rating 1000000000.0 App Google Play Movies & TV with rating 1000000000.0 App MX Player with rating 500000000.0 App VivaVideo - Video Editor & Photo Movie with rating 100000000.0 App VideoShow-Video Editor, Video Maker, Beauty Camera with rating 100000000.0 Top-5 in SOCIAL genre: App Instagram with rating 1000000000.0 App Google+ with rating 1000000000.0 App Facebook with rating 1000000000.0 App Snapchat with rating 500000000.0 App Facebook Lite with rating 500000000.0
According to output we can see relation between popular apps in both datasets. Our main idea about composited application becomes confirmed. It gathers the features of certain media and text content, chating and posting on individual user level. We can also note that within Android market a large segment of apps belongs to Video_Players genre. Users need some interactive stuff and this observation may include in final App Profile.
Therefore, through all these steps we made there is a concept of our Profitable app profile. The great notice that really can increase our chances to maximize a profit is adding interactive elements (like podcasts) and even gamification within the app. Also it must have a large comunication functionality with chat or forum included. It gives us a pretty good flexibility and evidently multiplicate a competitiveness on the store. Thank you for your time i hope it were useful. Have a nice day!