Skip to content

Latest commit

 

History

History
126 lines (100 loc) · 5.63 KB

README.md

File metadata and controls

126 lines (100 loc) · 5.63 KB

Pata Mchumba Dating-App-Recommender

Made.withFlexClip.1.mp4

Group Members

  1. Iain Mosima
  2. Benson Muriu
  3. Elsie Kiprop
  4. Fred Mutuma
  5. Peter Kigotho
  6. Oscar Karuga

1.1 Business Understanding

With the current generation embracing technology and its applications, many people have become accustomed to the idea of using dating apps. Therefore, Pata Mchumba, a dating company, has approached us to create a recommendation system for their users to increase the effectiveness of matches based on their preferences. Moreover our recommender will focus mainly on emotional connection rather than physical appearance.

1.2 Objectives

Main Objective:

  • To build a dating app recommender system that successfully maximises the matches.
  • Specific Objectives

    1.3 Data Understanding

    Our data was sourced from https://www.tandfonline.com/doi/abs/10.1080/10691898.2015.11889737.

    We considered the ethical concerns and no rights were infringed in the collection of this data as OKCupid provided this data for use by the public.

    The data contained 31 rows and 59964 columns.

    Column names and description

    1. age : How old the person is
    2. status : The person's relationship status (e.g. single, divorced, etc.)
    3. sex : The person's biological sex (e.g. male, female, etc.)
    4. orientation : The person's sexual orientation (e.g. straight, gay, etc.)
    5. body_type : The person's body type (e.g. slim, average, etc.)
    6. diet : The person's diet (e.g. vegetarian, non-vegetarian, etc.)
    7. drinks : The person's drinking habits (e.g. social drinker, heavy drinker, etc.)
    8. drugs : The person's drug habits (e.g. never, occasionally, etc.)
    9. education : The person's educational attainment (e.g. high school, college, etc.)
    10. ethnicity : The person's ethnic background (e.g. Hispanic, Asian, etc.)
    11. height : The person's height in inches
    12. income : The person's annual income
    13. job : The person's current job (e.g. doctor, lawyer, etc.)
    14. last_online : The date the person was last active on the website
    15. location : The person's current city
    16. offspring : The person's desire to have children
    17. pets : The person's pet preferences (e.g. dog, cat, etc.)
    18. religion : The person's religious beliefs
    19. sign : The person's astrological sign
    20. smokes : The person's smoking habits (e.g. nonsmoker, occasional smoker, etc.)

    essay columns

    1. essay 0: My self summary
    2. essay 1: What I’m doing with my life
    3. essay 2: I’m really good at...
    4. essay 3: The first thing people usually notice about me...
    5. essay 4: Favourite books, movies, show, music, and food
    6. essay 5: The six things I could never do without
    7. essay 6: I spend a lot of time thinking about...
    8. essay 7: On a typical Friday night I am...
    9. essay 8: The most private thing I am willing to admit...
    10. essay 9: You should message me if...

    1.4 Problem Questions

    1. What is the gender distribution of the users?
    2. What is the age distribution of the users?
    3. What is the orientation distribution of the users?
    4. What is the frequency distribution of the consumption of drugs, alcohol and smoking?

    1.5 Data Preparation

    Data cleaning

    The Dataset had 273202 missing values

    The Dataset had no duplicated values hence it's consistent

    1.6 Modelling

    This final model will be a hybrid one. This will includes the above model and a model that matches users depending on the essays. It is created by checking for sentence similarity in the essays provided by each user.

    1.7 Evaluation

    1.8 Conclusions

    EDA

    1. Q. What is the gender distribution of the users?
    2. A. From the above plot majority of the users are male at 60.1% and female follow closely at 39.9%

    3. Q. What is the age distribution of the users?
    4. A. Majority of the users are between 22 and 35. However there seems to be odd ages on the dating site such as 109.

      A. Majority of the users of this app are male. Most of them are aged 23 to 28. That same age bracket holds the highest number of female users.

    5. Q. What is the orientation distribution of the users?
    6. A. Majority of the users are of the orientation straight. Most of the straight users are male. This is the same case for the gay orientation For the bisexual orientation, majority are female.

    7. Q. What is the frequency distribution of the consumption of drugs, alcohol and smoking?
    8. A.

    Conclusions

    1. Majority of the users on dating sites are male.
    2. On the other hand most of them are of the orientation straight.
    3. It is evident that most of the people who sort out to using dating apps are young people.Mostly from 23-35 years of age.

    1.9 Recommendations

    1. Words being a way to one's heart makes this system less superficial oriented.
    2. More personalized matches that replaces the traditional swipe or like to match.
    3. Dealing with unspecified description in profiles to maximize matches.