Social Media Recommender Engine

Project Image

Objective

The Social Media Recommender Engine project has been meticulously architected to refine the social connectivity experience. This Proof of Concept (POC) is dedicated to presenting users with their best matches, considering a multitude of factors that include demographic data, professional backgrounds, and personal aspirations. The system is designed to facilitate meaningful connections by aligning users with similar interests and life goals.

Methodology

The project processes a rich dataset encompassing various user-provided content in multiple languages, such as personal and professional bios, job titles, and life ambitions. Following thorough data cleaning and feature engineering, the transformation of textual data into vector embeddings is performed using LaBSE—a sophisticated, language-agnostic sentence transformer available on TensorFlow Hub. This deep learning model is pivotal in transcending language barriers and extracting the essence of user profiles.

Innovation

The system employs cosine similarity measures to ascertain the degree of alignment between users, forging a final similarity score rooted in the depth of their shared attributes. A novel post-processing step is incorporated to ensure diversity in user recommendations by avoiding matches from the same company.

Evaluation and Impact

Human judgment played an integral role in the evaluation process, providing a qualitative assessment that endorsed the efficacy of the methodology. The approach outperformed traditional clustering methods like k-means, hierarchical clustering, and DBSCAN, even when supplemented with dimensionality reduction techniques such as PCA and t-SNE. The results resonated positively with stakeholders, affirming the project's potential.

Future Direction

Encouraged by the success of the POC, the project is poised for expansion. The forthcoming phase is set to encompass data collation on a larger scale, aiming to incorporate the entire user base that scales to tens of thousands, thereby amplifying the network's reach and the quality of connections it can foster.