Mining and Analyzing LinkedIn Data

Apply Data Science and Artificial Intelligence techniques to extract and analyze your LinkedIn network

LinkedIn is a social network focused on professional experience in order to generate connections and relationships between professionals from different areas. Professionals can provide profissional skills and search for jobs by connecting with people around the world. For example, if you would like to work with Data Science you can connect with companies and people who work in this field, increasing your chances of getting a job. On the other hand, companies are able to search for candidates according to the curriculum and skills provided by users. In 2017, LinkedIn established itself as the largest business platform and an important strategic tool for both professionals and companies.

What you’ll learn

  • Extract data from your LinkedIn profile using the LinkedIn API and .csv files.
  • Extract and analyze the connections between users, invitations, and text messages.
  • Generate fake usernames to mask real information.
  • Explore and view data related to your contacts’ companies and job titles.
  • Use edit Levenshtein distance, n-gram similarity and Jaccard distance to measure similarity between strings.
  • Cluster contacts based on similarity between positions, as well as generate HTML views to improve data presentation.
  • Use location APIs to extract latitude and longitude of contacts, in order to capture the city and country of lives.
  • View the location of contacts dynamically with Google Earth and the Basemap library.
  • Cluster contacts using the k-means algorithm.
  • Apply natural language processing techniques to analyze your LinkedIn text messages.
  • Generate word cloud to view the most frequent terms.
  • Extract name entities from text messages.
  • Create a sentiment classifier to extract the polarity of the LinkedIn text messages.

Course Content

  • Introduction –> 3 lectures • 8min.
  • LinkedIn datasets –> 10 lectures • 1hr 22min.
  • Connections between users and invitations –> 31 lectures • 3hr 46min.
  • Messages between users –> 8 lectures • 1hr 2min.
  • Final remarks –> 1 lecture • 1min.

Auto Draft

Requirements

  • Programming logic.
  • Basic Python programming.
  • No LinkedIn knowledge is necessary.

LinkedIn is a social network focused on professional experience in order to generate connections and relationships between professionals from different areas. Professionals can provide profissional skills and search for jobs by connecting with people around the world. For example, if you would like to work with Data Science you can connect with companies and people who work in this field, increasing your chances of getting a job. On the other hand, companies are able to search for candidates according to the curriculum and skills provided by users. In 2017, LinkedIn established itself as the largest business platform and an important strategic tool for both professionals and companies.

It is important that professionals know how to use the data of this social network in their favor. LinkedIn provides some datasets related to your profile, in which it is possible to apply Data Science and Analysis techniques to extract important and interesting insights about our network of connections. We can answer questions like this: What are the main positions of the people who are connected to us? Which companies are sending invitations to our profile? What is the location of our contacts? Is our LinkedIn network made up of people and companies related to our job? Are the companies I want to work for sending invitations to my profile? These and other questions can be answered during this course, so you can analyze if your network is in line with what you want professionally. Below you can see the main topics that will be implemented step by step:

 

  • Extract data from your LinkedIn profile using the LinkedIn API and .csv files. If you do not have LinkedIn, you will be able to follow the course using the data about my profile
  • Extract and analyze connections between users, invitations and text messages
  • Generate fake data to mask real information
  • Explore and visualize data related to your contacts’ companies and job titles
  • Use Levenshtein distance, n-gram similarity and Jaccard distance to measure similarity between strings
  • Cluster contacts based on similarity between positions, as well as generate HTML views to improve data presentation
  • Use location APIs to extract latitude and longitude of contacts to capture the city and country they live
  • View the location of contacts dynamically with Google Earth and the Basemap library
  • Cluster contacts using k-means algorithm
  • Apply natural language processing techniques to analyze your LinkedIn text messages
  • Generate word cloud to view the most frequent terms
  • Extract named entities from your text messages
  • Create a sentiment classifier to extract the polarity from LinkedIn messages

During the course, we will use the Python programming language and Google Colab, so you do not need to spend time installing the stuff on your own machine. You will be able to follow the course with a browser and an Internet connection! This is the best course if this is your first contact with social media data analysis!

Get Tutorial