Python for Biostatistics: Analyzing Infectious Diseases Data

Forecast infectious disease rate, build epidemiological modelling, and map the spread of infectious disease with heatmap

Welcome to Python for Biostatistics: Analyzing Infectious Diseases Data course. This is a comprehensive project-based course where you will learn step by step on how to perform complex analysis and visualization on infectious diseases datasets. This course is a perfect combination between biostatistics and Python, equipping you with the tools and techniques to tackle real-world challenges in public health. The course will be mainly concentrating on three major aspects, the first one is data analysis where you will explore the infectious diseases data from multiple perspectives, the second one is time series forecasting where you will be guided step by step on how to forecast the spread of infectious diseases using STL model, and the third one is public health policy where you will learn how to make a data driven public health policy based on epidemiological modeling. In the introduction session, you will learn the basic fundamentals of biostatistics, such as getting to know more about challenges that we commonly face when analyzing biostatistics data and statistical models that we will use, for instance STL which stands for seasonal trend decomposition. Then, you will continue by learning how to calculate infectious disease transmission using Kermack-McKendrick equation, this is a very important concept that you need to understand before getting into the coding session. Afterward, you will also learn several factors that can potentially accelerate the spread of infectious diseases, such as population density, healthcare accessibility, and antigenic variation. Once you have learnt all necessary information about biostatistics, we will start the project. Firstly, you will be guided step by step on how to set up Google Colab IDE. Not only that, you will also learn how to find and download infectious diseases dataset from Kaggle. Once, everything is ready, we will enter the main section of the course which is the project section The project will be consisted of three main parts, the first part is to conduct exploratory data analysis, the second part is to build forecasting model to predict the spread of the diseases in the future using time series model, meanwhile the third part is to perform epidemiological modelling and use the result to develop a public health policy to slow down the spread of the infectious disease.

What you’ll learn

  • Learn the basic fundamentals of biostatistics and infectious disease analysis.
  • Learn how to find correlation between population and disease rate.
  • Learn how to analyze infected patient demographics.
  • Learn how to map infectious disease per county using heatmap.
  • Learn how to analyze infectious disease yearly trend.
  • Learn how to perform confidence interval analysis.
  • Learn how to forecast infectious disease rate using time series decomposition.
  • Learn how to do epidemiological modeling using SIR model.
  • Learn how to perform public health policy evaluation.
  • Learn how to calculate infectious disease transmission rate using SIR model.
  • Learn several factors that accelerate the spread of infectious disease, such as population density, herd immunity, and antigenic variation.
  • Learn how to detect potential outliers using Z score method.
  • Learn how to clean dataset by removing missing rows and duplicate values.
  • Learn how to find and download datasets from Kaggle.

Course Content

  • Introduction –> 3 lectures • 15min.
  • Tools, IDE, and Datasets –> 1 lecture • 11min.
  • Introduction to Biostatistics –> 1 lecture • 8min.
  • Calculating Infectious Disease Transmission with SIR Model –> 1 lecture • 10min.
  • Factors That Accelerate the Spread of Infectious Disease –> 1 lecture • 5min.
  • Setting Up Google Colab IDE –> 1 lecture • 5min.
  • Finding & Downloading Infectious Disease Dataset From Kaggle –> 1 lecture • 5min.
  • Project Preparation –> 2 lectures • 10min.
  • Cleaning Infectious Disease Dataset by Removing Missing Values & Duplicates –> 1 lecture • 7min.
  • Detecting Potential Outliers with Z Score –> 1 lecture • 9min.
  • Finding Correlation Between Population & Disease Rate –> 1 lecture • 8min.
  • Analyzing Infected Patients Demographics –> 1 lecture • 14min.
  • Mapping Infectious Disease per County with Heatmap –> 1 lecture • 14min.
  • Analyzing Infectious Disease Yearly Trend –> 1 lecture • 15min.
  • Performing Confidence Interval Analysis –> 1 lecture • 5min.
  • Forecasting Infectious Disease Rate with Time Series –> 1 lecture • 13min.
  • Epidemiological Modelling with SIR Model –> 1 lecture • 14min.
  • Public Health Policy Evaluation –> 1 lecture • 13min.
  • Conclusion & Summary –> 1 lecture • 5min.

Auto Draft

Requirements

Welcome to Python for Biostatistics: Analyzing Infectious Diseases Data course. This is a comprehensive project-based course where you will learn step by step on how to perform complex analysis and visualization on infectious diseases datasets. This course is a perfect combination between biostatistics and Python, equipping you with the tools and techniques to tackle real-world challenges in public health. The course will be mainly concentrating on three major aspects, the first one is data analysis where you will explore the infectious diseases data from multiple perspectives, the second one is time series forecasting where you will be guided step by step on how to forecast the spread of infectious diseases using STL model, and the third one is public health policy where you will learn how to make a data driven public health policy based on epidemiological modeling. In the introduction session, you will learn the basic fundamentals of biostatistics, such as getting to know more about challenges that we commonly face when analyzing biostatistics data and statistical models that we will use, for instance STL which stands for seasonal trend decomposition. Then, you will continue by learning how to calculate infectious disease transmission using Kermack-McKendrick equation, this is a very important concept that you need to understand before getting into the coding session. Afterward, you will also learn several factors that can potentially accelerate the spread of infectious diseases, such as population density, healthcare accessibility, and antigenic variation. Once you have learnt all necessary information about biostatistics, we will start the project. Firstly, you will be guided step by step on how to set up Google Colab IDE. Not only that, you will also learn how to find and download infectious diseases dataset from Kaggle. Once, everything is ready, we will enter the main section of the course which is the project section The project will be consisted of three main parts, the first part is to conduct exploratory data analysis, the second part is to build forecasting model to predict the spread of the diseases in the future using time series model, meanwhile the third part is to perform epidemiological modelling and use the result to develop a public health policy to slow down the spread of the infectious disease.

First of all, before getting into the course, we need to ask this question to ourselves: why should we learn biostatistics, particularly infectious diseases analysis? Well, there are many reasons why, firstly, if you are interested in working in the public health or healthcare industry, having biostatistics knowledge would be very beneficial and help you to level up your career. In addition to that, you will also learn a lot of valuable skill sets that can be implemented in other projects, for example, time series decomposition can be used to forecast stock, real estate, commodity, and cryptocurrency markets. Last but not least, this course will also train you to be a better public health policy maker as you will extensively learn how to make data driven decisions and take other external factors into consideration.

Below are things that you can expect to learn from this course:

  • Learn the basic fundamentals of biostatistics and infectious disease analysis
  • Learn how to calculate infectious disease transmission rate using SIR model
  • Learn several factors that accelerate the spread of infectious disease, such as population density, herd immunity, and antigenic variation
  • Learn how to find and download datasets from Kaggle
  • Learn how to clean dataset by removing missing rows and duplicate values
  • Learn how to detect potential outliers using Z score method
  • Learn how to find correlation between population and disease rate
  • Learn how to analyze infected patient demographics
  • Learn how to map infectious disease per county using heatmap
  • Learn how to analyze infectious disease yearly trend
  • Learn how to perform confidence interval analysis
  • Learn how to forecast infectious disease rate using time series decomposition model
  • Learn how to do epidemiological modeling using SIR model
  • Learn how to perform public health policy evaluation
Get Tutorial