STATISTICS WITH PYTHON

Description

In this course, students will be introduced to the field of statistics, including where data come from,
study design, data management, and exploring and visualizing data. Learners will identify different
types of data, and learn how to visualize, analyze, and interpret summaries for both univariate and
multivariate data. In addition, they will also be introduced to the differences between probability
and non-probability sampling from larger populations, the idea of how sample estimates vary, and
how inferences can be made about larger populations based on probability sampling. Moreover,
students will explore basic principles behind using data for estimation and for assessing theories.
They will analyze both categorical data and quantitative data, starting with one population
techniques and expanding to handle comparisons of two populations. We will learn how to
construct confidence intervals and use sample data to assess whether or not a theory about the
value of a parameter is consistent with the data. A major focus will be on interpreting inferential
results appropriately. Additionally, students will apply what they’ve learned using Python within
the course environment focusing on specific case studies to help solidify the week’s statistical
concepts, which will include further deep dives into Python libraries including Stats models,
Pandas, and Seaborn.

Curriculum Objectives

This course is divided into three parts:
– Individual/Group online learning and practices extracted from online courses with support
from lecturer in class to expand students’ knowledge and experience.
– Lectures given by lecturer in class, teaching in student-centered style.
– Research Team Project
▪ Students will be split into teams consisting of 3-4 students. Each team will have a
unique research project assignment. Each team will take the sessions to work as a
group with the supports and mentoring providing by the teacher to guide and give
suggestions when necessary.
▪ The projects will allow students to practise some fundamentals and processes of
doing the scientific research.

Finally, there is a presentation and/or demonstration from each group to the class with the
discussion on the results.

Course Outline

LO1: Develop an outlook for the course and summarize future concepts and objectives.
LO2: Explore various uses of statistics and examine where data originates from.
LO3: Properly identify various data types and understand the different uses for each.
LO4: Understand the basic functions of Python to import, clean, and manage data.
LO5: Understand the various graphical displays used for univariate categorical and
quantitative data.
LO6: Interpret histograms and boxplots to describe quantitative data.
LO7: Obtain key interpretations used for describing quantitative data.
LO8: Create histograms, box plots, and numerical summaries through Python.
LO9: Create graphs and summary statistics of multivariate data, both categorical and
quantitative.
LO10: Summarize important information obtained through visualizations of multivariate
data.
LO11: Communicate statistical ideas clearly and concisely to a broad audience.
LO12: Integrate statistical reasoning into decisions and situations in your daily life.
LO13: Distinguish between probability and non-probability sampling.
LO14: Describe the concept of a sampling distribution, and how one can make inference
about a population parameter based on the estimated features of that distribution.
LO15: Identify appropriate analytic techniques for probability and non-probability
samples.
LO16: Explain how poorly-designed samples can lead to descriptions of population
features that are biased in nature.
LO17: Develop an outlook for the course and summarize future concepts and objectives.
LO18: Explain the framework for making decisions using data along with the potential
consequences of those decisions.
LO19: Identify the basic concepts central to Bayesian and frequentist statistics, which will
be used throughout this course.
LO20: Write basic Python functions and interpret documentation.
LO21: Define a confidence interval.
LO22: Determine assumptions needed to calculate confidence intervals for their respective
population parameter.
LO23: Calculate confidence intervals by hand for one population proportion, difference in
two population proportions, one population mean, one population mean difference
for paired data, and difference in population means for independent groups.
LO24: Demonstrate your understanding of confidence intervals by communicating
statistical ideas clearly and concisely for a potential client.
LO25: Create confidence intervals in Python.
LO26: Differentiate between various scenarios and determine the appropriate analysis
method.
LO27: Apply techniques of hypothesis testing and interpret the results.
LO28: Run hypothesis tests in Python and interpret the output.

Department
Data Science Academy
Campus
Sunrise Institute
Level
Intermediate
Method
Online, Physical
Start Date
2023/01/23, 2023/01/28
Duration
45hrs (Weekend 2pm - 5pm and Mon/Fri 6pm - 8pm)