Cosima Meyer

bit.ly/wids2023-slides

Who am I?



๐Ÿ‘ฉ๐Ÿผโ€๐ŸŽ“ PhD in Politicial Science @ University of Mannheim


๐Ÿ‘ฉ๐Ÿผโ€๐Ÿ’ป Data Science @ IBM


๐Ÿฆธ๐Ÿผโ€โ™€๏ธ Co-organizer of the joint PyLadies ๐Ÿ’œ R-Ladies event series

What we have ahead of us today

Quarto logo.

Quarto logo.

Quarto logo.

Quarto logo.

A data science workflow

A data science workflow

can perfectly accompany more than just one language

Data access

A snake with glasses sitting on a book.

Data access

# Load library
import pandas as pd

# Load data
url = "https://bit.ly/url-data"

health_df = pd.read_csv(url)
# Load data
url = "https://bit.ly/url-data"

health_data <- read.csv(url)

๐Ÿ‘ฉ๐Ÿผโ€๐Ÿ’ป

Data Wrangling

Quarto logo.

Data wrangling: Common first steps

# Print first lines
health_df.head()

# Extract the size
health_df.shape

# Extract info
health_df.info()
# Print first lines
head(health_data)

# Extract the size
dim(health_data)

# Extract info
str(health_data)
summary(health_data)

Data wrangling: The logic of pandas

Quarto logo.

๐Ÿ‘ฉ๐Ÿผโ€๐Ÿ’ป

Visualization

Quarto logo.

๐Ÿ‘ฉ๐Ÿผโ€๐Ÿ’ป

What else?



If you are curious to learn more about how nicely R and Python can work together, have a look at our R-Ladies ๐Ÿ’œ PyLadies event series:


โœจ Kickoff (including input talks on working in bilingual teams and using Mlflow for MLOps)

๐Ÿ‘ฉโ€๐ŸŽจ Visualization with Plotnine

๐Ÿค– Automating Workflows Using GitHub Actions and Quarto

๐Ÿง  autoML with H2O

โ€ฆ to be continued!

cosimameyer.com

cosimameyer

cosimameyer

@cosima_meyer@mas.to