# Posts by Category

## Instrumental variable analysis with a binary outcome

Here is an additional post on instrumental variable (IV) analysis. This follows an exercise where I employed two methods of IV analysis, comparing a Bayesian...

## Instrumental variable analysis with Bayesian modeling and statsmodels

Instrumental variable (IV) analysis is one method for causal inference. This approach relies on using an instrumental variable $Z$ to find the true relations...

## Time series with varying intercepts

I’ve done time-series data with time-to-event models and would like to explore modeling with mixed effects models. I’ll take an interative approach, in the s...

## Follow-up after getting causal estimates

Notes for Chapter 5 of Causal Inference with Survey Data on LinkedIn Learning, given by Franz Buscha. I’m using this series of posts to take some notes.

## Longitudinal Survey Designs

Notes for Chapter 4 of Causal Inference with Survey Data on LinkedIn Learning, given by Franz Buscha. I’m using this series of posts to take some notes.

## Cross-Sectional Survey Designs

Notes for Chapter 3 of Causal Inference with Survey Data on LinkedIn Learning, given by Franz Buscha. I’m using this series of posts to take some notes.

## Experimental Survey Designs

Notes for Chapter 2 of Causal Inference with Survey Data on LinkedIn Learning, given by Franz Buscha.

## Cause and effect

I’m basically a fan-boy of Richard McElreath’s Statistical Rethinking. That’s no secret. But I thought it would be prudent to learn more about causal inferen...

## Generating a predictive distribution for the number of people attending your party

A few years ago, not long after I started writing on this blog, I wrote a piece called The probability of making your Friday night party. Well, the opportuni...

## When the Spider-Man meme is relevant to multilevel models

For a while, I’ve wondered about the different approches for multilevel modeling, also known as mixed effects modeling. My initial understanding is with a Ba...

## LKJCorr and LKJCov in pymc

While continuing to deep dive on covariance priors following my prior post, I investigated implementations in pymc. I played around with the LKJcorr and LKJc...

## Weird ways that covariance matrices are made

Covariance priors for multivariate normal models are an important tool for the implementation of varying effects. By representing more than one parameter wit...

## Escaping the Devil’s Funnel

Multi-level models are great for improving our estimates. However, the intuitive way these kinds of models are specified (which goes by the unhelpful name “c...

## Correlated data, different DAGs

One of the lessons from Statistical Rethinking that really hit home for me was the importance of considering the data generation process. Different datasets ...

## Running models forwards and backwards

The value of simulations is highighted by Dr. McElreath throughout his textbook and by van de Schoot and colleagues. I didn’t entirely appreciate its value u...

## Exploring modeling failure

In my last post, I gave an example of a multilevel model using a binomial generalized linear model (GLM). The varying intercept model helped illustrate pa...

## Multilevel modeling with binomial GLM

I’ve been on a journey learning multilevel models and Bayesian inference through Richard McElreath’s Statistical Rethinking book. The concepts of shrinkage a...

## Working with PyTorch’s Dataset and Dataloader classes (part 1)

Recently, I built a simple NLP algorithm for a work project, following the template described in this tutorial. As I looked to increase my model’s complexity...

## PyMC linear regression part 4: predicting actual height

At last, we have come to the end. This is the final post in a series of linear regression posts using PyMC3, from my reading of Statistical Rethinking. Part ...

## PyMC linear regression part 3: predicting average height

This is the next post in a series of linear regression posts using PyMC3. This series has been inspired by my reading of Statistical Rethinking. Part 1 was d...

## PyMC linear regression part 2: understanding the posterior distribution

In a previous post, I wrote about my inital experience using PyMC3. The point was to take a dive deep into some of the package’s objects using a linear regre...

## PyMC linear regression part 1: PyMC objects

I previously wrote about my discovery of Statistical Rethinking. The book’s title could not be more spot-on–it’s helped me look at statistics in a different ...

## Bayes-ball part 3: the credible interval and doing the math

In the last post, we learned about the beta distribution and why it would be a more realistic prior in the context of our problem. We also selected appropria...

## Bayes-ball part 2: a more realistic prior

I meant to post this some time ago, but I have been busy. But with the baseball example I am using, it is only fitting that I post this now, just after this ...

## Bayes-ball part 1: determining a true talent level

In my previous post, we saw how Bayes’ theorem was applied to a relatively simple problem with Bertrand’s box paradox. Here I’ll talk about another applicati...

## Approaching Bertrand’s box paradox, including with Bayes’ theorem

Bayes’ theorem is one of the most useful applications in statistics. But sometimes it is not always easy to recognize when and how to apply it. I was doing s...

## Histograms and recursion in SQL

A few weeks ago, while making a histogram in a SQL query, I discovered that some solutions out there do not include bins with 0 counts. This bugged me so I f...

## F-in statistics!

I recently read this passage in the section on multiple linear regression from the fantastic book Introduction to Statistical Learning:

## Using CASE in the WHERE statement of SQL

Problem statement

## PostgreSQL and Jupyter notebooks

PostgreSQL is one of the most popular variants of SQL. It is common to use PostgreSQL with pgadmin but I am not a big fan of their UI. By contrast, interacti...

## Iterators in Python

One of the things about Python that I haven’t fully appreciated are the use of iterators. I’ll go over some iterators that are a part of base Python and then...

Seems like every statistics class starts off with a coin toss. It’s simple enough for me. Some fancy teachers might start right off the bat and get into the ...

## The probability of making your Friday night party

My wife and I have enjoyed living in the Bay Area where we’ve been able to satisfy our love of outdoor activities while being near a cool city. While we’re f...

## A neuro-educational approach to taking Andrew Ng’s Machine Learning Course

I recently finished Andrew Ng’s fantastic and well-known Machine Learning course through Coursera. As I progress into my data science journey, I felt that ta...

## A ggplot-inspired scatterplot function for Python

I coded for a couple of years in R but switched over to Python almost a year ago. I have to say that I miss R’s ggplot2. Like a lot.

## Instrumental variable analysis with a binary outcome

Here is an additional post on instrumental variable (IV) analysis. This follows an exercise where I employed two methods of IV analysis, comparing a Bayesian...

## Instrumental variable analysis with Bayesian modeling and statsmodels

Instrumental variable (IV) analysis is one method for causal inference. This approach relies on using an instrumental variable $Z$ to find the true relations...

## Time series with varying intercepts

I’ve done time-series data with time-to-event models and would like to explore modeling with mixed effects models. I’ll take an interative approach, in the s...

## Follow-up after getting causal estimates

Notes for Chapter 5 of Causal Inference with Survey Data on LinkedIn Learning, given by Franz Buscha. I’m using this series of posts to take some notes.

## Longitudinal Survey Designs

Notes for Chapter 4 of Causal Inference with Survey Data on LinkedIn Learning, given by Franz Buscha. I’m using this series of posts to take some notes.

## Cross-Sectional Survey Designs

Notes for Chapter 3 of Causal Inference with Survey Data on LinkedIn Learning, given by Franz Buscha. I’m using this series of posts to take some notes.

## Experimental Survey Designs

Notes for Chapter 2 of Causal Inference with Survey Data on LinkedIn Learning, given by Franz Buscha.

## Cause and effect

I’m basically a fan-boy of Richard McElreath’s Statistical Rethinking. That’s no secret. But I thought it would be prudent to learn more about causal inferen...

## Generating a predictive distribution for the number of people attending your party

A few years ago, not long after I started writing on this blog, I wrote a piece called The probability of making your Friday night party. Well, the opportuni...

## When the Spider-Man meme is relevant to multilevel models

For a while, I’ve wondered about the different approches for multilevel modeling, also known as mixed effects modeling. My initial understanding is with a Ba...

## LKJCorr and LKJCov in pymc

While continuing to deep dive on covariance priors following my prior post, I investigated implementations in pymc. I played around with the LKJcorr and LKJc...

## Weird ways that covariance matrices are made

Covariance priors for multivariate normal models are an important tool for the implementation of varying effects. By representing more than one parameter wit...

## Escaping the Devil’s Funnel

Multi-level models are great for improving our estimates. However, the intuitive way these kinds of models are specified (which goes by the unhelpful name “c...

## Correlated data, different DAGs

One of the lessons from Statistical Rethinking that really hit home for me was the importance of considering the data generation process. Different datasets ...

## Running models forwards and backwards

The value of simulations is highighted by Dr. McElreath throughout his textbook and by van de Schoot and colleagues. I didn’t entirely appreciate its value u...

## Exploring modeling failure

In my last post, I gave an example of a multilevel model using a binomial generalized linear model (GLM). The varying intercept model helped illustrate pa...

## Multilevel modeling with binomial GLM

I’ve been on a journey learning multilevel models and Bayesian inference through Richard McElreath’s Statistical Rethinking book. The concepts of shrinkage a...

## Working with PyTorch’s Dataset and Dataloader classes (part 1)

Recently, I built a simple NLP algorithm for a work project, following the template described in this tutorial. As I looked to increase my model’s complexity...

## PyMC linear regression part 4: predicting actual height

At last, we have come to the end. This is the final post in a series of linear regression posts using PyMC3, from my reading of Statistical Rethinking. Part ...

## PyMC linear regression part 3: predicting average height

This is the next post in a series of linear regression posts using PyMC3. This series has been inspired by my reading of Statistical Rethinking. Part 1 was d...

## PyMC linear regression part 2: understanding the posterior distribution

In a previous post, I wrote about my inital experience using PyMC3. The point was to take a dive deep into some of the package’s objects using a linear regre...

## PyMC linear regression part 1: PyMC objects

I previously wrote about my discovery of Statistical Rethinking. The book’s title could not be more spot-on–it’s helped me look at statistics in a different ...

## Bayes-ball part 3: the credible interval and doing the math

In the last post, we learned about the beta distribution and why it would be a more realistic prior in the context of our problem. We also selected appropria...

## Bayes-ball part 2: a more realistic prior

I meant to post this some time ago, but I have been busy. But with the baseball example I am using, it is only fitting that I post this now, just after this ...

## Bayes-ball part 1: determining a true talent level

In my previous post, we saw how Bayes’ theorem was applied to a relatively simple problem with Bertrand’s box paradox. Here I’ll talk about another applicati...

## Approaching Bertrand’s box paradox, including with Bayes’ theorem

Bayes’ theorem is one of the most useful applications in statistics. But sometimes it is not always easy to recognize when and how to apply it. I was doing s...

## F-in statistics!

I recently read this passage in the section on multiple linear regression from the fantastic book Introduction to Statistical Learning:

Seems like every statistics class starts off with a coin toss. It’s simple enough for me. Some fancy teachers might start right off the bat and get into the ...

## The probability of making your Friday night party

My wife and I have enjoyed living in the Bay Area where we’ve been able to satisfy our love of outdoor activities while being near a cool city. While we’re f...

## Histograms and recursion in SQL

A few weeks ago, while making a histogram in a SQL query, I discovered that some solutions out there do not include bins with 0 counts. This bugged me so I f...

## Using CASE in the WHERE statement of SQL

Problem statement

## PostgreSQL and Jupyter notebooks

PostgreSQL is one of the most popular variants of SQL. It is common to use PostgreSQL with pgadmin but I am not a big fan of their UI. By contrast, interacti...

## Improving my writing

Introduction I have identified writing well as a skill I will prioritize. This improved skill will benefit both smaller forms I have taken for granted (like...

## Should we mandate instruction practices that are known to improve student learning?

This is a summary of a discussion I led through a discussion group. This group is called the “STEM Education & Diversity Discussion Group” and it is orga...

## Vectorization with np.dot and broadcasting

Vectorization and broadcasting are tricks I have used sparingly and absent-mindedly if at all. However, it is a critical skill for algorithmic code to run ef...

## Iterators in Python

One of the things about Python that I haven’t fully appreciated are the use of iterators. I’ll go over some iterators that are a part of base Python and then...

## Should we mandate instruction practices that are known to improve student learning?

This is a summary of a discussion I led through a discussion group. This group is called the “STEM Education & Diversity Discussion Group” and it is orga...