From b7e1d6466ea9b125c2bbb3215c186562fd30f18c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fran=C3=A7ois=20Laurent?= <francois.laurent@posteo.net> Date: Sun, 26 Sep 2021 19:52:25 +0200 Subject: [PATCH] small additions in conf. int. and lin. regr. --- notebooks/scipy_TP.ipynb | 666 +++++++++++++++++++++++++++++ notebooks/scipy_TP_solutions.ipynb | 416 +++++++++++++----- notebooks/scipy_cours.ipynb | 282 ++++++++++-- 3 files changed, 1217 insertions(+), 147 deletions(-) create mode 100644 notebooks/scipy_TP.ipynb diff --git a/notebooks/scipy_TP.ipynb b/notebooks/scipy_TP.ipynb new file mode 100644 index 0000000..c5ddcb4 --- /dev/null +++ b/notebooks/scipy_TP.ipynb @@ -0,0 +1,666 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "a5a5210d", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "Import `numpy`, `pandas`, the `pyplot` module from `matplotlib`, `seaborn`, and the `stats` module from `scipy`." + ] + }, + { + "cell_type": "markdown", + "id": "5ac6cc32", + "metadata": {}, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "529c5f56", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "93ad4aaf", + "metadata": {}, + "source": [ + "# Comparison of two group means" + ] + }, + { + "cell_type": "markdown", + "id": "0e4fd0d9", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "Load the `mi.csv` data file located in the `data` directory of the course repository." + ] + }, + { + "cell_type": "markdown", + "id": "08c1dd12", + "metadata": {}, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "00130518", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "9cc036b2", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "Anything missing?" + ] + }, + { + "cell_type": "markdown", + "id": "99d5dc74", + "metadata": {}, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8a648a9b", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "3512f950", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "Show a summary table for these data." + ] + }, + { + "cell_type": "markdown", + "id": "6984434b", + "metadata": {}, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a7a7d087", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "04163591", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "Inspect the distribution of variables `Age` and `OwnsHouse`." + ] + }, + { + "cell_type": "markdown", + "id": "d6baac23", + "metadata": {}, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5de6412d", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "1e94c17b", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "Isolate the house-owners group from the others, draw their respective age distributions and report their mean ages as $99\\%$ confidence intervals." + ] + }, + { + "cell_type": "markdown", + "id": "497142f3", + "metadata": {}, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "55d18f16", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "ea79970d", + "metadata": {}, + "source": [ + "## Q\n", + "\n", + "Check the age is normally distributed in any one group, first following a graphical approach." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3c23a350", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "15e4d4c9", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## A (with nested Q&A)" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "id": "ddf5d4b0", + "metadata": { + "hidden": true + }, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "<Figure size 432x288 with 1 Axes>" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "(theoretical_quantiles, observed_quantiles), (slope, intercept, _) = stats.probplot(house_owners_age, fit=True)\n", + "plt.scatter(theoretical_quantiles, observed_quantiles, marker='+', color='b')\n", + "plt.axline((0, intercept), slope=slope, color='r')\n", + "plt.xlabel('theoretical quantiles')\n", + "plt.ylabel('ordered observations (age)');" + ] + }, + { + "cell_type": "markdown", + "id": "24b49c4c", + "metadata": { + "hidden": true + }, + "source": [ + "The red line is fitted to the blue points and does not align well on the linear part.\n", + "\n", + "### Q\n", + "\n", + "To better illustrate that the central part is approximately linear, perform a linear regression with the observations whose corresponding theoretical quantiles (abscissa) fall in the $[-1,1]$ interval, and make a probability plot replacing the default regression line by your regression line." + ] + }, + { + "cell_type": "markdown", + "id": "e6f91f58", + "metadata": { + "hidden": true + }, + "source": [ + "### A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0f888c53", + "metadata": { + "hidden": true + }, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "2cc80be1", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "Are the sample size and variance of the two groups similar enough for running a standard $t$ test?" + ] + }, + { + "cell_type": "markdown", + "id": "cd58c73a", + "metadata": {}, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0dbb79f7", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "d61f454a", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "Test the group mean ages equal." + ] + }, + { + "cell_type": "markdown", + "id": "b076e8e6", + "metadata": {}, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1d238900", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "62b30b76", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "How would you report the result of this test?" + ] + }, + { + "cell_type": "markdown", + "id": "efeac3ab", + "metadata": {}, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "341157b6", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "f72698b7", + "metadata": {}, + "source": [ + "## Q\n", + "\n", + "\\[optional; good for playing with Python rather than statistical methods\\]\n", + "\n", + "Although tractable in principle, the group difference in variance is quite large and -- had we smaller samples -- we could instead use the Welch's $t$ test that is known to better control for type-1 errors in cases of differing variances, but also a slightly lower power.\n", + "\n", + "As it is now clear we have a relationship between age and owning a house, let us compute the rejection rate (or power) as a function of sample size.\n", + "\n", + "Proposal:\n", + "* loop over decreasing sample sizes (*e.g.* 200, 50, 20, 10, 5),\n", + "* randomly pick a subsample of that size from each group,\n", + "* compare their means using the standard Student $t$-test and Welch $t$-test,\n", + "* observe whether each test successfully rejects $H_0$ for a constant significance level (*e.g.* 5%),\n", + "* replicate this procedure many times (*e.g.* 100)\n", + "* and compute the rejection rate for each sample size and type of test." + ] + }, + { + "cell_type": "markdown", + "id": "6b7d5d56", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Help: subsampling" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "ee947953", + "metadata": { + "hidden": true + }, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEGCAYAAABiq/5QAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAQ1klEQVR4nO3de6xlZXnH8e8PxqkK6nCZTMe5OGMkWCoVZERuMRbaBlsr1FIusXZisENStVCtivYPQhuTkhgvMa0yAS1tqYKIAYmBUkRTSzN2uBiEkUoRmOE6XpDWJtKRp3/sNXA4czmbM2ftPWe/309ysvdaa++znjdnn99Z59lrvTtVhSSpHfuMuwBJ0mgZ/JLUGINfkhpj8EtSYwx+SWrMgnEXMIyDDz64Vq1aNe4yJGleufXWW39YVYunr58Xwb9q1So2btw47jIkaV5J8sDO1tvqkaTGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8kuaNZStWkmTkX8tWrBz30OfUvJiyQZIAHt6ymTMuvmXk+73inONGvs8+ecQvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxvQZ/kj9LcleS7yb5QpIXJlmdZEOSe5NckWRhnzVIkp6rt+BPsgz4U2BNVb0G2Bc4E7gI+ERVvQr4CXB2XzVIknbUd6tnAfCiJAuAFwOPACcCV3XbLwNO7bkGSdIUvQV/VT0EfAx4kEHg/xS4FXiiqrZ1D9sCLNvZ85OsS7IxycatW7f2VaYkNafPVs8BwCnAauDlwH7AycM+v6rWV9WaqlqzePHinqqUpPb02er5DeAHVbW1qv4PuBo4HljUtX4AlgMP9ViDJGmaPoP/QeCYJC9OEuAk4G7gZuC07jFrgWt6rEGSNE2fPf4NDN7EvQ24s9vXeuBDwPuS3AscBFzaVw2SpB0tmPkhs1dVFwAXTFt9H3B0n/uVJO2aV+5KUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JoIy1asJMnIv5atWDnuoUvPW6/z8Uuj8vCWzZxx8S0j3+8V5xw38n1Ke8ojfklqjEf8kp6XZStW8vCWzeMuQ3vA4Jf0vIyrrQa21uaKrR5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhrjefyaM17YI80PBr/mjBf2SPODwS/NU/6Hpdky+KV5yhlJNVu+uStJjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmN6Df4ki5JcleR7STYlOTbJgUluTPL97vaAPmuQJD1X30f8nwKur6pXA68FNgHnAzdV1SHATd2yJGlEegv+JC8D3ghcClBVT1XVE8ApwGXdwy4DTu2rBknSjvo84l8NbAU+n+T2JJck2Q9YUlWPdI95FFiysycnWZdkY5KNW7du7bFMSWpLn8G/AHgd8JmqOhL4GdPaOlVVQO3syVW1vqrWVNWaxYsX91imJLWlz+DfAmypqg3d8lUM/hA8lmQpQHf7eI81SJKm6S34q+pRYHOSQ7tVJwF3A9cCa7t1a4Fr+qpBkrSjvqdlfi9weZKFwH3AOxn8sbkyydnAA8DpPdcgSZqi1+CvqjuANTvZdFKf+5Uk7ZpX7kpSYwx+SWqMwS9JjTH4JakxBr8kNabv0zmlybbPApKMuwrpeTH4pT3x9DbOuPiWsez6inOOG8t+Nf/Z6pGkxnjEL0kzGVNL7+XLV/DQ5gfn/Psa/JI0kzG19Ppq59nqkaTGDBX8SY4fZp0kae837BH/p4dcJ0nay+22x5/kWOA4YHGS903Z9FJg3z4LkyT1Y6Y3dxcC+3ePe8mU9U8Cp/VVlCSpP7sN/qr6JvDNJH9XVQ+MqCZJUo+GPZ3zl5KsB1ZNfU5VndhHUZKk/gwb/F8CPgtcAvyiv3IkSX0bNvi3VdVneq1EkjQSw57O+dUkf5JkaZIDt3/1WpkkqRfDHvGv7W4/MGVdAa+c23IkSX0bKviranXfhUiSRmOo4E/yRztbX1V/P7flSJL6Nmyr5/VT7r8QOAm4DTD4JWmeGbbV896py0kWAV/soyBJUr9mOy3zzwD7/pI0Dw3b4/8qg7N4YDA5268AV/ZVlCSpP8P2+D825f424IGq2tJDPZKkng3V6ukma/segxk6DwCe6rMoSVJ/hv0ErtOBbwN/AJwObEjitMySNA8N2+r5C+D1VfU4QJLFwL8AV/VVmCSpH8Oe1bPP9tDv/Oh5PFeStBcZ9oj/+iQ3AF/ols8AvtZPSZKkPs30mbuvApZU1QeSvA04odv078DlfRcnSZp7Mx3xfxL4MEBVXQ1cDZDk8G7b7/ZYmySpBzP16ZdU1Z3TV3brVvVSkSSpVzMF/6LdbHvRHNYhSRqRmYJ/Y5I/nr4yybuAW4fZQZJ9k9ye5LpueXWSDUnuTXJFkoXPv2xJ0mzN1OM/D/hKkrfzbNCvARYCvzfkPs4FNgEv7ZYvAj5RVV9M8lngbMDP85WkEdntEX9VPVZVxwEXAvd3XxdW1bFV9ehM3zzJcuB3gEu65QAn8uyFX5cBp86ydknSLAw7H//NwM2z+P6fBD7IYI4fgIOAJ6pqW7e8BVi2sycmWQesA1i5cuUsdi1J2pnerr5N8hbg8aoa6r2A6apqfVWtqao1ixcvnuPqJKldw165OxvHA29N8tsMPq7xpcCngEVJFnRH/cuBh3qsQZI0TW9H/FX14apaXlWrgDOBr1fV2xm0jLbP7LkWuKavGiRJOxrHRGsfAt6X5F4GPf9Lx1CDJDWrz1bPM6rqG8A3uvv3AUePYr+SpB05tbIkNcbgl6TGGPyS1JiR9Pg1WstWrOThLZvHXYakvZTBP4Ee3rKZMy6+ZeT7veKc40a+T0nPn8Evaf7YZ8F4DjD2mayonKzRSJpsT2/j8AuuH/lu77zw5JHvs0++uStJjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mN8RO4NHfG9bF43b4lDcffFs2dMX0sHkzeR+NJfTL4NRn8EO7RGed/dpoTDb5qNZH8EO7R8T+7ec83dyWpMQa/JDXG4Jekxhj8ktQYg1+SGuNZPdKe8KI1zUO+cqQ94amNmocMfkmayYRdIGjwS9JMJuwCwd7e3E2yIsnNSe5OcleSc7v1Bya5Mcn3u9sD+qpBkrSjPs/q2Qa8v6oOA44B3p3kMOB84KaqOgS4qVuWJI1Ib8FfVY9U1W3d/f8GNgHLgFOAy7qHXQac2lcNkqQdjaTHn2QVcCSwAVhSVY90mx4FluziOeuAdQArV64cQZUTxNkTJe1G78GfZH/gy8B5VfVkkme2VVUlqZ09r6rWA+sB1qxZs9PHaBcm7I0oSXOr1yt3k7yAQehfXlVXd6sfS7K0274UeLzPGiRJz9XnWT0BLgU2VdXHp2y6Fljb3V8LXNNXDZKkHfXZ6jkeeAdwZ5I7unUfAf4auDLJ2cADwOk91iBJmqa34K+qbwHZxeaT+tqvJGn3nJ1Tkhpj8EtSYwx+SWqMwS9JjTH4JakxTssszVdOzaFZMvil+cqpOTRLtnokqTEGvyQ1xuCXpMbY4+/JshUreXjL5nGXIUk7MPh78vCWzZxx8S1j2bdnekjaHVs9ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUmIm/ctepEyTpuSY++Mc1dYLTJkjaW9nqkaTGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktSYiZ+ygX0WjGf6hHHtV5JmMPnB//Q2Dr/g+pHv9s4LTx7LfrfvW5J2xVaPJDXG4Jekxowl+JOcnOSeJPcmOX8cNUhSq0Ye/En2Bf4GeDNwGHBWksNGXYcktWocR/xHA/dW1X1V9RTwReCUMdQhSU1KVY12h8lpwMlV9a5u+R3AG6rqPdMetw5Y1y0eCtwzy10eDPxwls+drxxzGxzz5NvT8b6iqhZPX7nXns5ZVeuB9Xv6fZJsrKo1c1DSvOGY2+CYJ19f4x1Hq+chYMWU5eXdOknSCIwj+P8DOCTJ6iQLgTOBa8dQhyQ1aeStnqraluQ9wA3AvsDnququHne5x+2iecgxt8ExT75exjvyN3clSePllbuS1BiDX5IaMzHBn2RFkpuT3J3kriTndusPTHJjku93tweMu9a5kuSFSb6d5DvdmC/s1q9OsqGbEuOK7k30iZJk3yS3J7muW57oMSe5P8mdSe5IsrFbN7GvbYAki5JcleR7STYlOXaSx5zk0O7nu/3rySTn9THmiQl+YBvw/qo6DDgGeHc3FcT5wE1VdQhwU7c8KX4OnFhVrwWOAE5OcgxwEfCJqnoV8BPg7PGV2JtzgU1TllsY869X1RFTzuue5Nc2wKeA66vq1cBrGfy8J3bMVXVP9/M9AjgK+F/gK/Qx5qqayC/gGuA3GVzxu7RbtxS4Z9y19TTeFwO3AW9gcKXfgm79scAN465vjse6vPsFOBG4DkgDY74fOHjauol9bQMvA35AdwJKC2OeNs7fAv6trzFP0hH/M5KsAo4ENgBLquqRbtOjwJJx1dWHruVxB/A4cCPwX8ATVbWte8gWYNmYyuvLJ4EPAk93ywcx+WMu4J+T3NpNZwKT/dpeDWwFPt+19C5Jsh+TPeapzgS+0N2f8zFPXPAn2R/4MnBeVT05dVsN/mRO1PmrVfWLGvxruJzBBHivHm9F/UryFuDxqrp13LWM2AlV9ToGs9q+O8kbp26cwNf2AuB1wGeq6kjgZ0xrcUzgmAHo3p96K/Cl6dvmaswTFfxJXsAg9C+vqqu71Y8lWdptX8rgyHjiVNUTwM0M2hyLkmy/OG/SpsQ4HnhrkvsZzOx6IoNe8CSPmap6qLt9nEHf92gm+7W9BdhSVRu65asY/CGY5DFv92bgtqp6rFue8zFPTPAnCXApsKmqPj5l07XA2u7+Wga9/4mQZHGSRd39FzF4T2MTgz8Ap3UPm6gxV9WHq2p5Va1i8O/w16vq7UzwmJPsl+Ql2+8z6P9+lwl+bVfVo8DmJId2q04C7maCxzzFWTzb5oEexjwxV+4mOQH4V+BOnu39foRBn/9KYCXwAHB6Vf14LEXOsSS/BlzGYOqLfYArq+ovk7ySwdHwgcDtwB9W1c/HV2k/krwJ+POqesskj7kb21e6xQXAP1XVR5McxIS+tgGSHAFcAiwE7gPeSfc6Z3LHvB/wIPDKqvppt27Of84TE/ySpOFMTKtHkjQcg1+SGmPwS1JjDH5JaozBL0mNMfilGSQ5NUklmeirotUOg1+a2VnAt7pbad4z+KXd6OZ+OoHBNM9nduv2SfK33TzxNyb5WpLTum1HJflmN5naDdsvtZf2Jga/tHunMJgT/j+BHyU5CngbsAo4DHgHg/mRts8V9WngtKo6Cvgc8NFxFC3tzoKZHyI17SwGk8DBYEqIsxj83nypqp4GHk1yc7f9UOA1wI2DqaPYF3gEaS9j8Eu7kORABrN/Hp6kGAR58ey8OTs8Bbirqo4dUYnSrNjqkXbtNOAfquoVVbWqqlYw+FSoHwO/3/X6lwBv6h5/D7A4yTOtnyS/Oo7Cpd0x+KVdO4sdj+6/DPwyg/ni7wb+kcFHXv60qp5i8MfioiTfAe4AjhtZtdKQnJ1TmoUk+1fV/3RT5n4bOL6bQ17a69njl2bnuu5DcBYCf2Xoaz7xiF+SGmOPX5IaY/BLUmMMfklqjMEvSY0x+CWpMf8P/S1i2GC7UfwAAAAASUVORK5CYII=\n", + "text/plain": [ + "<Figure size 432x288 with 1 Axes>" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "# let us consider an example sample\n", + "sample = others_age\n", + "\n", + "# and a subsample size\n", + "n = 200\n", + "\n", + "# we need a random generator\n", + "rng = np.random.default_rng()\n", + "\n", + "# now we can pick n observations from the original sample\n", + "# calling the `choice` method of the random generator\n", + "subsample = rng.choice(sample, n)\n", + "\n", + "# in principle the smaller sample will exhibit similar\n", + "# properties as the original sample; both are drawn from\n", + "# the population in similar ways\n", + "bins = np.arange(20, 70+1, 5)\n", + "sns.histplot(sample, bins=bins)\n", + "sns.histplot(subsample, bins=bins);" + ] + }, + { + "cell_type": "markdown", + "id": "b44a7b2b", + "metadata": {}, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "81e31c08", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "b7d98432", + "metadata": {}, + "source": [ + "# Comparing two distributions" + ] + }, + { + "cell_type": "markdown", + "id": "7f5453a9", + "metadata": {}, + "source": [ + "Now let proceed to comparing age between people living with kids and those living without kids.\n", + "Plot the data." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "0aeaeee7", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "<Figure size 432x288 with 1 Axes>" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [] + }, + { + "cell_type": "markdown", + "id": "02cbfb3c", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "How do the common descriptive statistics (mean, variance) compare?" + ] + }, + { + "cell_type": "markdown", + "id": "d2f481d2", + "metadata": {}, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e9c1cc3c", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "34b67889", + "metadata": {}, + "source": [ + "## Q\n", + "\n", + "How can we compare the two groups to state they differ from one another?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dd8085f4", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "89181560", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## A (with nested Q&A)" + ] + }, + { + "cell_type": "markdown", + "id": "ef4093df", + "metadata": { + "hidden": true + }, + "source": [ + "We need a two-sample goodness-of-fit test.\n", + "\n", + "This can be done in two ways:\n", + "\n", + "* with a $\\chi^2$ test of homogeneity, binning the age;\n", + "* with a two-sample Kolmogorov-Smirnov test.\n", + "\n", + "### Q\n", + "\n", + "Bin the two groups, first with 5-year-wide bins, extract frequencies and proceed to performing a $\\chi^2$ test." + ] + }, + { + "cell_type": "markdown", + "id": "4338fe92", + "metadata": { + "hidden": true + }, + "source": [ + "### A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f57a8ff6", + "metadata": { + "hidden": true + }, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "218a59ec", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "Are all the assumptions met? Adjust the procedure if necessary. Any interpretation?" + ] + }, + { + "cell_type": "markdown", + "id": "abdce59a", + "metadata": {}, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3e64b5ce", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "id": "1be0bd8d", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "# ..." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f2a026da", + "metadata": { + "hidden": true + }, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.10" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": false, + "skip_h1_title": false, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": { + "height": "calc(100% - 180px)", + "left": "10px", + "top": "150px", + "width": "384px" + }, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/notebooks/scipy_TP_solutions.ipynb b/notebooks/scipy_TP_solutions.ipynb index 878502a..9d54cb7 100644 --- a/notebooks/scipy_TP_solutions.ipynb +++ b/notebooks/scipy_TP_solutions.ipynb @@ -3,7 +3,9 @@ { "cell_type": "markdown", "id": "a5a5210d", - "metadata": {}, + "metadata": { + "heading_collapsed": true + }, "source": [ "## Q\n", "\n", @@ -47,7 +49,9 @@ { "cell_type": "markdown", "id": "0e4fd0d9", - "metadata": {}, + "metadata": { + "heading_collapsed": true + }, "source": [ "## Q\n", "\n", @@ -287,7 +291,9 @@ { "cell_type": "markdown", "id": "9cc036b2", - "metadata": {}, + "metadata": { + "heading_collapsed": true + }, "source": [ "## Q\n", "\n", @@ -708,7 +714,9 @@ { "cell_type": "markdown", "id": "3512f950", - "metadata": {}, + "metadata": { + "heading_collapsed": true + }, "source": [ "## Q\n", "\n", @@ -992,7 +1000,9 @@ { "cell_type": "markdown", "id": "04163591", - "metadata": {}, + "metadata": { + "heading_collapsed": true + }, "source": [ "## Q\n", "\n", @@ -1074,7 +1084,17 @@ "source": [ "## Q\n", "\n", - "Isolate the house-owners group from the others, draw their respective age distributions and check they are normally distributed." + "Isolate the house-owners group from the others, draw their respective age distributions and report their mean ages as $99\\%$ confidence intervals." + ] + }, + { + "cell_type": "markdown", + "id": "497142f3", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## A" ] }, { @@ -1118,9 +1138,81 @@ "sns.histplot(hue='OwnsHouse', y='Age', data=df, kde=True);" ] }, + { + "cell_type": "markdown", + "id": "a721a374", + "metadata": { + "hidden": true + }, + "source": [ + "For the confidence intervals, we need to evaluate the inverse survival function of the standard normal distribution at $0.5\\%$." + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "id": "1fca5a60", + "metadata": { + "hidden": true + }, + "outputs": [], + "source": [ + "alpha = 0.01\n", + "z = stats.norm().isf(alpha / 2)" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "id": "9f044f06", + "metadata": { + "hidden": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "House owners: 39.82 ± 2.25 years old on average\n", + "Others: 50.12 ± 1.32 years old on average\n" + ] + } + ], + "source": [ + "for group_name, group_age in (\n", + " ('House owners', house_owners_age),\n", + " ('Others', others_age),\n", + "):\n", + " m = np.mean(group_age)\n", + " z_times_sem = z * stats.sem(group_age)\n", + " print(f'{group_name}: {m:.2f} ± {z_times_sem:.2f} years old on average')" + ] + }, + { + "cell_type": "markdown", + "id": "ea79970d", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "Check the age is normally distributed in any one group, first following a graphical approach." + ] + }, + { + "cell_type": "markdown", + "id": "15e4d4c9", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## A (with nested Q&A)" + ] + }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 37, "id": "ddf5d4b0", "metadata": { "hidden": true @@ -1128,7 +1220,7 @@ "outputs": [ { "data": { - "image/png": "\n", + "image/png": "\n", "text/plain": [ "<Figure size 432x288 with 1 Axes>" ] @@ -1142,7 +1234,9 @@ "source": [ "(theoretical_quantiles, observed_quantiles), (slope, intercept, _) = stats.probplot(house_owners_age, fit=True)\n", "plt.scatter(theoretical_quantiles, observed_quantiles, marker='+', color='b')\n", - "plt.axline((0, intercept), slope=slope, color='r');" + "plt.axline((0, intercept), slope=slope, color='r')\n", + "plt.xlabel('theoretical quantiles')\n", + "plt.ylabel('ordered observations (age)');" ] }, { @@ -1152,12 +1246,27 @@ "hidden": true }, "source": [ - "The red line is fitted to the blue points and does not align well on the linear part. To better illustrate what is the linear part, we reimplement the regression (the exact implementation is out of the scope of this session):" + "The red line is fitted to the blue points and does not align well on the linear part.\n", + "\n", + "### Q\n", + "\n", + "To better illustrate that the central part is approximately linear, perform a linear regression with the observations whose corresponding theoretical quantiles (abscissa) fall in the $[-1,1]$ interval, and make a probability plot replacing the default regression line by your regression line." + ] + }, + { + "cell_type": "markdown", + "id": "153d570e", + "metadata": { + "heading_collapsed": true, + "hidden": true + }, + "source": [ + "### A" ] }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 35, "id": "0f888c53", "metadata": { "hidden": true @@ -1177,30 +1286,63 @@ } ], "source": [ - "import statsmodels.api as sm # anticipating the next class...\n", - "central = (-1<theoretical_quantiles) & (theoretical_quantiles<1)\n", - "model = sm.OLS(observed_quantiles[central], sm.add_constant(theoretical_quantiles[central])).fit()\n", - "a, b = model.params\n", + "central_part = (-1<theoretical_quantiles) & (theoretical_quantiles<1)\n", + "b, a, _, _, _ = stats.linregress(theoretical_quantiles[central_part], observed_quantiles[central_part])\n", "plt.scatter(theoretical_quantiles, observed_quantiles, marker='+', color='b')\n", "plt.axline((0, a), slope=b, color='r');" ] }, { "cell_type": "markdown", - "id": "f35584b7", + "id": "7981096d", "metadata": { "hidden": true }, "source": [ "The misalignment of the default regression line on the central part of the distribution is indicative of some asymmetry, while the diverging tails also hint at some departure from normality (kurtosis). The sampling procedure clearly excluded people younger than 20 years old or elder than 70, which results in truncated distributions.\n", "\n", + "We can seek confirmation with a normality test, although it is already clear the age is not normally distributed in our sample:" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "id": "ed0042b6", + "metadata": { + "hidden": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "NormaltestResult(statistic=148.99086986471391, pvalue=4.436532649003701e-33)" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "stats.normaltest(house_owners_age)" + ] + }, + { + "cell_type": "markdown", + "id": "f35584b7", + "metadata": { + "hidden": true + }, + "source": [ "Here, we have comfortable sample sizes and these departures from normality may not affect the power of the statistical test." ] }, { "cell_type": "markdown", "id": "2cc80be1", - "metadata": {}, + "metadata": { + "heading_collapsed": true + }, "source": [ "## Q\n", "\n", @@ -1247,13 +1389,16 @@ "hidden": true }, "source": [ - "`ttest_ind` allows standard deviation ratios [up to $2$](https://en.wikipedia.org/wiki/Student%27s_t-test#Equal_or_unequal_sample_sizes,_similar_variances_(1/2_%3C_sX1/sX2_%3C_2)). The groups can have different sample sizes." + "`ttest_ind` allows standard deviation ratios [up to $2$](https://en.wikipedia.org/wiki/Student%27s_t-test#Equal_or_unequal_sample_sizes,_similar_variances_(1/2_%3C_sX1/sX2_%3C_2)).\n", + "The groups can have different sample sizes." ] }, { "cell_type": "markdown", "id": "d61f454a", - "metadata": {}, + "metadata": { + "heading_collapsed": true + }, "source": [ "## Q\n", "\n", @@ -1300,7 +1445,9 @@ { "cell_type": "markdown", "id": "62b30b76", - "metadata": {}, + "metadata": { + "heading_collapsed": true + }, "source": [ "## Q\n", "\n", @@ -1319,7 +1466,7 @@ }, { "cell_type": "code", - "execution_count": 54, + "execution_count": 14, "id": "341157b6", "metadata": { "hidden": true @@ -1331,7 +1478,7 @@ "(814, -10.305953282828284, -0.7954424784394866, -0.7954424784394866)" ] }, - "execution_count": 54, + "execution_count": 14, "metadata": {}, "output_type": "execute_result" } @@ -1371,7 +1518,9 @@ { "cell_type": "markdown", "id": "f72698b7", - "metadata": {}, + "metadata": { + "heading_collapsed": true + }, "source": [ "## Q\n", "\n", @@ -1392,7 +1541,7 @@ }, { "cell_type": "markdown", - "id": "8a2bc253", + "id": "6b7d5d56", "metadata": { "heading_collapsed": true }, @@ -1402,15 +1551,15 @@ }, { "cell_type": "code", - "execution_count": 48, - "id": "fd050fbc", + "execution_count": 15, + "id": "ee947953", "metadata": { "hidden": true }, "outputs": [ { "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEGCAYAAABiq/5QAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAQ8UlEQVR4nO3df6xfdX3H8eer1E4Ftfy46Wp/rDUaHJMJUlF+xDjYFtycMMf4Eec6g4Nk6mQ6Fd0fhC0mkhiVmE1pQIcb0yJiQGJgDNHMsdS1gKlQmQyBXn7WH8jmElnlvT++p3K5Le3t7T3fb+/383wkN/d7zvl+73l/cr993dP3OefzTVUhSWrHglEXIEkaLoNfkhpj8EtSYwx+SWqMwS9JjVk46gJm4rDDDqtVq1aNugxJmlc2bdr0g6qamL5+XgT/qlWr2Lhx46jLkKR5Jcn9u1pvq0eSGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EuaN5atWEmSoX8tW7Fy1EOfU/NiygZJAnhocitnXnrr0Pe7/rzjh77PPnnEL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMb0Gf5K/SHJnku8k+XyS5yZZnWRDknuSrE+yqM8aJEnP1FvwJ1kG/DmwpqpeARwAnAVcDHy8ql4K/Bg4p68aJEk767vVsxB4XpKFwPOBh4GTgKu77VcAp/VcgyRpit6Cv6oeBD4KPMAg8H8CbAIer6rt3dMmgWW7en2Sc5NsTLJx27ZtfZUpSc3ps9VzMHAqsBp4MXAgcMpMX19V66pqTVWtmZiY6KlKSWpPn62e3wS+X1Xbqur/gGuAE4DFXesHYDnwYI81SJKm6TP4HwBem+T5SQKcDNwF3AKc3j1nLXBtjzVIkqbps8e/gcFJ3NuAzd2+1gEfAN6T5B7gUODyvmqQJO1s4Z6fMntVdSFw4bTV9wLH9rlfSdKz885dSWqMwS9JjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg19jYdmKlSQZ+teyFStHPXRpr/U6H780LA9NbuXMS28d+n7Xn3f80Pcp7SuP+CWpMR7xS9ory1as5KHJraMuQ/vA4Je0V0bVVgNba3PFVo8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY3xOn7NGW/skeYHg19zxht7pPnB4JfmKf+Hpdky+KV5yhlJNVue3JWkxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTG9Bn+SxUmuTvLdJFuSHJfkkCQ3Jfle9/3gPmuQJD1T30f8lwA3VNXLgVcCW4ALgJur6mXAzd2yJGlIegv+JC8CXgdcDlBVT1bV48CpwBXd064ATuurBknSzvo84l8NbAM+m+T2JJclORBYUlUPd895BFiyqxcnOTfJxiQbt23b1mOZktSWPoN/IfAq4FNVdTTwU6a1daqqgNrVi6tqXVWtqao1ExMTPZYpSW3pM/gngcmq2tAtX83gD8GjSZYCdN8f67EGSdI0vQV/VT0CbE1yeLfqZOAu4DpgbbduLXBtXzVIknbW97TM7wKuTLIIuBd4G4M/NlclOQe4Hzij5xokSVP0GvxVdQewZhebTu5zv5KkZ+edu5LUGINfkhpj8EtSYwx+SWqMwS9Jjen7ck5pvC1YSJJRVyHtFYNf2hdPbefMS28dya7Xn3f8SPar+c9WjyQ1xiN+SdqTEbX0Xrx8BQ9ufWDOf67BL0l7MqKWXl/tPFs9ktSYGQV/khNmsk6StP+b6RH/J2e4TpK0n9ttjz/JccDxwESS90zZ9ELggD4LkyT1Y08ndxcBB3XPe8GU9U8Ap/dVlCSpP7sN/qr6BvCNJH9fVfcPqSZJUo9mejnnLyVZB6ya+pqqOqmPoiRJ/Zlp8H8R+DRwGfDz/sqRJPVtpsG/vao+1WslkqShmOnlnF9J8mdJliY5ZMdXr5VJknox0yP+td33901ZV8BL5rYcSVLfZhT8VbW670IkScMxo+BP8se7Wl9Vn5vbciRJfZtpq+fVUx4/FzgZuA0w+CVpnplpq+ddU5eTLAa+0EdBkqR+zXZa5p8C9v0laR6aaY//Kwyu4oHB5Gy/ClzVV1GSpP7MtMf/0SmPtwP3V9VkD/VIkno2o1ZPN1nbdxnM0Hkw8GSfRUmS+jPTT+A6A/gW8IfAGcCGJE7LLEnz0ExbPX8FvLqqHgNIMgH8C3B1X4VJkvox06t6FuwI/c4P9+K1kqT9yEyP+G9IciPw+W75TOCr/ZQkSerTnj5z96XAkqp6X5I3Ayd2m/4duLLv4iRJc29PR/yfAD4IUFXXANcAJDmy2/Z7PdYmSerBnoJ/SVVtnr6yqjYnWdVPSZL0LBYsZP15x49kv+NkT6NZvJttz5vDOiRpz57azpEX3jD03W6+6JSh77NPe7oyZ2OSP52+MsnbgU0z2UGSA5LcnuT6bnl1kg1J7kmyPsmivS9bkjRbezriPx/4cpK38HTQrwEWAb8/w328G9gCvLBbvhj4eFV9IcmngXMAP89XkoZkt0f8VfVoVR0PXATc131dVFXHVdUje/rhSZYDvwtc1i0HOImnb/y6AjhtlrVLkmZhpvPx3wLcMouf/wng/Qzm+AE4FHi8qrZ3y5PAsl29MMm5wLkAK1eunMWuJUm70tvdt0neCDxWVTM6FzBdVa2rqjVVtWZiYmKOq5OkdvV5jdIJwJuS/A6Dj2t8IXAJsDjJwu6ofznwYI81SJKm6e2Iv6o+WFXLq2oVcBbwtap6C4OW0Y6ZPdcC1/ZVgyRpZ6OYaO0DwHuS3MOg53/5CGqQpGYN5Xa0qvo68PXu8b3AscPYryRpZ06tLEmNMfglqTEGvyQ1ZrymnBMAy1as5KHJraMuQ9J+yuAfQw9NbuXMS28d+n5HMl2upL1mq0eSGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjJ+5q/GwYOFoPvN3gf+ENP/4rtV4eGo7R154w9B3u/miU4a+T2lf2eqRpMYY/JLUGINfkhpj8EtSYwx+SWqMV/VI85WXsGqW/A1K85WXsGqWbPVIUmMMfklqjK0eaV+Mqs8u7QODX9oXI+qzg712zV5vwZ9kBfA5YAlQwLqquiTJIcB6YBVwH3BGVf24rzo0RB79alyN2RVUfR7xbwfeW1W3JXkBsCnJTcCfADdX1UeSXABcAHygxzo0LB79alyN2RVUvZ3craqHq+q27vF/A1uAZcCpwBXd064ATuurBknSzoZyVU+SVcDRwAZgSVU93G16hEEraFevOTfJxiQbt23bNowyJakJvQd/koOALwHnV9UTU7dVVTHo/++kqtZV1ZqqWjMxMdF3mZLUjF6DP8lzGIT+lVV1Tbf60SRLu+1Lgcf6rEGS9Ey9BX+SAJcDW6rqY1M2XQes7R6vBa7tqwZJ0s76vKrnBOCtwOYkd3TrPgR8BLgqyTnA/cAZPdYgSZqmt+Cvqm8CeZbNJ/e1X0nS7jlXjyQ1xikbJO0d79Ce9wx+SXvHO7TnPYN/HHlEJmk3DP5xNGbzikiaW57claTGGPyS1BiDX5IaY/BLUmMMfklqjFf19GTZipU8NLl11GVI0k4M/p48NLmVMy+9dST79hp+Sbtjq0eSGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhoz9nfuOnWCJD3T2Af/qKZOcNoESfsrWz2S1BiDX5IaY/BLUmPGvsc/MgsW2ueXtF8y+Pvy1HaOvPCGkex680WnjGS/kuYHWz2S1BiDX5IaY/BLUmPGv8fvSVZJeobxD/4RnWT1BKuk/ZWtHklqjMEvSY0x+CWpMQa/JDVmJMGf5JQkdye5J8kFo6hBklo19OBPcgDwt8AbgCOAs5McMew6JKlVozjiPxa4p6ruraongS8Ap46gDklqUqpquDtMTgdOqaq3d8tvBV5TVe+c9rxzgXO7xcOBu2e5y8OAH8zytfOVY26DYx5/+zreX6mqiekr99sbuKpqHbBuX39Oko1VtWYOSpo3HHMbHPP462u8o2j1PAismLK8vFsnSRqCUQT/fwAvS7I6ySLgLOC6EdQhSU0aequnqrYneSdwI3AA8JmqurPHXe5zu2gecsxtcMzjr5fxDv3kriRptLxzV5IaY/BLUmPGJviTrEhyS5K7ktyZ5N3d+kOS3JTke933g0dd61xJ8twk30ry7W7MF3XrVyfZ0E2Jsb47iT5WkhyQ5PYk13fLYz3mJPcl2ZzkjiQbu3Vj+94GSLI4ydVJvptkS5LjxnnMSQ7vfr87vp5Icn4fYx6b4Ae2A++tqiOA1wLv6KaCuAC4uapeBtzcLY+LnwEnVdUrgaOAU5K8FrgY+HhVvRT4MXDO6ErszbuBLVOWWxjzb1TVUVOu6x7n9zbAJcANVfVy4JUMft9jO+aqurv7/R4FHAP8L/Bl+hhzVY3lF3At8FsM7vhd2q1bCtw96tp6Gu/zgduA1zC4029ht/444MZR1zfHY13e/QM4CbgeSANjvg84bNq6sX1vAy8Cvk93AUoLY542zt8G/q2vMY/TEf8vJFkFHA1sAJZU1cPdpkeAJaOqqw9dy+MO4DHgJuC/gMeranv3lElg2YjK68sngPcDT3XLhzL+Yy7gn5Ns6qYzgfF+b68GtgGf7Vp6lyU5kPEe81RnAZ/vHs/5mMcu+JMcBHwJOL+qnpi6rQZ/Msfq+tWq+nkN/mu4nMEEeC8fbUX9SvJG4LGq2jTqWobsxKp6FYNZbd+R5HVTN47he3sh8CrgU1V1NPBTprU4xnDMAHTnp94EfHH6trka81gFf5LnMAj9K6vqmm71o0mWdtuXMjgyHjtV9ThwC4M2x+IkO27OG7cpMU4A3pTkPgYzu57EoBc8zmOmqh7svj/GoO97LOP93p4EJqtqQ7d8NYM/BOM85h3eANxWVY92y3M+5rEJ/iQBLge2VNXHpmy6DljbPV7LoPc/FpJMJFncPX4eg3MaWxj8ATi9e9pYjbmqPlhVy6tqFYP/Dn+tqt7CGI85yYFJXrDjMYP+73cY4/d2VT0CbE1yeLfqZOAuxnjMU5zN020e6GHMY3PnbpITgX8FNvN07/dDDPr8VwErgfuBM6rqRyMpco4l+XXgCgZTXywArqqqv07yEgZHw4cAtwN/VFU/G12l/UjyeuAvq+qN4zzmbmxf7hYXAv9UVR9Ocihj+t4GSHIUcBmwCLgXeBvd+5zxHfOBwAPAS6rqJ926Of89j03wS5JmZmxaPZKkmTH4JakxBr8kNcbgl6TGGPyS1BiDX9qDJKclqSRjfVe02mHwS3t2NvDN7rs07xn80m50cz+dyGCa57O6dQuS/F03T/xNSb6a5PRu2zFJvtFNpnbjjlvtpf2JwS/t3qkM5oT/T+CHSY4B3gysAo4A3spgfqQdc0V9Eji9qo4BPgN8eBRFS7uzcM9PkZp2NoNJ4GAwJcTZDP7dfLGqngIeSXJLt/1w4BXATYOpozgAeBhpP2PwS88iySEMZv88MkkxCPLi6XlzdnoJcGdVHTekEqVZsdUjPbvTgX+oql+pqlVVtYLBp0L9CPiDrte/BHh99/y7gYkkv2j9JPm1URQu7Y7BLz27s9n56P5LwC8zmC/+LuAfGXzk5U+q6kkGfywuTvJt4A7g+KFVK82Qs3NKs5DkoKr6n27K3G8BJ3RzyEv7PXv80uxc330IziLgbwx9zSce8UtSY+zxS1JjDH5JaozBL0mNMfglqTEGvyQ15v8BiGF1P6/1jG0AAAAASUVORK5CYII=\n", + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEGCAYAAABiq/5QAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAQ1klEQVR4nO3de6xlZXnH8e8PxqkK6nCZTMe5OGMkWCoVZERuMRbaBlsr1FIusXZisENStVCtivYPQhuTkhgvMa0yAS1tqYKIAYmBUkRTSzN2uBiEkUoRmOE6XpDWJtKRp3/sNXA4czmbM2ftPWe/309ysvdaa++znjdnn99Z59lrvTtVhSSpHfuMuwBJ0mgZ/JLUGINfkhpj8EtSYwx+SWrMgnEXMIyDDz64Vq1aNe4yJGleufXWW39YVYunr58Xwb9q1So2btw47jIkaV5J8sDO1tvqkaTGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8kuaNZStWkmTkX8tWrBz30OfUvJiyQZIAHt6ymTMuvmXk+73inONGvs8+ecQvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhpj8EtSYwx+SWqMwS9JjTH4JakxvQZ/kj9LcleS7yb5QpIXJlmdZEOSe5NckWRhnzVIkp6rt+BPsgz4U2BNVb0G2Bc4E7gI+ERVvQr4CXB2XzVIknbUd6tnAfCiJAuAFwOPACcCV3XbLwNO7bkGSdIUvQV/VT0EfAx4kEHg/xS4FXiiqrZ1D9sCLNvZ85OsS7IxycatW7f2VaYkNafPVs8BwCnAauDlwH7AycM+v6rWV9WaqlqzePHinqqUpPb02er5DeAHVbW1qv4PuBo4HljUtX4AlgMP9ViDJGmaPoP/QeCYJC9OEuAk4G7gZuC07jFrgWt6rEGSNE2fPf4NDN7EvQ24s9vXeuBDwPuS3AscBFzaVw2SpB0tmPkhs1dVFwAXTFt9H3B0n/uVJO2aV+5KUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JoIy1asJMnIv5atWDnuoUvPW6/z8Uuj8vCWzZxx8S0j3+8V5xw38n1Ke8ojfklqjEf8kp6XZStW8vCWzeMuQ3vA4Jf0vIyrrQa21uaKrR5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUGINfkhrjefyaM17YI80PBr/mjBf2SPODwS/NU/6Hpdky+KV5yhlJNVu+uStJjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmN6Df4ki5JcleR7STYlOTbJgUluTPL97vaAPmuQJD1X30f8nwKur6pXA68FNgHnAzdV1SHATd2yJGlEegv+JC8D3ghcClBVT1XVE8ApwGXdwy4DTu2rBknSjvo84l8NbAU+n+T2JJck2Q9YUlWPdI95FFiysycnWZdkY5KNW7du7bFMSWpLn8G/AHgd8JmqOhL4GdPaOlVVQO3syVW1vqrWVNWaxYsX91imJLWlz+DfAmypqg3d8lUM/hA8lmQpQHf7eI81SJKm6S34q+pRYHOSQ7tVJwF3A9cCa7t1a4Fr+qpBkrSjvqdlfi9weZKFwH3AOxn8sbkyydnAA8DpPdcgSZqi1+CvqjuANTvZdFKf+5Uk7ZpX7kpSYwx+SWqMwS9JjTH4JakxBr8kNabv0zmlybbPApKMuwrpeTH4pT3x9DbOuPiWsez6inOOG8t+Nf/Z6pGkxnjEL0kzGVNL7+XLV/DQ5gfn/Psa/JI0kzG19Ppq59nqkaTGDBX8SY4fZp0kae837BH/p4dcJ0nay+22x5/kWOA4YHGS903Z9FJg3z4LkyT1Y6Y3dxcC+3ePe8mU9U8Cp/VVlCSpP7sN/qr6JvDNJH9XVQ+MqCZJUo+GPZ3zl5KsB1ZNfU5VndhHUZKk/gwb/F8CPgtcAvyiv3IkSX0bNvi3VdVneq1EkjQSw57O+dUkf5JkaZIDt3/1WpkkqRfDHvGv7W4/MGVdAa+c23IkSX0bKviranXfhUiSRmOo4E/yRztbX1V/P7flSJL6Nmyr5/VT7r8QOAm4DTD4JWmeGbbV896py0kWAV/soyBJUr9mOy3zzwD7/pI0Dw3b4/8qg7N4YDA5268AV/ZVlCSpP8P2+D825f424IGq2tJDPZKkng3V6ukma/segxk6DwCe6rMoSVJ/hv0ErtOBbwN/AJwObEjitMySNA8N2+r5C+D1VfU4QJLFwL8AV/VVmCSpH8Oe1bPP9tDv/Oh5PFeStBcZ9oj/+iQ3AF/ols8AvtZPSZKkPs30mbuvApZU1QeSvA04odv078DlfRcnSZp7Mx3xfxL4MEBVXQ1cDZDk8G7b7/ZYmySpBzP16ZdU1Z3TV3brVvVSkSSpVzMF/6LdbHvRHNYhSRqRmYJ/Y5I/nr4yybuAW4fZQZJ9k9ye5LpueXWSDUnuTXJFkoXPv2xJ0mzN1OM/D/hKkrfzbNCvARYCvzfkPs4FNgEv7ZYvAj5RVV9M8lngbMDP85WkEdntEX9VPVZVxwEXAvd3XxdW1bFV9ehM3zzJcuB3gEu65QAn8uyFX5cBp86ydknSLAw7H//NwM2z+P6fBD7IYI4fgIOAJ6pqW7e8BVi2sycmWQesA1i5cuUsdi1J2pnerr5N8hbg8aoa6r2A6apqfVWtqao1ixcvnuPqJKldw165OxvHA29N8tsMPq7xpcCngEVJFnRH/cuBh3qsQZI0TW9H/FX14apaXlWrgDOBr1fV2xm0jLbP7LkWuKavGiRJOxrHRGsfAt6X5F4GPf9Lx1CDJDWrz1bPM6rqG8A3uvv3AUePYr+SpB05tbIkNcbgl6TGGPyS1JiR9Pg1WstWrOThLZvHXYakvZTBP4Ee3rKZMy6+ZeT7veKc40a+T0nPn8Evaf7YZ8F4DjD2mayonKzRSJpsT2/j8AuuH/lu77zw5JHvs0++uStJjTH4JakxBr8kNcbgl6TGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktQYg1+SGmPwS1JjDH5JaozBL0mN8RO4NHfG9bF43b4lDcffFs2dMX0sHkzeR+NJfTL4NRn8EO7RGed/dpoTDb5qNZH8EO7R8T+7ec83dyWpMQa/JDXG4Jekxhj8ktQYg1+SGuNZPdKe8KI1zUO+cqQ94amNmocMfkmayYRdIGjwS9JMJuwCwd7e3E2yIsnNSe5OcleSc7v1Bya5Mcn3u9sD+qpBkrSjPs/q2Qa8v6oOA44B3p3kMOB84KaqOgS4qVuWJI1Ib8FfVY9U1W3d/f8GNgHLgFOAy7qHXQac2lcNkqQdjaTHn2QVcCSwAVhSVY90mx4FluziOeuAdQArV64cQZUTxNkTJe1G78GfZH/gy8B5VfVkkme2VVUlqZ09r6rWA+sB1qxZs9PHaBcm7I0oSXOr1yt3k7yAQehfXlVXd6sfS7K0274UeLzPGiRJz9XnWT0BLgU2VdXHp2y6Fljb3V8LXNNXDZKkHfXZ6jkeeAdwZ5I7unUfAf4auDLJ2cADwOk91iBJmqa34K+qbwHZxeaT+tqvJGn3nJ1Tkhpj8EtSYwx+SWqMwS9JjTH4JakxTssszVdOzaFZMvil+cqpOTRLtnokqTEGvyQ1xuCXpMbY4+/JshUreXjL5nGXIUk7MPh78vCWzZxx8S1j2bdnekjaHVs9ktQYg1+SGmPwS1JjDH5JaozBL0mNMfglqTEGvyQ1xuCXpMYY/JLUmIm/ctepEyTpuSY++Mc1dYLTJkjaW9nqkaTGGPyS1BiDX5IaY/BLUmMMfklqjMEvSY0x+CWpMQa/JDXG4Jekxhj8ktSYiZ+ygX0WjGf6hHHtV5JmMPnB//Q2Dr/g+pHv9s4LTx7LfrfvW5J2xVaPJDXG4Jekxowl+JOcnOSeJPcmOX8cNUhSq0Ye/En2Bf4GeDNwGHBWksNGXYcktWocR/xHA/dW1X1V9RTwReCUMdQhSU1KVY12h8lpwMlV9a5u+R3AG6rqPdMetw5Y1y0eCtwzy10eDPxwls+drxxzGxzz5NvT8b6iqhZPX7nXns5ZVeuB9Xv6fZJsrKo1c1DSvOGY2+CYJ19f4x1Hq+chYMWU5eXdOknSCIwj+P8DOCTJ6iQLgTOBa8dQhyQ1aeStnqraluQ9wA3AvsDnququHne5x+2iecgxt8ExT75exjvyN3clSePllbuS1BiDX5IaMzHBn2RFkpuT3J3kriTndusPTHJjku93tweMu9a5kuSFSb6d5DvdmC/s1q9OsqGbEuOK7k30iZJk3yS3J7muW57oMSe5P8mdSe5IsrFbN7GvbYAki5JcleR7STYlOXaSx5zk0O7nu/3rySTn9THmiQl+YBvw/qo6DDgGeHc3FcT5wE1VdQhwU7c8KX4OnFhVrwWOAE5OcgxwEfCJqnoV8BPg7PGV2JtzgU1TllsY869X1RFTzuue5Nc2wKeA66vq1cBrGfy8J3bMVXVP9/M9AjgK+F/gK/Qx5qqayC/gGuA3GVzxu7RbtxS4Z9y19TTeFwO3AW9gcKXfgm79scAN465vjse6vPsFOBG4DkgDY74fOHjauol9bQMvA35AdwJKC2OeNs7fAv6trzFP0hH/M5KsAo4ENgBLquqRbtOjwJJx1dWHruVxB/A4cCPwX8ATVbWte8gWYNmYyuvLJ4EPAk93ywcx+WMu4J+T3NpNZwKT/dpeDWwFPt+19C5Jsh+TPeapzgS+0N2f8zFPXPAn2R/4MnBeVT05dVsN/mRO1PmrVfWLGvxruJzBBHivHm9F/UryFuDxqrp13LWM2AlV9ToGs9q+O8kbp26cwNf2AuB1wGeq6kjgZ0xrcUzgmAHo3p96K/Cl6dvmaswTFfxJXsAg9C+vqqu71Y8lWdptX8rgyHjiVNUTwM0M2hyLkmy/OG/SpsQ4HnhrkvsZzOx6IoNe8CSPmap6qLt9nEHf92gm+7W9BdhSVRu65asY/CGY5DFv92bgtqp6rFue8zFPTPAnCXApsKmqPj5l07XA2u7+Wga9/4mQZHGSRd39FzF4T2MTgz8Ap3UPm6gxV9WHq2p5Va1i8O/w16vq7UzwmJPsl+Ql2+8z6P9+lwl+bVfVo8DmJId2q04C7maCxzzFWTzb5oEexjwxV+4mOQH4V+BOnu39foRBn/9KYCXwAHB6Vf14LEXOsSS/BlzGYOqLfYArq+ovk7ySwdHwgcDtwB9W1c/HV2k/krwJ+POqesskj7kb21e6xQXAP1XVR5McxIS+tgGSHAFcAiwE7gPeSfc6Z3LHvB/wIPDKqvppt27Of84TE/ySpOFMTKtHkjQcg1+SGmPwS1JjDH5JaozBL0mNMfilGSQ5NUklmeirotUOg1+a2VnAt7pbad4z+KXd6OZ+OoHBNM9nduv2SfK33TzxNyb5WpLTum1HJflmN5naDdsvtZf2Jga/tHunMJgT/j+BHyU5CngbsAo4DHgHg/mRts8V9WngtKo6Cvgc8NFxFC3tzoKZHyI17SwGk8DBYEqIsxj83nypqp4GHk1yc7f9UOA1wI2DqaPYF3gEaS9j8Eu7kORABrN/Hp6kGAR58ey8OTs8Bbirqo4dUYnSrNjqkXbtNOAfquoVVbWqqlYw+FSoHwO/3/X6lwBv6h5/D7A4yTOtnyS/Oo7Cpd0x+KVdO4sdj+6/DPwyg/ni7wb+kcFHXv60qp5i8MfioiTfAe4AjhtZtdKQnJ1TmoUk+1fV/3RT5n4bOL6bQ17a69njl2bnuu5DcBYCf2Xoaz7xiF+SGmOPX5IaY/BLUmMMfklqjMEvSY0x+CWpMf8P/S1i2GC7UfwAAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 432x288 with 1 Axes>" ] @@ -1455,8 +1604,8 @@ }, { "cell_type": "code", - "execution_count": null, - "id": "35541f89", + "execution_count": 16, + "id": "81e31c08", "metadata": { "hidden": true }, @@ -1471,7 +1620,7 @@ }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 17, "id": "2ae175e5", "metadata": { "hidden": true @@ -1507,8 +1656,8 @@ }, { "cell_type": "code", - "execution_count": 42, - "id": "d9641fa6", + "execution_count": 18, + "id": "d15c23da", "metadata": { "hidden": true }, @@ -1556,49 +1705,49 @@ " <th>2</th>\n", " <td>57</td>\n", " <td>Student</td>\n", - " <td>0.97</td>\n", + " <td>1.00</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>57</td>\n", " <td>Welch</td>\n", - " <td>0.97</td>\n", + " <td>1.00</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>28</td>\n", " <td>Student</td>\n", - " <td>0.80</td>\n", + " <td>0.76</td>\n", " </tr>\n", " <tr>\n", " <th>5</th>\n", " <td>28</td>\n", " <td>Welch</td>\n", - " <td>0.80</td>\n", + " <td>0.76</td>\n", " </tr>\n", " <tr>\n", " <th>6</th>\n", " <td>14</td>\n", " <td>Student</td>\n", - " <td>0.55</td>\n", + " <td>0.51</td>\n", " </tr>\n", " <tr>\n", " <th>7</th>\n", " <td>14</td>\n", " <td>Welch</td>\n", - " <td>0.55</td>\n", + " <td>0.51</td>\n", " </tr>\n", " <tr>\n", " <th>8</th>\n", " <td>7</td>\n", " <td>Student</td>\n", - " <td>0.25</td>\n", + " <td>0.29</td>\n", " </tr>\n", " <tr>\n", " <th>9</th>\n", " <td>7</td>\n", " <td>Welch</td>\n", - " <td>0.23</td>\n", + " <td>0.29</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", @@ -1608,17 +1757,17 @@ " sample size test power\n", "0 288 Student 1.00\n", "1 288 Welch 1.00\n", - "2 57 Student 0.97\n", - "3 57 Welch 0.97\n", - "4 28 Student 0.80\n", - "5 28 Welch 0.80\n", - "6 14 Student 0.55\n", - "7 14 Welch 0.55\n", - "8 7 Student 0.25\n", - "9 7 Welch 0.23" + "2 57 Student 1.00\n", + "3 57 Welch 1.00\n", + "4 28 Student 0.76\n", + "5 28 Welch 0.76\n", + "6 14 Student 0.51\n", + "7 14 Welch 0.51\n", + "8 7 Student 0.29\n", + "9 7 Welch 0.29" ] }, - "execution_count": 42, + "execution_count": 18, "metadata": {}, "output_type": "execute_result" } @@ -1629,7 +1778,7 @@ }, { "cell_type": "markdown", - "id": "f0ffbdab", + "id": "c4c702bc", "metadata": { "hidden": true }, @@ -1659,7 +1808,7 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 19, "id": "0aeaeee7", "metadata": {}, "outputs": [ @@ -1684,7 +1833,9 @@ { "cell_type": "markdown", "id": "02cbfb3c", - "metadata": {}, + "metadata": { + "heading_collapsed": true + }, "source": [ "## Q\n", "\n", @@ -1703,7 +1854,7 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 20, "id": "e9c1cc3c", "metadata": { "hidden": true @@ -1715,7 +1866,7 @@ "(47.758187772925766, 44.85779329608938, 16.298908849529322, 9.611832029475966)" ] }, - "execution_count": 17, + "execution_count": 20, "metadata": {}, "output_type": "execute_result" } @@ -1741,7 +1892,7 @@ }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 21, "id": "b6bfdb58", "metadata": { "hidden": true @@ -1789,15 +1940,19 @@ { "cell_type": "markdown", "id": "89181560", - "metadata": {}, + "metadata": { + "heading_collapsed": true + }, "source": [ - "## A" + "## A (with nested Q&A)" ] }, { "cell_type": "markdown", "id": "ef4093df", - "metadata": {}, + "metadata": { + "hidden": true + }, "source": [ "We need a two-sample goodness-of-fit test.\n", "\n", @@ -1808,22 +1963,27 @@ "\n", "### Q\n", "\n", - "Bin the two groups, extract frequencies and proceed to performing a $\\chi^2$ test." + "Bin the two groups, first with 5-year-wide bins, extract frequencies and proceed to performing a $\\chi^2$ test." ] }, { "cell_type": "markdown", "id": "4338fe92", - "metadata": {}, + "metadata": { + "heading_collapsed": true, + "hidden": true + }, "source": [ "### A" ] }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 22, "id": "f57a8ff6", - "metadata": {}, + "metadata": { + "hidden": true + }, "outputs": [ { "data": { @@ -1832,7 +1992,7 @@ " array([ 2, 12, 44, 67, 65, 53, 57, 34, 16, 8]))" ] }, - "execution_count": 19, + "execution_count": 22, "metadata": {}, "output_type": "execute_result" } @@ -1847,16 +2007,20 @@ { "cell_type": "markdown", "id": "980a9794", - "metadata": {}, + "metadata": { + "hidden": true + }, "source": [ "Let us check we did not miss any observation:" ] }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 23, "id": "de19e80b", - "metadata": {}, + "metadata": { + "hidden": true + }, "outputs": [ { "data": { @@ -1864,7 +2028,7 @@ "816" ] }, - "execution_count": 20, + "execution_count": 23, "metadata": {}, "output_type": "execute_result" } @@ -1876,44 +2040,98 @@ }, { "cell_type": "markdown", - "id": "bbbc965c", - "metadata": {}, + "id": "0150c1f5", + "metadata": { + "hidden": true + }, "source": [ - "Check there are at least 5 observations per combination of factor levels:" + "Note that we have less than 5 observations in one combination of factor levels. In principle we should revise the binning so that all bins contain at least 5 observations.\n", + "\n", + "For the purpose of comparing the impact of the binning, let us run the test anyway." ] }, { "cell_type": "code", - "execution_count": 52, - "id": "32afe08a", - "metadata": {}, + "execution_count": 24, + "id": "b021e03f", + "metadata": { + "hidden": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "χ²(9) = 228.0, p-value = 4.33e-44\n" + ] + } + ], + "source": [ + "chi2, pvalue, dof, _ = stats.chi2_contingency(np.stack((lives_with_kids_freqs, lives_without_kids_freqs), axis=0))\n", + "print(f'χ²({dof}) = {chi2:.1f}, p-value = {pvalue:.3g}')" + ] + }, + { + "cell_type": "markdown", + "id": "218a59ec", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## Q\n", + "\n", + "Are all the assumptions met? Adjust the procedure if necessary. Any interpretation?" + ] + }, + { + "cell_type": "markdown", + "id": "abdce59a", + "metadata": { + "heading_collapsed": true + }, + "source": [ + "## A" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "id": "3e64b5ce", + "metadata": { + "hidden": true + }, "outputs": [ { "data": { "text/plain": [ - "(array([22, 23, 26]), array([ 2, 8, 12]))" + "(array([107, 49, 52, 95, 155]), array([ 14, 111, 118, 91, 24]))" ] }, - "execution_count": 52, + "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "np.sort(lives_without_kids_freqs)[:3], np.sort(lives_with_kids_freqs)[:3]" + "bins = np.arange(20, 70+1, 10)\n", + "lives_without_kids_freqs, _ = np.histogram(lives_without_kids, bins)\n", + "lives_with_kids_freqs, _ = np.histogram(lives_with_kids, bins)\n", + "lives_without_kids_freqs, lives_with_kids_freqs" ] }, { "cell_type": "code", - "execution_count": 21, - "id": "b021e03f", - "metadata": {}, + "execution_count": 26, + "id": "e79b98f2", + "metadata": { + "hidden": true + }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "χ²(9) = 228.0, p-value = 4.33e-44\n" + "χ²(4) = 208.0, p-value = 7.32e-44\n" ] } ], @@ -1924,28 +2142,20 @@ }, { "cell_type": "markdown", - "id": "3b829782", - "metadata": {}, - "source": [ - "### Q\n", - "\n", - "Similarly, perform a two-sample Kolmogorov-Smirnov test." - ] - }, - { - "cell_type": "markdown", - "id": "8e66c8ce", + "id": "949edcfb", "metadata": { - "heading_collapsed": true + "hidden": true }, "source": [ - "### A" + "The low frequency in one group, in the first case, did not affect the outcome of the test because of the relatively large number of bins.\n", + "\n", + "Although there is no doubt we do have an effect here, we can also run a two-sample Kolmogorov-Smirnov test. It is good practice to seek confirmation with different but equivalent tests." ] }, { "cell_type": "code", - "execution_count": 22, - "id": "3c694746", + "execution_count": 27, + "id": "7ea01f76", "metadata": { "hidden": true }, @@ -1956,7 +2166,7 @@ "KstestResult(statistic=0.31230026103290964, pvalue=1.1102230246251565e-16)" ] }, - "execution_count": 22, + "execution_count": 27, "metadata": {}, "output_type": "execute_result" } @@ -1967,17 +2177,21 @@ }, { "cell_type": "markdown", - "id": "901dca51", - "metadata": {}, + "id": "982eafde", + "metadata": { + "heading_collapsed": true + }, "source": [ - "# Correlations" + "# ..." ] }, { "cell_type": "code", "execution_count": null, - "id": "73b4a48d", - "metadata": {}, + "id": "09247a33", + "metadata": { + "hidden": true + }, "outputs": [], "source": [] } diff --git a/notebooks/scipy_cours.ipynb b/notebooks/scipy_cours.ipynb index f6cea83..1b23c75 100644 --- a/notebooks/scipy_cours.ipynb +++ b/notebooks/scipy_cours.ipynb @@ -1068,7 +1068,7 @@ { "cell_type": "code", "execution_count": 153, - "id": "c58a021f", + "id": "b00250f8", "metadata": { "hidden": true }, @@ -1103,7 +1103,7 @@ }, { "cell_type": "markdown", - "id": "60f07aba", + "id": "2b311609", "metadata": { "hidden": true }, @@ -1116,7 +1116,7 @@ { "cell_type": "code", "execution_count": 154, - "id": "0b70263b", + "id": "e60ba8f3", "metadata": { "hidden": true }, @@ -1187,20 +1187,20 @@ }, { "cell_type": "markdown", - "id": "77dfe22a", + "id": "e2b36e29", "metadata": { "hidden": true }, "source": [ "As they may differ, instead of reporting the sample mean alone, we can report a range of possible values for the population mean, and this is made possible by the fact the mean estimator is known to be normally distributed.\n", "\n", - "Note: a normal distribution with mean $\\mu$ and variation $\\sigma^2$ can be represented in `scipy` with the `norm` class that features that many distribution related measurements:" + "Note: a normal distribution with mean $\\mu$ and variance $\\sigma^2$ can be represented in `scipy` with the `norm` class whose methods implement many distribution-related measurements:" ] }, { "cell_type": "code", "execution_count": 145, - "id": "8f54f1c4", + "id": "7e3b5520", "metadata": { "hidden": true }, @@ -1214,7 +1214,7 @@ { "cell_type": "code", "execution_count": 151, - "id": "8eef01b5", + "id": "d6bbb149", "metadata": { "hidden": true }, @@ -1238,7 +1238,7 @@ { "cell_type": "code", "execution_count": 152, - "id": "d93bb006", + "id": "1341b01d", "metadata": { "hidden": true }, @@ -1261,7 +1261,7 @@ }, { "cell_type": "markdown", - "id": "128bee18", + "id": "2b05f3b3", "metadata": { "hidden": true }, @@ -1269,16 +1269,16 @@ "For example, we may report an interval around the sample mean that should include the population mean with a $1-\\alpha=95\\%$ probability:\n", "\n", "$$\n", - "\\bar{x} \\pm u_{\\alpha/2}\\frac{\\sigma}{\\sqrt{n}}\n", + "\\bar{x} \\pm z_{1-\\alpha/2}\\frac{\\sigma}{\\sqrt{n}}\n", "$$\n", "\n", - "$u_{\\alpha/2}$ is calculated as follows:" + "$z_{1-\\alpha/2}$ is calculated as follows:" ] }, { "cell_type": "code", - "execution_count": 139, - "id": "db08728d", + "execution_count": 197, + "id": "62f75d44", "metadata": { "hidden": true }, @@ -1289,7 +1289,7 @@ "1.9599639845400545" ] }, - "execution_count": 139, + "execution_count": 197, "metadata": {}, "output_type": "execute_result" } @@ -1301,12 +1301,13 @@ }, { "cell_type": "markdown", - "id": "705c2e67", + "id": "7728089e", "metadata": { "hidden": true }, "source": [ - "For a $95\\%$ confidence interval, we usually take $u\\approx 1.96$.\n", + "For a $95\\%$ confidence interval, we usually take $z\\approx 1.96$.\n", + "[isf](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.isf.html) is the inverse survival function, here of the standard normal distribution (with null mean and unit variance).\n", "\n", "$\\frac{\\sigma}{\\sqrt{n}}$ is the standard deviation of the sample mean and can be calculated using the `sem` function from `scipy.stats`." ] @@ -1314,7 +1315,7 @@ { "cell_type": "code", "execution_count": 155, - "id": "86912ddf", + "id": "a2a23163", "metadata": { "hidden": true }, @@ -1337,7 +1338,7 @@ { "cell_type": "code", "execution_count": 156, - "id": "9b723a68", + "id": "99fe274d", "metadata": { "hidden": true }, @@ -3355,7 +3356,7 @@ }, { "cell_type": "markdown", - "id": "6277cebb", + "id": "ca862d48", "metadata": { "heading_collapsed": true, "hidden": true @@ -3366,7 +3367,7 @@ }, { "cell_type": "markdown", - "id": "dde0e024", + "id": "2b1a4872", "metadata": { "hidden": true }, @@ -3384,6 +3385,7 @@ "cell_type": "markdown", "id": "141dc10b-c964-4621-9e53-32f36c11d2da", "metadata": { + "heading_collapsed": true, "tags": [] }, "source": [ @@ -3394,6 +3396,7 @@ "cell_type": "markdown", "id": "6efe325b", "metadata": { + "hidden": true, "tags": [] }, "source": [ @@ -3412,6 +3415,8 @@ "cell_type": "markdown", "id": "344da42e-5707-4a0a-9b98-4cbf47781fbe", "metadata": { + "heading_collapsed": true, + "hidden": true, "tags": [] }, "source": [ @@ -3422,6 +3427,7 @@ "cell_type": "markdown", "id": "0c8b61ee-e2da-4dc4-b3ad-02790d407d5b", "metadata": { + "hidden": true, "jp-MarkdownHeadingCollapsed": true, "tags": [] }, @@ -3442,7 +3448,9 @@ "cell_type": "code", "execution_count": 48, "id": "f5a36568-d5b5-4388-b3f9-f38855c58c80", - "metadata": {}, + "metadata": { + "hidden": true + }, "outputs": [ { "data": { @@ -3466,7 +3474,9 @@ { "cell_type": "markdown", "id": "a9fbd953", - "metadata": {}, + "metadata": { + "hidden": true + }, "source": [ "The correlation coefficient is a commonly-used effect size for the linear relationship between the two variables, similarly to (but not to be confused with) a regression coefficient:" ] @@ -3475,7 +3485,9 @@ "cell_type": "code", "execution_count": 49, "id": "3581eeb0-f98d-4b2f-a0f1-bef073d86980", - "metadata": {}, + "metadata": { + "hidden": true + }, "outputs": [ { "data": { @@ -3499,7 +3511,9 @@ "cell_type": "code", "execution_count": 50, "id": "730a1bd8-51b6-4c06-9431-9a0fce53c42c", - "metadata": {}, + "metadata": { + "hidden": true + }, "outputs": [ { "data": { @@ -3523,7 +3537,9 @@ "cell_type": "code", "execution_count": 51, "id": "bfd1c00e-20b3-48bd-9724-7342ae70f626", - "metadata": {}, + "metadata": { + "hidden": true + }, "outputs": [ { "data": { @@ -3546,7 +3562,9 @@ { "cell_type": "markdown", "id": "53338c9a-9345-4ead-8487-1b51254d7330", - "metadata": {}, + "metadata": { + "hidden": true + }, "source": [ "Pearson $r$ assumes the observations are drawn from normal distributions.\n", "\n", @@ -3555,17 +3573,19 @@ }, { "cell_type": "code", - "execution_count": 52, + "execution_count": 175, "id": "39423438-a688-4afa-b258-e17076ba1fbd", - "metadata": {}, + "metadata": { + "hidden": true + }, "outputs": [ { "data": { "text/plain": [ - "(7.087239246711281e-05, 8.421503263379038e-05)" + "(0.018346666276950804, 0.009331832682953218)" ] }, - "execution_count": 52, + "execution_count": 175, "metadata": {}, "output_type": "execute_result" } @@ -3575,6 +3595,10 @@ "x1 = rng.integers(10, size=30)\n", "x2 = x1 + rng.integers(10, size=x1.size)\n", "\n", + "# plus a few noisy observations\n", + "x1 = np.r_[x1, 9, 12]\n", + "x2 = np.r_[x2, 1, 2]\n", + "\n", "pearson_r, pearson_pv = stats.pearsonr(x1, x2)\n", "spearman_r, spearman_pv = stats.spearmanr(x1, x2)\n", "\n", @@ -3583,13 +3607,15 @@ }, { "cell_type": "code", - "execution_count": 53, + "execution_count": 176, "id": "bca6551e-9ffb-4f60-8b96-9a085b2e2282", - "metadata": {}, + "metadata": { + "hidden": true + }, "outputs": [ { "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAEGCAYAAABvtY4XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAVfklEQVR4nO3df7DddX3n8ecbEhpAEAwXC7ncTRRlEDBArmhFoxtKgcBiyxSFWYoUmKyKlbhsKi4OAxl3Ol2wLaMdslTQxWJgUSkuSMAWqNsBg1xMzC8sIhFu+B0FgQ6QhPf+cb6hl5DcnHu+55t78snzMXPmnp/vzzs53/O6n/M53/O9kZlIksqz03g3IElqhgEvSYUy4CWpUAa8JBXKgJekQk0Y7wZG2meffXLq1Knj3YYkbTeGhoaezcy+zd3WUwE/depU7r///vFuQ5K2GxHxqy3d5hKNJBXKgJekQhnwklSonlqDl6TxsG7dOoaHh3n55ZfHu5UtmjRpEv39/UycOLHtxxjwknZ4w8PD7LHHHkydOpWIGO923iQzWbt2LcPDw0ybNq3tx7lEI2mH9/LLLzN58uSeDHeAiGDy5MljfofRaMBHxF4R8Z2IeDAiVkXE7zU5niR1qlfDfaNO+mt6ieYKYFFm/nFE7ALs1vB4kqRKYzP4iHgrMBO4GiAzX83M55oaT5J6zYIFC7j22mvHbfwmZ/DTgGeAb0TEdGAIOD8zXxp5p4iYA8wBGBgYaLAdaduZMW/0F/XQZWduo040XtavX8+nPvWpce2hyTX4CcCRwJWZeQTwEnDhpnfKzKsyczAzB/v6Nns4BUkaNy+99BInnngi06dP59BDD+WGG25gaGiIj3zkI8yYMYPjjjuOJ554AoCPfvSjzJ07l8HBQa644gouueQSLr/8cgAefvhhjj/+eGbMmMGHP/xhHnzwQQBuvPFGDj30UKZPn87MmTO72nuTM/hhYDgzF1eXv8NmAl6SetmiRYvYf//9ufXWWwF4/vnnOeGEE7j55pvp6+vjhhtu4KKLLuKaa64B4NVXX339mFqXXHLJ63XmzJnDggULeNe73sXixYv5zGc+w5133sn8+fO5/fbbmTJlCs8991xXe28s4DPzyYh4LCIOysyfA8cAK5saT5KacNhhh3HBBRfwhS98gZNOOom9996b5cuXc+yxxwKwYcMG9ttvv9fv/4lPfOJNNV588UXuueceTj311Neve+WVVwA4+uijOeuss/j4xz/OKaec0tXem96L5s+A66o9aH4J/GnD40lSV7373e/mgQce4Ac/+AFf+tKXmDVrFocccgj33nvvZu+/++67v+m61157jb322oslS5a86bYFCxawePFibr31VmbMmMHQ0BCTJ0/uSu+N7gefmUuq9fX3ZuYfZuZvmhxPkrrt8ccfZ7fdduOMM85g3rx5LF68mGeeeeb1gF+3bh0rVqwYtcaee+7JtGnTuPHGG4HWN1OXLl0KtNbm3//+9zN//nz6+vp47LHHuta7hyqQpFEsW7aMefPmsdNOOzFx4kSuvPJKJkyYwOc+9zmef/551q9fz9y5cznkkENGrXPdddfx6U9/mi9/+cusW7eO0047jenTpzNv3jweeughMpNjjjmG6dOnd633yMyuFatrcHAw/YMfKoG7SW5fVq1axcEHHzzebWzV5vqMiKHMHNzc/T0WjSQVyoCXpEIZ8JJUKANekgplwEtSoQx4SSqU+8FL0ia2tpvrWLW7W+yiRYs4//zz2bBhA+eeey4XXljv8F3O4CWpB2zYsIHzzjuP2267jZUrV7Jw4UJWrqx3+C4DXpJ6wH333ceBBx7IO97xDnbZZRdOO+00br755lo1DXhJ6gFr1qzhgAMOeP1yf38/a9asqVXTgJekQhnwktQDpkyZ8oYjSQ4PDzNlypRaNQ14SeoB73vf+3jooYd45JFHePXVV7n++us5+eSTa9V0N0lJ2sR4HO1zwoQJfO1rX+O4445jw4YNnH322Vs9BPFWa3apN0lSTbNnz2b27Nldq+cSjSQVyoCXpEIZ8JJUKANekgplwEtSoQx4SSqUu0lK0iYenX9YV+sNXLxsq/c5++yzueWWW9h3331Zvnx5V8Z1Bi9JPeCss85i0aJFXa3Z6Aw+IlYDLwAbgPWZOdjkeJK0vZo5cyarV6/uas1tsUTzHzPz2W0wjiRpBJdoJKlQTc/gE7gjIhL4X5l51aZ3iIg5wByAgYEBYPS/hzgeBwGSum1rH+K186GctDVNz+A/lJlHAicA50XEzE3vkJlXZeZgZg729fU13I4k7TgancFn5prq59MRcRNwFPCjJseUpLrG4x3U6aefzt13382zzz5Lf38/l156Keecc06tmo0FfETsDuyUmS9U5/8AmN/UeJK0PVu4cGHXazY5g387cFNEbBzn25nZ3Z08JUlb1FjAZ+YvgelN1Zckjc7dJCUJyMzxbmFUnfRnwEva4U2aNIm1a9f2bMhnJmvXrmXSpEljepwHG5O0w+vv72d4eJhnnnlmvFvZokmTJtHf3z+mxxjwknZ4EydOZNq0aePdRte5RCNJhTLgJalQBrwkFcqAl6RCGfCSVCgDXpIKZcBLUqEMeEkqlAEvSYUy4CWpUAa8JBXKgJekQhnwklQoA16SCmXAS1KhDHhJKpQBL0mFMuAlqVAGvCQVyoCXpEIZ8JJUKANekgrVeMBHxM4R8dOIuKXpsSRJ/25bzODPB1Ztg3EkSSM0GvAR0Q+cCHy9yXEkSW82oeH6fwP8ObDHlu4QEXOAOQADAwMNt6OxmDHv2lFvH7rszG3Uyfh7dP5ho94+cPGybdSJ1L7GZvARcRLwdGYOjXa/zLwqMwczc7Cvr6+pdiRph9PkEs3RwMkRsRq4HpgVEX/f4HiSpBEaC/jM/GJm9mfmVOA04M7MPKOp8SRJb+R+8JJUqKY/ZAUgM+8G7t4WY0mSWpzBS1KhDHhJKpQBL0mFMuAlqVAGvCQVyoCXpEIZ8JJUKANekgplwEtSoQx4SSqUAS9JhTLgJalQbQV8RPxTO9dJknrHqEeTjIhJwG7APhGxNxDVTXsCUxruTZJUw9YOF/xfgLnA/sAQ/x7wvwW+1lxbkqS6Rg34zLwCuCIi/iwzv7qNepIkdUFbf/AjM78aER8Epo58TGZe21BfkqSa2gr4iPgW8E5gCbChujoBA16SelS7f7JvEHhPZmaTzUiSuqfd/eCXA7/bZCOSpO5qdwa/D7AyIu4DXtl4ZWae3EhXkqTa2g34S5psQpLUfe3uRfPPTTciSequdveieYHWXjMAuwATgZcyc8+mGpMk1dPuDH6PjecjIoCPAR9oqilJUn1jPppktvwDcNxo94uISRFxX0QsjYgVEXFpp01Kksau3SWaU0Zc3InWfvEvb+VhrwCzMvPFiJgI/EtE3JaZP+6sVUnSWLS7F81/GnF+PbCa1jLNFlVfinqxujixOvlFKUnaRtpdg//TTopHxM60jkJ5IPC3mbl4M/eZA8wBGBgY6GSYUc2YN/rRFIYuO7PrY46Xbf1vfXT+YaPePnDxsq6ON9KO9LxuzXg+D+3o9f5K1u4f/OiPiJsi4unq9N2I6N/a4zJzQ2YeDvQDR0XEoZu5z1WZOZiZg319fWP+B0iSNq/dD1m/AXyf1nHh9wf+b3VdWzLzOeAu4Pgx9idJ6lC7Ad+Xmd/IzPXV6ZvAqNPtiOiLiL2q87sCxwIP1mlWktS+dgN+bUScERE7V6czgLVbecx+wF0R8TPgJ8APM/OWOs1KktrX7l40ZwNfBf6a1p4w9wBnjfaAzPwZcESd5iRJnWs34OcDn8zM3wBExNuAy2kFvySpB7W7RPPejeEOkJm/xtm5JPW0dgN+p4jYe+OFagbf7uxfkjQO2g3prwD3RsSN1eVTgf/RTEuSpG5o95us10bE/cCs6qpTMnNlc21Jkupqe5mlCnRDXZK2E2M+XLAkaftgwEtSoQx4SSqUAS9JhTLgJalQBrwkFcqAl6RCGfCSVCgDXpIKZcBLUqEMeEkqlAEvSYUy4CWpUAa8JBXKgJekQhnwklQoA16SCmXAS1KhDHhJKpQBL0mFaizgI+KAiLgrIlZGxIqIOL+psSRJbzahwdrrgQsy84GI2AMYiogfZubKBseUJFUam8Fn5hOZ+UB1/gVgFTClqfEkSW/U5Az+dRExFTgCWLyZ2+YAcwAGBga2RTu1zJh37ai3D112Zlfr3bTHZVu8beDiZWMaa7x1+/9OzXh0/mGj3r69bXc7ssY/ZI2ItwDfBeZm5m83vT0zr8rMwcwc7Ovra7odSdphNBrwETGRVrhfl5nfa3IsSdIbNbkXTQBXA6sy86+aGkeStHlNzuCPBv4EmBURS6rT7AbHkySN0NiHrJn5L0A0VV+SNDq/ySpJhTLgJalQBrwkFcqAl6RCGfCSVCgDXpIKZcBLUqEMeEkqlAEvSYUy4CWpUAa8JBXKgJekQhnwklQoA16SCmXAS1KhDHhJKpQBL0mFMuAlqVAGvCQVyoCXpEIZ8JJUKANekgplwEtSoQx4SSqUAS9JhWos4CPimoh4OiKWNzWGJGnLmpzBfxM4vsH6kqRRNBbwmfkj4NdN1ZckjW7CeDcQEXOAOQADAwNbvf+j8w8b9faBi5d1pa9O9XJ/vdxbt+1I/9YdSbef1+25Xju1xv1D1sy8KjMHM3Owr69vvNuRpGKMe8BLkpphwEtSoZrcTXIhcC9wUEQMR8Q5TY0lSXqzxj5kzczTm6otSdo6l2gkqVAGvCQVyoCXpEIZ8JJUKANekgplwEtSoQx4SSqUAS9JhTLgJalQBrwkFcqAl6RCGfCSVCgDXpIKZcBLUqEMeEkqlAEvSYUy4CWpUAa8JBXKgJekQhnwklQoA16SCmXAS1KhDHhJKpQBL0mFMuAlqVAGvCQVqtGAj4jjI+LnEfGLiLiwybEkSW/UWMBHxM7A3wInAO8BTo+I9zQ1niTpjZqcwR8F/CIzf5mZrwLXAx9rcDxJ0giRmc0Ujvhj4PjMPLe6/CfA+zPzs5vcbw4wp7p4EPDzrZTeB3i2i63uSPV6ubdu1+vl3rpdr5d763a9Xu5tvOr9h8zs29wNE7rYSEcy8yrgqnbvHxH3Z+Zgt8bfker1cm/drtfLvXW7Xi/31u16vdxbL9ZrcolmDXDAiMv91XWSpG2gyYD/CfCuiJgWEbsApwHfb3A8SdIIjS3RZOb6iPgscDuwM3BNZq7oQum2l3Os12itXq/Xy711u14v99bter3cW8/Va+xDVknS+PKbrJJUKANekgq1XQV8Nw99EBHXRMTTEbG8C30dEBF3RcTKiFgREefXrDcpIu6LiKVVvUvr9ljV3TkifhoRt3Sh1uqIWBYRSyLi/pq19oqI70TEgxGxKiJ+r0atg6qeNp5+GxFza9T7fPUcLI+IhRExqdNaVb3zq1orOulrc9ttRLwtIn4YEQ9VP/euWe/Uqr/XImJMu+htod5l1XP7s4i4KSL26rTWiNsuiIiMiH1q9nZJRKwZsb3MrlnvhhG1VkfEkhq1Do+IH298jUXEUe329rrM3C5OtD6ofRh4B7ALsBR4T416M4EjgeVd6G0/4Mjq/B7Av9bsLYC3VOcnAouBD3Shz/8KfBu4pQu1VgP7dOm5/d/AudX5XYC9urjNPEnriyCdPH4K8Aiwa3X5/wBn1ejnUGA5sButHRz+EThwjDXetN0C/xO4sDp/IfCXNesdTOtLh3cDg13o7w+ACdX5v2y3vy29Rmntfn078KuxbINb6O0S4L91+HyOmiHAV4CLa/R2B3BCdX42cPdYe9yeZvBdPfRBZv4I+HU3GsvMJzLzger8C8AqWuHQab3MzBerixOrU61PwyOiHzgR+HqdOt0WEW+ltXFfDZCZr2bmc10qfwzwcGb+qkaNCcCuETGBVjA/XqPWwcDizPy3zFwP/DNwylgKbGG7/RitX5JUP/+wTr3MXJWZW/tG+Vjq3VH9ewF+TOs7MR3Vqvw18OeM8TXRzdf81upFRAAfBxbWqJXAntX5t9LBtrc9BfwU4LERl4epEaJNiYipwBG0Zt116uxcvb17GvhhZtaqB/wNrRfFazXrbJTAHRExFK3DTXRqGvAM8I1q+ejrEbF7d1rkNNp8gW1OZq4BLgceBZ4Ans/MO2r0sxz4cERMjojdaM3KDtjKY9rx9sx8ojr/JPD2LtRsytnAbZ0+OCI+BqzJzKXda4nPVstH14xleWsrPgw8lZkP1agxF7gsIh6jtR1+cawFtqeA73kR8Rbgu8DczPxtnVqZuSEzD6c12zkqIg6t0ddJwNOZOVSnp018KDOPpHW00PMiYmaHdSbQemt6ZWYeAbxEa5mhlurLdScDN9aosTet2fE0YH9g94g4o9N6mbmK1hLFHcAiYAmwodN6WxgjqflurykRcRGwHriuw8fvBvx34OIutnUl8E7gcFq/xL/SpbqnU2NyUfk08PnMPAD4PNW73LHYngK+pw99EBETaYX7dZn5vW7VrZYr7gKOr1HmaODkiFhNa2lrVkT8fc2+1lQ/nwZuorWE1olhYHjEO5Tv0Ar8uk4AHsjMp2rU+H3gkcx8JjPXAd8DPlinqcy8OjNnZOZM4De0Pq+p66mI2A+g+vl0F2p2VUScBZwE/Ofql1An3knrl+3SalvuBx6IiN/ttK/MfKqaTL0G/B2db8evq5bzTgFuqFnqk7S2OWhNVMbc2/YU8D176INqve1qYFVm/lUX6vVt3NMgInYFjgUe7LReZn4xM/szcyqt/7c7M7PjmWhE7B4Re2w8T+tDtI72RsrMJ4HHIuKg6qpjgJWd9jZCN2ZQjwIfiIjdquf4GFqfr3QsIvatfg7QCoFv1+wRWq+DT1bnPwnc3IWaXRMRx9NaHjw5M/+t0zqZuSwz983MqdW2PExr54Yna/S234iLf0SH2/Emfh94MDOHa9Z5HPhIdX4WMPblnk4+PR6vE601y3+ltTfNRTVrLaT1lmwdrQ3lnBq1PkTrbfHPaL3tXgLMrlHvvcBPq3rLafOT+DZrf5Sae9HQ2pNpaXVa0YXn4nDg/urf+w/A3jXr7Q6sBd7ahf+vS2n9cl0OfAv4nZr1/h+tX2BLgWM6ePybtltgMvBPVQD8I/C2mvX+qDr/CvAUcHvNer+g9fnZxtfGgk5rbXL7asa2F83mevsWsKza9r4P7FenXnX9N4FPdeF5/RAwVG0ri4EZY91ePFSBJBVqe1qikSSNgQEvSYUy4CWpUAa8JBXKgJekQhnwklQoA16SCmXAS1sQEe+rDkI1qfr27oo6xwSStjW/6CSNIiK+DEwCdqV1zJy/GOeWpLYZ8NIoquMe/QR4GfhgZnb16I9Sk1yikUY3GXgLrb/UVevP9UnbmjN4aRQR8X1ah1ieRutAVJ8d55aktk0Y7wakXhURZwLrMvPbEbEzcE9EzMrMO8e7N6kdzuAlqVCuwUtSoQx4SSqUAS9JhTLgJalQBrwkFcqAl6RCGfCSVKj/D/kH3ZIM5ertAAAAAElFTkSuQmCC\n", + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAEGCAYAAABvtY4XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAVfUlEQVR4nO3dfZRddX3v8feXJNyQECAkI81jkyvPiIIEpdKybJB7gVKoKAirEVNub64YBApG7UUppHb1Ae2tq95FLirSKCKCIFgooKLeWm1oEoNJiIpUDBOeYhQEeiEPfO8fZycMIZmcOefsOTO/vF9rzZpzzj77u79nZs9n9tn7t/eJzESSVJ49ut2AJKkeBrwkFcqAl6RCGfCSVCgDXpIKNbLbDfQ1ceLEnDFjRrfbkKRhY9myZb/IzJ4dTRtSAT9jxgyWLl3a7TYkadiIiJ/vbJq7aCSpUAa8JBXKgJekQg2pffA7smnTJnp7e3nhhRe63cpOjR49mqlTpzJq1KhutyJJ2wz5gO/t7WXcuHHMmDGDiOh2O6+SmWzYsIHe3l5mzpzZ7XYkaZshv4vmhRdeYMKECUMy3AEiggkTJgzpdxiSdk9DPuCBIRvuWw31/iTtnoZFwEuSBm63DPhFixaxePHibrchSbUa8gdZO23z5s28973vHfB8Dz76i51OO3zaxHZakqRaDNuAf/755zn77LPp7e1ly5YtfPSjH+XAAw/k0ksv5bnnnmPixIlcf/31TJo0ibe+9a0cddRRfPe73+Xcc8/l2WefZe+99+YDH/gADz/8MPPnz2f9+vWMGTOGT3/60xx66KHcfPPNXHXVVYwYMYJ9992XRTfc2u2XLEkDMmwD/u6772by5MnceeedADzzzDOccsop3H777fT09HDTTTdx+eWXc9111wGwcePGbde5ufLKK7fVmTdvHosWLeKggw5iyZIlvO997+O+++5j4cKF3HPPPUyZMoWnn36ax57dPOivUZLaMWwD/sgjj+Syyy7jQx/6EKeddhrjx49n1apVnHTSSQBs2bKFSZMmbXv+u971rlfVeO655/je977HWWedte2xF198EYDjjz+euXPncvbZZ3PmmWfW/GokqfOGbcAffPDBLF++nLvuuouPfOQjzJ49myOOOILvf//7O3z+2LFjX/XYSy+9xH777ceKFSteNW3RokUsWbKEO++8k2OOOYYv3nEv+43fv9MvQ5JqM2xH0Tz22GOMGTOGOXPmsGDBApYsWcL69eu3BfymTZtYvXp1vzX22WcfZs6cyc033ww0zkp94IEHAHj44Yd585vfzMKFC+np6eHxx9bV+4IkqcOG7Rb8ypUrWbBgAXvssQejRo3immuuYeTIkVx00UU888wzbN68mUsuuYQjjjii3zo33HADF1xwAR/72MfYtGkT55xzDm94wxtYsGABDz30EJnJiSeeyKGHv26QXpkkdUZkZrd72GbWrFm5/Qd+rFmzhsMOO6xLHb1sV8Mkh0qfknYvEbEsM2ftaNqw3UUjSeqfAS9JhTLgJalQBrwkFcqAl6RCGfCSVKhhNw7+mAWdvczvsqvP2+Vz7r77bi6YfyFbtmzhHefM4b/Pv7ijPUhSHdyC34UtW7Ywf/58Fv3Dl7jjm//CXXfcxk9/8uNutyVJu2TA78L999/PgQceyLTfnMGee+7Jqb//B3zr3n/qdluStEsG/C6sW7eOadOmbbt/wKTJPPnk413sSJKaY8BLUqEM+F2YMmUKjz766Lb7Tz7+GAccMKmfOSRpaKg14CPiTyJidUSsiogbI2J0ncurw7HHHstDDz1E79qfs3HjRu762lf53ZNO7nZbkrRLtQ2TjIgpwEXA4Zn5/yLiy8A5wPXt1G1mWGMnjRw5kk996lPMe/fZvLTlJd7+rnM58JBDB7UHSWpF3ePgRwJ7RcQmYAzwWM3Lq8Wpp57KXd95U7fbkKQBqW0XTWauAz4OrAUeB57JzHu3f15EzIuIpRGxdP369XW1I0m7ndoCPiLGA2cAM4HJwNiImLP98zLz2syclZmzenp66mpHknY7dR5kfRvws8xcn5mbgFuBt9S4PElSH3UG/FrguIgYExEBnAisqXF5kqQ+6twHvwS4BVgOrKyWdW1dy5MkvVKto2gy88+AP6tzGZKkHRt2lwteu/DIjtabfsXKXT7n/PPP5/Y7vsb+EyZy+zf+uaPLl6S6eKmCJsydO5f/s/hL3W5DkgbEgG/CCSecwL77je92G5I0IAa8JBXKgJekQhnwklQoA16SCjXshkk2M6yx084991y+8c37ePpXv2T2m17P/Es/yDvOedVldSRpSBl2Ad8NN954Iw8++otutyFJA+IuGkkqlAEvSYUaFgGfmd1uoV9DvT9Ju6chH/CjR49mw4YNQzZEM5MNGzYwevSw+zxxSYUb8gdZp06dSm9vL93+OL8nfvXcDh/PhDGTJzJ16tRB7kiS+jfkA37UqFHMnDmz220wZ8HinU5bdvV5g9iJJDVnyO+ikSS1xoCXpEIZ8JJUKANekgplwEtSoQx4SSqUAS9JhTLgJalQBrwkFcqAl6RCGfCSVCgDXpIKZcBLUqEMeEkqlAEvSYUy4CWpUAa8JBXKgJekQhnwklQoA16SCmXAS1KhDHhJKlStAR8R+0XELRHxo4hYExG/VefyJEkvG1lz/U8Cd2fmOyNiT2BMzcuTJFVqC/iI2Bc4AZgLkJkbgY11LU+S9Ep17qKZCawHPhcRP4iIz0TE2O2fFBHzImJpRCxdv359je1I0u6lzoAfCbwRuCYzjwaeBz68/ZMy89rMnJWZs3p6empsR5J2L3UGfC/Qm5lLqvu30Ah8SdIgqC3gM/MJ4NGIOKR66ETgwbqWJ0l6pbpH0bwfuKEaQfPvwB/VvDxJUqXWgM/MFcCsOpchSdoxz2SVpEIZ8JJUKANekgplwEtSoQx4SSqUAS9JhTLgJalQBrwkFcqAl6RCGfCSVCgDXpIKZcBLUqGaCviI+GYzj0mSho5+ryYZEaNpfFD2xIgYD0Q1aR9gSs29SZLasKvLBf8P4BJgMrCMlwP+18Cn6mtLktSufgM+Mz8JfDIi3p+Zfz9IPUmSOqCpD/zIzL+PiLcAM/rOk5mLa+pLktSmpgI+Ij4PvBZYAWypHk7AgAfWLjyy3+nTr1g5SJ1I0sua/ci+WcDhmZl1NiNJ6pxmx8GvAn6jzkYkSZ3V7Bb8RODBiLgfeHHrg5l5ei1dSZLa1mzAX1lnE5Kkzmt2FM136m5EktRZzY6ieZbGqBmAPYFRwPOZuU9djUmS2tPsFvy4rbcjIoAzgOPqakqS1L4BX00yG74K/NfOtyNJ6pRmd9Gc2efuHjTGxb9QS0eSpI5odhTN7/e5vRl4hMZuGknSENXsPvg/qrsRSVJnNfuBH1Mj4raIeKr6+kpETK27OUlS65o9yPo54A4a14WfDHytekySNEQ1G/A9mfm5zNxcfV0P9NTYlySpTc0G/IaImBMRI6qvOcCGOhuTJLWn2YA/HzgbeAJ4HHgnMLemniRJHdDsMMmFwHsy81cAEbE/8HEawS9JGoKa3YJ//dZwB8jMXwJH19OSJKkTmg34PSJi/NY71RZ8s1v/kqQuaDakPwF8PyJuru6fBfxFMzNGxAhgKbAuM08beIuSpFY0eybr4ohYCsyuHjozMx9schkXA2sALy0sSYOo6d0sVaA3G+pA4wxY4PdobO1fOrDWJEntqHs/+t8BHwTG7ewJETEPmAcwffp0jlmwuN+Cy64+r+mF91drIHV2d2sXHrnTadOvWDmInbysv56ge31JQ8mArwffrIg4DXgqM5f197zMvDYzZ2XmrJ4eT46VpE6pLeCB44HTI+IR4EvA7Ij4Qo3LkyT1UVvAZ+afZubUzJwBnAPcl5lz6lqeJOmV6tyClyR10aCcrJSZ3wa+PRjLkiQ1uAUvSYUy4CWpUAa8JBXKgJekQhnwklQoA16SCmXAS1KhDHhJKpQBL0mFMuAlqVAGvCQVyoCXpEIZ8JJUKANekgplwEtSoQx4SSqUAS9JhRqUT3TS0HbMgsX9Tr9t3CA1sp3++hqaPV3d77zTr1jZ9HLWLjyyI3W0e3MLXpIKZcBLUqEMeEkqlAEvSYUy4CWpUAa8JBXKgJekQhnwklQoA16SCmXAS1KhDHhJKpQBL0mFMuAlqVAGvCQVyoCXpEIZ8JJUKANekgplwEtSoQx4SSpUbQEfEdMi4lsR8WBErI6Ii+taliTp1er80O3NwGWZuTwixgHLIuLrmflgjcuUJFVq24LPzMczc3l1+1lgDTClruVJkl6pzi34bSJiBnA0sGQH0+YB8wCmT5/OxMFoSLuVYxYs7nf6beOu3um06Ves7HQ7g2rtwiP7ne7rK1vtB1kjYm/gK8Almfnr7adn5rWZOSszZ/X09NTdjiTtNmoN+IgYRSPcb8jMW+tcliTpleocRRPAZ4E1mfm3dS1HkrRjdW7BHw+8G5gdESuqr1NrXJ4kqY/aDrJm5neBqKu+JKl/nskqSYUy4CWpUAa8JBXKgJekQhnwklQoA16SCmXAS1KhDHhJKpQBL0mFMuAlqVAGvCQVyoCXpEIZ8JJUKANekgplwEtSoQx4SSqUAS9JhartE53qsnbhkTudNv2KlR2pM9BaA3HMgsX9Tl929XkdqXXbuKv7nbcbr28gr03dMVi/v079HXfSUMyWdmu5BS9JhTLgJalQBrwkFcqAl6RCGfCSVCgDXpIKZcBLUqEMeEkqlAEvSYUy4CWpUAa8JBXKgJekQhnwklQoA16SCmXAS1KhDHhJKpQBL0mFMuAlqVAGvCQVqtaAj4iTI+LHEfHTiPhwncuSJL1SbQEfESOA/w2cAhwOnBsRh9e1PEnSK9W5Bf8m4KeZ+e+ZuRH4EnBGjcuTJPURmVlP4Yh3Aidn5h9X998NvDkzL9zuefOAedXdQ4Af76L0ROAXHWixU3WGai17Gvxa9jT4tewJfjMze3Y0YWSHmmhZZl4LXNvs8yNiaWbOane5naozVGvZ0+DXsqfBr2VP/atzF806YFqf+1OrxyRJg6DOgP834KCImBkRewLnAHfUuDxJUh+17aLJzM0RcSFwDzACuC4zV3egdNO7cwapzlCtZU+DX8ueBr+WPfWjtoOskqTu8kxWSSqUAS9JhRo2Ad+pyx5ExHUR8VRErOpAT9Mi4lsR8WBErI6Ii1usMzoi7o+IB6o6V3WgtxER8YOI+Mc2ajwSESsjYkVELG2zn/0i4paI+FFErImI32qxziFVP1u/fh0Rl7RY60+qn/eqiLgxIka3UqeqdXFVZ/VA+9nROhkR+0fE1yPioer7+BbrnFX19FJEND3cbie1rq5+fz+MiNsiYr8W6/x5VWNFRNwbEZNb7anPtMsiIiNiYhuv78qIWNdn3Tq11Z4i4v3Vz2p1RPxNGz3d1KefRyJiRTO1tsnMIf9F4yDtw8B/BvYEHgAOb7HWCcAbgVUd6GsS8Mbq9jjgJ630BQSwd3V7FLAEOK7N3i4Fvgj8Yxs1HgEmduh3+A/AH1e39wT269B68QSNEz0GOu8U4GfAXtX9LwNzW+zjdcAqYAyNgQvfAA4cwPyvWieBvwE+XN3+MPDXLdY5jMYJhN8GZrXZ038BRla3/7qNnvbpc/siYFGrPVWPT6MxmOPnza6vO+nrSuADA/zd76jO71brwH+q7r+mndfXZ/ongCsG0t9w2YLv2GUPMvP/Ar/sRFOZ+XhmLq9uPwusoREcA62TmflcdXdU9dXy0e+ImAr8HvCZVmt0UkTsS2Pl/SxAZm7MzKc7UPpE4OHM/HmL848E9oqIkTTC+bEW6xwGLMnM/8jMzcB3gDObnXkn6+QZNP4pUn3/g1bqZOaazNzV2eHN1rq3en0A/0rj3JZW6vy6z92xNLmu9/O3+7+ADzZbZxe1BmQndS4A/iozX6ye81S7PUVEAGcDNw6kv+ES8FOAR/vc76WFIK1TRMwAjqax9d3K/COqt19PAV/PzJbqVP6Oxgr/Uhs1oPEHc29ELIvGJSVaNRNYD3yu2m30mYgY22Zv0Di3YkAr/FaZuQ74OLAWeBx4JjPvbbGPVcDvRMSEiBgDnMorT/JrxQGZ+Xh1+wnggDbrddr5wD+1OnNE/EVEPAr8IXBFG3XOANZl5gOt1tjOhdXuo+ua2S22EwfTWB+WRMR3IuLYDvT1O8CTmfnQQGYaLgE/pEXE3sBXgEu22zppWmZuycyjaGwVvSkiXtdiL6cBT2Xmslbm385vZ+YbaVwRdH5EnNBinZE03npek5lHA8/T2O3QsmicPHc6cHOL84+nsZU8E5gMjI2IOa3Uysw1NHZZ3AvcDawAtrRSayf1kzbe0XVaRFwObAZuaLVGZl6emdOqGhfu6vk76WMM8D9p4x/Edq4BXgscReOf/idarDMS2B84DlgAfLnaAm/HubSwMTNcAn7IXvYgIkbRCPcbMvPWdutVuy6+BZzcYonjgdMj4hEau7JmR8QXWuxlXfX9KeA2GrvKWtEL9PZ5V3ILjcBvxynA8sx8ssX53wb8LDPXZ+Ym4FbgLa02k5mfzcxjMvME4Fc0jse048mImARQfW/qbX7dImIucBrwh9U/nnbdALyjxXlfS+Mf9APV+j4VWB4Rv9FKscx8strQegn4NO2t77dWu17vp/FOuqmDvztS7UI8E7hpoPMOl4Afkpc9qP4rfxZYk5l/20adnq0jEiJiL+Ak4Eet1MrMP83MqZk5g8bP6b7MHPCWaUSMjYhxW2/TOMDW0sijzHwCeDQiDqkeOhF4sJVafbS0RdPHWuC4iBhT/R5PpHEMpSUR8Zrq+3Qaf4xfbKM3aKzf76luvwe4vc16bYuIk2ns+js9M/+jjToH9bl7Bq2v6ysz8zWZOaNa33tpDHp4osW+JvW5+3ZaXN+Br9I40EpEHExjUEE7V5d8G/CjzOwd8JwDOSLbzS8a+zV/QmM0zeVt1LmRxtuvTTRWiP/WRq3fpvHW+Yc03pavAE5toc7rgR9UdVYxwCPl/dR9Ky2OoqExYumB6mt1Oz/zqt5RwNLqNX4VGN9GrbHABmDfNnu6ika4rAI+TzXqocVa/0zjn9YDwIntrpPABOCbwEM0RmTs32Kdt1e3XwSeBO5po6ef0jgWtnVd3+Xol53U+Ur1M/8h8DVgSqs9bTf9EZofRbOjvj4PrKz6ugOY1GKdPYEvVK9xOTC7ndcHXA+8t5X10ksVSFKhhssuGknSABnwklQoA16SCmXAS1KhDHhJKpQBL0mFMuAlqVAGvLQTEXFsdeGp0dWZvatbvUaQ1A2e6CT1IyI+BowG9qJxPZ2/7HJLUtMMeKkf1bWP/g14AXhLZnbsKpFS3dxFI/VvArA3jU/savkj/aRucAte6kdE3EHjssszaVx8qqVrl0vdMLLbDUhDVUScB2zKzC9GxAjgexExOzPv63ZvUjPcgpekQrkPXpIKZcBLUqEMeEkqlAEvSYUy4CWpUAa8JBXKgJekQv1/oheQGaMpzdIAAAAASUVORK5CYII=\n", "text/plain": [ "<Figure size 432x288 with 1 Axes>" ] @@ -3607,13 +3633,15 @@ }, { "cell_type": "code", - "execution_count": 54, + "execution_count": 177, "id": "a2dee6dc-0f17-4b33-9431-c6dc2d2bd418", - "metadata": {}, + "metadata": { + "hidden": true + }, "outputs": [ { "data": { - "image/png": "\n", + "image/png": "\n", "text/plain": [ "<Figure size 432x288 with 1 Axes>" ] @@ -3632,7 +3660,9 @@ { "cell_type": "markdown", "id": "18902dd3-022a-447a-9681-6713b2e9296e", - "metadata": {}, + "metadata": { + "hidden": true + }, "source": [ "What could possibly go wrong?\n", "\n", @@ -3643,10 +3673,155 @@ "On the other side, Spearman coefficient is based on ranks and may catch less intuitive patterns." ] }, + { + "cell_type": "markdown", + "id": "37816425", + "metadata": { + "heading_collapsed": true, + "hidden": true + }, + "source": [ + "### Linear regression" + ] + }, + { + "cell_type": "markdown", + "id": "8abb81b1", + "metadata": { + "hidden": true + }, + "source": [ + "As previously said, the correlation is not directly related to the regression lines we constantly plot to illustrate the relationship a correlation coefficent is supposed to quantify.\n", + "\n", + "The linear regression also offers an approach for quantifying an association between two quantitative variables." + ] + }, + { + "cell_type": "code", + "execution_count": 180, + "id": "fb9d664f", + "metadata": { + "hidden": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(5.43327239488117, 0.5804387568555759, 0.01834666627695083)" + ] + }, + "execution_count": 180, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "slope, intercept, R, pvalue, slope_std_err = stats.linregress(x1, x2)\n", + "intercept, slope, pvalue" + ] + }, + { + "cell_type": "markdown", + "id": "8cfc5265", + "metadata": { + "hidden": true + }, + "source": [ + "The $p$-value returned by [linregress](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.linregress.html) is related to $H_0$: the slope is $0$.\n", + "\n", + "If there is no slope, there is no association between the variables.\n", + "\n", + "A linear regression is also a model that can predict the value of one variable from the value of the other variable:" + ] + }, + { + "cell_type": "code", + "execution_count": 182, + "id": "7f69b66e", + "metadata": { + "hidden": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "8.335466179159049" + ] + }, + "execution_count": 182, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "x1_observed = 5\n", + "x2_predicted = intercept + slope * x1_observed\n", + "x2_predicted" + ] + }, + { + "cell_type": "code", + "execution_count": 185, + "id": "14836378", + "metadata": { + "hidden": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "7.867716535433071" + ] + }, + "execution_count": 185, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "x2_observed = 10\n", + "x1_predicted = (x2_observed - intercept) / slope\n", + "x1_predicted" + ] + }, + { + "cell_type": "code", + "execution_count": 191, + "id": "51bff28b", + "metadata": { + "hidden": true + }, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "<Figure size 432x288 with 1 Axes>" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "ax = sns.regplot(x=x1, y=x2)\n", + "x1_min, _ = ax.get_xlim()\n", + "x2_min, x2_max = ax.get_ylim()\n", + "ax.plot([x1_min, x1_observed, x1_observed], [x2_predicted, x2_predicted, x2_min], 'r:', linewidth=1)\n", + "ax.plot([x1_min, x1_predicted, x1_predicted], [x2_observed, x2_observed, x2_min], 'r:', linewidth=1)\n", + "ax.set_ylim([x2_min, x2_max])\n", + "ax.set_xlabel('x1')\n", + "ax.set_ylabel('x2');" + ] + }, { "cell_type": "markdown", "id": "9f26feda-d0fd-4d9d-a748-b0b82d0f84b4", - "metadata": {}, + "metadata": { + "heading_collapsed": true + }, "source": [ "## Effect sizes and test power" ] @@ -3654,7 +3829,9 @@ { "cell_type": "markdown", "id": "5d8555e2-f91e-4ed8-b518-d95b1f46aa11", - "metadata": {}, + "metadata": { + "hidden": true + }, "source": [ "`scipy.stats` does not offer any helper for effect size and power calculation.\n", "\n", @@ -3666,6 +3843,7 @@ "execution_count": 55, "id": "78c838fa-6eff-459a-867c-b9e930b0cebd", "metadata": { + "hidden": true, "jupyter": { "outputs_hidden": true }, @@ -3714,7 +3892,9 @@ "cell_type": "code", "execution_count": 56, "id": "5eaf887e-9310-4b89-acca-ccde9dff5f02", - "metadata": {}, + "metadata": { + "hidden": true + }, "outputs": [], "source": [ "from statsmodels.stats import power\n", @@ -3725,7 +3905,8 @@ "cell_type": "markdown", "id": "5532ddbe-202e-48f5-96fd-825c465a503f", "metadata": { - "heading_collapsed": true + "heading_collapsed": true, + "hidden": true }, "source": [ "### Effect sizes" @@ -3788,7 +3969,10 @@ { "cell_type": "markdown", "id": "3b30414c-6575-4dff-9637-664cb0f29a5a", - "metadata": {}, + "metadata": { + "heading_collapsed": true, + "hidden": true + }, "source": [ "### Power analysis" ] @@ -3796,7 +3980,9 @@ { "cell_type": "markdown", "id": "9b54a050-85cc-44c3-b406-650734d0d2ec", - "metadata": {}, + "metadata": { + "hidden": true + }, "source": [ "Prior to collecting data, in the presence of preliminary data to roughly predict the expected effect size, one can estimate the sample size necessary for a test to detect such an effect.\n", "\n", @@ -3817,7 +4003,9 @@ "cell_type": "code", "execution_count": 169, "id": "de47ba11-f486-44e3-9f3b-9ddffacc95ee", - "metadata": {}, + "metadata": { + "hidden": true + }, "outputs": [ { "data": { @@ -3843,8 +4031,10 @@ { "cell_type": "code", "execution_count": null, - "id": "fdbaf062", - "metadata": {}, + "id": "084b9635", + "metadata": { + "hidden": true + }, "outputs": [], "source": [] } -- GitLab