{"cells":[{"cell_type":"markdown","metadata":{"id":"qDa0LVC-aDnz"},"source":["# Tutorial 1: Introducción a Pandas"]},{"cell_type":"markdown","metadata":{"id":"66pVTgxJaDn1"},"source":["## Herramientas"]},{"cell_type":"markdown","metadata":{"id":"wq0BVHO6aDn2"},"source":["### Anaconda\n","Una forma fácil de tener un ambiente de Python **local** con las bibliotecas más comunes es instalando *Anaconda*. Para esto:\n","\n","- Descarga en el siguiente link la última versión de Python: https://www.python.org/downloads/\n","- Descarga en el siguiente link la última versión de Anaconda: https://www.anaconda.com/distribution/\n","- Puedes probar tu instalación ejecutando `python` en un terminal `Anaconda Prompt` y verificar que diga algo como `Python 3.7.6 |Anaconda 4.9.0` al principio. Opcionalmente, si quieres ejecutar las herramientas de anaconda desde la terminal del sistema, asegúrate de dejar en el PATH el directorio `bin` de anaconda (Guía en windows, *Add Anaconda to Path (Optional)* https://www.datacamp.com/community/tutorials/installing-anaconda-windows).\n","\n","**Instalación de Bibliotecas:**\n","Anaconda facilita mucho la instalación de las bibliotecas que usaremos en este laboratorio. Instalar las bibliotecas (`scikit-learn`, `jupyter`) desde cero puede ser un poco complicado. Por lo tanto, instalar Anaconda es altamente recomendado para estas sesiones de laboratorio.\n","\n","1. Abrir aplicación Anaconda prompt.\n","2. Ejecutar comando: conda install *biblioteca*\n","\n","Para este tutorial instalar las bibliotecas: *numpy*, *scikit-learn*, *pandas*, *matplotlib*, *seaborn*"]},{"cell_type":"markdown","metadata":{"id":"paF0ayd_aDn4"},"source":["### Jupyter\n","\n","**Jupyter notebook** (viene con anaconda) es una aplicación web que permite crear documentos con código Python, similar a los R Notebooks o R Markdown. Para este tutorial y los laboratorios usaremos un **notebook** donde deberán completar sus respuestas en el mismo archivo.\n","\n","Para cargar y editar un archivo.ipynb deben abrir la terminal y ejecutar `jupyter notebook`. Esto abrirá el navegador donde pueden buscar el archivo .ipynb dentro del directorio. TIP: con Shift-Enter pueden ejecutar cada bloque del notebook.\n","\n","\n","El archivo en formato **HTML** se puede descargar ejecutando el siguiente comando desde la consola de anaconda:\n","\n","`jupyter nbconvert nombre_archivo.ipynb --to html`\n","\n","Otra opción más sencilla es descargarlo desde el mismo notebook, haciendo clic en:\n","*File -> Download as-> HTML (.html)*"]},{"cell_type":"markdown","metadata":{"id":"6v6J9KgeaDn5"},"source":["## Google Colab\n","\n","Aunque usaremos un notebook local (porque es importante que se familiaricen con anaconda), deben conocer Colaboratory, también llamado \"Colab\", que esencialmente es un jupyter notebook con las siguientes ventajas:\n","- No requiere configuración\n","- Da acceso gratuito a GPUs\n","- Permite compartir contenido fácilmente"]},{"cell_type":"markdown","metadata":{"id":"T9RatKEEaDn5"},"source":["## Data Frames\n","Un data frame es una tabla, con filas y columnas. Cada columna debe tener nombre mientras que las filas pueden tener nombre, pero no es recomendable."]},{"cell_type":"markdown","metadata":{"id":"GUNrAjN_aDn6"},"source":["### Introducción a Pandas"]},{"cell_type":"markdown","metadata":{"id":"lG0Q3q3HaDn6"},"source":["Pandas es una herramienta de manipulación y análisis de datos de código abierto rápida, potente, flexible y fácil de usar []. Este paquete de Python proporciona estructuras de datos similares a los dataframes de R (tablas con filas de observaciones y columnas de variables).\n","\n","Pandas proporciona mecanismos eficientes para trabajar con diferentes formatos de datos como archivos CSV (del inglés comma-separated values), archivos de Excel o bases de datos.\n","\n","Las dos estructuras de datos principales de Pandas son: **Series** (Matriz unidimensional etiquetada de forma homogénea) y **DataFrame** (Estructura de datos bidimensional con columnas que pueden contener diferentes tipos de datos). Podríamos pensar en las estructuras de datos de Pandas como contenedores flexibles para datos de dimensiones inferiores. Por ejemplo, DataFrame es un contenedor para Series y Series es un contenedor para escalares []."]},{"cell_type":"code","execution_count":1,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"pMXzA0PDaDn6","executionInfo":{"status":"ok","timestamp":1742188113599,"user_tz":240,"elapsed":1190,"user":{"displayName":"rob","userId":"11204549018487637648"}},"outputId":"b0f85e3d-f932-4995-c5bf-5103ca3e12c7"},"outputs":[{"output_type":"stream","name":"stdout","text":[" x y voltaje\n","0 10 a 1\n","1 20 b 1\n","2 30 c 1\n"]}],"source":["import pandas as pd\n","\n","# Definimos un DataFrame con dos columnas, 'x' e 'y', y una columna adicional 'voltaje'\n","data = {'x': [10, 20, 30], 'y': ['a', 'b', 'c'], 'voltaje': [1, 1, 1]}\n","df = pd.DataFrame(data)\n","\n","# Mostramos todo el DataFrame, observa cómo se crean los encabezados.\n","print(df)\n"]},{"cell_type":"code","execution_count":2,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":178},"id":"eJp7FwCqaDn9","executionInfo":{"status":"ok","timestamp":1742188117036,"user_tz":240,"elapsed":16,"user":{"displayName":"rob","userId":"11204549018487637648"}},"outputId":"7c349e77-30bd-4fa0-80c8-255d3f71f734"},"outputs":[{"output_type":"execute_result","data":{"text/plain":["0 10\n","1 20\n","2 30\n","Name: x, dtype: int64"],"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
x
010
120
230
\n","

"]},"metadata":{},"execution_count":2}],"source":["# Muestra solo la columna 'x'.\n","df['x']"]},{"cell_type":"code","execution_count":3,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":178},"id":"cxcLlZHgaDn_","executionInfo":{"status":"ok","timestamp":1742188119321,"user_tz":240,"elapsed":53,"user":{"displayName":"rob","userId":"11204549018487637648"}},"outputId":"c757f1ea-4870-40fe-9b0d-573db287b1f3"},"outputs":[{"output_type":"execute_result","data":{"text/plain":["0 a\n","1 b\n","2 c\n","Name: y, dtype: object"],"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
y
0a
1b
2c
\n","

"]},"metadata":{},"execution_count":3}],"source":["# Para mostrar sólo la columna y.\n","df['y']"]},{"cell_type":"code","execution_count":4,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"WTEpx-F6aDoA","executionInfo":{"status":"ok","timestamp":1742188121467,"user_tz":240,"elapsed":51,"user":{"displayName":"rob","userId":"11204549018487637648"}},"outputId":"3bd14f02-6d7c-4417-9c04-c5f825470692"},"outputs":[{"output_type":"execute_result","data":{"text/plain":["(3, 3)"]},"metadata":{},"execution_count":4}],"source":["# Para indicar el número de filas y columnas de d\n","df.shape"]},{"cell_type":"code","execution_count":5,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"FtQzNyS2aDoA","executionInfo":{"status":"ok","timestamp":1742188124077,"user_tz":240,"elapsed":11,"user":{"displayName":"rob","userId":"11204549018487637648"}},"outputId":"a1ef7ccf-6886-4b08-ccec-b88d9b8bd1fa"},"outputs":[{"output_type":"stream","name":"stdout","text":["3\n","3\n"]}],"source":["# Para indicar el número de filas de d.\n","# Utilizando la función shape\n","num_filas = df.shape[0]\n","print(num_filas)\n","\n","# Utilizando la función len\n","num_filas = len(df)\n","print(num_filas)\n"]},{"cell_type":"code","execution_count":6,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"YtxMam_1aDoB","executionInfo":{"status":"ok","timestamp":1742188129809,"user_tz":240,"elapsed":40,"user":{"displayName":"rob","userId":"11204549018487637648"}},"outputId":"87e8a779-4835-48d9-bb43-e73f7622554a"},"outputs":[{"output_type":"stream","name":"stdout","text":["3\n","3\n"]}],"source":["# Para indicar el número de columnas de d.\n","\n","# Utilizando la función shape\n","num_columnas = df.shape[1]\n","print(num_columnas)\n","\n","# Utilizando la función len junto con columns\n","num_columnas = len(df.columns)\n","print(num_columnas)\n"]},{"cell_type":"markdown","metadata":{"id":"rq3lzckkaDoC"},"source":["## Ejemplo: Datos de Accidentes de Tránsito en Chile\n","\n","Usaremos los datos de accidentes de tránsito en Chile en los años 2010 y 2011.\n","\n","Puedes descargar los datos al computador de las siguientes direcciones:\n","\n","\n","\n","\n"]},{"cell_type":"markdown","metadata":{"id":"mCfS-4P_aDoC"},"source":["Puedes descargarlos o cargarlos remotamente.\n","\n","Si los descargas:"]},{"cell_type":"code","execution_count":7,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":356},"id":"TjuA3SvtaDoC","executionInfo":{"status":"error","timestamp":1742188146180,"user_tz":240,"elapsed":146,"user":{"displayName":"rob","userId":"11204549018487637648"}},"outputId":"dee01255-bc96-450f-97fb-17f67f7c4972"},"outputs":[{"output_type":"error","ename":"FileNotFoundError","evalue":"[Errno 2] No such file or directory: 'accidentes_2010_2011.txt'","traceback":["\u001b[0;31m---------------------------------------------------------------------------\u001b[0m","\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)","\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# Lee el archivo 'accidentes_2010_2011.txt' en un DataFrame llamado 'tipos'\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mtipos\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpd\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mread_table\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"accidentes_2010_2011.txt\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;31m# Lee el archivo 'afectados_2010_2011.txt' en un DataFrame llamado 'afectados'\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mafectados\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mpd\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mread_table\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"afectados_2010_2011.txt\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.11/dist-packages/pandas/io/parsers/readers.py\u001b[0m in \u001b[0;36mread_table\u001b[0;34m(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend)\u001b[0m\n\u001b[1;32m 1403\u001b[0m \u001b[0mkwds\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mupdate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkwds_defaults\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1404\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1405\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0m_read\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkwds\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1406\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1407\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.11/dist-packages/pandas/io/parsers/readers.py\u001b[0m in \u001b[0;36m_read\u001b[0;34m(filepath_or_buffer, kwds)\u001b[0m\n\u001b[1;32m 618\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 619\u001b[0m \u001b[0;31m# Create the parser.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 620\u001b[0;31m \u001b[0mparser\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mTextFileReader\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfilepath_or_buffer\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwds\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 621\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 622\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mchunksize\u001b[0m \u001b[0;32mor\u001b[0m \u001b[0miterator\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.11/dist-packages/pandas/io/parsers/readers.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, f, engine, **kwds)\u001b[0m\n\u001b[1;32m 1618\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1619\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhandles\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mIOHandles\u001b[0m \u001b[0;34m|\u001b[0m \u001b[0;32mNone\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1620\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_engine\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_make_engine\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mengine\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1621\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1622\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mclose\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.11/dist-packages/pandas/io/parsers/readers.py\u001b[0m in \u001b[0;36m_make_engine\u001b[0;34m(self, f, engine)\u001b[0m\n\u001b[1;32m 1878\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;34m\"b\"\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mmode\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1879\u001b[0m \u001b[0mmode\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0;34m\"b\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1880\u001b[0;31m self.handles = get_handle(\n\u001b[0m\u001b[1;32m 1881\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1882\u001b[0m \u001b[0mmode\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.11/dist-packages/pandas/io/common.py\u001b[0m in \u001b[0;36mget_handle\u001b[0;34m(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)\u001b[0m\n\u001b[1;32m 871\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mioargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mencoding\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0;34m\"b\"\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mioargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmode\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 872\u001b[0m \u001b[0;31m# Encoding\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 873\u001b[0;31m handle = open(\n\u001b[0m\u001b[1;32m 874\u001b[0m \u001b[0mhandle\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 875\u001b[0m \u001b[0mioargs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmode\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: 'accidentes_2010_2011.txt'"]}],"source":["\n","# Lee el archivo 'accidentes_2010_2011.txt' en un DataFrame llamado 'tipos'\n","tipos = pd.read_table(\"accidentes_2010_2011.txt\")\n","\n","# Lee el archivo 'afectados_2010_2011.txt' en un DataFrame llamado 'afectados'\n","afectados = pd.read_table(\"afectados_2010_2011.txt\")\n"]},{"cell_type":"markdown","metadata":{"id":"w3_cnvCAaDoD"},"source":["Para cargar los datos remotamente:"]},{"cell_type":"code","execution_count":8,"metadata":{"id":"SWa74bD3aDoD","executionInfo":{"status":"ok","timestamp":1742188178702,"user_tz":240,"elapsed":2041,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[],"source":["\n","# Lee el archivo 'accidentes_2010_2011.txt' en un DataFrame llamado 'tipos'\n","tipos = pd.read_table(\"https://users.dcc.uchile.cl/~hsarmien/mineria/datasets/accidentes_2010_2011.txt\", sep = ' ')\n","\n","# Lee el archivo 'afectados_2010_2011.txt' en un DataFrame llamado 'afectados'\n","afectados = pd.read_table(\"https://users.dcc.uchile.cl/~hsarmien/mineria/datasets/afectados_2010_2011.txt\", sep=' ')\n"]},{"cell_type":"markdown","metadata":{"id":"CqIkzpNAaDoD"},"source":["Esta última opción es conveniente porque son archivos pequeños.\n","\n","Siempre que llegue a sus manos un dataset, lo primero es hacer una revisión inicial para entender cómo están estructurados los datos. Esto significa, entender cuántos datos son, cuántas columnas, qué describe cada columna, el tipo de datos de las columnas, normalización de datos, entre otras cosas.\n","\n","En nuestro caso, el dataset tipos contiene la frecuencia de los distintos tipos de accidentes ocurridos en el 2010 y 2011, en Chile. Por otro lado, el dataset afectados contiene el estado de gravedad en que terminaron los accidente en Chile. Desde luego que ambos datasets se complementan.\n"]},{"cell_type":"markdown","metadata":{"id":"fftQTXSQaDoE"},"source":["### Atributos de un dataset"]},{"cell_type":"code","execution_count":9,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"OJJwfrCFaDoE","executionInfo":{"status":"ok","timestamp":1742188181145,"user_tz":240,"elapsed":54,"user":{"displayName":"rob","userId":"11204549018487637648"}},"outputId":"3c3a0b33-f920-4011-c308-ffc6a5fc4989"},"outputs":[{"output_type":"stream","name":"stdout","text":["\n","Index: 4296 entries, 1 to 4296\n","Data columns (total 5 columns):\n"," # Column Non-Null Count Dtype \n","--- ------ -------------- ----- \n"," 0 Muestra 4296 non-null object\n"," 1 Descripcion 4296 non-null object\n"," 2 Anio 4296 non-null int64 \n"," 3 TipoAccidente 4296 non-null object\n"," 4 Cantidad 4296 non-null int64 \n","dtypes: int64(2), object(3)\n","memory usage: 201.4+ KB\n"]}],"source":["tipos.info()"]},{"cell_type":"markdown","metadata":{"id":"-1ok2EwmaDoF"},"source":["Acá se muestra información sobre el DataFrame, incluyendo el nombre de las columnas, el número de entradas no nulas y los tipos de datos de cada columna.\n","\n","Ahora, lo mismo para afectados:"]},{"cell_type":"code","execution_count":10,"metadata":{"id":"SdvHcTH3aDoG","outputId":"bfeb1580-4171-43f4-9fac-cfc7ef82663b","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1742188183037,"user_tz":240,"elapsed":128,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"stream","name":"stdout","text":["\n","Index: 2864 entries, 1 to 2864\n","Data columns (total 5 columns):\n"," # Column Non-Null Count Dtype \n","--- ------ -------------- ----- \n"," 0 Muestra 2864 non-null object\n"," 1 Descripcion 2864 non-null object\n"," 2 Anio 2864 non-null int64 \n"," 3 Estado 2864 non-null object\n"," 4 Cantidad 2864 non-null int64 \n","dtypes: int64(2), object(3)\n","memory usage: 134.2+ KB\n"]}],"source":["afectados.info()"]},{"cell_type":"markdown","metadata":{"id":"-XLllFKxaDoH"},"source":["### Función head\n","\n","Con la función head podemos hacernos una idea de cómo son los datos, nos muestra los primeros 5 datos del dataset con los encabezados de cada atributo. Esto es útil para ver si los datos quedaron bien cargados o no (mejor mostrar unos pocos a mostrar todo el dataset completo). Adicionalmente puede recibir como argumento un entero para especificar el numero de filas que te gustaríaa obtener."]},{"cell_type":"code","execution_count":11,"metadata":{"id":"T8DygWqyaDoH","outputId":"30fe8eeb-74d7-4714-f8e0-38e9a8006257","colab":{"base_uri":"https://localhost:8080/","height":206},"executionInfo":{"status":"ok","timestamp":1742188184593,"user_tz":240,"elapsed":62,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":[" Muestra Descripcion Anio TipoAccidente Cantidad\n","1 Nacional Nacional 2010 Atropello 8247\n","2 Nacional Nacional 2011 Atropello 8339\n","3 Regional XV Región Arica y Parinacota 2010 Atropello 115\n","4 Regional XV Región Arica y Parinacota 2011 Atropello 159\n","5 Comunal ARICA 2010 Atropello 115"],"text/html":["\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
MuestraDescripcionAnioTipoAccidenteCantidad
1NacionalNacional2010Atropello8247
2NacionalNacional2011Atropello8339
3RegionalXV Región Arica y Parinacota2010Atropello115
4RegionalXV Región Arica y Parinacota2011Atropello159
5ComunalARICA2010Atropello115
\n","
\n","
\n","\n","
\n"," \n","\n"," \n","\n"," \n","
\n","\n","\n","
\n"," \n","\n","\n","\n"," \n","
\n","\n","
\n","
\n"],"application/vnd.google.colaboratory.intrinsic+json":{"type":"dataframe","variable_name":"tipos","summary":"{\n \"name\": \"tipos\",\n \"rows\": 4296,\n \"fields\": [\n {\n \"column\": \"Muestra\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 3,\n \"samples\": [\n \"Nacional\",\n \"Regional\",\n \"Comunal\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Descripcion\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 359,\n \"samples\": [\n \"PUCON\",\n \"CANELA\",\n \"PALENA\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Anio\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 0,\n \"min\": 2010,\n \"max\": 2011,\n \"num_unique_values\": 2,\n \"samples\": [\n 2011,\n 2010\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"TipoAccidente\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 6,\n \"samples\": [\n \"Atropello\",\n \"Caida\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Cantidad\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 835,\n \"min\": 0,\n \"max\": 31487,\n \"num_unique_values\": 390,\n \"samples\": [\n 108,\n 226\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}"}},"metadata":{},"execution_count":11}],"source":["tipos.head()"]},{"cell_type":"markdown","metadata":{"id":"3gqcUdZsaDoH"},"source":["### Función describe\n","La función describe aplica estadísticas a cada columna. En particular, indica el promedio, mediana, quantiles, valor máximo, mínimo, entre otros."]},{"cell_type":"code","execution_count":12,"metadata":{"id":"1nEdqG9UaDoI","outputId":"5c5077fb-ca3b-4850-cf0e-bbb6d82b49bb","colab":{"base_uri":"https://localhost:8080/","height":394},"executionInfo":{"status":"ok","timestamp":1742188186635,"user_tz":240,"elapsed":79,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":[" Muestra Descripcion Anio TipoAccidente Cantidad\n","count 4296 4296 4296.000000 4296 4296.000000\n","unique 3 359 NaN 6 NaN\n","top Comunal Nacional NaN Atropello NaN\n","freq 4104 12 NaN 716 NaN\n","mean NaN NaN 2010.500000 NaN 84.203911\n","std NaN NaN 0.500058 NaN 835.751218\n","min NaN NaN 2010.000000 NaN 0.000000\n","25% NaN NaN 2010.000000 NaN 1.000000\n","50% NaN NaN 2010.500000 NaN 5.000000\n","75% NaN NaN 2011.000000 NaN 20.000000\n","max NaN NaN 2011.000000 NaN 31487.000000"],"text/html":["\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
MuestraDescripcionAnioTipoAccidenteCantidad
count429642964296.00000042964296.000000
unique3359NaN6NaN
topComunalNacionalNaNAtropelloNaN
freq410412NaN716NaN
meanNaNNaN2010.500000NaN84.203911
stdNaNNaN0.500058NaN835.751218
minNaNNaN2010.000000NaN0.000000
25%NaNNaN2010.000000NaN1.000000
50%NaNNaN2010.500000NaN5.000000
75%NaNNaN2011.000000NaN20.000000
maxNaNNaN2011.000000NaN31487.000000
\n","
\n","
\n","\n","
\n"," \n","\n"," \n","\n"," \n","
\n","\n","\n","
\n"," \n","\n","\n","\n"," \n","
\n","\n","
\n","
\n"],"application/vnd.google.colaboratory.intrinsic+json":{"type":"dataframe","summary":"{\n \"name\": \"tipos\",\n \"rows\": 11,\n \"fields\": [\n {\n \"column\": \"Muestra\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 4,\n \"samples\": [\n 3,\n \"4104\",\n \"4296\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Descripcion\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 4,\n \"samples\": [\n 359,\n \"12\",\n \"4296\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Anio\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 1149.790259181574,\n \"min\": 0.5000582038300092,\n \"max\": 4296.0,\n \"num_unique_values\": 5,\n \"samples\": [\n 2010.5,\n 2011.0,\n 0.5000582038300092\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"TipoAccidente\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 4,\n \"samples\": [\n 6,\n \"716\",\n \"4296\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Cantidad\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 10967.261432459083,\n \"min\": 0.0,\n \"max\": 31487.0,\n \"num_unique_values\": 8,\n \"samples\": [\n 84.20391061452514,\n 5.0,\n 4296.0\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}"}},"metadata":{},"execution_count":12}],"source":["tipos.describe(include='all')"]},{"cell_type":"code","execution_count":13,"metadata":{"id":"VcE_4QixaDoI","outputId":"14f62294-e182-471c-a778-ab57141320b8","colab":{"base_uri":"https://localhost:8080/","height":394},"executionInfo":{"status":"ok","timestamp":1742188189129,"user_tz":240,"elapsed":130,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":[" Muestra Descripcion Anio Estado Cantidad\n","count 2864 2864 2864.000000 2864 2864.000000\n","unique 3 358 NaN 4 NaN\n","top Comunal Nacional NaN Muertos NaN\n","freq 2736 8 NaN 716 NaN\n","mean NaN NaN 2010.500000 NaN 115.583799\n","std NaN NaN 0.500087 NaN 1220.347115\n","min NaN NaN 2010.000000 NaN 0.000000\n","25% NaN NaN 2010.000000 NaN 3.000000\n","50% NaN NaN 2010.500000 NaN 9.000000\n","75% NaN NaN 2011.000000 NaN 32.000000\n","max NaN NaN 2011.000000 NaN 43034.000000"],"text/html":["\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
MuestraDescripcionAnioEstadoCantidad
count286428642864.00000028642864.000000
unique3358NaN4NaN
topComunalNacionalNaNMuertosNaN
freq27368NaN716NaN
meanNaNNaN2010.500000NaN115.583799
stdNaNNaN0.500087NaN1220.347115
minNaNNaN2010.000000NaN0.000000
25%NaNNaN2010.000000NaN3.000000
50%NaNNaN2010.500000NaN9.000000
75%NaNNaN2011.000000NaN32.000000
maxNaNNaN2011.000000NaN43034.000000
\n","
\n","
\n","\n","
\n"," \n","\n"," \n","\n"," \n","
\n","\n","\n","
\n"," \n","\n","\n","\n"," \n","
\n","\n","
\n","
\n"],"application/vnd.google.colaboratory.intrinsic+json":{"type":"dataframe","summary":"{\n \"name\": \"afectados\",\n \"rows\": 11,\n \"fields\": [\n {\n \"column\": \"Muestra\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 4,\n \"samples\": [\n 3,\n \"2736\",\n \"2864\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Descripcion\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 4,\n \"samples\": [\n 358,\n \"8\",\n \"2864\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Anio\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 810.764753003937,\n \"min\": 0.5000873133683421,\n \"max\": 2864.0,\n \"num_unique_values\": 5,\n \"samples\": [\n 2010.5,\n 2011.0,\n 0.5000873133683421\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Estado\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 4,\n \"samples\": [\n 4,\n \"716\",\n \"2864\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Cantidad\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 15034.394038623206,\n \"min\": 0.0,\n \"max\": 43034.0,\n \"num_unique_values\": 8,\n \"samples\": [\n 115.58379888268156,\n 9.0,\n 2864.0\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}"}},"metadata":{},"execution_count":13}],"source":["afectados.describe(include='all')"]},{"cell_type":"markdown","metadata":{"id":"qIX92qCKaDoI"},"source":["Aunque también podemos hacer el muestreo por separado empleando las siguientes funciones:"]},{"cell_type":"code","execution_count":14,"metadata":{"id":"3VvdhkB_aDoJ","outputId":"195b3362-bcdb-400b-de0f-d5ff32396337","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1742188191191,"user_tz":240,"elapsed":9,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["84.20391061452514"]},"metadata":{},"execution_count":14}],"source":["tipos['Cantidad'].mean()"]},{"cell_type":"code","execution_count":15,"metadata":{"id":"vmOXGeBLaDoJ","outputId":"627e0009-52a1-4bbf-feb4-e061bdc8ec02","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1742188191941,"user_tz":240,"elapsed":9,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["835.7512175536759"]},"metadata":{},"execution_count":15}],"source":["# desviacion estandar\n","tipos['Cantidad'].std()"]},{"cell_type":"code","execution_count":16,"metadata":{"id":"Zx_O7B0SaDoK","outputId":"182399e5-5117-4bea-e74e-cc21adc043a3","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1742188192672,"user_tz":240,"elapsed":23,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["0"]},"metadata":{},"execution_count":16}],"source":["# minimo (maximo)\n","tipos['Cantidad'].min()"]},{"cell_type":"code","execution_count":17,"metadata":{"id":"rutj99qgaDoK","outputId":"4e64044e-878c-45d0-de34-21576697d65a","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1742188193800,"user_tz":240,"elapsed":56,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["5.0"]},"metadata":{},"execution_count":17}],"source":["# mediana\n","tipos['Cantidad'].median()"]},{"cell_type":"code","execution_count":19,"metadata":{"id":"uiHugERaaDoL","outputId":"6791267f-c272-4016-cb7d-5e45080ea02b","colab":{"base_uri":"https://localhost:8080/","height":241},"executionInfo":{"status":"ok","timestamp":1742188195865,"user_tz":240,"elapsed":50,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["0.00 0.0\n","0.25 1.0\n","0.50 5.0\n","0.75 20.0\n","1.00 31487.0\n","Name: Cantidad, dtype: float64"],"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
Cantidad
0.000.0
0.251.0
0.505.0
0.7520.0
1.0031487.0
\n","

"]},"metadata":{},"execution_count":19}],"source":["# cuantiles, los valores que son mayores que una fracción $q$ de los datos\n","tipos['Cantidad'].quantile([0,0.25, 0.5, 0.75, 1])"]},{"cell_type":"code","execution_count":20,"metadata":{"id":"O4mg3fAiaDoN","outputId":"50df75ec-5cd1-4d17-f92e-e1b05c63a64d","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1742188196670,"user_tz":240,"elapsed":89,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["19.0"]},"metadata":{},"execution_count":20}],"source":["# diferencia entre cuartil 3 y cuartil 1 (Q3 - Q1), o cuantil 0.75 y cuantil 0.25\n","\n","q1,q3 = tipos['Cantidad'].quantile([0.25,0.75])\n","q3-q1"]},{"cell_type":"markdown","metadata":{"id":"cT4FqdwbaDoN"},"source":["### Consultas sobre data frames (proyección y filtro)\n","\n","Para proyectar o seleccionar columnas de una tabla, usamos el nombre de la columna entre [ ]. (Siempre emplearemos head para no alargar este manual)."]},{"cell_type":"code","execution_count":21,"metadata":{"id":"hO4ntX45aDoN","outputId":"5ed1a3fd-e242-4ae6-da5a-a00ad866c092","colab":{"base_uri":"https://localhost:8080/","height":241},"executionInfo":{"status":"ok","timestamp":1742188198022,"user_tz":240,"elapsed":64,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["1 8247\n","2 8339\n","3 115\n","4 159\n","5 115\n","Name: Cantidad, dtype: int64"],"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
Cantidad
18247
28339
3115
4159
5115
\n","

"]},"metadata":{},"execution_count":21}],"source":["# muestra sólo la columna Cantidad\n","# note que el resultado de esta operación es un Vector\n","\n","tipos['Cantidad'].head()"]},{"cell_type":"code","execution_count":22,"metadata":{"id":"LXEWeqdfaDoO","outputId":"b3e61653-c940-4473-f696-97a34e6328c9","colab":{"base_uri":"https://localhost:8080/","height":423},"executionInfo":{"status":"ok","timestamp":1742188199925,"user_tz":240,"elapsed":97,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":[" Cantidad TipoAccidente\n","1 8247 Atropello\n","2 8339 Atropello\n","3 115 Atropello\n","4 159 Atropello\n","5 115 Atropello\n","... ... ...\n","4292 2 Otros\n","4293 1 Otros\n","4294 2 Otros\n","4295 5 Otros\n","4296 14 Otros\n","\n","[4296 rows x 2 columns]"],"text/html":["\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
CantidadTipoAccidente
18247Atropello
28339Atropello
3115Atropello
4159Atropello
5115Atropello
.........
42922Otros
42931Otros
42942Otros
42955Otros
429614Otros
\n","

4296 rows × 2 columns

\n","
\n","
\n","\n","
\n"," \n","\n"," \n","\n"," \n","
\n","\n","\n","
\n"," \n","\n","\n","\n"," \n","
\n","\n","
\n","
\n"],"application/vnd.google.colaboratory.intrinsic+json":{"type":"dataframe","summary":"{\n \"name\": \"tipos[['Cantidad','TipoAccidente']]\",\n \"rows\": 4296,\n \"fields\": [\n {\n \"column\": \"Cantidad\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 835,\n \"min\": 0,\n \"max\": 31487,\n \"num_unique_values\": 390,\n \"samples\": [\n 108,\n 226,\n 228\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"TipoAccidente\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 6,\n \"samples\": [\n \"Atropello\",\n \"Caida\",\n \"Otros\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}"}},"metadata":{},"execution_count":22}],"source":["# se puede seleccionar más de una columna\n","\n","tipos[['Cantidad','TipoAccidente']]"]},{"cell_type":"markdown","metadata":{"id":"3NXFgseIaDoO"},"source":["Ahora, para filtrar filas, usamos la notación [columnas][filas]."]},{"cell_type":"code","execution_count":23,"metadata":{"id":"xMKw8fJ0aDoO","outputId":"e3a7e8a7-0f26-45d1-e6b6-8dce76f2bc5e","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1742188201211,"user_tz":240,"elapsed":64,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["115"]},"metadata":{},"execution_count":23}],"source":["# fila 3, columna 5\n","tipos['Cantidad'][5]"]},{"cell_type":"code","execution_count":24,"metadata":{"id":"d-KD1zkEaDoP","outputId":"9bedcc4b-4ddc-4d24-89d3-107e275889f2","colab":{"base_uri":"https://localhost:8080/","height":241},"executionInfo":{"status":"ok","timestamp":1742188202337,"user_tz":240,"elapsed":52,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["Muestra Regional\n","Descripcion XV Región Arica y Parinacota\n","Anio 2010\n","TipoAccidente Atropello\n","Cantidad 115\n","Name: 3, dtype: object"],"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
3
MuestraRegional
DescripcionXV Región Arica y Parinacota
Anio2010
TipoAccidenteAtropello
Cantidad115
\n","

"]},"metadata":{},"execution_count":24}],"source":["# Selecciona la fila 3 completa\n","# La función iloc se utiliza para seleccionar filas por índice. El índice 2 corresponde a la tercera fila\n","tipos.iloc[2, :]"]},{"cell_type":"code","execution_count":25,"metadata":{"id":"RQf0XdwKaDoP","outputId":"3632e6d3-db7f-4006-ee2a-ea0079511b42","colab":{"base_uri":"https://localhost:8080/","height":237},"executionInfo":{"status":"ok","timestamp":1742188213644,"user_tz":240,"elapsed":51,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":[" Anio TipoAccidente\n","1 2010 Atropello\n","2 2011 Atropello\n","3 2010 Atropello\n","4 2011 Atropello\n","5 2010 Atropello\n","6 2011 Atropello"],"text/html":["\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
AnioTipoAccidente
12010Atropello
22011Atropello
32010Atropello
42011Atropello
52010Atropello
62011Atropello
\n","
\n","
\n","\n","
\n"," \n","\n"," \n","\n"," \n","
\n","\n","\n","
\n"," \n","\n","\n","\n"," \n","
\n","\n","
\n","
\n"],"application/vnd.google.colaboratory.intrinsic+json":{"type":"dataframe","summary":"{\n \"name\": \"tipos[['Anio','TipoAccidente']][:6]\",\n \"rows\": 6,\n \"fields\": [\n {\n \"column\": \"Anio\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 0,\n \"min\": 2010,\n \"max\": 2011,\n \"num_unique_values\": 2,\n \"samples\": [\n 2011,\n 2010\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"TipoAccidente\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"Atropello\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}"}},"metadata":{},"execution_count":25}],"source":["# Muestra los primeros 6 datos y las columnas seleccionadas\n","tipos[['Anio','TipoAccidente']][:6]"]},{"cell_type":"markdown","metadata":{"id":"O2IPMp2jaDoP"},"source":["Desde luego que podemos crear condiciones o filtros"]},{"cell_type":"code","execution_count":26,"metadata":{"id":"Zm_pL10OaDoP","outputId":"8550f9aa-2285-489b-80fe-88649cf091a4","colab":{"base_uri":"https://localhost:8080/","height":458},"executionInfo":{"status":"ok","timestamp":1742188215191,"user_tz":240,"elapsed":10,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["1 True\n","2 False\n","3 True\n","4 False\n","5 True\n"," ... \n","2860 False\n","2861 True\n","2862 False\n","2863 True\n","2864 False\n","Name: Anio, Length: 2864, dtype: bool"],"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
Anio
1True
2False
3True
4False
5True
......
2860False
2861True
2862False
2863True
2864False
\n","

2864 rows × 1 columns

\n","

"]},"metadata":{},"execution_count":26}],"source":["# Para cada valor de la columna Anio, indica si es 2010 o no (mediante True y False)\n","afectados[\"Anio\"] == 2010"]},{"cell_type":"code","execution_count":27,"metadata":{"id":"zbyTf5gaaDoQ","outputId":"294d9688-9215-40d0-f47e-09f878e7a866","colab":{"base_uri":"https://localhost:8080/","height":423},"executionInfo":{"status":"ok","timestamp":1742188216895,"user_tz":240,"elapsed":84,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":[" Muestra Descripcion Anio Estado Cantidad\n","1 Nacional Nacional 2010 Muertos 1595\n","3 Regional XV Región Arica y Parinacota 2010 Muertos 28\n","5 Comunal ARICA 2010 Muertos 24\n","7 Comunal CAMARONES 2010 Muertos 2\n","9 Comunal PUTRE 2010 Muertos 2\n","... ... ... ... ... ...\n","2855 Comunal TALAGANTE 2010 Leves 279\n","2857 Comunal EL MONTE 2010 Leves 86\n","2859 Comunal ISLA DE MAIPO 2010 Leves 112\n","2861 Comunal PADRE HURTADO 2010 Leves 98\n","2863 Comunal PENAFLOR 2010 Leves 249\n","\n","[1432 rows x 5 columns]"],"text/html":["\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
MuestraDescripcionAnioEstadoCantidad
1NacionalNacional2010Muertos1595
3RegionalXV Región Arica y Parinacota2010Muertos28
5ComunalARICA2010Muertos24
7ComunalCAMARONES2010Muertos2
9ComunalPUTRE2010Muertos2
..................
2855ComunalTALAGANTE2010Leves279
2857ComunalEL MONTE2010Leves86
2859ComunalISLA DE MAIPO2010Leves112
2861ComunalPADRE HURTADO2010Leves98
2863ComunalPENAFLOR2010Leves249
\n","

1432 rows × 5 columns

\n","
\n","
\n","\n","
\n"," \n","\n"," \n","\n"," \n","
\n","\n","\n","
\n"," \n","\n","\n","\n"," \n","
\n","\n","
\n","
\n"],"application/vnd.google.colaboratory.intrinsic+json":{"type":"dataframe","summary":"{\n \"name\": \"afectados[afectados[\\\"Anio\\\"] == 2010]\",\n \"rows\": 1432,\n \"fields\": [\n {\n \"column\": \"Muestra\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 3,\n \"samples\": [\n \"Nacional\",\n \"Regional\",\n \"Comunal\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Descripcion\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 358,\n \"samples\": [\n \"PUCON\",\n \"CANELA\",\n \"LLANQUIHUE\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Anio\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 0,\n \"min\": 2010,\n \"max\": 2010,\n \"num_unique_values\": 1,\n \"samples\": [\n 2010\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Estado\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 4,\n \"samples\": [\n \"Graves\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Cantidad\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 1203,\n \"min\": 0,\n \"max\": 41744,\n \"num_unique_values\": 234,\n \"samples\": [\n 65\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}"}},"metadata":{},"execution_count":27}],"source":["#ahora con el resultado anterior, selecciona solo las filas que son True en un nuevo dataframe\n","afectados[afectados[\"Anio\"] == 2010]"]},{"cell_type":"code","execution_count":28,"metadata":{"id":"UbvCPR0OaDoQ","outputId":"19254aee-cb98-49d0-c391-7214b15512d4","colab":{"base_uri":"https://localhost:8080/","height":241},"executionInfo":{"status":"ok","timestamp":1742188218958,"user_tz":240,"elapsed":82,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["Muestra 1432\n","Descripcion 1432\n","Anio 1432\n","Estado 1432\n","Cantidad 1432\n","dtype: int64"],"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
0
Muestra1432
Descripcion1432
Anio1432
Estado1432
Cantidad1432
\n","

"]},"metadata":{},"execution_count":28}],"source":["# podemos contar los datos de accidentes del 2010\n","afectados[afectados[\"Anio\"] == 2010].count()"]},{"cell_type":"markdown","metadata":{"id":"-5K_Cwo7aDoQ"},"source":["Una función util para contar cuantos valores son NA en una columna, es la siguiente:"]},{"cell_type":"code","execution_count":29,"metadata":{"id":"2tb7pbCraDoQ","outputId":"d29e027e-b407-462a-80ac-4a6265dea53f","colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"status":"ok","timestamp":1742188226873,"user_tz":240,"elapsed":11,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["0"]},"metadata":{},"execution_count":29}],"source":["afectados['Anio'].isna().sum()"]},{"cell_type":"markdown","metadata":{"id":"1NsF1mnLaDoR"},"source":["Para topas las columnas:"]},{"cell_type":"code","execution_count":30,"metadata":{"id":"QFlEWhB9aDoR","outputId":"a6150331-0203-4ae6-8d7b-e3bfb343b4e8","colab":{"base_uri":"https://localhost:8080/","height":241},"executionInfo":{"status":"ok","timestamp":1742188229625,"user_tz":240,"elapsed":17,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":["Muestra 0\n","Descripcion 0\n","Anio 0\n","Estado 0\n","Cantidad 0\n","dtype: int64"],"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
0
Muestra0
Descripcion0
Anio0
Estado0
Cantidad0
\n","

"]},"metadata":{},"execution_count":30}],"source":["afectados.isna().sum()"]},{"cell_type":"markdown","metadata":{"id":"HPVpBlTeaDoS"},"source":["Por ejemplo que muestre sólo los datos del 2011:"]},{"cell_type":"code","execution_count":31,"metadata":{"id":"4hNdfzLWaDoS","outputId":"60eeb277-f52e-4850-a764-10ade6d79519","colab":{"base_uri":"https://localhost:8080/","height":206},"executionInfo":{"status":"ok","timestamp":1742188231381,"user_tz":240,"elapsed":38,"user":{"displayName":"rob","userId":"11204549018487637648"}}},"outputs":[{"output_type":"execute_result","data":{"text/plain":[" Muestra Descripcion Anio Estado Cantidad\n","2 Nacional Nacional 2011 Muertos 1573\n","4 Regional XV Región Arica y Parinacota 2011 Muertos 33\n","6 Comunal ARICA 2011 Muertos 29\n","8 Comunal CAMARONES 2011 Muertos 2\n","10 Comunal PUTRE 2011 Muertos 2"],"text/html":["\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
MuestraDescripcionAnioEstadoCantidad
2NacionalNacional2011Muertos1573
4RegionalXV Región Arica y Parinacota2011Muertos33
6ComunalARICA2011Muertos29
8ComunalCAMARONES2011Muertos2
10ComunalPUTRE2011Muertos2
\n","
\n","
\n","\n","
\n"," \n","\n"," \n","\n"," \n","
\n","\n","\n","
\n"," \n","\n","\n","\n"," \n","
\n","\n","
\n","
\n"],"application/vnd.google.colaboratory.intrinsic+json":{"type":"dataframe","summary":"{\n \"name\": \"afectados[afectados[\\\"Anio\\\"] == 2011]\",\n \"rows\": 5,\n \"fields\": [\n {\n \"column\": \"Muestra\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 3,\n \"samples\": [\n \"Nacional\",\n \"Regional\",\n \"Comunal\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Descripcion\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 5,\n \"samples\": [\n \"XV Regi\\u00f3n Arica y Parinacota\",\n \"PUTRE\",\n \"ARICA\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Anio\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 0,\n \"min\": 2011,\n \"max\": 2011,\n \"num_unique_values\": 1,\n \"samples\": [\n 2011\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Estado\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"Muertos\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Cantidad\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 696,\n \"min\": 2,\n \"max\": 1573,\n \"num_unique_values\": 4,\n \"samples\": [\n 33\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}"}},"metadata":{},"execution_count":31}],"source":["# Filtra los datos cuyo año es 2011 y muestra todas las columnas (notar que ahora no muestra TRUE/FALSE)\n","afectados[afectados[\"Anio\"] == 2011].head()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"YAx3vB1DaDoU","outputId":"53ea4ffc-b69f-4dd5-c1e7-ac1de6b1a253"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
MuestraDescripcionAnioEstadoCantidad
4RegionalXV Región Arica y Parinacota2011Muertos33
14RegionalI Región de Tarapacá2011Muertos56
30RegionalII Región de Antofagasta2011Muertos87
50RegionalIII Región de Atacama2011Muertos53
70RegionalIV Región de Coquimbo2011Muertos73
\n","
"],"text/plain":[" Muestra Descripcion Anio Estado Cantidad\n","4 Regional XV Región Arica y Parinacota 2011 Muertos 33\n","14 Regional I Región de Tarapacá 2011 Muertos 56\n","30 Regional II Región de Antofagasta 2011 Muertos 87\n","50 Regional III Región de Atacama 2011 Muertos 53\n","70 Regional IV Región de Coquimbo 2011 Muertos 73"]},"execution_count":52,"metadata":{},"output_type":"execute_result"}],"source":["# Filtramos que la columna Anio sea 2011 y además que la columna Muestra sea Regional. Se muestran todas las columnas.\n","afectados[(afectados[\"Anio\"] == 2011) & (afectados[\"Muestra\"] == \"Regional\")].head()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"fIkihurDaDoU","outputId":"bcfd53d8-ca05-4df4-a1c1-5c63aec9a172"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
DescripcionCantidad
4XV Región Arica y Parinacota33
14I Región de Tarapacá56
30II Región de Antofagasta87
50III Región de Atacama53
70IV Región de Coquimbo73
\n","
"],"text/plain":[" Descripcion Cantidad\n","4 XV Región Arica y Parinacota 33\n","14 I Región de Tarapacá 56\n","30 II Región de Antofagasta 87\n","50 III Región de Atacama 53\n","70 IV Región de Coquimbo 73"]},"execution_count":53,"metadata":{},"output_type":"execute_result"}],"source":["# Filtramos que la columna Anio sea 2011 y además que la columna Muestra sea Regional. Seleccionamos la Descripcion y la Cantidad\n","afectados[(afectados[\"Anio\"] == 2011) & (afectados[\"Muestra\"] == \"Regional\")][[\"Descripcion\", \"Cantidad\"]].head()"]},{"cell_type":"markdown","metadata":{"id":"19y6M6e4aDoV"},"source":["### Operaciones sobre dataframe"]},{"cell_type":"markdown","metadata":{"id":"l7L6bOp9aDoV"},"source":["#### Aggregate\n","Para saber cuántos muertos por accidentes hubo en todo Chile podemos emplear aggregate. Esto es similar a GROUP BY en SQL."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"N2aGKe6saDoV","outputId":"8f1b3cc3-bf58-4752-8c12-392e874b3a02"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
EstadoCantidad
0Graves40869
1Leves254334
2MenosGraves26325
3Muertos9504
\n","
"],"text/plain":[" Estado Cantidad\n","0 Graves 40869\n","1 Leves 254334\n","2 MenosGraves 26325\n","3 Muertos 9504"]},"execution_count":57,"metadata":{},"output_type":"execute_result"}],"source":["# Aplica la función suma (sum) a la columna Cantidad en base a los datos de Estado\n","afectados.groupby('Estado')['Cantidad'].sum().reset_index()"]},{"cell_type":"markdown","metadata":{"id":"9kk1XyzNaDoV"},"source":["Esta función hará grupos dentro de afectados, donde cada grupo estará asociado al mismo valor de Estado, y estará compuesto de todos los valores dados por Cantidad. A cada uno de estos grupos aplicará la función FUN, que en este caso es sum. Es decir, entregará la suma de las cantidades agrupadas por cada estado.\n","\n","También podríamos ser más especificos y sumar la columna cantidad agrupando por Estado y Anio."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"P1OrhHtvaDoV","outputId":"6627f6a2-d918-45e3-94d3-f9cd7ff2bd3b"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
EstadoAnioCantidad
0Graves201020697
1Graves201120172
2Leves2010125232
3Leves2011129102
4MenosGraves201012963
5MenosGraves201113362
6Muertos20104785
7Muertos20114719
\n","
"],"text/plain":[" Estado Anio Cantidad\n","0 Graves 2010 20697\n","1 Graves 2011 20172\n","2 Leves 2010 125232\n","3 Leves 2011 129102\n","4 MenosGraves 2010 12963\n","5 MenosGraves 2011 13362\n","6 Muertos 2010 4785\n","7 Muertos 2011 4719"]},"execution_count":58,"metadata":{},"output_type":"execute_result"}],"source":["afectados.groupby(['Estado', 'Anio'])['Cantidad'].sum().reset_index()"]},{"cell_type":"markdown","metadata":{"id":"ag2n2j_BaDoV"},"source":["#### Unique y drop_duplicates\n","Con unique y drop_duplicates podemos obtener el conjunto de datos (sin repetir) de una columna."]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"q5TbWf4KaDoW","executionInfo":{"status":"ok","timestamp":1710964589834,"user_tz":180,"elapsed":246,"user":{"displayName":"Fran Antonie Zautzik Rojas","userId":"06455364071528604762"}},"outputId":"6a968107-8378-4921-d13f-406bcf9e5810"},"outputs":[{"output_type":"execute_result","data":{"text/plain":["array(['Atropello', 'Caida', 'Colision', 'Choque', 'Volcadura', 'Otros'],\n"," dtype=object)"]},"metadata":{},"execution_count":11}],"source":["tipos['TipoAccidente'].unique()"]},{"cell_type":"code","source":["tipos['TipoAccidente'].drop_duplicates()"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"s3vrBZIhbw-o","executionInfo":{"status":"ok","timestamp":1710964561701,"user_tz":180,"elapsed":307,"user":{"displayName":"Fran Antonie Zautzik Rojas","userId":"06455364071528604762"}},"outputId":"3ca856f6-63a8-4416-faad-18ed9e3ec089"},"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["1 Atropello\n","717 Caida\n","1433 Colision\n","2149 Choque\n","2865 Volcadura\n","3581 Otros\n","Name: TipoAccidente, dtype: object"]},"metadata":{},"execution_count":10}]},{"cell_type":"markdown","metadata":{"id":"au_aTX2FaDoW"},"source":["#### Sort\n","En algún momento vamos a requerir ordenar las columnas en base a uno o más atributos. Por ejemplo:"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"KRtchsQOaDoX","outputId":"36233bfa-99ad-4651-f52a-40315db6bb0a"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
MuestraDescripcionAnioEstadoCantidad
7ComunalCAMARONES2010Muertos2
8ComunalCAMARONES2011Muertos2
9ComunalPUTRE2010Muertos2
10ComunalPUTRE2011Muertos2
5ComunalARICA2010Muertos24
3RegionalXV Región Arica y Parinacota2010Muertos28
6ComunalARICA2011Muertos29
4RegionalXV Región Arica y Parinacota2011Muertos33
2NacionalNacional2011Muertos1573
1NacionalNacional2010Muertos1595
\n","
"],"text/plain":[" Muestra Descripcion Anio Estado Cantidad\n","7 Comunal CAMARONES 2010 Muertos 2\n","8 Comunal CAMARONES 2011 Muertos 2\n","9 Comunal PUTRE 2010 Muertos 2\n","10 Comunal PUTRE 2011 Muertos 2\n","5 Comunal ARICA 2010 Muertos 24\n","3 Regional XV Región Arica y Parinacota 2010 Muertos 28\n","6 Comunal ARICA 2011 Muertos 29\n","4 Regional XV Región Arica y Parinacota 2011 Muertos 33\n","2 Nacional Nacional 2011 Muertos 1573\n","1 Nacional Nacional 2010 Muertos 1595"]},"execution_count":60,"metadata":{},"output_type":"execute_result"}],"source":["# Tomar los primeros 10 datos de afectados\n","afectados_reducido = afectados.head(10)\n","\n","# Ordenar ascendentemente por la columna 'Cantidad'\n","afectados_reducido.sort_values(by='Cantidad')\n"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"5TpoGoMmaDoX","outputId":"ecd1d56e-f7e9-43ec-c4d6-ad98454590fe"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
MuestraDescripcionAnioEstadoCantidad
1NacionalNacional2010Muertos1595
2NacionalNacional2011Muertos1573
4RegionalXV Región Arica y Parinacota2011Muertos33
6ComunalARICA2011Muertos29
3RegionalXV Región Arica y Parinacota2010Muertos28
5ComunalARICA2010Muertos24
7ComunalCAMARONES2010Muertos2
8ComunalCAMARONES2011Muertos2
9ComunalPUTRE2010Muertos2
10ComunalPUTRE2011Muertos2
\n","
"],"text/plain":[" Muestra Descripcion Anio Estado Cantidad\n","1 Nacional Nacional 2010 Muertos 1595\n","2 Nacional Nacional 2011 Muertos 1573\n","4 Regional XV Región Arica y Parinacota 2011 Muertos 33\n","6 Comunal ARICA 2011 Muertos 29\n","3 Regional XV Región Arica y Parinacota 2010 Muertos 28\n","5 Comunal ARICA 2010 Muertos 24\n","7 Comunal CAMARONES 2010 Muertos 2\n","8 Comunal CAMARONES 2011 Muertos 2\n","9 Comunal PUTRE 2010 Muertos 2\n","10 Comunal PUTRE 2011 Muertos 2"]},"execution_count":61,"metadata":{},"output_type":"execute_result"}],"source":["# Ordenar descendente la columna Cantidad\n","afectados_reducido.sort_values(by='Cantidad', ascending=False)"]},{"cell_type":"markdown","metadata":{"id":"yQY8PupFaDoX"},"source":["#### Merge\n","al como lo vimos al principio de este documento, para crear un nuevo data frame se usa pd.frame(). Por ejemplo:"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"q1AbUjSkaDoX","outputId":"40214852-2d8e-4759-c7fa-6b84e97d9883"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
x1y1
0010
1120
2240
3360
4480
55100
66120
77140
88160
\n","
"],"text/plain":[" x1 y1\n","0 0 10\n","1 1 20\n","2 2 40\n","3 3 60\n","4 4 80\n","5 5 100\n","6 6 120\n","7 7 140\n","8 8 160"]},"execution_count":63,"metadata":{},"output_type":"execute_result"}],"source":["\n","# DataFrame 'a'\n","a = pd.DataFrame({'x1': range(9), 'y1': [10, 20, 40, 60, 80, 100, 120, 140, 160]})\n","\n","# DataFrame 'b'\n","b = pd.DataFrame({'x1': [1, 2, 4, 6, 8, 10], 'y2': [0, 3, 5, 7, 9, 11]})\n","\n","a\n"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"3XjYRBDyaDoY","outputId":"6e943787-78ac-43fc-e5a9-acd1f338b72b"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
x1y2
010
123
245
367
489
51011
\n","
"],"text/plain":[" x1 y2\n","0 1 0\n","1 2 3\n","2 4 5\n","3 6 7\n","4 8 9\n","5 10 11"]},"execution_count":64,"metadata":{},"output_type":"execute_result"}],"source":["b"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"cyhGlA6raDoY","outputId":"5731378e-69d9-48b5-cede-23f0e7fe9a90"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
x1y1y2
01200
12403
24805
361207
481609
\n","
"],"text/plain":[" x1 y1 y2\n","0 1 20 0\n","1 2 40 3\n","2 4 80 5\n","3 6 120 7\n","4 8 160 9"]},"execution_count":65,"metadata":{},"output_type":"execute_result"}],"source":["# Inner join\n","pd.merge(a, b, on='x1')"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"UfdqvoWzaDoY","outputId":"6b265f5f-9182-49bd-8449-7bfb44161a3d"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
x1y1y2
0010.0NaN
1120.00.0
2240.03.0
3360.0NaN
4480.05.0
55100.0NaN
66120.07.0
77140.0NaN
88160.09.0
910NaN11.0
\n","
"],"text/plain":[" x1 y1 y2\n","0 0 10.0 NaN\n","1 1 20.0 0.0\n","2 2 40.0 3.0\n","3 3 60.0 NaN\n","4 4 80.0 5.0\n","5 5 100.0 NaN\n","6 6 120.0 7.0\n","7 7 140.0 NaN\n","8 8 160.0 9.0\n","9 10 NaN 11.0"]},"execution_count":66,"metadata":{},"output_type":"execute_result"}],"source":["# Full outer join\n","pd.merge(a, b, on='x1', how='outer')"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"u2o4P3O_aDob","outputId":"61579bff-160c-4c53-8f98-9b4fd2526e72"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
x1y1y2
0010NaN
11200.0
22403.0
3360NaN
44805.0
55100NaN
661207.0
77140NaN
881609.0
\n","
"],"text/plain":[" x1 y1 y2\n","0 0 10 NaN\n","1 1 20 0.0\n","2 2 40 3.0\n","3 3 60 NaN\n","4 4 80 5.0\n","5 5 100 NaN\n","6 6 120 7.0\n","7 7 140 NaN\n","8 8 160 9.0"]},"execution_count":67,"metadata":{},"output_type":"execute_result"}],"source":["# Left outer join\n","pd.merge(a, b, on='x1', how='left')"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"2iU9dWyqaDob","outputId":"0bfbb044-f2d0-4bf1-a43b-da775a075422"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
x1y1y2
0120.00
1240.03
2480.05
36120.07
48160.09
510NaN11
\n","
"],"text/plain":[" x1 y1 y2\n","0 1 20.0 0\n","1 2 40.0 3\n","2 4 80.0 5\n","3 6 120.0 7\n","4 8 160.0 9\n","5 10 NaN 11"]},"execution_count":68,"metadata":{},"output_type":"execute_result"}],"source":["# Right outer join\n","pd.merge(a, b, on='x1', how='right')"]},{"cell_type":"markdown","metadata":{"id":"RFtTetUFaDob"},"source":["#### Sumar filas y columnas\n","\n","Para sumar toda las filas de un data frame:"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"kOA_TF3kaDob","outputId":"e1975b1b-f31a-43a5-f4d9-cb15214dad47"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
x1y1
011
122
233
344
455
566
677
788
899
91010
\n","
"],"text/plain":[" x1 y1\n","0 1 1\n","1 2 2\n","2 3 3\n","3 4 4\n","4 5 5\n","5 6 6\n","6 7 7\n","7 8 8\n","8 9 9\n","9 10 10"]},"execution_count":69,"metadata":{},"output_type":"execute_result"}],"source":["df = pd.DataFrame({'x1': range(1, 11), 'y1': range(1, 11)})\n","df"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"iLLcRVVcaDoc","outputId":"6814443d-e132-4cf1-b374-c6c0dbc4f132"},"outputs":[{"data":{"text/plain":["0 2\n","1 4\n","2 6\n","3 8\n","4 10\n","5 12\n","6 14\n","7 16\n","8 18\n","9 20\n","dtype: int64"]},"execution_count":70,"metadata":{},"output_type":"execute_result"}],"source":["df.sum(axis=1)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"qcwET6SNaDoc","outputId":"7b8f2da0-49db-4111-a1c0-366ba7fc19f8"},"outputs":[{"data":{"text/plain":["5 12\n","6 14\n","7 16\n","8 18\n","9 20\n","dtype: int64"]},"execution_count":71,"metadata":{},"output_type":"execute_result"}],"source":["# suma las filas cuyo x1 es mayor a 5\n","df[df['x1'] > 5].sum(axis=1)"]},{"cell_type":"markdown","metadata":{"id":"NUf0hKrbaDoc"},"source":["Para sumar las columnas de un data frame:"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"nA55fO2raDoc","outputId":"07b1f838-7dbd-44d3-aae3-485a49af2b56"},"outputs":[{"data":{"text/plain":["x1 55\n","y1 55\n","dtype: int64"]},"execution_count":72,"metadata":{},"output_type":"execute_result"}],"source":["df.sum(axis=0)"]},{"cell_type":"markdown","metadata":{"id":"BFLq0yFJaDod"},"source":["#### Melt\n","\n","Permite reformatear o manipular una matriz de datos.\n","\n","\n","Consideremos el siguiente dataframe que contiene el registro de goles que convirtió Colo-Colo (CC) y la Universidad de Chile (U) en la primera y segunda jornada de la liga chilena:"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"bSTfSCAwaDod","outputId":"1156d429-ced0-405a-f653-df1c65890ed1"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
jornadaequipofavorcontra
01CC30
12CC21
21U12
32U51
\n","
"],"text/plain":[" jornada equipo favor contra\n","0 1 CC 3 0\n","1 2 CC 2 1\n","2 1 U 1 2\n","3 2 U 5 1"]},"execution_count":1,"metadata":{},"output_type":"execute_result"}],"source":["\n","# Crear el DataFrame en Python\n","d = pd.DataFrame({'jornada': [1, 2, 1, 2],\n"," 'equipo': ['CC', 'CC', 'U', 'U'],\n"," 'favor': [3, 2, 1, 5],\n"," 'contra': [0, 1, 2, 1]})\n","\n","# Mostrar el DataFrame\n","d\n"]},{"cell_type":"markdown","metadata":{"id":"x3AwEhw6aDoe"},"source":["Por ejemplo, en la jornada 1, Colo-Colo hizo 3 goles (a favor) y no recibió goles en contra. En la misma jornada, la Universidad de Chile hizo 1 gol (a favor) y recibió 2 en contra.\n","\n","Si quiero saber el total de goles de la primera jornada es mediante la suma de las columnas favor y contra:"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"sWCzOpgPaDoe","outputId":"b41a2045-a0ac-49e9-f222-8838e1eefc36"},"outputs":[{"name":"stdout","output_type":"stream","text":["6\n"]}],"source":["# Filtrar las filas donde la columna 'jornada' es igual a 1\n","f1 = d[d['jornada'] == 1]\n","\n","# Sumar las columnas 3 y 4 (indexadas como 2 y 3 en Python)\n","suma = f1.iloc[:, [2, 3]].sum()\n","\n","print(suma.sum())\n"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"sYTVRDMeaDoe","outputId":"0ab56306-c37f-4bc0-da81-b887c1fdaf1b"},"outputs":[{"name":"stdout","output_type":"stream","text":["6\n"]}],"source":["# Sumar las columnas \"favor\" y \"contra\" del DataFrame filtrado f1\n","suma = f1[[\"favor\", \"contra\"]].sum()\n","\n","print(suma.sum())\n"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"NYeMi_IRaDoe","outputId":"cd66ebca-c49b-475a-9f27-8aee9049af2b"},"outputs":[{"name":"stdout","output_type":"stream","text":["6\n"]}],"source":["# Filtrar las filas donde la columna 'jornada' es igual a 1 y luego sumar las columnas 3 y 4\n","suma = d.loc[d['jornada'] == 1, ['favor', 'contra']].sum()\n","\n","print(suma.sum())\n"]},{"cell_type":"markdown","metadata":{"id":"x_G4FvReaDoe"},"source":["Ahora, algo más sofisticado es emplear melt() Esta función nos permitirá reformatear la tabla y dejar todos los goles en una sola columna."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"SIkIoCmuaDoe","outputId":"4ad31eb9-913c-4e8f-f010-adbe5b18b54e"},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
jornadaequipovariablevalue
01CCfavor3
12CCfavor2
21Ufavor1
32Ufavor5
41CCcontra0
52CCcontra1
61Ucontra2
72Ucontra1
\n","
"],"text/plain":[" jornada equipo variable value\n","0 1 CC favor 3\n","1 2 CC favor 2\n","2 1 U favor 1\n","3 2 U favor 5\n","4 1 CC contra 0\n","5 2 CC contra 1\n","6 1 U contra 2\n","7 2 U contra 1"]},"execution_count":15,"metadata":{},"output_type":"execute_result"}],"source":["d2 = pd.melt(d, id_vars=[\"jornada\", \"equipo\"]) # jornada y equipo queda fijo, se crea un registro para cada instancia\n","\n","d2 # observe qué es lo que hace"]},{"cell_type":"markdown","metadata":{"id":"JBgZh52TaDoe"},"source":["Con esto formateamos los datos de otra manera. En la función se le indica que deje fijas las columnas jornada y equipo, y cree un registro para cada instancia de favor y otro en contra. Además, observe el nombre de las nuevas columnas.\n","\n","Con esto podríamos sumar más fácilmente todos los goles de la primera fecha:"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"yFL9vTFDaDof","outputId":"7d0d0a13-9b9a-40aa-82c0-bc1a466beeaf"},"outputs":[{"data":{"text/plain":["6"]},"execution_count":20,"metadata":{},"output_type":"execute_result"}],"source":["# Filtrar las filas donde la columna 'jornada' es igual a 1 en el DataFrame d2\n","f2 = d2[d2['jornada'] == 1]\n","\n","f2['value'].sum()\n"]}],"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.11.4"},"colab":{"provenance":[]}},"nbformat":4,"nbformat_minor":0}