{ "cells": [ { "cell_type": "markdown", "id": "228d57af-2bd1-458d-860b-e17b2fe7d445", "metadata": {}, "source": [ "## Example: Bar Ordering\n", "\n", "Don't neglect the role of ordering when working with categorical variables. These provide an additional opportunity to emphasize or highlight.\n", "\n" ] }, { "cell_type": "code", "execution_count": 3, "id": "279c3be4-53f3-4db5-a4e3-8f55019529af", "metadata": {}, "outputs": [], "source": [ "import altair as alt\n", "import polars as pl" ] }, { "cell_type": "code", "execution_count": 4, "id": "b071e9be-1fad-4852-87b8-38157be6de66", "metadata": {}, "outputs": [], "source": [ "countries = {\n", " \"AA\": 13, \"AB\": 45, \"AC\": 30, \"AD\": 14, \"AE\": 21,\n", " \"BA\": 17, \"BB\": 25, \"BC\": 29, \"BD\": 16, \"BE\": 21,\n", " \"CA\": 20, \"CB\": 22, \"CC\": 28, \"CD\": 18, \"CE\": 24,\n", " \"DA\": 40, \"DB\": 33, \"DC\": 30, \"DD\": 13, \"CE\": 28,\n", " \"EA\": 16, \"EB\": 45, \"EC\": 27, \"ED\": 80, \"CE\": 33,\n", " }" ] }, { "cell_type": "code", "execution_count": 6, "id": "f9d54624-a50e-414e-a29c-c8371fe5b98f", "metadata": {}, "outputs": [], "source": [ "df = pl.DataFrame(countries).unpivot(variable_name=\"country\")" ] }, { "cell_type": "code", "execution_count": 7, "id": "334fe526-2536-478c-9457-aad9e166eb1d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "alt.Chart(df).mark_bar().encode(x=\"country\", y=\"value\")" ] }, { "cell_type": "markdown", "id": "e163a712-3674-48bf-b0d3-f1d279c5ba2b", "metadata": {}, "source": [ "This alphabetical ordering may serve you well if you want people to quickly be able to find their country. But you can consider other orderings & groupings that might make your point better." ] }, { "cell_type": "code", "execution_count": 10, "id": "7883c59e-0bb4-4713-8772-aca9441e6800", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ordering by value to emphasize outliers\n", "alt.Chart(df).mark_bar().encode(\n", " alt.X(\"country\", sort=alt.EncodingSortField(field=\"value\", order=\"ascending\")\n", " ),\n", " y=\"value\",\n", ")" ] }, { "cell_type": "code", "execution_count": 23, "id": "1b965691-7e32-45d7-b1db-833e353c230d", "metadata": {}, "outputs": [], "source": [ "# adding a grouping column and using it for coloring/grouping\n", "# using first letter of country for a pretend grouping -- \n", "# in reality you could group by region/characteristics\n", "df_grouped = df.with_columns(grouping=pl.col(\"country\").str.slice(0, 1))" ] }, { "cell_type": "code", "execution_count": 24, "id": "c7735212-e6aa-405d-9b76-87a92565e88b", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "alt.Chart(df_grouped).mark_bar().encode(\n", " alt.X(\"country\", sort=alt.EncodingSortField(field=\"group\", order=\"ascending\")),\n", " y=\"value\",\n", " color=\"grouping:N\",\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "a721eddb-a9b1-457a-9648-e9606b62f044", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.15" } }, "nbformat": 4, "nbformat_minor": 5 }