GroupBy ======= .. currentmodule:: pandasCore GroupBy operations allow you to split data into groups based on some criteria, apply a function to each group independently, and combine the results. Example ------- .. code-block:: python import pandasCore as pd df = pd.DataFrame({ 'A': ['foo', 'bar', 'foo', 'bar'], 'B': [1, 2, 3, 4], 'C': [2.0, 4.0, 6.0, 8.0] }) # Group by single column grouped = df.groupby('A') # Aggregation df.groupby('A').sum() df.groupby('A').mean() df.groupby('A').agg(['sum', 'mean']) Parameters ---------- .. list-table:: :widths: 15 10 10 65 :header-rows: 1 * - Parameter - Type - Default - Description * - by - str/list - required - Column(s) to group by * - axis - int - 0 - Split along rows (0) or columns (1) * - level - int/str - None - Group by index level * - as_index - bool - True - Return with group labels as index * - sort - bool - True - Sort group keys * - group_keys - bool - True - Add group keys to index * - observed - bool - False - Only show observed values for categoricals * - dropna - bool - True - Drop groups with NA values Aggregation Methods ------------------- .. list-table:: :widths: 20 80 :header-rows: 1 * - Method - Description * - count() - Count of values * - sum() - Sum of values * - mean() - Mean of values * - median() - Median of values * - std() - Standard deviation * - var() - Variance * - min() - Minimum * - max() - Maximum * - first() - First value * - last() - Last value * - prod() - Product * - size() - Group sizes * - sem() - Standard error of mean * - describe() - Descriptive statistics * - nunique() - Count unique values Transformation Methods ---------------------- .. list-table:: :widths: 20 80 :header-rows: 1 * - Method - Description * - apply(func) - Apply function to each group * - transform(func) - Transform with function * - filter(func) - Filter groups with function * - agg(func) - Aggregate with function(s) Iteration --------- .. code-block:: python # Iterate over groups for name, group in df.groupby('A'): print(name) print(group)