SparseArray =========== .. cpp:class:: pandas::SparseArray Extension array type for specialized data storage. Example ------- .. code-block:: cpp #include using namespace pandas; // Use SparseArray SparseArray obj; // ... operations ... Constructors ------------ .. list-table:: :widths: 55 25 20 :header-rows: 1 * - Signature - Location - Example * - ``SparseArray(const numpy::NDArray& dense, T fill_value)`` - pd_sparse_array.h:81 - * - ``explicit SparseArray(const numpy::NDArray& dense)`` - pd_sparse_array.h:94 - * - ``SparseArray(const numpy::NDArray& sp_values, const numpy::NDArray& sp_index, T fill_value, size_t length)`` - pd_sparse_array.h:110 - * - ``SparseArray(const std::vector& values, T fill_value)`` - pd_sparse_array.h:125 - Construction ------------ .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``static SparseArray from_dense(const numpy::NDArray& dense, T fill_value)`` - static SparseArray - pd_sparse_array.h:808 - :ref:`View ` * - ``static SparseArray from_dense(const numpy::NDArray& dense)`` - static SparseArray - pd_sparse_array.h:815 - :ref:`View ` Indexing / Selection -------------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``T at(size_t index) const`` - T - pd_sparse_array.h:269 - :ref:`View ` Missing Data ------------ .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``SparseArray fillna(T value) const`` - SparseArray - pd_sparse_array.h:378 - :ref:`View ` * - ``numpy::NDArray isna() const`` - numpy::NDArray - pd_sparse_array.h:334 - :ref:`View ` * - ``numpy::NDArray notna() const`` - numpy::NDArray - pd_sparse_array.h:365 - :ref:`View ` Statistics ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``size_t count() const`` - size_t - pd_sparse_array.h:425 - :ref:`View ` * - ``std::optional max() const`` - std::optional - pd_sparse_array.h:667 - :ref:`View ` * - ``double mean() const`` - double - pd_sparse_array.h:595 - :ref:`View ` * - ``std::optional min() const`` - std::optional - pd_sparse_array.h:630 - :ref:`View ` * - ``std::optional std(int ddof = 1) const`` - std::optional - pd_sparse_array.h:704 - :ref:`View ` * - ``T sum() const`` - T - pd_sparse_array.h:570 - :ref:`View ` * - ``std::optional var(int ddof = 1) const`` - std::optional - pd_sparse_array.h:713 - :ref:`View ` Comparison ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``numpy::NDArray compare_op(const SparseArray& other, Op op) const`` - numpy::NDArray - pd_sparse_array.h:1043 - * - ``size_t len() const`` - size_t - pd_sparse_array.h:195 - :ref:`View ` Combining --------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``static SparseArray concat(const std::vector>& arrays)`` - static SparseArray - pd_sparse_array.h:822 - :ref:`View ` I/O --- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``numpy::NDArray to_dense() const`` - numpy::NDArray - pd_sparse_array.h:298 - :ref:`View ` * - ``std::string to_string() const`` - std::string - pd_sparse_array.h:903 - :ref:`View ` Conversion ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``SparseArray copy() const`` - SparseArray - pd_sparse_array.h:318 - :ref:`View ` Type Checking ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``bool is_fill(size_t index) const`` - bool - pd_sparse_array.h:276 - :ref:`View ` * - ``bool is_fill_value(T val) const`` - bool - pd_sparse_array.h:48 - Other Methods ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``bool all() const`` - bool - pd_sparse_array.h:778 - :ref:`View ` * - ``bool any() const`` - bool - pd_sparse_array.h:752 - :ref:`View ` * - ``SparseArray binary_op(const SparseArray& other, Op op) const`` - SparseArray - pd_sparse_array.h:1010 - * - ``build_from_dense(dense)`` - - pd_sparse_array.h:87 - * - ``build_from_dense(dense)`` - - pd_sparse_array.h:100 - * - ``build_from_dense(dense)`` - - pd_sparse_array.h:135 - * - ``void build_from_dense(const numpy::NDArray& dense)`` - void - pd_sparse_array.h:947 - * - ``double density() const`` - double - pd_sparse_array.h:235 - :ref:`View ` * - ``dtype_type dtype() const`` - dtype_type - pd_sparse_array.h:145 - :ref:`View ` * - ``bool empty() const`` - bool - pd_sparse_array.h:188 - :ref:`View ` * - ``T fill_value() const`` - T - pd_sparse_array.h:220 - :ref:`View ` * - ``T find_value_at(size_t index) const`` - T - pd_sparse_array.h:988 - * - ``size_t nbytes() const`` - size_t - pd_sparse_array.h:159 - :ref:`View ` * - ``size_t nbytes_dense() const`` - size_t - pd_sparse_array.h:167 - * - ``constexpr int ndim() const`` - constexpr int - pd_sparse_array.h:174 - :ref:`View ` * - ``size_t npoints() const`` - size_t - pd_sparse_array.h:227 - :ref:`View ` * - ``op(dense1.getElementAt({i}), dense2.getElementAt({i})))`` - - pd_sparse_array.h:1022 - * - ``op(fill_value_, scalar), length_)`` - - pd_sparse_array.h:1039 - * - ``std::string repr() const`` - std::string - pd_sparse_array.h:934 - :ref:`View ` * - ``numpy::NDArray scalar_compare(T scalar, Op op) const`` - numpy::NDArray - pd_sparse_array.h:1060 - * - ``SparseArray scalar_op(T scalar, Op op) const`` - SparseArray - pd_sparse_array.h:1031 - * - ``std::vector shape() const`` - std::vector - pd_sparse_array.h:181 - :ref:`View ` * - ``size_t size() const`` - size_t - pd_sparse_array.h:152 - :ref:`View ` * - ``const numpy::NDArray& sp_index() const`` - const numpy::NDArray& - pd_sparse_array.h:213 - :ref:`View ` * - ``const numpy::NDArray& sp_values() const`` - const numpy::NDArray& - pd_sparse_array.h:206 - :ref:`View ` * - ``std::string sparse_dtype_footer() const`` - std::string - pd_sparse_array.h:890 - * - ``double sparsity() const`` - double - pd_sparse_array.h:244 - :ref:`View ` * - ``void validate_sparse_data()`` - void - pd_sparse_array.h:968 - Internal Methods ---------------- *1 internal methods (prefixed with underscore)* Code Examples ------------- The following examples are extracted from the test suite. .. _example-sparsearray-from_dense-0: .. dropdown:: from_dense (pd_test_1_all.cpp:3164) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3154 :emphasize-lines: 11 } void pd_test_sparse_array_from_dense() { std::cout << "========= SparseArray: from_dense ======================= "; numpy::NDArray dense(std::vector{10}); for (size_t i = 0; i < 10; ++i) { dense.setElementAt({i}, (i == 3 || i == 7) ? 5.0 : 0.0); } auto sparse = pandas::SparseArray::from_dense(dense, 0.0); if (sparse.size() != 10) { std::cout << " [FAIL] : in pd_test_sparse_array_from_dense() : size != 10" << std::endl; throw std::runtime_error("pd_test_sparse_array_from_dense failed: size != 10"); } if (sparse.npoints() != 2) { std::cout << " [FAIL] : in pd_test_sparse_array_from_dense() : npoints != 2" << std::endl; throw std::runtime_error("pd_test_sparse_array_from_dense failed: npoints != 2"); } .. _example-sparsearray-from_dense-1: .. dropdown:: from_dense (pd_test_1_all.cpp:3164) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3154 :emphasize-lines: 11 } void pd_test_sparse_array_from_dense() { std::cout << "========= SparseArray: from_dense ======================= "; numpy::NDArray dense(std::vector{10}); for (size_t i = 0; i < 10; ++i) { dense.setElementAt({i}, (i == 3 || i == 7) ? 5.0 : 0.0); } auto sparse = pandas::SparseArray::from_dense(dense, 0.0); if (sparse.size() != 10) { std::cout << " [FAIL] : in pd_test_sparse_array_from_dense() : size != 10" << std::endl; throw std::runtime_error("pd_test_sparse_array_from_dense failed: size != 10"); } if (sparse.npoints() != 2) { std::cout << " [FAIL] : in pd_test_sparse_array_from_dense() : npoints != 2" << std::endl; throw std::runtime_error("pd_test_sparse_array_from_dense failed: npoints != 2"); } .. _example-sparsearray-at-2: .. dropdown:: at (pd_test_1_all.cpp:6581) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6571 :emphasize-lines: 11 // Test isna/notna with float data { std::map> float_data; float_data["X"] = {1.0, std::nan(""), 3.0}; float_data["Y"] = {4.0, 5.0, std::nan("")}; pandas::DataFrame df_na(float_data); auto na_mask = df_na.isna(); // Row 1, col 0 (X) should be NA if (!na_mask.getElementAt({1, 0})) { std::cout << " [FAIL] : in pd_test_dataframe_manipulation() : isna at (1,0) should be true" << std::endl; throw std::runtime_error("pd_test_dataframe_manipulation failed: isna at (1,0)"); } // Row 2, col 1 (Y) should be NA if (!na_mask.getElementAt({2, 1})) { std::cout << " [FAIL] : in pd_test_dataframe_manipulation() : isna at (2,1) should be true" << std::endl; throw std::runtime_error("pd_test_dataframe_manipulation failed: isna at (2,1)"); } // Row 0, col 0 should NOT be NA if (na_mask.getElementAt({0, 0})) { std::cout << " [FAIL] : in pd_test_dataframe_manipulation() : isna at (0,0) should be false" << std::endl; .. _example-sparsearray-fillna-3: .. dropdown:: fillna (pd_test_1_all.cpp:537) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 527 :emphasize-lines: 11 throw std::runtime_error("pd_test_categorical_array_na_handling failed: isna size != 4"); } // Test dropna pandas::CategoricalArray dropped = arr.dropna(); if (dropped.size() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : dropna size != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: dropna size != 2"); } // Test fillna (fill with existing category) pandas::CategoricalArray filled = arr.fillna("a"); // 'a' is in categories if (filled.has_na()) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : fillna should have no NA" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: fillna should have no NA"); } std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_array_add_categories() { .. _example-sparsearray-isna-4: .. dropdown:: isna (pd_test_1_all.cpp:524) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 514 :emphasize-lines: 11 throw std::runtime_error("pd_test_categorical_array_na_handling failed: has_na() should be true"); } // Test count (non-NA) if (arr.count() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : count() != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: count() != 2"); } // Test isna array numpy::NDArray na_mask = arr.isna(); if (na_mask.getSize() != 4) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : isna size != 4" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: isna size != 4"); } // Test dropna pandas::CategoricalArray dropped = arr.dropna(); if (dropped.size() != 2) { std::cout << " [FAIL] : in pd_test_categorical_array_na_handling() : dropna size != 2" << std::endl; throw std::runtime_error("pd_test_categorical_array_na_handling failed: dropna size != 2"); .. _example-sparsearray-notna-5: .. dropdown:: notna (pd_test_1_all.cpp:6595) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6585 :emphasize-lines: 11 if (!na_mask.getElementAt({2, 1})) { std::cout << " [FAIL] : in pd_test_dataframe_manipulation() : isna at (2,1) should be true" << std::endl; throw std::runtime_error("pd_test_dataframe_manipulation failed: isna at (2,1)"); } // Row 0, col 0 should NOT be NA if (na_mask.getElementAt({0, 0})) { std::cout << " [FAIL] : in pd_test_dataframe_manipulation() : isna at (0,0) should be false" << std::endl; throw std::runtime_error("pd_test_dataframe_manipulation failed: isna at (0,0)"); } auto notna_mask = df_na.notna(); if (notna_mask.getElementAt({1, 0})) { std::cout << " [FAIL] : in pd_test_dataframe_manipulation() : notna at (1,0) should be false" << std::endl; throw std::runtime_error("pd_test_dataframe_manipulation failed: notna at (1,0)"); } } // Test fillna { std::map> float_data; float_data["X"] = {1.0, std::nan(""), 3.0}; .. _example-sparsearray-count-6: .. dropdown:: count (pd_test_1_all.cpp:66) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 56 :emphasize-lines: 11 if (arr.is_na(0)) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : is_na(0) should be false" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: is_na(0) should be false"); } if (!arr.has_na()) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : has_na() should be true" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: has_na() should be true"); } if (arr.count() != 2) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : count() should be 2" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: count() should be 2"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_kleene_and() { std::cout << "========= BooleanArray: Kleene AND ======================= "; .. _example-sparsearray-max-7: .. dropdown:: max (pd_test_1_all.cpp:771) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 761 :emphasize-lines: 11 pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true); // ordered // Test min std::optional min_val = arr.min(); if (!min_val.has_value() || *min_val != "low") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'"); } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); } // Test unordered throws for min/max pandas::CategoricalArray unordered = arr.as_unordered(); bool threw = false; try { unordered.min(); .. _example-sparsearray-mean-8: .. dropdown:: mean (pd_test_1_all.cpp:282) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 272 :emphasize-lines: 11 std::optional(true), std::optional(true) }); auto s = arr.sum(); if (!s.has_value() || s.value() != 3) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : sum should be 3" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: sum"); } auto m = arr.mean(); if (!m.has_value() || std::abs(m.value() - 0.75) > 0.001) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : mean should be 0.75" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_dtype() { std::cout << "========= BooleanArray: dtype ======================= "; .. _example-sparsearray-min-9: .. dropdown:: min (pd_test_1_all.cpp:764) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 754 :emphasize-lines: 11 } void pd_test_categorical_array_ordered_operations() { std::cout << "========= CategoricalArray: ordered operations (min/max) ======================= "; std::vector cats = {"low", "medium", "high"}; std::vector codes = {0, 2, 1, 0, -1}; // low, high, medium, low, NA pandas::CategoricalArray arr = pandas::CategoricalArray::from_codes(codes, cats, true); // ordered // Test min std::optional min_val = arr.min(); if (!min_val.has_value() || *min_val != "low") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : min != 'low'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: min != 'low'"); } // Test max std::optional max_val = arr.max(); if (!max_val.has_value() || *max_val != "high") { std::cout << " [FAIL] : in pd_test_categorical_array_ordered_operations() : max != 'high'" << std::endl; throw std::runtime_error("pd_test_categorical_array_ordered_operations failed: max != 'high'"); .. _example-sparsearray-std-10: .. dropdown:: std (pd_test_1_all.cpp:4526) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 4516 :emphasize-lines: 11 #include "../pandas/pd_series.h" namespace dataframe_tests { namespace dataframe_tests_aggregation { void pd_test_aggregation_series_sem() { std::cout << "========= Series sem ============================"; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto sem_val = s.sem(); // std(ddof=1) = sqrt(2.5), sem = sqrt(2.5)/sqrt(5) ≈ 0.707 bool passed = sem_val.has_value() && std::abs(*sem_val - 0.707) < 0.01; if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_series_sem() : sem value incorrect" << std::endl; throw std::runtime_error("pd_test_aggregation_series_sem failed: sem value incorrect"); } std::cout << " -> tests passed" << std::endl; } void pd_test_aggregation_series_quantile() { .. _example-sparsearray-sum-11: .. dropdown:: sum (pd_test_1_all.cpp:276) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 266 :emphasize-lines: 11 } // Test sum/mean pandas::BooleanArray arr({ std::optional(true), std::optional(false), std::optional(true), std::optional(true) }); auto s = arr.sum(); if (!s.has_value() || s.value() != 3) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : sum should be 3" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: sum"); } auto m = arr.mean(); if (!m.has_value() || std::abs(m.value() - 0.75) > 0.001) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : mean should be 0.75" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } .. _example-sparsearray-var-12: .. dropdown:: var (pd_test_1_all.cpp:20890) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20880 :emphasize-lines: 11 throw std::runtime_error("pd_test_expanding_std failed: expanding std values incorrect"); } std::cout << " -> tests passed" << std::endl; } void pd_test_expanding_var() { std::cout << "========= Expanding var ========================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); auto result = s.expanding().var(); // Expanding var (ddof=1): NaN, 0.5, 1.0, 1.6667, 2.5 bool passed = std::isnan(result[0]) && std::abs(result[1] - 0.5) < 0.001 && std::abs(result[2] - 1.0) < 0.001 && std::abs(result[3] - 1.6667) < 0.001 && std::abs(result[4] - 2.5) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_expanding_var() : expanding var values incorrect" << std::endl; throw std::runtime_error("pd_test_expanding_var failed: expanding var values incorrect"); .. _example-sparsearray-len-13: .. dropdown:: len (pd_test_3_all.cpp:20867) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20857 :emphasize-lines: 11 auto title_result = s.str().title(); if (title_result[0] != "Hello World" || title_result[1] != "Hello World" || title_result[2] != "Hello World") { std::cout << " [FAIL] : title() failed" << std::endl; throw std::runtime_error("pd_test_str_capitalize_title: title() failed"); } std::cout << " -> tests passed" << std::endl; } // ============================================================================ // Test str().len() // ============================================================================ void pd_test_str_len() { std::cout << "========= Series.str().len() ============================"; pandas::Series s({"a", "bb", "ccc", ""}); auto lens = s.str().len(); if (lens[0] != 1 || lens[1] != 2 || lens[2] != 3 || lens[3] != 0) { std::cout << " [FAIL] : len() failed" << std::endl; .. _example-sparsearray-concat-14: .. dropdown:: concat (pd_test_1_all.cpp:17717) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 17707 :emphasize-lines: 11 } void pd_test_period_index_concat() { std::cout << "========= concat factory =============================="; std::vector ordinals1 = {0, 1}; std::vector ordinals2 = {2, 3}; pandas::PeriodIndex idx1(ordinals1, "D"); pandas::PeriodIndex idx2(ordinals2, "D"); pandas::PeriodIndex concatenated = pandas::PeriodIndex::concat({idx1, idx2}); bool passed = (concatenated.size() == 4); if (!passed) { std::cout << " [FAIL] : in pd_test_period_index_concat()" << std::endl; throw std::runtime_error("pd_test_period_index_concat failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-sparsearray-to_dense-15: .. dropdown:: to_dense (pd_test_1_all.cpp:3272) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3262 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_sparse_array_to_dense() { std::cout << "========= SparseArray: to_dense ======================= "; std::vector data = {0.0, 1.0, 0.0, 2.0, 0.0}; pandas::SparseArray arr(data, 0.0); auto dense = arr.to_dense(); if (dense.getSize() != 5) { std::cout << " [FAIL] : in pd_test_sparse_array_to_dense() : dense size != 5" << std::endl; throw std::runtime_error("pd_test_sparse_array_to_dense failed: dense size != 5"); } if (dense.getElementAt({0}) != 0.0 || dense.getElementAt({1}) != 1.0 || dense.getElementAt({2}) != 0.0 || dense.getElementAt({3}) != 2.0 || dense.getElementAt({4}) != 0.0) { .. _example-sparsearray-to_string-16: .. dropdown:: to_string (pd_test_1_all.cpp:2693) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2683 :emphasize-lines: 11 pandas::PeriodArray arr_m(std::vector{ "2020-01", "NaT", "2025-06" }, "M"); // Year auto years = arr_m.year(); auto y0 = years[0]; if (!y0.has_value() || y0.value() != 2020) { std::cout << " [FAIL] : year[0] should be 2020, got " << (y0.has_value() ? std::to_string(y0.value()) : "NA") << std::endl; throw std::runtime_error("pd_test_period_array_year_month_quarter failed: year[0]"); } auto y1 = years[1]; if (y1.has_value()) { std::cout << " [FAIL] : year[1] should be NA (NaT)" << std::endl; throw std::runtime_error("pd_test_period_array_year_month_quarter failed: year[1] should be NA"); } auto y2 = years[2]; .. _example-sparsearray-copy-17: .. dropdown:: copy (pd_test_1_all.cpp:5798) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5788 :emphasize-lines: 11 // ============================================================================ // Copy/Rename Tests // ============================================================================ void pd_test_categorical_index_copy() { std::cout << "========= copy ========================================"; pandas::CategoricalArray arr({"a", "b", "c"}); pandas::CategoricalIndex idx(arr, "original"); pandas::CategoricalIndex copied = idx.copy(); bool passed = (copied.size() == idx.size() && copied.name() == idx.name() && copied.categories() == idx.categories() && copied.ordered() == idx.ordered()); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_copy()" << std::endl; throw std::runtime_error("pd_test_categorical_index_copy failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-sparsearray-is_fill-18: .. dropdown:: is_fill (pd_test_1_all.cpp:3314) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3304 :emphasize-lines: 11 throw std::runtime_error("pd_test_sparse_array_element_access failed: arr[3] != 10.0"); } // Access fill values if (arr[0] != 0.0) { std::cout << " [FAIL] : in pd_test_sparse_array_element_access() : arr[0] != fill_value" << std::endl; throw std::runtime_error("pd_test_sparse_array_element_access failed: arr[0] != fill_value"); } // Test is_fill if (!arr.is_fill(0)) { std::cout << " [FAIL] : in pd_test_sparse_array_element_access() : is_fill(0) should be true" << std::endl; throw std::runtime_error("pd_test_sparse_array_element_access failed: is_fill(0) should be true"); } if (arr.is_fill(1)) { std::cout << " [FAIL] : in pd_test_sparse_array_element_access() : is_fill(1) should be false" << std::endl; throw std::runtime_error("pd_test_sparse_array_element_access failed: is_fill(1) should be false"); } std::cout << " -> tests passed" << std::endl; .. _example-sparsearray-all-19: .. dropdown:: all (pd_test_1_all.cpp:247) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 237 :emphasize-lines: 11 pandas::BooleanArray has_true({ std::optional(false), std::optional(true) }); any_result = has_true.any(); if (!any_result.has_value() || !any_result.value()) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : any() with True" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: any() with True"); } // Test all() pandas::BooleanArray all_true({ std::optional(true), std::optional(true) }); auto all_result = all_true.all(); if (!all_result.has_value() || !all_result.value()) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : all() of all True" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: all() all True"); } .. _example-sparsearray-any-20: .. dropdown:: any (pd_test_1_all.cpp:226) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 216 :emphasize-lines: 11 std::cout << " [FAIL] : in pd_test_boolean_array_kleene_not() : ~NA should be NA" << std::endl; throw std::runtime_error("pd_test_boolean_array_kleene_not failed: ~NA"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_reductions() { std::cout << "========= BooleanArray: reductions ======================= "; // Test any() pandas::BooleanArray all_false({ std::optional(false), std::optional(false) }); auto any_result = all_false.any(); if (!any_result.has_value() || any_result.value()) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : any() of all False" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: any() all False"); } .. _example-sparsearray-density-21: .. dropdown:: density (pd_test_1_all.cpp:3247) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3237 :emphasize-lines: 11 std::cout << " [FAIL] : in pd_test_sparse_array_fill_value_property() : default float fill_value should be NaN" << std::endl; throw std::runtime_error("pd_test_sparse_array_fill_value_property failed: default float fill_value should be NaN"); } std::cout << " -> tests passed" << std::endl; } void pd_test_sparse_array_density() { std::cout << "========= SparseArray: density ======================= "; // 20% density (2 non-fill out of 10) std::vector data = {0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0}; pandas::SparseArray arr(data, 0.0); double density = arr.density(); if (std::abs(density - 0.2) > 0.001) { std::cout << " [FAIL] : in pd_test_sparse_array_density() : density != 0.2, got " << density << std::endl; throw std::runtime_error("pd_test_sparse_array_density failed: density != 0.2"); } double sparsity = arr.sparsity(); .. _example-sparsearray-dtype-22: .. dropdown:: dtype (pd_test_1_all.cpp:295) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 285 :emphasize-lines: 11 throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_dtype() { std::cout << "========= BooleanArray: dtype ======================= "; pandas::BooleanArray arr; if (arr.dtype().name() != "boolean") { std::cout << " [FAIL] : in pd_test_boolean_array_dtype() : dtype name should be 'boolean'" << std::endl; throw std::runtime_error("pd_test_boolean_array_dtype failed: dtype name"); } if (arr.dtype().kind() != "b") { std::cout << " [FAIL] : in pd_test_boolean_array_dtype() : dtype kind should be 'b'" << std::endl; throw std::runtime_error("pd_test_boolean_array_dtype failed: dtype kind"); } std::cout << " -> tests passed" << std::endl; .. _example-sparsearray-empty-23: .. dropdown:: empty (pd_test_1_all.cpp:941) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 931 :emphasize-lines: 11 #include "../pandas/pd_config.h" namespace dataframe_tests { namespace dataframe_tests_config { void pd_test_config_version() { std::cout << "========= df_config: version info ======================= "; const char* version = pandas::DataFrameInfo::version(); if (version == nullptr || std::string(version).empty()) { std::cout << "[FAIL] : in pd_test_config_version() : version is null or empty" << std::endl; throw std::runtime_error("pd_test_config_version failed: version is null or empty"); } std::cout << "-> tests passed" << std::endl; } void pd_test_config_na_repr() { std::cout << "========= df_config: NA representation ======================= "; const char* na_repr = pandas::DataFrameConfig::get_na_repr(); if (na_repr == nullptr) { .. _example-sparsearray-fill_value-24: .. dropdown:: fill_value (pd_test_1_all.cpp:3229) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3219 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_sparse_array_fill_value_property() { std::cout << "========= SparseArray: fill_value property ======================= "; std::vector data = {-1, 5, -1, 10, -1}; pandas::SparseArray arr(data, static_cast(-1)); if (arr.fill_value() != -1) { std::cout << " [FAIL] : in pd_test_sparse_array_fill_value_property() : fill_value != -1" << std::endl; throw std::runtime_error("pd_test_sparse_array_fill_value_property failed: fill_value != -1"); } // Test default fill_value for float (NaN) pandas::SparseArray arr_float; if (!std::isnan(arr_float.fill_value())) { std::cout << " [FAIL] : in pd_test_sparse_array_fill_value_property() : default float fill_value should be NaN" << std::endl; throw std::runtime_error("pd_test_sparse_array_fill_value_property failed: default float fill_value should be NaN"); } .. _example-sparsearray-nbytes-25: .. dropdown:: nbytes (pd_test_1_all.cpp:6214) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6204 :emphasize-lines: 11 } // Test empty DataFrame pandas::DataFrame empty_df; if (!empty_df.empty()) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : should be empty" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: should be empty"); } // Test nbytes > 0 for non-empty if (df.nbytes() == 0) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : nbytes should be > 0" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: nbytes should be > 0"); } // Test columns index if (df.columns().size() != 3) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : columns size != 3" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: columns size != 3"); } .. _example-sparsearray-ndim-26: .. dropdown:: ndim (pd_test_1_all.cpp:6195) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6185 :emphasize-lines: 11 pandas::DataFrame df(data); // Test shape auto shape = df.shape(); if (shape.size() != 2 || shape[0] != 4 || shape[1] != 3) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : shape mismatch" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: shape mismatch"); } // Test ndim if (df.ndim() != 2) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : ndim != 2" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: ndim != 2"); } // Test empty if (df.empty()) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : should not be empty" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: should not be empty"); } .. _example-sparsearray-npoints-27: .. dropdown:: npoints (pd_test_1_all.cpp:3171) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3161 :emphasize-lines: 11 dense.setElementAt({i}, (i == 3 || i == 7) ? 5.0 : 0.0); } auto sparse = pandas::SparseArray::from_dense(dense, 0.0); if (sparse.size() != 10) { std::cout << " [FAIL] : in pd_test_sparse_array_from_dense() : size != 10" << std::endl; throw std::runtime_error("pd_test_sparse_array_from_dense failed: size != 10"); } if (sparse.npoints() != 2) { std::cout << " [FAIL] : in pd_test_sparse_array_from_dense() : npoints != 2" << std::endl; throw std::runtime_error("pd_test_sparse_array_from_dense failed: npoints != 2"); } std::cout << " -> tests passed" << std::endl; } void pd_test_sparse_array_sp_values_property() { std::cout << "========= SparseArray: sp_values property ======================= "; .. _example-sparsearray-repr-28: .. dropdown:: repr (pd_test_1_all.cpp:10906) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 10896 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_extension_index_repr() { std::cout << "========= repr ========================="; pandas::CategoricalArray arr({"a", "b", "c"}); // Use ExtensionIndex directly to test base class repr pandas::ExtensionIndex idx(arr, "test"); std::string repr_str = idx.repr(); bool passed = (!repr_str.empty() && repr_str.find("ExtensionIndex") != std::string::npos); if (!passed) { std::cout << " [FAIL] : in pd_test_extension_index_repr() : repr check failed" << std::endl; throw std::runtime_error("pd_test_extension_index_repr failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-sparsearray-shape-29: .. dropdown:: shape (pd_test_1_all.cpp:6188) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6178 :emphasize-lines: 11 std::cout << "========= properties ======================="; std::map> data; data["A"] = {1.0, 2.0, 3.0, 4.0}; data["B"] = {5.0, 6.0, 7.0, 8.0}; data["C"] = {9.0, 10.0, 11.0, 12.0}; pandas::DataFrame df(data); // Test shape auto shape = df.shape(); if (shape.size() != 2 || shape[0] != 4 || shape[1] != 3) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : shape mismatch" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: shape mismatch"); } // Test ndim if (df.ndim() != 2) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : ndim != 2" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: ndim != 2"); } .. _example-sparsearray-size-30: .. dropdown:: size (pd_test_1_all.cpp:22) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 12 :emphasize-lines: 11 #include "../pandas/pd_boolean_array.h" namespace dataframe_tests { namespace dataframe_tests_boolean_array { void pd_test_boolean_array_constructors() { std::cout << "========= BooleanArray: constructors ======================= "; // Default constructor pandas::BooleanArray arr1; if (arr1.size() != 0) { std::cout << " [FAIL] : in pd_test_boolean_array_constructors() : default constructor size != 0" << std::endl; throw std::runtime_error("pd_test_boolean_array_constructors failed: default constructor size != 0"); } // Initializer list constructor pandas::BooleanArray arr2({ std::optional(true), std::optional(false), std::nullopt, std::optional(true) .. _example-sparsearray-sp_index-31: .. dropdown:: sp_index (pd_test_1_all.cpp:3207) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3197 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_sparse_array_sp_index_property() { std::cout << "========= SparseArray: sp_index property ======================= "; std::vector data = {0.0, 1.0, 0.0, 2.0, 0.0, 3.0}; pandas::SparseArray arr(data, 0.0); const auto& sp_idx = arr.sp_index(); if (sp_idx.getSize() != 3) { std::cout << " [FAIL] : in pd_test_sparse_array_sp_index_property() : sp_index size != 3" << std::endl; throw std::runtime_error("pd_test_sparse_array_sp_index_property failed: sp_index size != 3"); } if (sp_idx.getElementAt({0}) != 1 || sp_idx.getElementAt({1}) != 3 || sp_idx.getElementAt({2}) != 5) { std::cout << " [FAIL] : in pd_test_sparse_array_sp_index_property() : sp_index content mismatch" << std::endl; throw std::runtime_error("pd_test_sparse_array_sp_index_property failed: sp_index content mismatch"); .. _example-sparsearray-sp_values-32: .. dropdown:: sp_values (pd_test_1_all.cpp:3185) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3175 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_sparse_array_sp_values_property() { std::cout << "========= SparseArray: sp_values property ======================= "; std::vector data = {0.0, 1.0, 0.0, 2.0, 0.0, 3.0}; pandas::SparseArray arr(data, 0.0); const auto& sp_vals = arr.sp_values(); if (sp_vals.getSize() != 3) { std::cout << " [FAIL] : in pd_test_sparse_array_sp_values_property() : sp_values size != 3" << std::endl; throw std::runtime_error("pd_test_sparse_array_sp_values_property failed: sp_values size != 3"); } if (sp_vals.getElementAt({0}) != 1.0 || sp_vals.getElementAt({1}) != 2.0 || sp_vals.getElementAt({2}) != 3.0) { std::cout << " [FAIL] : in pd_test_sparse_array_sp_values_property() : sp_values content mismatch" << std::endl; throw std::runtime_error("pd_test_sparse_array_sp_values_property failed: sp_values content mismatch"); .. _example-sparsearray-sparsity-33: .. dropdown:: sparsity (pd_test_1_all.cpp:3257) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 3247 :emphasize-lines: 11 // 20% density (2 non-fill out of 10) std::vector data = {0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0}; pandas::SparseArray arr(data, 0.0); double density = arr.density(); if (std::abs(density - 0.2) > 0.001) { std::cout << " [FAIL] : in pd_test_sparse_array_density() : density != 0.2, got " << density << std::endl; throw std::runtime_error("pd_test_sparse_array_density failed: density != 0.2"); } double sparsity = arr.sparsity(); if (std::abs(sparsity - 0.8) > 0.001) { std::cout << " [FAIL] : in pd_test_sparse_array_density() : sparsity != 0.8, got " << sparsity << std::endl; throw std::runtime_error("pd_test_sparse_array_density failed: sparsity != 0.8"); } std::cout << " -> tests passed" << std::endl; } void pd_test_sparse_array_to_dense() { std::cout << "========= SparseArray: to_dense ======================= ";