NDFrameBase =========== .. cpp:class:: pandas::NDFrameBase Core data container class in the pandas namespace. Example ------- .. code-block:: cpp #include using namespace pandas; // Use NDFrameBase NDFrameBase obj; // ... operations ... Construction ------------ .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``virtual std::unique_ptr create_nan_filled(size_t n) const = 0`` - virtual std::unique_ptr - pd_ndframe_base.h:477 - Indexing / Selection -------------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``virtual const Attrs& attrs() const = 0`` - virtual const Attrs& - pd_ndframe_base.h:173 - :ref:`View ` * - ``virtual Attrs& attrs() = 0`` - virtual Attrs& - pd_ndframe_base.h:179 - :ref:`View ` * - ``virtual const std::vector& get_cat_categories() const`` - virtual const std::vector& - pd_ndframe_base.h:422 - :ref:`View ` * - ``virtual std::string get_cat_categories_dtype() const`` - virtual std::string - pd_ndframe_base.h:446 - * - ``virtual bool get_value_bool(size_t idx) const = 0`` - virtual bool - pd_ndframe_base.h:351 - :ref:`View ` * - ``virtual double get_value_double(size_t idx) const = 0`` - virtual double - pd_ndframe_base.h:282 - :ref:`View ` * - ``virtual std::string get_value_str(size_t idx) const = 0`` - virtual std::string - pd_ndframe_base.h:256 - :ref:`View ` * - ``virtual bool mask_at(size_t) const`` - virtual bool - pd_ndframe_base.h:145 - :ref:`View ` * - ``virtual void set_value_double(size_t idx, double value) = 0`` - virtual void - pd_ndframe_base.h:364 - * - ``virtual void set_value_nan(size_t idx) = 0`` - virtual void - pd_ndframe_base.h:357 - :ref:`View ` * - ``virtual void set_value_str(size_t idx, const std::string& value)`` - virtual void - pd_ndframe_base.h:371 - * - ``virtual std::unique_ptr take_indices(const std::vector& indices) const = 0`` - virtual std::unique_ptr - pd_ndframe_base.h:494 - Data Manipulation ----------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``virtual std::unique_ptr reindex_with_indexer(const numpy::NDArray& indexer) const = 0`` - virtual std::unique_ptr - pd_ndframe_base.h:502 - :ref:`View ` * - ``virtual std::unique_ptr reindex_with_indexer_as( const std::string& target_dtype, const numpy::NDArray& indexer) const`` - virtual std::unique_ptr - pd_ndframe_base.h:527 - * - ``virtual void replace_value(double to_replace, double value) = 0`` - virtual void - pd_ndframe_base.h:344 - * - ``virtual void set_index(std::unique_ptr new_index) = 0`` - virtual void - pd_ndframe_base.h:232 - :ref:`View ` Missing Data ------------ .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``virtual void fillna_double(double value) = 0`` - virtual void - pd_ndframe_base.h:330 - * - ``virtual void fillna_string(const std::string& value) { (void)value`` - virtual void - pd_ndframe_base.h:337 - :ref:`View ` Statistics ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``virtual size_t count() const = 0`` - virtual size_t - pd_ndframe_base.h:317 - :ref:`View ` * - ``virtual int max_decimal_places() const`` - virtual int - pd_ndframe_base.h:268 - * - ``double sum() const`` - double - pd_ndframe_base.h:288 - :ref:`View ` Reshaping --------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``DataFrame unstack(int level = -1) const`` - DataFrame - pd_ndframe_base.h:470 - :ref:`View ` Combining --------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``virtual std::unique_ptr concat_with(const NDFrameBase& other) const = 0`` - virtual std::unique_ptr - pd_ndframe_base.h:487 - I/O --- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``virtual std::string to_string() const = 0`` - virtual std::string - pd_ndframe_base.h:544 - :ref:`View ` * - ``virtual std::vector to_string_vector() const = 0`` - virtual std::vector - pd_ndframe_base.h:301 - :ref:`View ` Conversion ---------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``virtual std::unique_ptr astype_dtype(const std::string& dtype_str) const`` - virtual std::unique_ptr - pd_ndframe_base.h:513 - :ref:`View ` * - ``void copy_frame_flags_from(const NDFrameBase& src)`` - void - pd_ndframe_base.h:591 - * - ``virtual void copy_value_from(size_t src_idx, size_t dst_idx) = 0`` - virtual void - pd_ndframe_base.h:384 - Type Checking ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``bool is_bool_dtype() const`` - bool - pd_ndframe_base.h:110 - * - ``virtual bool is_na_at(size_t idx) const = 0`` - virtual bool - pd_ndframe_base.h:324 - :ref:`View ` Other Methods ------------- .. list-table:: :widths: 40 20 15 25 :header-rows: 1 * - Signature - Return Type - Location - Example * - ``virtual bool all(int axis = 0, bool bool_only = false, bool skipna = true) const = 0`` - virtual bool - pd_ndframe_base.h:397 - :ref:`View ` * - ``virtual bool all_values_whole_number() const`` - virtual bool - pd_ndframe_base.h:262 - :ref:`View ` * - ``virtual bool any(int axis = 0, bool bool_only = false, bool skipna = true) const = 0`` - virtual bool - pd_ndframe_base.h:406 - :ref:`View ` * - ``virtual std::vector axes() const = 0`` - virtual std::vector - pd_ndframe_base.h:245 - :ref:`View ` * - ``size_t cache_memory_usage() const override`` - size_t - pd_ndframe_base.h:569 - * - ``virtual bool cat_ordered() const`` - virtual bool - pd_ndframe_base.h:436 - :ref:`View ` * - ``void clear_cache() const override = 0`` - void - pd_ndframe_base.h:559 - :ref:`View ` * - ``virtual void clear_dtype_override()`` - virtual void - pd_ndframe_base.h:124 - * - ``virtual std::unique_ptr clone() const = 0`` - virtual std::unique_ptr - pd_ndframe_base.h:461 - :ref:`View ` * - ``virtual std::string dtype_name() const = 0`` - virtual std::string - pd_ndframe_base.h:99 - :ref:`View ` * - ``virtual std::string dtype_name_full() const { return dtype_name()`` - virtual std::string - pd_ndframe_base.h:105 - :ref:`View ` * - ``virtual bool empty() const = 0`` - virtual bool - pd_ndframe_base.h:93 - :ref:`View ` * - ``virtual const Flags& flags() const = 0`` - virtual const Flags& - pd_ndframe_base.h:191 - :ref:`View ` * - ``bool has_cached_values() const override = 0`` - bool - pd_ndframe_base.h:564 - :ref:`View ` * - ``virtual bool has_cat_categories() const`` - virtual bool - pd_ndframe_base.h:416 - :ref:`View ` * - ``virtual bool has_mask() const`` - virtual bool - pd_ndframe_base.h:140 - :ref:`View ` * - ``virtual bool hasnans() const = 0`` - virtual bool - pd_ndframe_base.h:311 - :ref:`View ` * - ``virtual const IndexBase& index() const = 0`` - virtual const IndexBase& - pd_ndframe_base.h:225 - :ref:`View ` * - ``virtual std::optional name() const`` - virtual std::optional - pd_ndframe_base.h:209 - :ref:`View ` * - ``virtual size_t nbytes() const = 0`` - virtual size_t - pd_ndframe_base.h:163 - :ref:`View ` * - ``virtual size_t ndim() const = 0`` - virtual size_t - pd_ndframe_base.h:157 - :ref:`View ` * - ``virtual std::string repr() const = 0`` - virtual std::string - pd_ndframe_base.h:550 - :ref:`View ` * - ``virtual void set_attrs(const Attrs& attrs) = 0`` - virtual void - pd_ndframe_base.h:185 - * - ``virtual void set_cat_categories(const std::vector& /\*cats\*/)`` - virtual void - pd_ndframe_base.h:431 - :ref:`View ` * - ``virtual void set_cat_categories_dtype(const std::string& /\*dtype\*/)`` - virtual void - pd_ndframe_base.h:451 - * - ``virtual void set_cat_ordered(bool /\*ordered\*/)`` - virtual void - pd_ndframe_base.h:441 - :ref:`View ` * - ``virtual void set_dtype_override(const std::string&)`` - virtual void - pd_ndframe_base.h:119 - :ref:`View ` * - ``virtual void set_flags(const Flags& flags, bool copy = true, bool allows_duplicate_labels = true) = 0`` - virtual void - pd_ndframe_base.h:199 - :ref:`View ` * - ``virtual void set_name(const std::optional& /\*name\*/)`` - virtual void - pd_ndframe_base.h:215 - :ref:`View ` * - ``virtual void set_sparse_fill_value(double)`` - virtual void - pd_ndframe_base.h:135 - * - ``void set_string_na_sentinel_disabled(bool v)`` - void - pd_ndframe_base.h:583 - :ref:`View ` * - ``virtual std::vector shape() const = 0`` - virtual std::vector - pd_ndframe_base.h:151 - :ref:`View ` * - ``virtual size_t size() const = 0`` - virtual size_t - pd_ndframe_base.h:87 - :ref:`View ` * - ``bool string_na_sentinel_disabled() const`` - bool - pd_ndframe_base.h:584 - :ref:`View ` Code Examples ------------- The following examples are extracted from the test suite. .. _example-ndframebase-attrs-0: .. dropdown:: attrs (pd_test_1_all.cpp:16361) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 16351 :emphasize-lines: 11 // ===================================================================== // Series Attrs Integration Tests // ===================================================================== void pd_test_ndframe_series_attrs() { std::cout << "========= series attrs integration =============================" << std::endl; pandas::Series s({1.0, 2.0, 3.0}); // Test setting attrs on Series s.attrs().set("source", std::string("test_data")); s.attrs().set("timestamp", 1234567890); bool passed = s.attrs().get("source") == "test_data"; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_series_attrs() : set/get source" << std::endl; throw std::runtime_error("pd_test_ndframe_series_attrs failed: set/get source"); } passed = s.attrs().get("timestamp") == 1234567890; if (!passed) { .. _example-ndframebase-attrs-1: .. dropdown:: attrs (pd_test_1_all.cpp:16361) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 16351 :emphasize-lines: 11 // ===================================================================== // Series Attrs Integration Tests // ===================================================================== void pd_test_ndframe_series_attrs() { std::cout << "========= series attrs integration =============================" << std::endl; pandas::Series s({1.0, 2.0, 3.0}); // Test setting attrs on Series s.attrs().set("source", std::string("test_data")); s.attrs().set("timestamp", 1234567890); bool passed = s.attrs().get("source") == "test_data"; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_series_attrs() : set/get source" << std::endl; throw std::runtime_error("pd_test_ndframe_series_attrs failed: set/get source"); } passed = s.attrs().get("timestamp") == 1234567890; if (!passed) { .. _example-ndframebase-get_cat_categories-2: .. dropdown:: get_cat_categories (pd_test_2_all.cpp:20374) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20364 :emphasize-lines: 11 auto cs = std::make_unique>(svals, "cat"); cs->set_dtype_override("category"); cs->set_cat_categories({"a", "b", "c"}); cs->set_cat_ordered(true); df.insert(0, "cat", std::move(cs), true); auto s = df.get_column_as_string_series("cat"); check(s.dtype_name() == "category", "cat dtype"); check(s.has_cat_categories(), "cat has_categories"); check(s.cat_ordered() == true, "cat ordered"); auto cats = s.get_cat_categories(); check(cats.size() == 3, "cat categories size"); std::set cat_set(cats.begin(), cats.end()); check(cat_set.count("a") && cat_set.count("b") && cat_set.count("c"), "cat categories content"); } void pd_test_getitem_dispatch_index_propagation() { std::cout << "pd_test_getitem_dispatch_index_propagation" << std::endl; // Test DatetimeIndex freq propagation pandas::DataFrame df; .. _example-ndframebase-get_value_bool-3: .. dropdown:: get_value_bool (pd_test_5_all.cpp:35197) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 35187 :emphasize-lines: 11 df.add_column_nullable("X", {true, pandas::NA_BOOL, false}); pandas_tests::check(df["X"].get_value_double(0) == 1.0, "case_2.idx0_one", local_fail); pandas_tests::check(std::isnan(df["X"].get_value_double(1)), "case_2.idx1_nan", local_fail); pandas_tests::check(df["X"].get_value_double(2) == 0.0, "case_2.idx2_zero", local_fail); } void bool_nullable_826495_case_3_get_value_bool_mask_aware(int& local_fail) { pandas::DataFrame df; df.add_column_nullable("X", {true, pandas::NA_BOOL, false}); pandas_tests::check(df["X"].get_value_bool(0) == true, "case_3.idx0_true", local_fail); pandas_tests::check(df["X"].get_value_bool(1) == false, "case_3.idx1_NA_false", local_fail); pandas_tests::check(df["X"].get_value_bool(2) == false, "case_3.idx2_false", local_fail); } void bool_nullable_826495_case_4_is_na_at_mask_aware(int& local_fail) { pandas::DataFrame df; df.add_column_nullable("X", {true, pandas::NA_BOOL, false}); pandas_tests::check(df["X"].is_na_at(0) == false, "case_4.idx0_not_na", local_fail); pandas_tests::check(df["X"].is_na_at(1) == true, "case_4.idx1_is_na", local_fail); pandas_tests::check(df["X"].is_na_at(2) == false, "case_4.idx2_not_na", local_fail); .. _example-ndframebase-get_value_double-4: .. dropdown:: get_value_double (pd_test_2_all.cpp:19160) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 19150 :emphasize-lines: 11 std::map col_funcs; col_funcs["a"] = "sum"; col_funcs["b"] = "mean"; pandas::Series result = df.agg_to_series(col_funcs); // a.sum() = 10.0, b.mean() = 25.0 check(result.size() == 2, "result_size_2"); // std::map iterates in alphabetical order: a, b check(std::abs(result.get_value_double(0) - 10.0) < 1e-9, "a_sum_10"); check(std::abs(result.get_value_double(1) - 25.0) < 1e-9, "b_mean_25"); // Check index labels check(result.index().get_value_str(0) == "a", "index_0_a"); check(result.index().get_value_str(1) == "b", "index_1_b"); } void pd_test_agg_dispatch_dict_simple_single_col() { std::cout << " -- pd_test_agg_dispatch_dict_simple_single_col --" << std::endl; .. _example-ndframebase-get_value_str-5: .. dropdown:: get_value_str (pd_test_1_all.cpp:4665) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 4655 :emphasize-lines: 11 auto corr_df = df.corr(); // Check dimensions bool passed = corr_df.nrows() == 2 && corr_df.ncols() == 2; if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_dataframe_corr() : corr should be 2x2" << std::endl; throw std::runtime_error("pd_test_aggregation_dataframe_corr failed: corr should be 2x2"); } // Diagonal should be 1.0 std::string aa = corr_df["A"].get_value_str(0); passed = std::abs(std::stod(aa) - 1.0) < 0.001; if (!passed) { std::cout << " [FAIL] : in pd_test_aggregation_dataframe_corr() : diagonal should be 1.0" << std::endl; throw std::runtime_error("pd_test_aggregation_dataframe_corr failed: diagonal should be 1.0"); } // A-B correlation should be 1.0 (perfect correlation) std::string ab = corr_df["B"].get_value_str(0); passed = std::abs(std::stod(ab) - 1.0) < 0.001; if (!passed) { .. _example-ndframebase-mask_at-6: .. dropdown:: mask_at (pd_test_3_all.cpp:27712) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 27702 :emphasize-lines: 11 fail++; } else { if (bool_s->dtype_name() != "boolean") { std::cout << " FAIL: dtype should be boolean, got " << bool_s->dtype_name() << std::endl; fail++; } if (!bool_s->has_mask()) { std::cout << " FAIL: should have mask for NA" << std::endl; fail++; } else { if (!bool_s->mask_at(2)) { std::cout << " FAIL: position 2 should be masked (NA)" << std::endl; fail++; } } } if (fail == 0) std::cout << " OK" << std::endl; } void pd_test_astype_to_string() { .. _example-ndframebase-set_value_nan-7: .. dropdown:: set_value_nan (pd_test_5_all.cpp:18478) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 18468 :emphasize-lines: 11 "0 a\n" "1 NaN\n" "2 c"; bool ok = (actual == expected); pandas_tests::check(ok, "where_mask_dtype_promotion_2_503514_case_10_str_col_where_default.to_string", local_fail); if (!ok) dump_diff("case_10", expected, actual); } void where_mask_dtype_promotion_2_503514_case_11_get_value_str_mask_int_renders_NaN(int& local_fail) { pandas::Series s({10, 20, 30}); s.set_value_nan(0); std::string actual = s.get_value_str(0); std::string expected = "NaN"; bool ok = (actual == expected); pandas_tests::check(ok, "where_mask_dtype_promotion_2_503514_case_11_get_value_str_mask_int_renders_NaN (got " + actual + ")", local_fail); bool ok1 = (s.get_value_str(1) == "20"); bool ok2 = (s.get_value_str(2) == "30"); pandas_tests::check(ok1, "case_11.kept_idx1_eq_20", local_fail); .. _example-ndframebase-reindex_with_indexer-8: .. dropdown:: reindex_with_indexer (pd_test_5_all.cpp:40388) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 40378 :emphasize-lines: 11 s.set_dtype_override("boolean"); s.set_freq(std::optional("D")); s.set_string_na_sentinel_disabled(true); // Indexer: identity over the 3 source positions. numpy::NDArray indexer(std::vector{3}); indexer.setElementAt({0}, 0); indexer.setElementAt({1}, 1); indexer.setElementAt({2}, 2); auto base = s.reindex_with_indexer(indexer); pandas_tests::check(base != nullptr, "case7.reindex_with_indexer_nonnull", local_fail); if (!base) return; auto* r = dynamic_cast*>(base.get()); pandas_tests::check(r != nullptr, "case7.reindex_with_indexer_is_Series_int64", local_fail); if (!r) return; // dtype_override propagates (oracle says yes). pandas_tests::check(r->dtype_override().has_value() && .. _example-ndframebase-set_index-9: .. dropdown:: set_index (pd_test_1_all.cpp:20318) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20308 :emphasize-lines: 11 // Set datetime index std::vector dates = { "2020-01-01 00:00:00", "2020-01-01 12:00:00", "2020-01-02 00:00:00", "2020-01-02 12:00:00", "2020-01-03 00:00:00", "2020-01-03 12:00:00" }; df.set_index(std::make_unique>(dates)); // Resample to daily auto resampler = df.resample("D"); pandas::DataFrame result = resampler.sum(); // Check that we got aggregated results bool passed = (result.nrows() <= df.nrows()); if (!passed) { std::cout << " [FAIL] : in pd_test_timeseries_resample_basic() : resample didn't reduce rows" << std::endl; .. _example-ndframebase-fillna_string-10: .. dropdown:: fillna_string (pd_test_5_all.cpp:47965) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 47955 :emphasize-lines: 11 "NaT", "null", "", "x", ""}); auto& col = df["col"]; for (size_t r = 0; r < df.nrows(); ++r) { std::cout << tag << " [" << r << "] val=\"" << col.get_value_str(r) << "\" is_na_at=" << col.is_na_at(r) << "\n"; } // CROSS-REFERENCE: pd_series.h:1938 lists only ""/None/nan/NaN as NA // for Series; "NA"/"NaT"/"null"/"" are NOT treated // as NA by is_na_at. This interacts with the fillna bug (item #1): // fillna_string (pd_series.h:1995) shares the SAME list. } catch (const std::exception& e) { std::cout << tag << " exception: " << e.what() << "\n"; } std::cout << tag << " === end ===\n"; } static void P33_forced_object_sentinels() { const std::string tag = "[P33]"; std::cout << "\n" << tag << " === dtype='object' with 'NaT'/'null' literals (residual bug?) ===\n"; .. _example-ndframebase-count-11: .. dropdown:: count (pd_test_1_all.cpp:66) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 56 :emphasize-lines: 11 if (arr.is_na(0)) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : is_na(0) should be false" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: is_na(0) should be false"); } if (!arr.has_na()) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : has_na() should be true" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: has_na() should be true"); } if (arr.count() != 2) { std::cout << " [FAIL] : in pd_test_boolean_array_na_handling() : count() should be 2" << std::endl; throw std::runtime_error("pd_test_boolean_array_na_handling failed: count() should be 2"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_kleene_and() { std::cout << "========= BooleanArray: Kleene AND ======================= "; .. _example-ndframebase-sum-12: .. dropdown:: sum (pd_test_1_all.cpp:276) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 266 :emphasize-lines: 11 } // Test sum/mean pandas::BooleanArray arr({ std::optional(true), std::optional(false), std::optional(true), std::optional(true) }); auto s = arr.sum(); if (!s.has_value() || s.value() != 3) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : sum should be 3" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: sum"); } auto m = arr.mean(); if (!m.has_value() || std::abs(m.value() - 0.75) > 0.001) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : mean should be 0.75" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } .. _example-ndframebase-unstack-13: .. dropdown:: unstack (pd_test_3_all.cpp:1739) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 1729 :emphasize-lines: 11 } if (s.size() != 3) { std::cout << " [FAIL] : in pd_test_3_all_chainable_mutators() : Case H size" << std::endl; throw std::runtime_error("pd_test_3_all_chainable_mutators failed: Case H size"); } std::cout << " -> tests passed" << std::endl; } void pd_test_3_all_dataframe_unstack() { std::cout << "========= DataFrame.unstack() ========================"; std::map> data = { {"A", {1.0, 2.0, 3.0}}, {"B", {4.0, 5.0, 6.0}} }; pandas::DataFrame df(data); // Without MultiIndex, unstack() returns self (matches pandas behavior) pandas::DataFrame result = df.unstack(); .. _example-ndframebase-to_string-14: .. dropdown:: to_string (pd_test_1_all.cpp:2693) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 2683 :emphasize-lines: 11 pandas::PeriodArray arr_m(std::vector{ "2020-01", "NaT", "2025-06" }, "M"); // Year auto years = arr_m.year(); auto y0 = years[0]; if (!y0.has_value() || y0.value() != 2020) { std::cout << " [FAIL] : year[0] should be 2020, got " << (y0.has_value() ? std::to_string(y0.value()) : "NA") << std::endl; throw std::runtime_error("pd_test_period_array_year_month_quarter failed: year[0]"); } auto y1 = years[1]; if (y1.has_value()) { std::cout << " [FAIL] : year[1] should be NA (NaT)" << std::endl; throw std::runtime_error("pd_test_period_array_year_month_quarter failed: year[1] should be NA"); } auto y2 = years[2]; .. _example-ndframebase-to_string_vector-15: .. dropdown:: to_string_vector (pd_test_1_all.cpp:10871) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 10861 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_extension_index_to_string_vector() { std::cout << "========= to_string_vector ========================="; pandas::CategoricalArray arr({"a", std::nullopt, "c"}); pandas::CategoricalIndex idx(arr); auto str_vec = idx.to_string_vector(); bool passed = (str_vec.size() == 3 && str_vec[0] == "a" && str_vec[1] == "NA" && str_vec[2] == "c"); if (!passed) { std::cout << " [FAIL] : in pd_test_extension_index_to_string_vector() : to_string_vector check failed" << std::endl; throw std::runtime_error("pd_test_extension_index_to_string_vector failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-ndframebase-astype_dtype-16: .. dropdown:: astype_dtype (pd_test_5_all.cpp:43633) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 43623 :emphasize-lines: 11 "0 a\n" "1 b\n" "2 c"; check_case("dtype_extension_dt_complex_fallback_925116_case_6", df, actual, expected, "string", local_fail); } void f_dtype_extension_dt_complex_fallback_925116_case_7_series_string_astype_string_drops_override(int& local_fail) { std::cout << "-- case_7_series_string_astype_string_drops_override\n"; pandas::Series s({"a", "b", "c"}); auto r_box = s.astype_dtype("string"); auto* r = dynamic_cast*>(r_box.get()); if (r == nullptr) { pandas_tests::check(false, "case_7.astype_returned_non_string_series", local_fail); return; } pandas::DataFrame df = r->to_frame(std::optional("v")); std::string actual = df.to_string(); std::cout << " src_dtype=" << show_dtype(s) << " astype_result_dtype=" << show_dtype(*r) << "\n"; .. _example-ndframebase-is_na_at-17: .. dropdown:: is_na_at (pd_test_5_all.cpp:35205) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 35195 :emphasize-lines: 11 pandas::DataFrame df; df.add_column_nullable("X", {true, pandas::NA_BOOL, false}); pandas_tests::check(df["X"].get_value_bool(0) == true, "case_3.idx0_true", local_fail); pandas_tests::check(df["X"].get_value_bool(1) == false, "case_3.idx1_NA_false", local_fail); pandas_tests::check(df["X"].get_value_bool(2) == false, "case_3.idx2_false", local_fail); } void bool_nullable_826495_case_4_is_na_at_mask_aware(int& local_fail) { pandas::DataFrame df; df.add_column_nullable("X", {true, pandas::NA_BOOL, false}); pandas_tests::check(df["X"].is_na_at(0) == false, "case_4.idx0_not_na", local_fail); pandas_tests::check(df["X"].is_na_at(1) == true, "case_4.idx1_is_na", local_fail); pandas_tests::check(df["X"].is_na_at(2) == false, "case_4.idx2_not_na", local_fail); } void bool_nullable_826495_case_5_fillna_preserves_dtype(int& local_fail) { pandas::DataFrame df; df.add_column_nullable("X", {true, pandas::NA_BOOL, false}); pandas_tests::check(df["X"].dtype_name() == "boolean", "case_5.pre_dtype", local_fail); auto df_filled = df.fillna(1.0); pandas_tests::check(df_filled["X"].dtype_name() == "boolean", .. _example-ndframebase-all-18: .. dropdown:: all (pd_test_1_all.cpp:247) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 237 :emphasize-lines: 11 pandas::BooleanArray has_true({ std::optional(false), std::optional(true) }); any_result = has_true.any(); if (!any_result.has_value() || !any_result.value()) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : any() with True" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: any() with True"); } // Test all() pandas::BooleanArray all_true({ std::optional(true), std::optional(true) }); auto all_result = all_true.all(); if (!all_result.has_value() || !all_result.value()) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : all() of all True" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: all() all True"); } .. _example-ndframebase-all_values_whole_number-19: .. dropdown:: all_values_whole_number (pd_test_5_all.cpp:30090) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 30080 :emphasize-lines: 11 !src_map_ov.empty() ? src_map_ov : src_ser_dt; bool is_int_like = (src_effective.find("int") != std::string::npos || src_effective.find("uint") != std::string::npos); bool comb_has_col = combined.has_column(flat); bool comb_hasnans = false, comb_allwhole = false; std::string comb_dt = ""; if (comb_has_col) { const pandas::NDFrameBase& c = combined[flat]; comb_hasnans = c.hasnans(); comb_allwhole = c.all_values_whole_number(); comb_dt = c.dtype_name(); } bool would_apply = is_int_like && comb_has_col && !comb_hasnans && comb_allwhole; std::cout << tag << " flat=" << flat << " src_effective=" << (src_effective.empty() ? "" : src_effective) << " is_int_like=" << is_int_like << " comb_dt=" << comb_dt << " comb_hasnans=" << comb_hasnans << " comb_allwhole=" << comb_allwhole .. _example-ndframebase-any-20: .. dropdown:: any (pd_test_1_all.cpp:226) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 216 :emphasize-lines: 11 std::cout << " [FAIL] : in pd_test_boolean_array_kleene_not() : ~NA should be NA" << std::endl; throw std::runtime_error("pd_test_boolean_array_kleene_not failed: ~NA"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_reductions() { std::cout << "========= BooleanArray: reductions ======================= "; // Test any() pandas::BooleanArray all_false({ std::optional(false), std::optional(false) }); auto any_result = all_false.any(); if (!any_result.has_value() || any_result.value()) { std::cout << " [FAIL] : in pd_test_boolean_array_reductions() : any() of all False" << std::endl; throw std::runtime_error("pd_test_boolean_array_reductions failed: any() all False"); } .. _example-ndframebase-axes-21: .. dropdown:: axes (pd_test_1_all.cpp:16602) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 16592 :emphasize-lines: 11 // ===================================================================== // Axes Tests // ===================================================================== void pd_test_ndframe_axes() { std::cout << "========= axes =================================================" << std::endl; pandas::Series s({1.0, 2.0, 3.0}); auto axes = s.axes(); bool passed = axes.size() == 1; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_axes() : axes count" << std::endl; throw std::runtime_error("pd_test_ndframe_axes failed: axes count"); } passed = axes[0]->size() == 3; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_axes() : axis size" << std::endl; .. _example-ndframebase-cat_ordered-22: .. dropdown:: cat_ordered (pd_test_2_all.cpp:20373) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20363 :emphasize-lines: 11 std::vector svals = {"a", "b", "a", "c"}; auto cs = std::make_unique>(svals, "cat"); cs->set_dtype_override("category"); cs->set_cat_categories({"a", "b", "c"}); cs->set_cat_ordered(true); df.insert(0, "cat", std::move(cs), true); auto s = df.get_column_as_string_series("cat"); check(s.dtype_name() == "category", "cat dtype"); check(s.has_cat_categories(), "cat has_categories"); check(s.cat_ordered() == true, "cat ordered"); auto cats = s.get_cat_categories(); check(cats.size() == 3, "cat categories size"); std::set cat_set(cats.begin(), cats.end()); check(cat_set.count("a") && cat_set.count("b") && cat_set.count("c"), "cat categories content"); } void pd_test_getitem_dispatch_index_propagation() { std::cout << "pd_test_getitem_dispatch_index_propagation" << std::endl; // Test DatetimeIndex freq propagation .. _example-ndframebase-clear_cache-23: .. dropdown:: clear_cache (pd_test_1_all.cpp:19413) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 19403 :emphasize-lines: 11 s.mean(); s.min(); s.max(); passed = s.has_cached_values() == true; if (!passed) { std::cout << " [FAIL] : in pd_test_series_cache() : cache not populated" << std::endl; throw std::runtime_error("pd_test_series_cache failed: cache not populated"); } s.clear_cache(); passed = s.has_cached_values() == false; if (!passed) { std::cout << " [FAIL] : in pd_test_series_cache() : cache not cleared" << std::endl; throw std::runtime_error("pd_test_series_cache failed: cache not cleared"); } std::cout << " -> tests passed" << std::endl; } void pd_test_series_string_repr() { .. _example-ndframebase-clone-24: .. dropdown:: clone (pd_test_1_all.cpp:5776) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5766 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_index_clone() { std::cout << "========= clone ======================================="; pandas::CategoricalArray arr({"p", "q", "r"}); pandas::CategoricalIndex idx(arr, "original"); std::unique_ptr cloned = idx.clone(); bool passed = (cloned != nullptr && cloned->size() == idx.size() && cloned->name() == idx.name()); if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_clone()" << std::endl; throw std::runtime_error("pd_test_categorical_index_clone failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-ndframebase-dtype_name-25: .. dropdown:: dtype_name (pd_test_1_all.cpp:10104) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 10094 :emphasize-lines: 11 } void pd_test_extension_index_array_constructor() { std::cout << "========= array constructor ========================="; pandas::CategoricalArray arr({"apple", "banana", "apple", "cherry"}); pandas::CategoricalIndex idx(arr, "fruits"); bool passed = (idx.size() == 4 && !idx.empty() && idx.name().has_value() && *idx.name() == "fruits" && idx.dtype_name() == "category"); if (!passed) { std::cout << " [FAIL] : in pd_test_extension_index_array_constructor() : array constructor check failed" << std::endl; throw std::runtime_error("pd_test_extension_index_array_constructor failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_extension_index_copy_constructor() { std::cout << "========= copy constructor ========================="; .. _example-ndframebase-dtype_name_full-26: .. dropdown:: dtype_name_full (pd_test_5_all.cpp:26384) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 26374 :emphasize-lines: 11 pandas::DataFrame df; df.add_column("group", {"A", "A", "B"}); df.add_column("flag", {true, false, true}); // Promote the column's dtype override to the PandasPython-origin sub-type. df.set_column_dtype("flag", "object:bool"); // Pre-check: dtype_name strips the colon, dtype_name_full keeps it. pandas_tests::check(df["flag"].dtype_name() == "object", "b21.pre: df[flag].dtype_name()==object (got '" + df["flag"].dtype_name() + "')", local_fail); pandas_tests::check(df["flag"].dtype_name_full() == "object:bool", "b21.pre: df[flag].dtype_name_full()==object:bool (got '" + df["flag"].dtype_name_full() + "')", local_fail); auto gg = df.groupby("group").get_group("A"); // FIX VERIFIED: Option 2 via iloc_rows + take_indices preserves the // dtype_override ("object:bool"); dtype_name() strips the colon and // returns "object". std::string gg_dt = gg["flag"].dtype_name(); std::string gg_dt_full = gg["flag"].dtype_name_full(); .. _example-ndframebase-empty-27: .. dropdown:: empty (pd_test_1_all.cpp:941) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 931 :emphasize-lines: 11 #include "../pandas/pd_config.h" namespace dataframe_tests { namespace dataframe_tests_config { void pd_test_config_version() { std::cout << "========= df_config: version info ======================= "; const char* version = pandas::DataFrameInfo::version(); if (version == nullptr || std::string(version).empty()) { std::cout << "[FAIL] : in pd_test_config_version() : version is null or empty" << std::endl; throw std::runtime_error("pd_test_config_version failed: version is null or empty"); } std::cout << "-> tests passed" << std::endl; } void pd_test_config_na_repr() { std::cout << "========= df_config: NA representation ======================= "; const char* na_repr = pandas::DataFrameConfig::get_na_repr(); if (na_repr == nullptr) { .. _example-ndframebase-flags-28: .. dropdown:: flags (pd_test_1_all.cpp:16397) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 16387 :emphasize-lines: 11 // ===================================================================== // Series Flags Integration Tests // ===================================================================== void pd_test_ndframe_series_flags() { std::cout << "========= series flags integration =============================" << std::endl; pandas::Series s({1, 2, 3}); // Test default flags bool passed = s.flags().allows_duplicate_labels == true; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_series_flags() : default allows_duplicate_labels" << std::endl; throw std::runtime_error("pd_test_ndframe_series_flags failed: default allows_duplicate_labels"); } passed = s.flags().copy_on_write == false; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_series_flags() : default copy_on_write" << std::endl; throw std::runtime_error("pd_test_ndframe_series_flags failed: default copy_on_write"); } .. _example-ndframebase-has_cached_values-29: .. dropdown:: has_cached_values (pd_test_1_all.cpp:19395) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 19385 :emphasize-lines: 11 } std::cout << " -> tests passed" << std::endl; } void pd_test_series_cache() { std::cout << "========= cache management ========================================="; pandas::Series s({1.0, 2.0, 3.0, 4.0, 5.0}); bool passed = s.has_cached_values() == false; if (!passed) { std::cout << " [FAIL] : in pd_test_series_cache() : initial cache not empty" << std::endl; throw std::runtime_error("pd_test_series_cache failed: initial cache not empty"); } // Trigger cache s.sum(); s.mean(); s.min(); s.max(); .. _example-ndframebase-has_cat_categories-30: .. dropdown:: has_cat_categories (pd_test_2_all.cpp:20372) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20362 :emphasize-lines: 11 pandas::DataFrame df; std::vector svals = {"a", "b", "a", "c"}; auto cs = std::make_unique>(svals, "cat"); cs->set_dtype_override("category"); cs->set_cat_categories({"a", "b", "c"}); cs->set_cat_ordered(true); df.insert(0, "cat", std::move(cs), true); auto s = df.get_column_as_string_series("cat"); check(s.dtype_name() == "category", "cat dtype"); check(s.has_cat_categories(), "cat has_categories"); check(s.cat_ordered() == true, "cat ordered"); auto cats = s.get_cat_categories(); check(cats.size() == 3, "cat categories size"); std::set cat_set(cats.begin(), cats.end()); check(cat_set.count("a") && cat_set.count("b") && cat_set.count("c"), "cat categories content"); } void pd_test_getitem_dispatch_index_propagation() { std::cout << "pd_test_getitem_dispatch_index_propagation" << std::endl; .. _example-ndframebase-has_mask-31: .. dropdown:: has_mask (pd_test_3_all.cpp:27708) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 27698 :emphasize-lines: 11 auto* bool_s = dynamic_cast*>(result.get()); if (!bool_s) { std::cout << " FAIL: expected Series" << std::endl; fail++; } else { if (bool_s->dtype_name() != "boolean") { std::cout << " FAIL: dtype should be boolean, got " << bool_s->dtype_name() << std::endl; fail++; } if (!bool_s->has_mask()) { std::cout << " FAIL: should have mask for NA" << std::endl; fail++; } else { if (!bool_s->mask_at(2)) { std::cout << " FAIL: position 2 should be masked (NA)" << std::endl; fail++; } } } .. _example-ndframebase-hasnans-32: .. dropdown:: hasnans (pd_test_1_all.cpp:5363) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 5353 :emphasize-lines: 11 void pd_test_categorical_index_from_codes() { std::cout << "========= from_codes ================================="; std::vector codes = {0, 1, 0, 2, -1}; // -1 = NA std::vector categories = {"low", "medium", "high"}; pandas::CategoricalIndex idx = pandas::CategoricalIndex::from_codes(codes, categories, true, "level"); bool passed = (idx.size() == 5 && idx.num_categories() == 3 && idx.ordered() && idx.name().has_value() && *idx.name() == "level" && idx.hasnans()); // has NA from code -1 if (!passed) { std::cout << " [FAIL] : in pd_test_categorical_index_from_codes()" << std::endl; throw std::runtime_error("pd_test_categorical_index_from_codes failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_categorical_index_simple_new() { std::cout << "========= _simple_new ================================="; .. _example-ndframebase-index-33: .. dropdown:: index (pd_test_1_all.cpp:6680) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6670 :emphasize-lines: 11 void pd_test_dataframe_index_ops() { std::cout << "========= index operations ================="; // Test set_axis (rows) { std::map> data; data["A"] = {1, 2, 3}; pandas::DataFrame df(data); auto renamed = df.set_axis({"x", "y", "z"}, 0); std::string idx0 = renamed.index().get_value_str(0); if (idx0 != "x") { std::cout << " [FAIL] : in pd_test_dataframe_index_ops() : set_axis first label should be 'x'" << std::endl; throw std::runtime_error("pd_test_dataframe_index_ops failed: set_axis"); } } // Test set_axis (columns) { std::map> data; data["A"] = {1, 2}; .. _example-ndframebase-name-34: .. dropdown:: name (pd_test_1_all.cpp:295) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 285 :emphasize-lines: 11 throw std::runtime_error("pd_test_boolean_array_reductions failed: mean"); } std::cout << " -> tests passed" << std::endl; } void pd_test_boolean_array_dtype() { std::cout << "========= BooleanArray: dtype ======================= "; pandas::BooleanArray arr; if (arr.dtype().name() != "boolean") { std::cout << " [FAIL] : in pd_test_boolean_array_dtype() : dtype name should be 'boolean'" << std::endl; throw std::runtime_error("pd_test_boolean_array_dtype failed: dtype name"); } if (arr.dtype().kind() != "b") { std::cout << " [FAIL] : in pd_test_boolean_array_dtype() : dtype kind should be 'b'" << std::endl; throw std::runtime_error("pd_test_boolean_array_dtype failed: dtype kind"); } std::cout << " -> tests passed" << std::endl; .. _example-ndframebase-nbytes-35: .. dropdown:: nbytes (pd_test_1_all.cpp:6214) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6204 :emphasize-lines: 11 } // Test empty DataFrame pandas::DataFrame empty_df; if (!empty_df.empty()) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : should be empty" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: should be empty"); } // Test nbytes > 0 for non-empty if (df.nbytes() == 0) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : nbytes should be > 0" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: nbytes should be > 0"); } // Test columns index if (df.columns().size() != 3) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : columns size != 3" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: columns size != 3"); } .. _example-ndframebase-ndim-36: .. dropdown:: ndim (pd_test_1_all.cpp:6195) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6185 :emphasize-lines: 11 pandas::DataFrame df(data); // Test shape auto shape = df.shape(); if (shape.size() != 2 || shape[0] != 4 || shape[1] != 3) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : shape mismatch" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: shape mismatch"); } // Test ndim if (df.ndim() != 2) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : ndim != 2" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: ndim != 2"); } // Test empty if (df.empty()) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : should not be empty" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: should not be empty"); } .. _example-ndframebase-repr-37: .. dropdown:: repr (pd_test_1_all.cpp:10906) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 10896 :emphasize-lines: 11 std::cout << " -> tests passed" << std::endl; } void pd_test_extension_index_repr() { std::cout << "========= repr ========================="; pandas::CategoricalArray arr({"a", "b", "c"}); // Use ExtensionIndex directly to test base class repr pandas::ExtensionIndex idx(arr, "test"); std::string repr_str = idx.repr(); bool passed = (!repr_str.empty() && repr_str.find("ExtensionIndex") != std::string::npos); if (!passed) { std::cout << " [FAIL] : in pd_test_extension_index_repr() : repr check failed" << std::endl; throw std::runtime_error("pd_test_extension_index_repr failed"); } std::cout << " -> tests passed" << std::endl; } .. _example-ndframebase-set_cat_categories-38: .. dropdown:: set_cat_categories (pd_test_2_all.cpp:20366) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20356 :emphasize-lines: 11 check(sub.columns().get_value_str(0) == "col", "dup col0 name"); check(sub.columns().get_value_str(1) == "col", "dup col1 name"); } void pd_test_getitem_dispatch_category_metadata() { std::cout << "pd_test_getitem_dispatch_category_metadata" << std::endl; pandas::DataFrame df; std::vector svals = {"a", "b", "a", "c"}; auto cs = std::make_unique>(svals, "cat"); cs->set_dtype_override("category"); cs->set_cat_categories({"a", "b", "c"}); cs->set_cat_ordered(true); df.insert(0, "cat", std::move(cs), true); auto s = df.get_column_as_string_series("cat"); check(s.dtype_name() == "category", "cat dtype"); check(s.has_cat_categories(), "cat has_categories"); check(s.cat_ordered() == true, "cat ordered"); auto cats = s.get_cat_categories(); check(cats.size() == 3, "cat categories size"); std::set cat_set(cats.begin(), cats.end()); .. _example-ndframebase-set_cat_ordered-39: .. dropdown:: set_cat_ordered (pd_test_2_all.cpp:20367) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20357 :emphasize-lines: 11 check(sub.columns().get_value_str(1) == "col", "dup col1 name"); } void pd_test_getitem_dispatch_category_metadata() { std::cout << "pd_test_getitem_dispatch_category_metadata" << std::endl; pandas::DataFrame df; std::vector svals = {"a", "b", "a", "c"}; auto cs = std::make_unique>(svals, "cat"); cs->set_dtype_override("category"); cs->set_cat_categories({"a", "b", "c"}); cs->set_cat_ordered(true); df.insert(0, "cat", std::move(cs), true); auto s = df.get_column_as_string_series("cat"); check(s.dtype_name() == "category", "cat dtype"); check(s.has_cat_categories(), "cat has_categories"); check(s.cat_ordered() == true, "cat ordered"); auto cats = s.get_cat_categories(); check(cats.size() == 3, "cat categories size"); std::set cat_set(cats.begin(), cats.end()); check(cat_set.count("a") && cat_set.count("b") && cat_set.count("c"), "cat categories content"); .. _example-ndframebase-set_dtype_override-40: .. dropdown:: set_dtype_override (pd_test_2_all.cpp:20225) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 20215 :emphasize-lines: 11 std::vector vals = {1.0, 2.0, 3.0}; df.insert(0, "A", std::make_unique>(vals, "A"), true); auto t = df.classify_column_access("A"); check(t == pandas::DataFrame::ColumnAccessType::NumericColumn, "float64 -> NumericColumn"); // int64 column pandas::DataFrame df2; std::vector ivals = {10, 20, 30}; auto iseries = std::make_unique>(ivals, "B"); iseries->set_dtype_override("int64"); df2.insert(0, "B", std::move(iseries), true); auto t2 = df2.classify_column_access("B"); check(t2 == pandas::DataFrame::ColumnAccessType::NumericColumn, "int64 -> NumericColumn"); } void pd_test_getitem_dispatch_classify_bool() { std::cout << "pd_test_getitem_dispatch_classify_bool" << std::endl; pandas::DataFrame df; std::vector bvals = {true, false, true}; df.insert(0, "flag", std::make_unique>(bvals, "flag"), true); .. _example-ndframebase-set_flags-41: .. dropdown:: set_flags (pd_test_1_all.cpp:16410) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 16400 :emphasize-lines: 11 throw std::runtime_error("pd_test_ndframe_series_flags failed: default allows_duplicate_labels"); } passed = s.flags().copy_on_write == false; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_series_flags() : default copy_on_write" << std::endl; throw std::runtime_error("pd_test_ndframe_series_flags failed: default copy_on_write"); } // Test set_flags s.set_flags(pandas::Flags(false, true)); passed = s.flags().allows_duplicate_labels == false; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_series_flags() : set allows_duplicate_labels" << std::endl; throw std::runtime_error("pd_test_ndframe_series_flags failed: set allows_duplicate_labels"); } passed = s.flags().copy_on_write == true; if (!passed) { std::cout << " [FAIL] : in pd_test_ndframe_series_flags() : set copy_on_write" << std::endl; throw std::runtime_error("pd_test_ndframe_series_flags failed: set copy_on_write"); .. _example-ndframebase-set_name-42: .. dropdown:: set_name (pd_test_1_all.cpp:11798) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 11788 :emphasize-lines: 11 throw std::runtime_error("pd_test_index_vector_constructor failed"); } std::cout << " -> tests passed" << std::endl; } void pd_test_index_copy_constructor() { std::cout << "========= copy constructor ============================"; pandas::Index idx1{1, 2, 3}; idx1.set_name("original"); pandas::Index idx2(idx1); bool passed = (idx2.size() == 3); passed = passed && (idx2.name().value() == "original"); passed = passed && idx2.equals(idx1); if (!passed) { std::cout << " [FAIL] : in pd_test_index_copy_constructor() : copy failed" << std::endl; throw std::runtime_error("pd_test_index_copy_constructor failed"); .. _example-ndframebase-set_string_na_sentinel_disabled-43: .. dropdown:: set_string_na_sentinel_disabled (pd_test_5_all.cpp:40315) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 40305 :emphasize-lines: 11 pandas_tests::check(false, "case3.mask_pos2_false", local_fail); } std::cout << " source has_mask=" << s.has_mask() << " result has_mask=" << r.has_mask() << "\n"; } void case_4_frame_flags_propagate(int& local_fail) { std::cout << "----- case_4_frame_flags_propagate -----\n"; auto s = make_series_3({"x", "y", "z"}); s.set_string_na_sentinel_disabled(true); auto r = s.reindex({"0", "1", "2"}); pandas_tests::check(r.string_na_sentinel_disabled() == true, "case4.string_na_sentinel_disabled_propagates", local_fail); std::cout << " source flag=" << s.string_na_sentinel_disabled() << " result flag=" << r.string_na_sentinel_disabled() << "\n"; } .. _example-ndframebase-shape-44: .. dropdown:: shape (pd_test_1_all.cpp:6188) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 6178 :emphasize-lines: 11 std::cout << "========= properties ======================="; std::map> data; data["A"] = {1.0, 2.0, 3.0, 4.0}; data["B"] = {5.0, 6.0, 7.0, 8.0}; data["C"] = {9.0, 10.0, 11.0, 12.0}; pandas::DataFrame df(data); // Test shape auto shape = df.shape(); if (shape.size() != 2 || shape[0] != 4 || shape[1] != 3) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : shape mismatch" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: shape mismatch"); } // Test ndim if (df.ndim() != 2) { std::cout << " [FAIL] : in pd_test_dataframe_properties() : ndim != 2" << std::endl; throw std::runtime_error("pd_test_dataframe_properties failed: ndim != 2"); } .. _example-ndframebase-size-45: .. dropdown:: size (pd_test_1_all.cpp:22) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 12 :emphasize-lines: 11 #include "../pandas/pd_boolean_array.h" namespace dataframe_tests { namespace dataframe_tests_boolean_array { void pd_test_boolean_array_constructors() { std::cout << "========= BooleanArray: constructors ======================= "; // Default constructor pandas::BooleanArray arr1; if (arr1.size() != 0) { std::cout << " [FAIL] : in pd_test_boolean_array_constructors() : default constructor size != 0" << std::endl; throw std::runtime_error("pd_test_boolean_array_constructors failed: default constructor size != 0"); } // Initializer list constructor pandas::BooleanArray arr2({ std::optional(true), std::optional(false), std::nullopt, std::optional(true) .. _example-ndframebase-string_na_sentinel_disabled-46: .. dropdown:: string_na_sentinel_disabled (pd_test_5_all.cpp:40319) :class-title: example-dropdown .. code-block:: cpp :linenos: :lineno-start: 40309 :emphasize-lines: 11 << " result has_mask=" << r.has_mask() << "\n"; } void case_4_frame_flags_propagate(int& local_fail) { std::cout << "----- case_4_frame_flags_propagate -----\n"; auto s = make_series_3({"x", "y", "z"}); s.set_string_na_sentinel_disabled(true); auto r = s.reindex({"0", "1", "2"}); pandas_tests::check(r.string_na_sentinel_disabled() == true, "case4.string_na_sentinel_disabled_propagates", local_fail); std::cout << " source flag=" << s.string_na_sentinel_disabled() << " result flag=" << r.string_na_sentinel_disabled() << "\n"; } void case_5_index_name_propagates(int& local_fail) { std::cout << "----- case_5_index_name_propagates -----\n"; auto s = make_series_3({10, 20, 30}); s.index_mut().set_name(std::optional("idx_name"));