pandasCore C++ Documentation#

Welcome to the pandasCore C++ API documentation. This library provides a NumPy/pandas-compatible C++ implementation for high-performance data analysis.

Overview#

pandasCore C++ is a header-only library that provides:

  • DataFrame: Two-dimensional labeled data structure with columns of potentially different types

  • Series: One-dimensional labeled array capable of holding any data type

  • Index types: RangeIndex, DatetimeIndex, TimedeltaIndex, PeriodIndex, MultiIndex, etc.

  • Window operations: Rolling, Expanding, EWM for time series analysis

  • GroupBy: Split-apply-combine operations

  • I/O: CSV, JSON, Parquet, Excel support

Quick Start#

#include <pandas/pandas.h>
using namespace pandas;

int main() {
    // Create a DataFrame
    DataFrame df;
    df["A"] = {1, 2, 3, 4, 5};
    df["B"] = {1.1, 2.2, 3.3, 4.4, 5.5};
    df["C"] = {"a", "b", "c", "d", "e"};

    // Basic operations
    std::cout << df.head(3) << std::endl;
    std::cout << df.describe() << std::endl;

    // Statistics
    auto mean_A = df["A"].mean();
    auto std_B = df["B"].std();

    // GroupBy
    auto grouped = df.groupby("C");
    auto agg_result = grouped.sum();

    // I/O
    df.to_csv("output.csv");

    return 0;
}

Installation#

pandasCore C++ is a header-only library. Simply include the headers:

#include <pandas/pd_dataframe.h>
#include <pandas/pd_series.h>
// or include everything:
#include <pandas/pandas.h>

Requirements#

  • C++20 compatible compiler (MSVC 2022, GCC 11+, Clang 14+)

  • Optional: Intel MKL for optimized linear algebra

  • Optional: Intel IPP for performance optimizations

  • Optional: Boost Math Library for statistical distributions

API Reference#

See the API Reference for complete API documentation.

Indices and tables#