Building Your First Model
This tutorial will guide you through building a basic machine learning pipeline with PhilanthroPy and scikit-learn.
1. Installation
You can install PhilanthroPy using pip:
2. Generating Synthetic Data
Let's begin by generating some synthetic donor data. This simulates a real CRM export.
from philanthropy.datasets import generate_synthetic_donor_data
df = generate_synthetic_donor_data(n_samples=500, random_state=42)
X = df[["total_gift_amount", "years_active", "event_attendance_count"]].to_numpy()
y = df["is_major_donor"].to_numpy()
3. Creating a Basic Model
We can pass this data directly into the DonorPropensityModel.
from philanthropy.models import DonorPropensityModel
model = DonorPropensityModel(n_estimators=200, random_state=0)
model.fit(X, y)
4. Predicting Affinity
Instead of raw probabilities, the model provides an affinity score from 0-100, which is much more actionable for gift officers.
5. Using Pipelines
PhilanthroPy components drop directly into scikit-learn pipelines.
from sklearn.pipeline import Pipeline
from philanthropy.preprocessing import FiscalYearTransformer, WealthScreeningImputer, DischargeToSolicitationWindowTransformer
from philanthropy.models import DonorPropensityModel
pipe = Pipeline([
("fy", FiscalYearTransformer(date_col="gift_date")),
("wealth", WealthScreeningImputer(wealth_cols=["estimated_net_worth"])),
("window", DischargeToSolicitationWindowTransformer()),
("model", DonorPropensityModel(n_estimators=200, random_state=0)),
])
# Assuming X_train, y_train, X_test are populated with appropriate data:
# pipe.fit(X_train, y_train)
# scores = pipe.predict_proba(X_test)[:, 1]