# Python Optimization Tutorial | Marketing Budget Allocation

The overall goal is we were trying to maximize sales through understanding of our the total channel contribution mix based on our budget constraints. Let take a look at the process.

## Why is This an Important Technique?

Its completely data driven as opposed to simple guessing techniques. This approach can lead to improved targeting, increased brand awareness, higher customer engagement, and ultimately, higher sales and revenue. Moreover, by using Python to perform these analyses, businesses can automate and scale their data analytics and decision-making processes, and stay competitive in a rapidly changing market. This can be done by deploying this whole process in a cloud.

## What are the Steps to Building Our Optimization Model?

**Linear Model**: We create a Pandas DataFrame from the advertising data and fits a linear regression model to the data to obtain the coefficients and intercept.**Optimization Model**: The code defines the objective function and constraint function for the optimization problem, and uses the`minimize()`

function from the`scipy.optimize`

library to solve the optimization problem and find the ideal allocation of the budget across the channels that maximizes the total sales. The code calculates the actual percentage for each channel contribution using the coefficients and the optimized percentages and interprets the results in light of the data and the optimization problem.

Download the dataset by clicking Here

You can find the dataset here: Where to Find Data and select Marketing Channels

## Coding the Budget Optimization Model

```
#Load in the the essentials
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error,r2_score
from scipy.optimize import minimize
```

### Load and Visualize the data

```
df =pd.read_csv('Advertising.csv')
df.head()
```

```
# create subplots to look at the linearity of the data with the target variable
fig, axes = plt.subplots(nrows=1,ncols=3,figsize=(12,4))
for i,channel in enumerate(['TV','Radio','Newspaper']):
axes[i] = sns.regplot(data=df,x=channel,y='Sales',ax=axes[i])
```

The resulting plot will show three subplots, each depicting the relationship between Sales and one of the three advertising channels: TV, Radio, and Newspaper. The regression lines will show the trend and strength of the linear relationship between the advertising channel and sales, while the scatter plot points will represent the individual observations. Note that these observation to not predict which variable will be the most impact in a linear model. The models will take into account the interaction between the variables which might affect the coefficetn.

Insights that could be gained from this visualization include:

- The relative importance of each advertising channel in driving sales
- The linearity and strength of the relationship between each advertising channel and sales
- Whether there are any outliers or non-linear relationships that may warrant further investigation.

```
# lets check the correlation between the variables
sns.heatmap(df.corr(),annot=True)
```

We can see that the variables are correlated with each other. The coefficients from the linear regression model should be able to capture these relationship.

### Build the Linear Regression Model

```
#Define your independent and target variables
X = df[['TV', 'Radio', 'Newspaper']]
y = df['Sales']
# split the data and train the model
X_train,X_test, y_train, y_test = train_test_split(X,y)
model = LinearRegression()
model.fit(X_train, y_train)
#get the coefficients, predictions, and intercept
predictions = model.predict(X_test)
coefficients = model.coef_
intercept = model.intercept_
#create a dataframe
pd.DataFrame(zip(X.columns,coefficients),columns=['channel','coefficients'])
```

### Create the Optimization Model

The following code performs an optimization to find the ideal allocation of a budget across three advertising channels (TV, radio, and newspaper) that maximizes the total sales. It defines the objective function as the negative of the total sales, and the constraint function as the remaining budget after subtracting the total investment in the channels.

The optimization is performed using the `minimize()`

function from the `scipy.optimize`

library, which takes the objective function, the initial guess, the bounds on the allocation of the budget, and the constraint function as inputs. The code then calculates the actual percentage for each channel contribution using the coefficients and the optimized percentages, and interprets the results in light of the data and the optimization problem. The initial guess for the model is that there are equal contribution across 3 channels for 1/3 or 33.33% at a budget of $60,000.

```
# define the objective function for optimization
def objective(x):
return -1 * (coefficients[0] * x[0] + coefficients[1] * x[1] + coefficients[2] * x[2] + intercept)
# define the constraint function for optimization
def constraint(x):
return 60000 - (coefficients[0] * x[0] + coefficients[1] * x[1] + coefficients[2] * x[2] + intercept)
# set the initial guess for the optimization
x0 = np.array([1/3, 1/3, 1/3])
# perform the optimization
solution = minimize(objective, x0, bounds=[(0, np.inf)]*3, constraints=[{'type': 'eq', 'fun': constraint}])
```

After finding the optimal allocation of the budget across the three advertising channels that maximizes total sales, the code calculates the actual percentage for each channel contribution using the coefficients and the optimized percentages. It first calculates the total sales, then computes the percentage of the total sales that can be attributed to each channel by multiplying the corresponding coefficient and the optimized percentage, and dividing the result by the total sales. Finally, the code prints the results, including the coefficients, intercept, the ideal channel contribution percentage, and the actual percentage for each channel contribution.

```
# calculate the actual percentage for each channel contribution
total_sales = -(solution.fun)
tv_percentage = coefficients[0] * solution.x[0] / total_sales
radio_percentage = coefficients[1] * solution.x[1] / total_sales
newspaper_percentage = coefficients[2] * solution.x[2] / total_sales
# print the results
print('Coefficients:', coefficients)
print('Intercept:', intercept)
print('Ideal channel contribution percentage:', solution.x)
print('TV percentage:', tv_percentage)
print('Radio percentage:', radio_percentage)
print('Newspaper percentage:', newspaper_percentage)
```

```
#create pie chart
labels =X.columns
explode = [0.2, 0.2, 0.1]
plt.pie(channel_split['percentage'], labels = labels, autopct='%.0f%%',explode=explode)
```