Monte Carlo Simulation¶

Assign probability distributions to inputs to be able to get probability distributions of outputs. Enables a much deeper understanding of model results and especially the risk of a result.

Resources¶

Introduction to Monte Carlo Simulations¶

Notes¶

Monte Carlo simulation is the external counterpart to internal randomness
The core model is still (probably) deterministic, but then we add randomness into the model by randomizing the inputs to the model and running it many times
Adding this randomness allows us to answer deeper questions about the problem, such as what is the chance of some outcome occurring
The process is the same as sensitivity or external scenario analysis, just run the model multiple times with different inputs. Only here we are randomly drawing the inputs from distributions
We will also do some additional analysis on the results from the MC simulation

Transcript¶

00:02: hey everyone
00:03: nick duraburtis here teaching you
00:05: financial modeling today we're going to
00:07: be doing an
00:08: introduction to monte carlo simulation
00:11: and this
00:11: is part of our new lecture segment on
00:14: the same monte carlo simulation
00:17: so we've explored in this course already
00:21: a couple other ways of exploring the
00:23: parameter space
00:24: looking at different inputs and how they
00:26: affect our model
00:28: we've already looked at sensitivity
00:30: analysis and scenario analysis
00:32: and now monte carlo simulation comes to
00:36: round out that set of possibilities
00:40: so monte carlo simulation
00:44: is unique in that it allows you to take
00:47: a deterministic model which has no
00:50: notion of randomness or
00:52: probability and be able to make
00:55: conclusions
00:56: about the chance of certain outcomes
00:59: occurring
00:59: in your model so you're able to take
01:01: this model with no probability
01:03: in it at all and add it externally
01:06: without having to change the core model
01:08: itself
01:10: um so that will give you a good
01:13: understanding of
01:15: not just what is kind of the expected
01:17: outcome from the model
01:18: but also what's the full range of
01:20: possible outcomes and
01:22: what is the chance of each of these
01:24: outcomes occurring
01:26: and certainly in finance this is an
01:28: important thing to consider
01:30: because we're always concerned about the
01:32: risk return
01:33: trade-off and we can
01:36: think of in a sense the baseline output
01:40: from our model as
01:41: being kind of the return or you know
01:44: whatever objective
01:45: that we're uh looking at evaluating in
01:48: the model
01:49: and then the chance of of getting
01:52: different
01:54: outcomes we can think of that as the
01:56: risk so
01:59: running the model without monte carlo
02:01: simulation we're kind of just looking at
02:02: one side of the risk return trade-off
02:05: and so it can be very helpful to bring
02:07: this in to fully evaluate the problem
02:11: so i think it's useful to motivate this
02:13: by an example
02:15: um so let's talk about a potential
02:19: that you can place um so
02:22: you have an opportunity to place a bet
02:25: for one dollar
02:26: and if you win that bet you're going to
02:28: get two dollars
02:30: but if you lose the bet then you're
02:32: going to lose 750 000
02:35: and there's no way to avoid this payment
02:38: through legal means
02:39: you're gonna have to be obligated to pay
02:41: that no matter what
02:44: and then looking at the odds of this bet
02:48: uh one in a million you lose that 750
02:52: 000 and every other time
02:55: 999 999 times out of a million
02:59: you're going to win the two dollars
03:03: and if you just go and you take the
03:05: expected value
03:06: of this bet then the expected profit is
03:10: 25 cents so just looking
03:12: at kind of the expected outcome you
03:15: should definitely take this back
03:17: but uh you know any
03:20: reasonable person looking at this bet
03:24: would think twice and at least carefully
03:27: consider should i really take this bet
03:29: because that downside of losing 750
03:33: thousand dollars
03:35: is so severely bad that
03:38: even though it has such a low
03:40: probability
03:41: you might not want to take the bet
03:43: because you only have such a small
03:45: amount to gain in the bet
03:48: so if you make decisions 100 percent on
03:50: expected value
03:51: you would take the bet but
03:55: more than just the expected value
03:57: matters here the probabilities of
03:59: different outcomes
03:59: occurring and what possible outcomes can
04:02: occur
04:03: still matter even beyond expected value
04:08: so this is to some extent the
04:12: kind of concepts we're trying to get at
04:14: with monte carlo simulation
04:17: you want to see what are the different
04:18: possible outcomes from the model
04:20: and consider what that means for your
04:23: particular situation
04:27: and then as far as running monte carlo
04:29: simulation
04:30: this is a visualization of how that
04:32: looks
04:34: and you may notice that this is
04:37: uh very similar to what we had seen
04:40: for sensitivity analysis
04:43: in fact it's the same image here
04:46: describing monte carlo simulation
04:48: and that's because the process to run
04:50: each of them is almost exactly the same
04:55: so the same for external scenario
04:58: analysis
04:59: really all three techniques you follow
05:01: the same pattern
05:03: uh basically just run the model a bunch
05:04: of times passing in different inputs
05:06: each time
05:08: and associating those inputs with the
05:09: outputs and putting it all together in
05:12: some sort of analysis and visualization
05:14: at the end
05:16: um so the the difference
05:20: because the process is so similar the
05:22: difference with monte carlo simulations
05:24: is that we are assigning distributions
05:28: probability distributions
05:30: to each of the inputs and we're randomly
05:32: drawing
05:33: the values of those inputs to put into
05:36: our model
05:37: whereas with external scenario analysis
05:40: we said
05:40: what are all the inputs that make sense
05:43: for this situation
05:44: we're manually picking those values and
05:47: also with sensitivity analysis we said
05:49: we want to look at
05:50: you know investment rates between one
05:52: and five percent
05:53: and so you're manually saying well i
05:55: want to look at one two three four five
05:57: percent
05:58: um but with monte carlo simulation you
06:00: just say well the
06:01: interest rate is going to have a mean of
06:03: three it's
06:05: going to have a standard deviation of 2
06:06: percent on a normal distribution
06:08: and then each time that you run the
06:10: model you don't know what interest rate
06:12: is going to go in the model it's just
06:14: randomly picked from the distribution it
06:16: could be 5
06:17: this time 1 the next time and so on
06:23: and then what this allows you to do
06:26: which is unique
06:27: to monte carlo simulation is
06:30: because we have probability
06:33: distributions of the inputs
06:35: that allows you to assign a probability
06:37: distribution to the outputs
06:39: as well um so that's where
06:42: we're going to be able to get into
06:44: talking about the chance
06:46: of a certain outcome occurring uh you
06:48: know maybe it's a capital budgeting
06:50: setting and you're
06:51: looking at the the probability that
06:53: you're going to have a positive
06:55: mpv uh or it could be a portfolio
06:58: setting where you're saying uh you know
07:01: there's
07:02: a 10 chance that we're gonna lose as
07:04: much
07:05: as three hundred thousand dollars uh
07:08: there are a lot of different situations
07:09: where you wanna think about the
07:11: probability
07:12: and monte carlo simulation is a nice way
07:15: to
07:16: be able to come to those conclusions
07:18: with an otherwise
07:20: deterministic model
07:23: so that's the quick intro of monte carlo
07:26: simulation
07:28: we're gonna do a different order for
07:31: uh exposing everyone to this we're going
07:33: to go
07:34: next to actually running the simulation
07:37: um and then we're gonna come back and
07:39: discuss
07:40: more formally what we're doing i think
07:43: it's easiest to learn all this by
07:45: example
07:47: because when we talk about it formally
07:49: it can sound a little bit complicated
07:51: but really it's not
07:52: and just seeing it by an example really
07:55: drives that home
07:56: so we'll come back next time to look at
07:58: that example of
07:59: running a monte carlo simulation so
08:02: thanks for listening
08:03: and see you next time

Monte Carlo Investment Returns¶

Notes¶

Running Monte Carlo simulations in Excel without the use of an add-in is complex
Running Monte Carlo simulations in Python just a few lines of code
If you want to add Monte Carlo simulation to an Excel model, it is easiest to use xlwings to connect Python to run the simulations on your Excel model
After running the simulations, you must analyze and visualize the output
A histogram is a good choice for showing the output distribution
A table of the percentiles of the distribution and values corresponding to those percentiles is a more quantitative way to show the output distribution
If we have some specific objective or loss in mind, we can determine the probability of achieving the objective/loss

Resources¶

Monte Carlo Investment Returns

Transcript¶

00:03: hey everyone
00:04: nick dear burtis here teaching you
00:05: financial modeling
00:07: today we're going to be looking at an
00:09: example of how to apply
00:11: monte carlo simulation and this is going
00:14: to be
00:14: in the context of a portfolio model
00:17: where we're looking to
00:18: allocate resources between two
00:21: assets this is part of our lecture
00:24: series on
00:24: monte carlo simulation so
00:28: we introduced the whole idea of what
00:32: monte carlo simulation
00:33: is and why we would want to go about it
00:36: and
00:37: i mentioned that we're going to look at
00:39: an example of how to do it before
00:41: getting into the more formal definition
00:44: of
00:44: what it is that we're doing i think it's
00:47: easiest to learn this by example
00:50: before we look at that example let's
00:52: just quickly talk about
00:54: how you would actually go and run this
00:57: so monte carlo simulation can be applied
01:00: to
01:01: any model it doesn't matter if it's in
01:03: python or excel
01:05: but it is definitely easier to run monte
01:08: carlo simulations
01:10: in python if you want to do a pure
01:14: excel monte carlo simulation
01:17: unless you're getting some kind of
01:19: add-in uh
01:20: then you're going to be looking at doing
01:22: some kind of
01:23: data table approach which is going to be
01:26: fairly complicated to work out
01:28: for more than two inputs
01:31: um so i generally wouldn't recommend
01:34: that
01:35: you can also go to vba but
01:38: it's generally easier to just handle
01:40: this with python
01:42: uh because in python you just do a loop
01:45: over the number of iterations each time
01:47: you draw the random inputs you run the
01:49: model
01:50: and then you collect the output so it's
01:52: really not more than a few lines of code
01:54: to
01:55: make this happen in python
01:59: so we're going to look first at the
02:01: python example
02:03: of how to do this
02:06: because it's more straightforward since
02:08: ultimately we're going to run the monte
02:10: carlo simulation in python so if the
02:12: model is already in python
02:14: it's very straightforward and then when
02:16: we go over to
02:18: running a monte carlo simulation on an
02:20: excel model
02:22: in a future video then we're going to
02:25: use
02:25: excel wings to go back and forth between
02:28: python and excel and have python
02:30: orchestrate the monte carlo simulation
02:33: on the excel
02:34: model
02:38: so the problem that we're going to look
02:40: at here
02:41: which is good application for monte
02:43: carlo simulation
02:45: is an investment problem we have a
02:48: thousand dollars now
02:49: we need a thousand and fifty a year from
02:51: now
02:52: and we have a choice of investing in two
02:54: different assets a risk-free asset
02:57: and a stock and the risk-free asset
03:00: being that it's risk-free it always
03:02: returns the same percentage
03:04: whereas the stock the return is going to
03:08: be drawn
03:08: from a normal distribution
03:12: with a 10 average and 20 standard
03:15: deviation
03:17: so the question here is how much should
03:20: we allocate
03:21: to the risk free and to the stock
03:24: in order to maximize our chance of
03:27: meeting our objective
03:28: of having 1050 in one year
03:33: i'm gonna intentionally set up the
03:35: values of the inputs this way so that
03:37: you couldn't just put 100 in the risk
03:39: free
03:39: that is not going to be able to get you
03:41: to the goal of
03:42: a thousand and fifty dollars so you do
03:44: have to put some amount into the stock
03:47: but how much should you put into the
03:49: stock
03:51: so the way we're going to approach this
03:52: is first we're going to
03:54: build out the basic model
03:58: to get the portfolio value for given
04:02: returns
04:03: and then we're going to
04:05: [Music]
04:07: get that stock return from a normal
04:10: distribution
04:11: and re-run the model for a number of
04:14: iterations
04:15: and put this all together to analyze and
04:18: visualize
04:20: and then uh because we're trying to
04:23: evaluate
04:24: what's the best weight we're going to
04:25: then repeat
04:27: this whole process or a number of
04:31: different weights
04:32: into the two assets and then pick the
04:35: one
04:35: which has the highest probability of
04:38: achieving the objective of a thousand
04:40: fifty dollars
04:43: so let's go ahead and jump over to the
04:46: jupiter notebook
04:47: example for this there's already a
04:48: completed example
04:50: on the course site that you can download
04:52: and take a look at
04:59: so this is that jupiter notebook and
05:01: here at the beginning
05:02: it again describes uh the problem
05:06: and our general approach for going about
05:09: it
05:11: so let's go ahead and jump into the code
05:13: since we've
05:14: already kind of covered this well
05:16: quickly i'll just talk about the math
05:17: going on here
05:20: so in order to get the return on the
05:24: portfolio of these two assets
05:26: we take a weighted average of the
05:28: returns on each of the assets
05:30: so it's the weight the weight and the
05:32: risk free times the return in the risk
05:34: free
05:35: plus the weight in the stock times the
05:37: return on the stock
05:40: um but we can simplify this a little bit
05:44: to remove
05:44: one input from the model uh because we
05:47: only have two assets
05:49: the weights must sum up to one we must
05:51: be 100 invested
05:52: in total and so then the weight on the
05:55: risk free is one minus
05:56: the weight on the stock so we can
05:59: eliminate that weight on the risk free
06:01: and we have the return on the portfolio
06:04: being the risk-free
06:05: times one minus the stock weight plus
06:08: the
06:08: stock return times the weight on the
06:11: stock
06:13: so translating that over into python
06:16: code we can define some variables here
06:19: for the stock return the risk free and
06:21: the weight on the stock
06:23: and we can create this formula here
06:26: which gets us the portfolio return
06:28: so that's just taking this formula and
06:30: implementing it
06:31: in python risk-free times one minus
06:34: stock weight
06:35: uh plus the stock return times the stock
06:37: weight
06:39: and with these baseline values we get a
06:41: 6.5 percent portfolio return
06:47: so then we can take the initial value on
06:51: the portfolio
06:52: and multiply it by one plus the
06:54: portfolio return this is a one-year
06:56: model so we can multiply by one plus the
06:59: portfolio return
07:00: and get the ending value on the
07:02: portfolio
07:04: so this is basically our entire
07:07: core model here i wanted to keep this as
07:11: as very simple so we can focus on the
07:14: monte carlo simulation
07:16: so we can just put that all into a
07:20: function
07:22: we want to definitely have a function
07:24: that runs our core model
07:26: as we go into the monte carlo simulation
07:30: so this is just taking that logic we
07:32: worked out getting the portfolio return
07:34: and then applying that return to the
07:36: initial value on the portfolio
07:39: to get the end value and we can pass any
07:42: of these
07:43: as inputs so we can say what if the
07:47: uh stock return was instead 20
07:50: then we would be getting even more money
07:54: so yeah here just showing trying with
07:57: some other inputs
08:00: so our core model is done we have one
08:02: function that runs the core model
08:05: and you would want to have this kind of
08:08: structure
08:09: for any model any python model that
08:11: you're going to go and do monte carlo
08:12: simulations on you want to have one
08:14: function
08:15: which can take the model inputs and
08:17: return the outputs from your model
08:22: so we want to get into running the
08:25: simulation now
08:27: and recall that the first step in the
08:30: monte carlo simulation process
08:32: is to randomly pick the values of our
08:36: inputs and so we've got to set it up so
08:38: that we can randomly pick those
08:40: values so we already covered in the
08:43: continuous random variable material
08:46: that we can use the python's random
08:48: library and random.normal variant
08:51: to pull numbers from a normal
08:53: distribution
08:54: so you'll see as i've run this multiple
08:55: times i get different values
08:57: uh according to pulling random numbers
09:00: from
09:01: a distribution which has a 10 percent
09:04: mean and a 20 standard deviation
09:11: so then we can put this together
09:15: with the
09:18: function which runs our model to now
09:21: run the model along with a random
09:24: uh rate of return on the stock so
09:28: you're just that same uh logic to get
09:31: the random stock return then i'm just
09:33: printing out
09:34: what that stock return is uh before
09:38: running it through the model so you can
09:41: see each time that i run it we get
09:43: different stock returns
09:45: and we get different portfolio in values
09:47: as a result
09:49: and this kind of gets into the core
09:52: of why we need to do a model like this
09:55: you can see that
09:56: we're getting quite different portfolio
09:58: values each time
09:59: and a number of them are not meeting our
10:01: goal of 1050 and so this definitely
10:04: is um something we need to carefully
10:07: consider
10:08: if we have that goal of getting a
10:10: thousand and fifty dollars in a year
10:14: so that's how you can run one simulation
10:17: at a time
10:18: now let's run as many simulations as we
10:22: want
10:23: so we basically take that same exact
10:26: thing that we just did
10:28: uh getting the random stock return and
10:31: then
10:32: running the model with that random stock
10:34: return and we just put it into a loop
10:36: a loop over the number of iterations
10:39: we set up an outputs list we append our
10:41: result to that outputs list
10:45: and that allows us to run the model as
10:48: many times
10:49: as we want or as many simulations as we
10:51: want on the model
10:54: so we can see here's three simulations
10:56: we could change it to five
10:57: and have five simulations there
11:00: so we'll just take that same logic and
11:03: just
11:04: wrap it up into a function of course you
11:06: know if you're doing this in your own
11:07: model you wouldn't want to have both
11:09: still sitting around
11:10: you would just convert this into that
11:11: but they're shown separately to
11:13: show how you can kind of build it out
11:16: um and so this is just that same logic
11:19: now in a function
11:20: format so
11:24: and this now runs with a thousand
11:26: iterations by default
11:28: and here we're just showing uh how many
11:30: results there were
11:31: and the first five results so now
11:35: this function we are running a thousand
11:38: simulations
11:39: on the model and we could change that
11:43: uh number of simulations to whatever we
11:46: want
11:50: so um you know this is where i mentioned
11:54: that uh you can run that you can run the
11:57: simulation and you get some results but
11:58: you really have to
12:00: analyze and visualize them to get any
12:02: kind of meaning out of that
12:04: so a good first
12:08: approach for the visualization is to
12:10: create a histogram
12:12: of the outputs
12:15: so i'm just going to make a data frame
12:18: and then
12:20: put those results into the data frame as
12:23: the portfolio in values
12:25: and then do a histogram over those
12:28: values
12:29: [Music]
12:30: and we can now see
12:35: a distribution of the outputs the
12:36: outputs also themselves
12:38: look normal the portfolio in values we
12:41: can see that it's kind of centered
12:43: around
12:44: this is maybe somewhere around 10 and
12:47: 50.
12:50: and we can see some values are
12:53: as low as almost down to 600 more
12:56: commonly down to 800
12:59: and for getting as high as over 1500
13:02: or over 1400 and more commonly we're
13:06: getting high values in the
13:07: 12 to 1400 range and
13:10: you can also the kde is an alternative
13:13: to the histogram
13:15: which basically gets at the same concept
13:19: but it just tries to kind of smooth out
13:21: the curve
13:25: so another way to look at this
13:27: distribution
13:28: is to say well you know five percent
13:32: of the time we're going to uh get at
13:36: least this value and 95
13:38: of the time we're going to get at least
13:39: this value um
13:42: so we can do that with a table
13:45: of the probabilities and the percentiles
13:49: of the distribution
13:51: so to get there first we can form the
13:55: percentiles that we want to look at here
13:56: just doing five percent increments
13:59: throughout the whole range of
14:01: percentiles
14:03: and we can use the uh quantile
14:07: method on the data frame
14:11: column in order to see that so we just
14:14: pass
14:15: these percentiles this list of
14:18: uh the percentiles to the quantile
14:21: method and with that we get uh these
14:25: results
14:26: saying that um in 95
14:30: of cases you're going to have less than
14:34: 1231. so only in five percent of cases
14:37: do you get
14:38: at least 1231. um
14:42: and in five percent of cases you get
14:44: lower than
14:45: 900 so we're already starting to
14:49: get out what we want here we can see
14:52: you know somewhere in the 40s chance
14:56: of being lower than our objective
14:59: so there's a 50-something percent chance
15:01: of meeting our objective
15:03: um but there's a more direct way to get
15:06: at this probability of meeting the
15:08: objective
15:09: another note is you don't have to do
15:11: this on the column you can do it on the
15:13: data frame
15:14: itself um and that achieves the same
15:17: thing and then if you have
15:19: other if you have your input values in
15:21: the data frame as well then you'll see
15:23: percentiles of those as well
15:28: um so now we're getting to the
15:30: probability of achieving a certain
15:32: objective which makes a lot of sense to
15:34: evaluate in this model where our
15:37: objective is to have a thousand and
15:39: fifty and we're trying
15:40: to allocate to meet that objective and
15:43: so this is kind of the main output for
15:44: this model
15:46: now in some other models you might not
15:48: have an objective
15:49: specific objective that makes sense
15:53: in which case you don't need to do this
15:55: analysis
15:56: but it's useful in cases where you have
15:58: some kind of target number that you're
16:00: trying to achieve
16:01: and you want to evaluate the probability
16:03: of achieving that
16:06: um so unfortunately there's not like a
16:10: direct
16:10: function in pandas to do this analysis
16:13: but
16:14: here's a simple one-liner which will
16:16: accomplish that
16:17: for you um so you can't feel free to
16:21: just copy paste this
16:22: into your model and change out you know
16:26: whatever
16:26: output you're looking at in the data
16:28: frame
16:31: but we'll break this down now so you can
16:34: understand
16:35: what's going on here
16:39: so the first part going on here
16:42: is we're checking which of the values
16:46: which of our simulations met the
16:48: objective
16:49: that we're trying to achieve so we can
16:52: compare
16:53: a column of data frame to a number or
16:56: whatever else
16:57: and it will give us true a column of
17:00: true false
17:01: um representing we did meet the
17:04: objective
17:06: or we didn't meet the objective and just
17:08: to see
17:09: what values that's based on um
17:12: you can see for these first four rows
17:16: indeed they're above 10 50. we met the
17:18: objective
17:19: and then we get to this uh fifth row
17:22: which it was below 10.50 and so we did
17:25: not meet the objective
17:29: so we we can't
17:33: immediately do math with um boolean true
17:36: false
17:37: so then we convert this into ones and
17:40: zeros
17:41: so that we can do math with it um
17:44: so now this is the same thing
17:48: uh but we've just converted it into one
17:50: means
17:52: uh that we have met the objective and
17:54: zero means we have not met the objective
17:56: so now that covers this part of the
17:58: expression
18:00: and the last part is then just taking
18:02: the average
18:03: so when we take the average
18:07: of one for yes we met the objective zero
18:11: for no we did not meet the objective
18:14: that gets us to the
18:17: probability based on these simulation
18:21: results
18:22: of meeting the objective so that'll be
18:25: explained
18:26: in greater detail in the next video
18:29: where we formally go over
18:30: everything that we're doing um but just
18:33: know for now that taking the average of
18:35: this
18:35: um one for we met the objective zero we
18:39: didn't meet the objective
18:40: will get you to an estimate of the
18:42: probability of meeting the objective
18:48: so now we have a few different results
18:52: and analysis and visualization of the
18:56: simulation
18:58: but we mentioned in our approach to this
19:00: that now we're going to have to evaluate
19:02: this for the different
19:03: weights that we can put into the stock
19:06: so let's go ahead and wrap this up into
19:09: functions
19:10: so that we can easily reuse that logic
19:13: across a number of different uh weights
19:15: in the stock
19:18: so here i've created functions for all
19:21: these things
19:22: that we just looked at um so
19:25: first um here you'll see i made a
19:28: function that
19:29: takes the results from the simulation
19:31: and creates a data frame
19:33: uh from them i created a function
19:37: which takes the data frame and
19:40: does the histogram
19:43: of the results and
19:47: as you go to start putting
19:49: visualizations into functions
19:51: you should get familiar with this plt
19:54: dot
19:54: show like normally when you're doing a
19:57: visualization out in a cell it doesn't
19:58: matter
19:59: but when you're running it in a function
20:02: you generally want the visualization to
20:04: show up
20:04: as soon as you run it so that all the
20:06: output can stay in order
20:08: and this plt.show is going to ensure
20:11: that
20:12: so you just import matplotlib.piplot as
20:15: plt that's the convention
20:18: remember that matplotlib is the plotting
20:20: library under the hood for pandas
20:24: and then we just put this plt.show after
20:27: any spot that we're doing
20:28: a plot and this is a general practice
20:31: that you should follow for
20:33: any visualizations you have in functions
20:38: and then we have another function here
20:41: to produce that
20:42: table of the probabilities that creates
20:44: the percentile and then
20:45: does quantile on the data frame with
20:47: those percentiles
20:49: and we have a function which gets us the
20:51: probability
20:52: of the objective and
20:55: then another function which
20:59: puts all of this together all the
21:01: outputs
21:03: so it takes the results creates the data
21:05: frame it visualizes those results of the
21:07: histogram
21:08: it creates the probability table and it
21:10: creates the probability of achieving the
21:12: objective
21:13: and then it returns the probability
21:15: table and
21:16: probability of achieving the objective
21:18: finally
21:19: one last function which does the whole
21:22: summarization
21:23: of the analysis it calls this function
21:25: to get the
21:26: probability table and probability
21:28: objective
21:30: and the plot is already going to be
21:32: shown
21:33: the histogram is going to be shown while
21:35: we call this
21:37: and then we can
21:40: add some other output here to kind of
21:42: separate things out so a header here to
21:44: show the probability table and then
21:46: formatting
21:47: the probability table and then some
21:49: space
21:50: and then a sentence about the
21:52: probability of
21:53: meeting the objective and another space
21:57: uh when i call this you can see
22:00: now i'm taking the results from our
22:03: prior simulation
22:05: then we can see it shows the histogram
22:07: it shows the probability table
22:09: and it shows the probability of meeting
22:12: our objective
22:13: all in one function call
22:18: so next we're going to get into okay now
22:21: let's run this with the different stock
22:22: plates and
22:23: try to pick which is the most
22:27: appropriate allocation based upon the
22:29: simulation results
22:31: um but before we get there i just want
22:34: to show
22:34: one other thing we can do to format the
22:37: output
22:39: and that's that um super notebooks
22:43: work uh by using html and css
22:47: and we can use that to our advantage we
22:49: can actually put html
22:51: in the notebook and display it and have
22:54: whatever kind of formatting of the
22:55: output that we want
22:57: so i'm showing you a little
23:01: snippet here we can use the ipython
23:04: library
23:05: and the display module of that library
23:09: and we can import html and display from
23:12: there and then when you do display
23:13: html then it's going to
23:17: show it's going to format as html
23:20: whatever you have in there
23:22: so you can um
23:26: create whatever kind of output you want
23:28: with html
23:30: um here and this is not
23:33: we're not teaching html in this class
23:37: but just know this h2 thing that's a
23:39: level two header
23:42: and so you can use this function you can
23:44: just copy paste this into your own model
23:47: you can change these twos to anything
23:49: from one to six
23:50: for different levels of headings and if
23:53: you just use this function
23:54: then you just now have a function which
23:58: displays a header
23:59: so you can just pass whatever string
24:01: that you want there and it's going to
24:03: show a header
24:05: that's going to be useful because i mean
24:07: you see how much output we had just from
24:09: one
24:09: run here now we're going to have a bunch
24:11: of different runs and we want to make
24:13: sure we know
24:14: what output corresponds to what
24:18: so now we're coming to choosing
24:22: uh the appropriate weight and so let's
24:24: look at
24:26: uh 10 increments and the stock weight
24:28: going from 10 to 90
24:30: um and then we're just going to do a
24:34: loop
24:34: over those weights we're going to use
24:37: that display header
24:38: functionality we just built out to
24:41: separate it out
24:42: um to whatever weight that we're looking
24:45: at
24:46: and then use our simulation function to
24:49: get the results
24:50: with whatever stock weight and then
24:53: display the results
24:54: and because we've wrapped everything
24:56: nicely up in functions
24:58: this code ultimately becomes very simple
25:01: and that's how you can kind of build
25:03: layers on layers
25:04: with python and do quite complicated
25:06: things without ever having to write
25:09: complicated code
25:10: so now we see we have the results for
25:13: all these different weights on the stock
25:17: so we can kind of look through that and
25:20: see
25:21: i mean really the most important output
25:23: here is the probability of getting the
25:24: objective
25:26: but you can also get a better
25:27: understanding of what's going on by
25:29: looking at the other
25:30: values um so when we're 10 in the stock
25:34: you can see there's not a very big range
25:37: of
25:37: the possible values uh but we also only
25:41: have a 20
25:41: chance of meeting our objective whereas
25:44: we come all the way to 90
25:46: in the stock and you can see we have a
25:48: much larger range here
25:50: uh in the distribution but we do have a
25:53: higher
25:54: probability of meeting that objective
25:57: then we go back to eighty percent and
25:59: the probability goes
26:00: up and seventy percent it went down a
26:04: little bit
26:05: sixty percent it was back up um so
26:08: it looks like you know between um
26:13: and by the time we get to uh 40
26:16: it's going substantially down um so we
26:19: know that the proper range in the stock
26:21: is going to be somewhere between 50 and
26:23: 80 percent
26:24: and you can run this with additional
26:28: simulations maybe run it with 10
26:30: 000 or 50 000 simulations instead of a
26:33: thousand
26:34: and that will get you better stability
26:36: in the results
26:40: but this is the main idea here of
26:43: we wanted to evaluate different weights
26:45: the probability of getting the objective
26:47: with each of those weights
26:49: and setting up the functions and
26:53: everything in a clean way
26:54: that we can repeat all this analysis for
26:58: the different weights without having to
27:00: repeat the code
27:03: so that's an overview of how we can
27:07: apply monte carlo simulation in a simple
27:10: model
27:11: um in the other videos in this segment
27:13: we'll look at applying monte carlo
27:15: simulation to an existing model
27:17: and also applying it to an excel model
27:20: so thanks for listening and see you next
27:24: time

Monte Carlo Dividend Discount Model (DDM) Lab Exercise¶

Notes¶

This is an example of applying Monte Carlo simulations to a typical model just to better understand the probability distribution of the results
Be careful that if the growth exceeds the discount rate in the model, it becomes invalid, so some conditions in the model may be needed to address this

Transcript¶

00:03: hey everyone this is nick diabetis
00:05: teaching you financial modeling
00:06: today we're going to be going over the
00:09: lab exercise
00:10: on applying monte carlo simulation to
00:13: the dividend discount model
00:15: this is part of our lecture series on
00:17: monte carlo simulation
00:19: so we have already introduced what monte
00:23: carlo simulation is and then we went
00:25: and applied it in the context of a
00:28: portfolio model choosing between two
00:30: different assets
00:31: and now we're reaching the lab exercise
00:34: at the end of that material
00:36: and it's focused on the dividend
00:38: discount model
00:40: so the situation here is that you're
00:43: trying to value
00:44: a mature company they have stable
00:46: dividend
00:47: growth and so this is a reasonable
00:50: model to look at for evaluation
00:54: and the model is defined as you see here
00:57: in the second point the price is equal
00:59: to the next dividend
01:01: over the cost of capital or
01:05: discount rate of the stock minus
01:08: the growth rate of the dividends on
01:11: stock
01:14: and i gave you the initial
01:17: values to use for the inputs so the next
01:20: dividend is going to be a dollar
01:22: uh the discount rate cost of capital
01:25: is gonna be nine percent and the growth
01:28: rate
01:29: is going to be four percent so the first
01:31: step here is just to build out the core
01:33: model
01:34: which is able to take these inputs and
01:36: produce the price
01:38: from those inputs
01:41: but then as the modeler you're concerned
01:44: that some of these inputs could have
01:45: been mis-estimated
01:47: maybe the growth isn't four percent
01:49: maybe the cost of capital isn't nine
01:51: percent
01:52: so how can we evaluate changing these
01:56: and understand what are the chances of
01:58: achieving different
01:59: prices uh based on the possibility that
02:02: these values could be different
02:04: that's where the monte carlo simulation
02:06: comes in
02:07: so for the level one exercise
02:10: uh we're going to take the growth rate
02:13: and now draw that from a normal
02:15: distribution with a mean of four percent
02:17: standard deviation of one percent and
02:21: run that through the simulations
02:24: and ultimately visualize and summarize
02:27: the
02:27: resulting probability distribution of
02:30: the price
02:33: and then coming to the level 2 exercise
02:35: it's going to be continuing on from the
02:37: first
02:38: but here you're just also concerned that
02:40: the cost of capital
02:42: could be misestimated so for that
02:45: we'll also be drawing from normal
02:47: distributions using a mean of nine
02:49: percent
02:50: standard deviation of two percent and
02:52: the growth is also being randomly drawn
02:55: at the same time
02:58: and then you want to run through the
03:00: simulations and
03:01: visualize and summarize the resulting
03:04: probability distribution of the price
03:08: and now you have to be careful in this
03:11: level 2
03:11: exercise that there is a condition
03:16: in the dividend discount model for it to
03:17: be valid
03:19: when we look at the dividend discount
03:20: model we have this denominator
03:23: the uh cost of capital minus the growth
03:26: rate
03:27: for the model to be valid the cost of
03:30: capital has to be greater than
03:32: the growth rate otherwise this
03:35: denominator becomes negative the price
03:37: becomes negative which is nonsensical
03:40: so that's actually an assumption of the
03:43: dividend discount model
03:44: that the cost of capital should be
03:46: greater than the growth rate
03:49: and as you're doing the level one it's
03:51: it's pretty unlikely
03:52: that that situation would occur
03:55: that the growth rate would be greater
03:57: than the cost of capital uh because
04:00: we're drawing with a mean of four
04:01: percent standard deviation of one
04:03: percent it's pretty unlikely that it's
04:04: going to hit nine percent
04:06: but then once we start also varying the
04:08: cost of capital
04:10: now if we happen to get a low cost of
04:12: capital at the same time we're getting a
04:13: high growth rate
04:15: then that would lead to this condition
04:19: or violation a violation of the
04:21: assumptions of the model
04:23: uh that the cost of capital should be
04:25: greater than the growth rate
04:28: then you'll get negative crazy prices in
04:30: your model
04:31: so what you need to do in addition to
04:35: building the base model and building the
04:37: simulation is
04:38: you need to be able to check the
04:40: simulation inputs
04:42: before passing it through the model and
04:45: you want to check
04:46: that the cost of capital is indeed
04:48: greater than the growth rate
04:50: and if not just reject that simulation
04:53: you want to draw new inputs because it's
04:55: not a valid
04:56: run of the model
04:59: so that's the overview of the live
05:02: exercise
05:03: on adding monte carlo simulation to a
05:05: dividend discount model
05:07: so thanks for listening and see you next
05:09: time

Formal Introduction to Monte Carlo Simulations¶

Notes¶

The process described here to run Monte Carlo simulations may sound very similar to that to run sensitivity analysis, and that’s because it is. The only difference is that you randomly pick the input values from distributions with each run of the model rather than having fixed input ranges
Running the Monte Carlo simulation is not enough. You will have a bunch of outputs, but you must analyze them and visualize them to extract meaning
The main insights we can draw from analyzing a Monte Carlo simulation relate to the probabilities of certain outcomes in the model. We can also get a deeper picture of the relationships between inputs and outputs in a more complex model where that may not be clear
The probability table is the quantitative version of plotting the data on a histogram. I would generally recommend including both as the histogram allows quick understanding of the shape of the entire distribution whereas the probability table helps in quantifying the distribution
The Value at Risk (VaR) represents losing at least some amount with a degree of confidence, e.g. in 95% of periods the portfolio should not lose more than $1,000. The probability table can be interpreted in the same way if the outcome you are analyzing is the gain/loss
The probability of a certain outcome makes sense when you have some kind of goal in mind, then you can evaluate the probability of achieving that goal. If there is no specific goal in mind, there is no need to carry out this analysis

Transcript¶

00:03: hey everyone
00:04: this is nick dear burtis teaching you
00:05: financial modeling
00:07: today we're going to be doing a formal
00:09: introduction
00:10: to monte carlo simulation and the
00:14: analysis
00:15: of the outputs this is part of our
00:17: lecture series
00:18: on monte carlo simulation
00:21: so we already ran through a general
00:24: introduction
00:25: of what monte carlo simulation is why we
00:28: might want to go about it
00:30: and then we kind of flip the structure
00:34: on its head to first go through an
00:36: example
00:37: of how to do monte carlo simulation
00:43: and i flipped it because i think it's
00:45: easier to see by example
00:47: um before getting formally introduced to
00:51: it because it can sound a little bit
00:53: complicated when you look at it formally
00:56: but really it's a fairly simple process
00:58: and seeing the example makes that clear
01:00: so if you haven't viewed the video on
01:02: the example go back and look at that
01:04: first
01:07: so we're now looking
01:10: at theoretically what is monte carlo
01:14: simulation and what is the process
01:17: and as you look at this if you've seen
01:20: the prior videos
01:21: on sensitivity analysis you'll
01:24: notice that the process here and the
01:27: setup is almost exactly the same
01:31: we have some model here we're
01:34: representing the model
01:35: mathematically as we get some output
01:39: model is some function which converts
01:40: the inputs to the output
01:43: and we have multiple different inputs
01:48: and in order to run these simulations
01:52: we first assign a probability
01:54: distribution
01:55: to each input and then
01:59: for each input we're going to randomly
02:03: pick values from their distributions
02:07: and we're going to repeat that previous
02:10: step
02:11: for n times n is our number of
02:14: iterations number of simulations
02:17: um and so then we're going to have all
02:20: the random inputs and
02:22: then you want to calculate the model run
02:25: the model
02:27: with those simulated input values
02:30: so for each simulation
02:34: you've got each random input you pass it
02:36: into the model
02:38: and you get the result then we want to
02:42: keep the inputs associated to the
02:45: outputs so you know which inputs
02:46: produced which outputs and the final
02:49: step
02:50: is to visualize and analyze the results
02:54: so basically
02:58: all these last steps are
03:02: all these last three steps are the same
03:06: as sensitivity analysis running the
03:07: model with each of the inputs
03:09: keeping the model uh the inputs
03:10: associated with the outputs
03:12: analyzing the resulting outputs um
03:17: the only thing that's different here is
03:18: this random piece so
03:20: assigning a probability distribution to
03:22: each input and we're going to randomly
03:24: pick the value of the input from the
03:26: distribution
03:27: each time that we want to run the model
03:34: so let's say now we've run the
03:37: simulation
03:38: we've got our 10 000 different results
03:41: from the model
03:43: now what can we do with those results so
03:46: there are a few outputs
03:47: from the analysis and visualization
03:50: that we can gather
03:54: so the first category here is
03:57: probability distributions of the output
04:01: and we looked at two different ways in
04:03: the prior example
04:04: of how we can get at this
04:09: one is with a histogram over all the
04:11: results
04:12: and then the other is with a table of
04:15: the
04:15: probabilities the percentiles of the
04:18: probability distribution
04:19: um and the value
04:23: of the variable at that percentile in
04:26: the
04:26: distribution um
04:29: so in the investment returns example
04:31: that was you know saying that
04:33: 45 of uh
04:36: cases we got less than a thousand and
04:39: twenty dollars and having that
04:41: for the range of different percentiles
04:46: then we also have the probability of a
04:49: certain outcome
04:50: so this i mean it is kind of within this
04:53: idea of looking at the probability
04:55: distribution of the outputs
04:57: but it's just looking at a one
04:58: particular point on the probability
05:00: distribution
05:02: and that particular point is some
05:05: goal or objective that we care about and
05:08: so we want to
05:09: evaluate the probability of achieving
05:12: that objective or outcome
05:17: so in our investment returns model that
05:19: was what's the chance that we're gonna
05:21: get the thousand fifty dollars that we
05:22: need to satisfy our
05:24: obligation and then the last
05:27: uh main output which we haven't looked
05:30: at calculating yet and we're going to do
05:33: in the next video where we add monte
05:36: carlo simulation to the dynamic salary
05:38: retirement model
05:40: is we can look at the relationship
05:42: between the inputs and the outputs
05:46: so monte carlo simulation
05:50: we can use similarly to
05:54: sensitivity analysis where we're trying
05:55: to see what changing an input does to
05:58: the model
05:59: we can get at the same kinds of
06:00: questions with monte carlo simulation as
06:03: well trying to understand
06:04: how the inputs affect the outputs
06:07: and in order to
06:10: [Applause]
06:10: [Music]
06:11: analyze this the two main methods that
06:14: we'll look at are
06:15: visualizing it via a scatter plot
06:19: and using a multivariate regression to
06:22: get at it quantitatively
06:27: so digging into the outcome probability
06:30: distributions
06:32: so uh you can see on the left here
06:35: uh the kind of outputs that we had from
06:38: our
06:39: investment returns example we have a
06:41: histogram
06:42: distribution of the
06:45: whole of all the outputs all the
06:48: different
06:48: portfolio and values and then we have
06:51: this probability table
06:53: as well as a quantitative way of showing
06:55: that information
06:58: these are both getting at just the
07:00: chance of having different outcomes
07:02: in your model so
07:06: uh you can visualize this with a plot in
07:08: a table and with a plot
07:10: usually use a histogram you can also use
07:12: a kde
07:14: kernel density estimation plot as the
07:16: other
07:17: potential way to visualize this uh
07:20: which just gives a smoother looking
07:22: output
07:24: um and this helps you understand
07:28: at a high level what the distribution
07:30: looks like is it basically a normal
07:32: distribution
07:33: as is the case here what's kind of the
07:36: the range of the distribution
07:38: um and just understanding at a high
07:40: level are there heavy tails
07:42: is it you know non-normal et cetera
07:46: and then the probability table helps you
07:48: quantify some of this and think about
07:50: the chance
07:50: of hitting certain values in the model
07:54: so what this probability table says is
07:57: that
07:58: in 25 percent of cases we're gonna have
08:00: less than
08:01: a thousand and twenty dollars and fifty
08:03: percent of cases we're going to have
08:04: less than
08:05: 1039 dollars and in 75 percent of cases
08:09: we're going to have less than
08:10: 1053 dollars
08:17: and then i just wanted to note here that
08:22: the value at risk is a common
08:26: measure that's used in the industry and
08:29: the value at risk
08:30: is typically looking at a portfolio
08:34: but it evaluates at a certain
08:38: confidence level or certain probability
08:41: uh the the minimum amount
08:45: that you're going to lose so
08:48: it's saying like we're in 95 percent of
08:52: days
08:52: we're not going to lose more than a
08:54: thousand dollars
08:56: or yeah at 95 percent of days
09:00: we're going to lose less than a thousand
09:02: dollars
09:04: and those other five percent of days it
09:06: doesn't say anything about that it could
09:07: be
09:08: 10 000 could be a million loss just that
09:12: 95 of the time the loss is going to be
09:14: less than
09:15: uh a thousand dollars so this
09:18: probability table
09:20: um it actually gets at the same
09:24: concept so the probability table is
09:27: actually a more
09:28: general version of the value at risk
09:31: measure which
09:32: probability table tells you all these
09:34: different
09:35: values and as general to whatever kind
09:37: of output you want to look at
09:38: whereas value at risk is specifically
09:41: talking about some kind of loss
09:43: in your model and usually
09:48: most commonly to look at 90 95 or 99
09:51: um percentiles so if you need to
09:55: calculate the var
09:56: from a monte carlo simulation look no
10:00: further
10:00: than the probability table there are
10:03: other ways to calculate var
10:05: in other situations but it's getting at
10:09: the same concept as this probability
10:10: table
10:14: so then coming to the probability of a
10:16: certain
10:17: outcome um
10:20: so this was where we were saying uh
10:23: what's the chance of getting a thousand
10:25: fifty dollars
10:26: in our portfolio
10:29: so in order to understand how this works
10:33: let's look at a very simple example
10:37: so you have some box here and this box
10:41: has red and blue balls on the inside
10:44: and you can't see what's inside the box
10:48: you don't know
10:49: how many red balls and how many blue
10:51: balls there are in the box
10:54: and what you want to get is an estimate
10:57: of what is the chance
10:59: when i reach in to pull out a ball that
11:01: i'm going to get a blue ball
11:02: when i do that
11:06: so what's the process you might go
11:09: through
11:09: to figure this out
11:12: well just reach in grab a ball so i get
11:15: red or blue let me write it down
11:18: put it back in take the box up mix it up
11:21: whatever so it's random
11:23: and pull another one out write down its
11:25: color and just keep doing the same thing
11:27: a thousand times
11:31: and so you get a blue ball 350
11:34: out of 1000 times so then what is your
11:38: probability
11:39: of getting a blue ball and so
11:43: a decent amount of people will probably
11:45: just kind of intuitively
11:48: understand this and say well that's
11:50: that's 35
11:51: chance of getting a blue ball
11:55: how do we get there so
12:00: we can estimate the probability of
12:03: some outcome by basically trying a bunch
12:07: of times and
12:10: seeing how many of those times we
12:12: achieve the objective that we want
12:15: and you just divide the number of times
12:17: you hit the objective
12:18: by the total number of times and that's
12:20: going to be an estimate of the
12:22: probability of achieving that objective
12:25: so in the ball situation three 350 blue
12:29: balls
12:30: um when whenever we draw a blue ball
12:33: that's meaning our objective
12:35: of getting a blue ball whenever we draw
12:37: a red ball that's
12:38: not meeting that objective and so it
12:40: doesn't enter into the numerator
12:42: so uh we get 350
12:46: times where the trial was successful we
12:48: pulled the blue ball
12:49: and so that's the numerator and we had a
12:51: thousand times that we pulled balls
12:53: in total and so that's the denominator
12:55: and so we get that 35 percent
12:58: estimate of the probability of getting a
13:01: blue ball
13:04: and when we apply this in the investment
13:06: investment example
13:07: it was the same exact logic
13:11: it may have looked a little bit
13:12: complicated what we did in pandas
13:14: but all that we did was convert it into
13:17: this kind of format
13:18: where we assigned a one or a zero
13:23: based on whether we met the objective if
13:25: we got at least a thousand and fifty
13:27: dollars it became a one
13:29: if we got less than that then it became
13:30: a zero so that's the same
13:32: as here when we get a blue ball we make
13:36: it
13:36: one and when we get a red ball we make
13:39: it zero
13:41: and then sum it all up and that's going
13:44: to be
13:46: just the count of how many times that
13:48: you got the blue ball
13:49: or just the count of how many times that
13:51: we got a thousand and fifty dollars
13:55: so then that's the numerator and the
13:58: denominator is just the total
14:01: number of trials
14:04: number of simulations
14:07: and that you know taking
14:11: the sum of all that divided by the count
14:15: that is an average right so really
14:18: it's just getting a one for the positive
14:20: outcome a zero for the negative outcome
14:22: and then taking the average
14:24: of that and that will give you an
14:27: estimate of the probability of achieving
14:29: the objective
14:30: that you desire
14:33: so that's a formal introduction
14:37: to monte carlo simulation
14:40: we're going to come back next time to
14:41: discuss how we can analyze
14:44: the relationship between the inputs and
14:47: the outputs
14:49: so thanks for listening and see you next
14:51: time

Analyzing Relationships with Monte Carlo Simulations¶

Notes¶

The results from the Monte Carlo simulation can be run through multivariate regression or another empirical method to better understand the relationship between inputs and outputs
Sensitivity analysis gets at the same goal, but sensitivity analysis is a bit more narrow because at most one other input is changing at the same time. With Monte Carlo simulation, all inputs are changing with each run and so if inputs have complex interactions in the model they will be better understood through MC simulation
The multivariate regression results give the quantitative interpretation of the relationship while scatter plots can help visualize the relationship

Transcript¶

00:02: hey everyone this is nick diabetis
00:04: teaching you financial modeling
00:06: today we're going to be looking at how
00:07: we can analyze the relationship
00:09: between inputs and outputs in our models
00:12: by using
00:13: monte carlo simulation this is part of
00:16: our lecture segment on
00:17: monte carlo simulation so
00:21: we've already covered an intro to monte
00:24: carlo we've run
00:25: monte carlo on a model and then we went
00:28: through the
00:29: formal introduction of what we're doing
00:32: in monte carlo simulation so now we're
00:36: going to
00:37: dig into how we can
00:41: establish relationships and quantify the
00:43: relationships
00:45: between our inputs and outputs using
00:47: monte carlo simulation
00:49: so we
00:53: the monte carlo simulation is not the
00:54: only tool that we can use
00:56: to achieve this objective we've already
00:59: looked at
01:00: sensitivity analysis which can get at
01:03: the same basic idea what is the
01:05: relationship
01:06: between the inputs and the outputs with
01:08: sensitivity analysis we're changing one
01:10: or two inputs at once
01:12: and seeing what happens to the
01:16: output of our model with those changing
01:18: input values
01:21: with monte carlo simulation then
01:24: we're uh running the model a bunch of
01:27: different times with all these
01:28: randomized inputs
01:30: and that allows us to get a more
01:34: full picture of how the inputs relate to
01:37: the outputs
01:38: because when you change just one or two
01:40: inputs at a time
01:42: you may be leaving out that
01:45: when other inputs are at different
01:47: values
01:49: these inputs that you're changing have
01:51: different effects
01:53: so thinking about the retirement model
01:58: when we're changing the interest rate
01:59: and seeing how that changes our years to
02:02: retirement
02:03: well the interest rate is going to have
02:06: a bigger impact
02:07: on the model if the initial salary
02:11: was higher to begin with so you might
02:14: you know thinking about this in advance
02:16: you might do a sensitivity analysis
02:18: of interest rate versus the initial
02:20: salary so you can see how these two
02:22: interplay together
02:25: but you may not have thought about this
02:26: relationship at the get-go
02:28: or there may be other relationships
02:30: which
02:31: matter as well i mean with a higher
02:33: savings rate
02:34: also the interest rate is going to be
02:36: more impactful
02:38: so uh doing it the monte carlo
02:42: simulation route to analyze the
02:43: relationship
02:45: you're kind of bringing all these
02:46: different relationships together
02:48: and not having to explicitly think about
02:51: them as the modeler
02:52: you just kind of throw everything in
02:54: with random inputs
02:56: and the simulation is going to reveal
02:59: those relationships
03:04: so by changing all the inputs each time
03:06: that you run the model
03:08: then you're going to get cases uh
03:11: with each different values of the inputs
03:13: so
03:15: some cases you're going to have a high
03:16: interest rate and within that in some
03:18: cases you're going to have a high salary
03:20: some cases you're going to have a low
03:21: salary
03:22: some uh simulations you're going to have
03:25: a low interest rate and within that
03:27: you're going to have some that have a
03:28: high salary and some that have a low
03:30: salary
03:30: as well as different values of the other
03:32: inputs as well
03:34: so as long as you do enough simulations
03:36: a high enough number of iterations
03:39: then you're going to capture cases of
03:41: all these different values of the inputs
03:44: interplaying together and that gives you
03:46: a much more
03:47: fuller picture of the relationship
03:50: between the inputs and outputs
03:54: now the issue with this with sensitivity
03:57: analysis it was fairly straightforward
03:59: to understand uh how we can just look at
04:02: those results we just see the result
04:04: from the model
04:05: we can visualize it using conditional
04:08: formatting
04:09: or a hexpin plot and it's fairly
04:12: straightforward
04:13: it's a little more complicated to take
04:17: okay well now we've got 10 000 results
04:20: from this model
04:21: how do we get an understanding of the
04:24: relationship between the inputs and
04:25: outputs
04:27: so i'll discuss two different approaches
04:30: that we can use here
04:31: to get an understanding the first is
04:35: by visualizing uh via a scatter plot
04:40: so the scatter plot
04:43: shows the relationship between two
04:45: variables one that gets plotted on the
04:47: x-axis and one that gets plotted on the
04:49: y-axis
04:50: and each point is the values of the x
04:54: and the y's together
04:56: so this could be looking at that
05:00: investment rate on the x and looking at
05:02: the years to retirement
05:04: on the y dimension and you would say
05:05: well one time i ran it
05:07: and i had a 2.2 interest rate and i got
05:12: 22 years to retirement uh
05:15: et cetera and
05:19: um the disadvantage of
05:22: scatter plots is that it only does look
05:24: at one variable at a time so you do have
05:26: to have
05:26: like one scatter plot for each variable
05:30: but then each graph is is very focused
05:33: on that variable
05:36: and so what you're looking for when you
05:38: look at these scatter plots
05:40: is you want to see some kind of pattern
05:44: um if the points are just kind of in a
05:47: cloud
05:48: as we see in the bottom picture here
05:51: there's no kind of linear
05:53: or shape in here it's just kind of an
05:57: ambiguous cloud that
06:00: is supportive of there not being a
06:02: strong relationship
06:03: between the variables whereas
06:07: when the points seem to kind of fit
06:09: along a line
06:10: or like a u shape or some other kind of
06:13: very defined shape
06:16: [Music]
06:17: then that's evidence in support
06:20: of there being a relationship between
06:23: the two variables
06:27: so
06:31: as far as quantifying this then we're
06:34: going to
06:35: look at regressions multivariate
06:38: regressions
06:39: to accomplish that the scatter plot just
06:41: gives you a quick picture
06:43: visualization that you can quickly see
06:46: the relationship and
06:48: how the relationship changes throughout
06:49: the range of the input
06:53: but the multivariate regression is going
06:56: to be able to give you
06:57: quantitatively what is the impact of the
07:00: input on the output
07:02: so that allows you to answer questions
07:04: like if i earned
07:06: ten thousand dollars more for my
07:07: starting salary how much sooner
07:10: would i be able to retire uh
07:13: so of course in order to answer that
07:15: question
07:16: an easy attempt at it is to just go to
07:19: your model inputs
07:20: and increase the salary by 10 000 and
07:23: see what happens
07:24: to the years of retirement that's kind
07:26: of
07:27: that's you know basically the
07:28: sensitivity analysis approach
07:31: um but it is a simplistic way of looking
07:34: at it
07:35: it doesn't take into account how all the
07:38: other inputs
07:39: in the model could change you're still
07:41: just assuming that those other inputs
07:43: are at their baseline values
07:46: so by doing the monte carlo simulation
07:48: we take into account
07:49: all these cases of all the other inputs
07:52: being at different values
07:55: so multivariate regression basically we
07:59: put
07:59: whatever output we're trying to analyze
08:02: as our y variable
08:04: and then each of our inputs as the x
08:05: variables
08:07: and this will be able to tell us
08:09: quantitatively
08:10: what is the relationship what's the
08:13: strength
08:13: magnitude of the relationship and
08:15: direction
08:17: uh between the input and the output
08:22: so the process or how you interpret the
08:27: uh results of that as we run a
08:29: multivariate
08:30: regression we get some fit statistics
08:33: and then the part that really matters
08:36: for this
08:37: is the coefficients and the
08:40: p-values and if the p-value is high
08:45: that's evidence of there not being a
08:47: relationship
08:49: if it's low then there is at least some
08:52: relationship it doesn't necessarily mean
08:54: that the relationship is strong or
08:56: even meaningful in your model but it
08:58: does mean that there is evidence that
09:00: there is a relationship
09:02: and then you look at the coefficient to
09:05: assess the strength or magnitude of that
09:08: relationship so the coefficient
09:12: in multivariate regression is
09:16: how much does the outcome variable
09:18: change when there
09:19: is a one unit increase in
09:22: the input variable or x variable
09:26: so to give an example of that say we're
09:29: working still with this
09:31: retirement model and you get a
09:33: coefficient of negative .0002
09:37: on starting salary when years to
09:40: retirement is your y variable
09:43: so what that means is a one unit
09:44: increase in our x
09:46: is associated with a negative point zero
09:49: zero zero
09:49: two unit increase or decrease in the y
09:54: um so our salary is in dollars and so
09:58: that means that's a one dollar increase
10:00: in salary
10:01: and then our year's retirement is in
10:04: years
10:05: and so that's a one dollar increase in
10:07: salary associated with a
10:08: decrease in year's retirement of 0.0002
10:12: years
10:13: of course that's not
10:16: a nice way to interpret it right like
10:19: who cares about a one dollar increase in
10:20: salary that's not
10:22: gonna be a meaningful thing uh but the
10:24: nice thing about these
10:26: uh relationships is you can just
10:27: multiply them up
10:29: in order to get it in terms of something
10:31: which is meaningful
10:33: so we can multiply both sides here by
10:35: ten thousand
10:36: to now change it into a ten thousand
10:39: dollar increase in salary
10:41: is associated with a decrease in use to
10:44: retirement by
10:45: two years so whenever you interpret the
10:48: coefficients
10:50: you want to put them in terms of
10:52: something which is meaningful
10:54: for your model no one cares about a one
10:56: dollar change in salary how about a ten
10:58: thousand dollar change
11:00: and let's interpret it in that context
11:05: an important thing as you go to
11:07: interpret the results from the
11:08: regression
11:10: um and this can definitely be a point of
11:13: confusion
11:14: is that these coefficients are
11:17: all less all else constant so that means
11:22: that this you know ten thousand dollar
11:25: increase in salary decreases years to
11:26: retirement by two years
11:28: that is not taking into account um
11:33: that you know when you're able to earn
11:35: more money you're probably able to save
11:37: more money as well and so the savings
11:39: rate is going to be higher for an
11:40: individual who makes
11:42: more money that is not being captured
11:45: in the coefficient now
11:49: a big reason that i've been saying we
11:51: should use this approach rather than
11:53: just
11:53: sensitivity analysis is because it takes
11:55: into account that all the other
11:56: variables are changing
11:58: so that's where students can get
12:00: confused by this
12:02: because now here we're saying oh but
12:05: we're basically treating all the other
12:06: inputs as constant
12:08: um but it's
12:12: even though we're kind of isolating the
12:14: effect to this one
12:16: uh input with this coefficient
12:19: the regression model is still
12:21: considering all these different cases of
12:23: the input values so you can kind of
12:25: think of it as an
12:26: average across you know thinking about
12:29: um earning salary uh you can think of it
12:33: as an average across
12:34: like sometimes the investment rate was
12:35: high sometimes the investment rate was
12:37: low
12:38: sometimes the saving rate was high
12:39: sometimes the saving rate was low
12:42: taking the average across all these
12:44: different cases
12:45: what was the overall effect of just the
12:49: salary portion
12:53: so if you know that two inputs in your
12:56: model
12:57: are linked as is the case
13:00: uh potentially here with starting salary
13:02: and savings rate you know that
13:04: if you have a higher starting salary
13:06: you're gonna have a higher savings rate
13:09: then you can basically combine the
13:12: coefficients and say well i know when
13:15: the
13:15: salary goes up at ten thousand the
13:17: savings rate is going to go up by five
13:18: percent and then you can add those two
13:21: effects together
13:22: to get the total effect on the years to
13:24: retirement so you can still use these
13:27: regression results to get at the full
13:29: relationship
13:30: it's just that each coefficient is
13:33: interpreting
13:34: just the effect of that variable
13:39: and another thing to be careful about
13:41: here is the units we already talked
13:43: about
13:43: you know how this is a one unit increase
13:45: and you're probably
13:47: in a lot of cases going to need to
13:48: adjust the units
13:50: in order to get that to a meaningful
13:53: number
13:54: uh and that's definitely the case when
13:56: you think about decimals versus
13:59: percentages so you know we're always
14:02: representing our investment returns in
14:04: decimal format
14:06: in our python models and
14:09: a one unit change in decimal format is
14:12: actually a 100
14:13: change it's going from zero to one which
14:15: is zero to one hundred percent
14:18: um and so the coefficient that you'll
14:20: get
14:21: for the investment rate or any other
14:23: decimal
14:25: um decimal number that is actually a
14:27: percentage
14:29: um then basically
14:32: the coefficient is going to be much
14:35: larger
14:36: uh it's going to be a hundred times as
14:38: large as
14:39: the value would be for a 1 change so you
14:42: have to divide by 100
14:44: to get it to a one percent change
14:48: um and if you you know have things in
14:51: percentages it could go the other way
14:53: around in the model
14:55: and so you just need to be careful about
14:56: the units and thinking through
14:58: the coefficients and what
15:02: makes sense to have everything in the
15:04: proper units
15:07: so that's an overview of theoretically
15:10: how and why we're going to analyze the
15:13: relationship
15:14: of inputs and outputs through monte
15:16: carlo simulations
15:18: we'll come back next time to apply monte
15:21: carlo simulation
15:22: to the dynamic salary retirement model
15:25: and within that we're going to see an
15:26: example of
15:27: how we can do all this analysis of
15:29: relating inputs to outputs
15:31: so thanks for listening and see you next
15:35: time

Applying Monte Carlo Simulation to a Python Model¶

Notes¶

It can make sense to set up a separate dataclass for your simulation-specific inputs, or you may add them to the existing dataclass
Once you start running large numbers of simulations, some unexpected situations may occur in your model such as inputs going negative that were supposed to only be positive, or one input being greater than another when it is supposed to be less. To solve this, we can build functions which produce the random inputs according to the necessary conditions in our model
Create a function which runs a single simulation, then call that function in a loop over the number of iterations to run all the simulations
Because we typically have multiple changing inputs and may even have multiple outputs, it is useful to store data as a list of tuples and then create a DataFrame at the end
It doesn’t hurt to take the quantile of the entire DataFrame to see the distributions of the inputs as well. It can be a nice check to make sure your random inputs are working appropriately
After running a multivariate regression, be sure to add some text interpreting the results
We can check the standardized coefficients (coef * std) to understand which inputs have the greatest impact on the outputs. Be careful that these results are influenced by your choice of the input distributions. If your input distributions are not reasonable, neither will be the results

Resources¶

Dynamic Salary Retirement Model with Monte Carlo

Transcript¶

00:03: hey everyone
00:04: nick diabetes here teaching you
00:05: financial modeling today
00:07: we're going to be looking at an example
00:09: of how to add
00:10: monte carlo simulation to an existing
00:13: python
00:14: model this is part of our lecture series
00:16: on monte carlo simulation
00:19: so we introduced monte carlo we looked
00:22: on how to build out a model with monte
00:24: carlo
00:25: and we went through a more formal
00:27: introduction and an
00:29: explanation of everything that we're
00:30: doing and now
00:32: it's time to go and apply monte carlo
00:35: simulation
00:36: to our existing dynamic salary
00:38: retirement model
00:40: and you can find the full completed
00:43: exercise there on the course site so
00:46: that you can
00:48: take from that example to build out your
00:50: own monte carlo simulations
00:54: so let's jump over here to the dynamic
00:57: salary
00:58: retirement model and i'm just going to
01:00: go ahead and
01:01: restart kernel run all cells so that we
01:04: can
01:04: get everything defined to get ready to
01:08: do our monte carlo simulations
01:12: so we can add a new section here monte
01:16: carlo simulation
01:18: and you would want to describe
01:22: what you're doing here what's the goal
01:24: of the simulation etc i'm going to skip
01:26: over that
01:27: for brevity in the video and go right to
01:30: the code you can see all of that
01:32: in the completed example so
01:37: we're going to have some additional
01:39: inputs now
01:40: from the simulation
01:44: we're going to need to draw all the
01:46: different inputs from normal
01:47: distributions
01:50: and so we're going to have to have means
01:51: and standard deviations of those
01:53: distributions
01:57: and so we can
02:00: use our existing baseline
02:04: input as the mean
02:07: so we don't have to add all the means
02:10: we do need to go and add these standard
02:12: deviations though
02:14: and we're also going to need to have a
02:15: number of iterations for the simulations
02:18: as an input so we've got a number of
02:21: different inputs to
02:22: manage here because we've got a
02:25: bunch of inputs to manage it makes sense
02:27: to create a data class
02:29: to manage them there's a number of ways
02:32: you could set this up you don't
02:33: necessarily need to use the data class
02:34: you could go and add these inputs to the
02:36: existing model inputs data class
02:38: but i'm just going to create a separate
02:41: simulation inputs data class
02:48: and so in that i'm going to put the
02:50: number of iterations
02:52: uh that would be an integer let's
02:55: default it to 10 000. oh
02:58: we're building this out i'll put it at
03:00: 100 then i'll go back and change it to a
03:02: thousand
03:03: later uh we're gonna have the
03:06: starting salaries so let's look at what
03:08: we
03:10: have in the model data
03:13: starting salary
03:17: so we want a standard deviation for that
03:21: um and let's make that ten thousand
03:24: dollars
03:25: um and as you go to pick
03:29: a standard deviation for your
03:30: distribution
03:32: so the mean you know whatever the kind
03:34: of expected or most likely value is
03:37: should be fine for the mean we already
03:39: have that from our baseline
03:40: values so that's fine the standard
03:44: deviations you want to think about
03:46: uh one standard deviation changes in
03:48: either direction should happen often so
03:50: going
03:50: between 70 and between 50 and 70 000
03:53: salary happens often that makes sense
03:56: two standard deviations in either
03:59: direction
04:00: should be not happening very often but
04:03: not
04:03: rare either so that's
04:06: going from a 40 000 to an 80 000 salary
04:09: that
04:10: seems reasonable three standard
04:12: deviation changes should be rare
04:14: um so going from thirty thousand
04:19: to uh ninety thousand yeah those those
04:22: outer
04:22: thirty to forty and uh eighty to ninety
04:25: seem pretty rare for a starting salary
04:28: and outside like four times standard
04:31: deviation should like almost never
04:33: happen
04:34: um so 20 000 starting salary or 100 000
04:37: starting salary
04:38: for you know if this is some just
04:41: undergraduate getting a job
04:42: both of those almost never going to
04:44: happen
04:46: so that seems like a reasonable standard
04:49: deviation and that's how you can think
04:51: through
04:52: what standard deviation should i pick
04:53: for my distribution
04:56: so then we can go
04:59: to create the rest of our standard
05:02: deviations promo every new year's
05:04: std um
05:07: let's put that at 1.5
05:11: um the cost of living raised
05:17: let's put that at a half of a percent
05:24: um the savings rate
05:31: let's put that at seven percent
05:35: and the interest rate
05:41: let's put that at one percent
05:46: okay and then we can create an instance
05:48: of our
05:50: simulation inputs
05:53: and we have everything there
05:58: so the first step in the monte carlo
06:01: simulation
06:02: is to draw the random values
06:05: of the inputs in order to run them
06:07: through the model
06:11: but looking at the inputs into our model
06:15: before we go and draw random values we
06:17: want to think about
06:19: what are valid ranges of these inputs
06:22: in our model is it possible that we're
06:24: going to hit some
06:25: invalid numbers by pulling these
06:28: random values um
06:32: so salary how often you're getting
06:35: promotions
06:37: cost of living raised promotion raise
06:39: savings rate
06:41: um all these things really they need to
06:44: be positive they don't make sense
06:46: if they're negative
06:50: and the interest rate i would say you
06:52: know if this was
06:54: each individual year we were getting a
06:55: random interest rate sure that can go
06:57: negative
06:58: but if we're talking about a long-term
07:00: interest rate
07:01: that also should be positive so really
07:05: all these inputs that we're randomizing
07:07: should be positive
07:08: in the model so
07:11: knowing that knowing the conditions that
07:13: we need to have on our inputs
07:16: we can write functions to draw the
07:19: random inputs
07:20: that are always going to satisfy these
07:23: conditions
07:25: so
07:29: well first i'm gonna i'm gonna go back
07:30: up to the top and import random
07:33: uh because we're definitely going to
07:36: need that
07:38: to draw the values from normal
07:39: distributions
07:41: um and so
07:44: you recall from the um
07:47: continuous random variable material
07:51: random.normal variant is able to draw
07:53: values from a normal distribution
07:56: um and so let's just
07:59: take an example mean here
08:02: of two and a standard deviation
08:06: of one then i set these up because i
08:10: know
08:10: that this is going to go negative in
08:12: some cases it's only two standard
08:14: deviations away from zero and so that
08:16: should happen decently often
08:18: um so putting the mean and standard
08:20: deviation
08:22: then we get random values from that
08:24: normal distribution
08:26: um and most of them are going to be
08:27: positive but some of them are going to
08:30: come up negative i saw one that was
08:31: negative there
08:34: um so
08:37: what we can do um we
08:40: want to figure out a way so that every
08:43: value that we draw is going to be
08:44: positive
08:45: and let me actually just increase this
08:47: so that it's a lot more likely
08:49: to get negative numbers in here
08:52: so that it's really clear that this is
08:55: working appropriately
08:58: so what we can do is basically
09:01: pick the value and then if we didn't
09:05: get a value that meets our conditions in
09:07: this case that is a positive number
09:10: then we're just going to keep drawing
09:11: values until we do
09:13: so what we can do is use a while loop
09:18: for this because the while loop executes
09:20: until
09:22: some condition evaluates to false as
09:24: long as it's
09:25: true it's going to keep executing and so
09:28: this is the perfect fit here
09:29: because we want to keep drawing random
09:31: values until
09:33: we meet our condition of
09:36: it being positive so
09:41: that condition so let's um
09:45: call this drawn value um
09:49: our condition would be while the drawn
09:52: value
09:54: is less than zero so as long as we're
09:56: getting a negative number
09:57: keep going so it's basically the
09:59: opposite
10:00: of the condition that you want we want
10:03: the drawn value to be greater than zero
10:05: so as long as greater than or equal to
10:08: zero
10:09: so as long as it's not the case that it
10:12: satisfies that condition
10:14: as long as it's a negative number then
10:16: we're going to keep
10:17: drawing additional values um but then
10:21: you go and run this and you'll get the
10:22: name error that
10:23: drawn value is not defined because we
10:25: don't define it until here
10:27: so we also need to initialize it so just
10:30: initialize it to some value which is
10:33: going to satisfy
10:37: the reverse condition so basically put
10:40: it
10:41: at a value which is not acceptable for
10:44: your model
10:45: and that will make sure that it goes
10:47: into the while loop
10:49: and so then we just show the drawn value
10:51: at the end
10:53: and then you'll notice that no matter
10:55: how many times i run this
10:56: it's going to come up positive every
10:58: time
11:00: um even though we saw it was decently
11:03: often
11:04: that we were getting negative numbers
11:05: before
11:08: so now we have a function we can call
11:11: this
11:12: uh random normal
11:15: positive which takes a mean and a
11:19: standard deviation
11:21: um and returns that drawn value at the
11:25: end
11:25: so now we can just do random normal
11:27: positive
11:29: with whatever mean and standard
11:31: deviation
11:32: and it's going to uh give us
11:36: values from the normal distribution
11:38: basically but just
11:39: chop off any of those ones which are
11:42: negative and
11:42: try again
11:46: so we can apply this function across all
11:48: the different inputs that we're
11:50: randomizing
11:50: in the model
11:54: so um and of course you would add a doc
11:57: string to explain
11:58: what this does i'm just skipping that
12:00: for
12:01: keep the video short but definitely take
12:03: a look at the completed example
12:05: for having all the doc strings and
12:07: everything filled out
12:10: so what we want to do next is we want to
12:14: pick the random values of all these
12:16: inputs
12:17: so
12:20: all these different inputs here we want
12:24: to
12:24: randomly draw them um
12:28: so i'm just going to copy these
12:31: to just make my life easier to type this
12:34: out
12:35: um so then i can get
12:39: all the names of the different inputs
12:40: there
12:44: delete off these commas
12:50: and then um
12:55: we can then um
12:58: that's not gonna work delete off these
13:01: values as well
13:05: we're going to use the random normal
13:06: positive function that we just created
13:09: in order to um
13:13: let me put a space there random normal
13:17: positive
13:18: um and
13:21: we want to
13:25: do the mean there
13:28: as the original input value and we want
13:31: to get from the sim data
13:33: the std of
13:36: that value so then
13:40: we have drawn all these different inputs
13:45: now let me add a data equals model data
13:53: um and then i named one of these
13:57: oh i did i forgot to put promotional
14:00: arrays standard deviation
14:01: so um that in here as well
14:05: promo raise std
14:10: and what's the reasonable value for that
14:13: let's say 5 on that
14:17: so now hopefully this will work yep um
14:20: and so now we have all these different
14:23: random values interest rate
14:25: promotion promotional raise etc
14:28: we're getting random values for each and
14:30: they're always positive
14:34: um so then we can
14:38: make a function out of this um so
14:41: we can call this
14:44: years retirement
14:48: simulation inputs
14:52: i'm going to take the data and the sim
14:56: data
15:00: and
15:03: then we can return all of these values
15:15: so we want to return all these different
15:18: values and so it's doing it as
15:21: a tuple
15:28: where we're returning all these at once
15:33: then we can call this years to
15:37: retirement simulation inputs
15:39: with the data and the sim data and we're
15:42: going to get
15:43: all these different random values and
15:45: you can see they're changing each time
15:47: first one corresponds to salary second
15:49: promotions every nears
15:50: and so on
15:54: so now we're able to draw the random
15:57: values
15:58: of all our inputs and so the next step
16:01: is then to get to
16:02: running a single simulation
16:07: so we
16:11: are going to call this function
16:14: um and we want to save
16:18: the results of it so we can do
16:21: we can take the same thing to split it
16:22: back out into the
16:25: individual variable values
16:30: so you see i run that and now all these
16:33: things
16:33: are defined individually
16:39: and then we want to create the data
16:43: so create an instance of the model
16:45: inputs
16:46: with these values
16:51: so i'm going to grab these again
17:02: and let them equal put a comma
17:06: and now we should have that new data
17:08: created appropriately
17:11: so i run this and i get the model data
17:14: being created
17:15: with random values now
17:21: now that we have the data into the model
17:24: inputs data class
17:25: now we can run the model so we can do
17:28: years for retirement equals
17:29: we have the years to retirement function
17:32: um
17:33: we want to pass the new data and we want
17:36: to make sure
17:37: we don't need to print out the do i have
17:40: the print output in this version of the
17:41: model
17:42: i might not
17:45: okay it seems it's not there let me
17:48: quickly add it um otherwise we're going
17:51: to have
17:52: a huge amount of output coming out of
17:55: this
17:56: so if we're an output and just wrap
17:59: all the print statements in that
18:06: three print statements here
18:14: um and then coming back over to here
18:18: now for an output equals false
18:23: so with that we get the year's
18:24: retirement and we're going to get
18:28: it should be different using retirement
18:29: but we're getting the same
18:31: year's retirement so if the print output
18:33: wasn't there i think i might have used
18:35: the version yep which had this model
18:37: data mistake
18:38: so make sure it flows all the way
18:40: through redefine that
18:43: and now hopefully we'll get different
18:45: years of retirement with each run
18:47: of the model
18:50: and we're still not getting that that's
18:54: odd
18:58: let me just restart this and run all the
19:00: way through while i
19:01: um go back and take another look
19:05: um oh we have
19:13: oh i was editing this wellsdf function
19:16: okay so this is the one that also had
19:19: that model data mistake
19:20: your data data okay we're good now
19:24: hopefully it should come through now yes
19:27: okay now we're getting different years
19:29: of retirement with each one of the model
19:31: so you definitely want to do these
19:33: checks on your own with your own model
19:35: as you build it out
19:36: if one simulation does not work properly
19:38: then certainly
19:39: 10 000 are not going to work properly
19:41: either
19:44: um and now that we have the logic to
19:46: produce
19:47: one simulation then we can wrap that up
19:50: into a function
19:53: so i'm going to call this producer
19:56: retirement
19:58: single simulation it takes the data
20:02: and the sim data
20:05: and then all this and return the years
20:09: or we want to return more than just the
20:11: years for retirement though
20:13: um we want to actually
20:16: return all the inputs as well so we can
20:19: return all the inputs
20:21: and the years to retirement um
20:24: so that we have all the inputs
20:27: associated with the output
20:29: so then when i call this we then get
20:33: all those inputs again but also the
20:35: output the year's retirement as well and
20:38: that's all
20:38: associated together so now we can
20:42: run a single monte carlo simulation with
20:46: a single line of code
20:49: so now that we can do that we want to
20:51: get to running the full
20:53: monte carlo simulation process with
20:56: however many iterations that we want
21:00: so um we want to basically call this a
21:04: loop
21:05: over the number of iterations and
21:09: all we're doing is just calling this a
21:10: bunch of times and putting it into a
21:11: list so i'm going to use a list
21:13: comprehension
21:15: to simplify that loop so just calling
21:18: the function
21:19: or i in range
21:22: sim data dot number of iterations
21:29: hold out all results and then
21:32: we can look at let's just look at the
21:36: first
21:36: five because there's going to be a lot
21:37: in there and we can see we're getting
21:39: multiple runs of this with the inputs
21:41: associated with the outputs
21:45: so then we can put this into a data
21:47: frame
21:48: and if i imported andis uh
21:51: yep uh so put this all into a data frame
21:56: pd.dataframe
21:59: of all results
22:03: and the columns then we want to
22:07: name these columns so we're going to
22:09: have starting salary
22:11: first and you want to go in the same
22:12: order as whatever you have in the tuple
22:15: the starting salary and then promos
22:19: every n years
22:28: then the cost of living raise
22:32: and then the promotion rate
22:36: and then the savings rate
22:40: the interest rate and finally the years
22:43: to retirement
22:46: and you don't want to have really long
22:48: cells like this it's just really
22:50: difficult to read so i'm going to split
22:52: this
22:52: onto multiple lines it's within
22:54: parentheses and so i can split it
22:59: and this is going to make the code
23:01: easier to read
23:06: so then we should have
23:10: our data frame created
23:13: and we see that here so i have it set at
23:16: 100
23:16: simulations right now that's why we have
23:18: 100 rows in the data frame each row is
23:20: one simulation
23:22: and we see all the input values
23:23: associated with
23:25: the output so
23:29: now we're able to run all the
23:31: simulations so let's make a function for
23:33: that year's retirement
23:36: monicarlo takes the data and sim data
23:41: and let's end in all this and return it
23:46: so now i can call this
23:50: and we should get the same thing
23:54: and of course i could you know change it
23:58: and
23:58: run for
24:02: say a thousand iterations and then we
24:04: would see a thousand rows in the data
24:05: frame so everything
24:06: seems to be flowing through properly
24:13: so um
24:17: now we've got the simulation results and
24:20: we can get them with a single function
24:24: uh let's go ahead and save those results
24:26: into a data frame
24:32: so now we've got this data frame
24:35: but it doesn't have great formatting we
24:39: might want to apply some formatting to
24:41: it
24:43: um so style format um
24:47: so starting salary and i'm just gonna
24:51: i want to probably format all of them so
24:53: i'm just gonna copy these to get started
24:55: with
24:56: um and then starting salary
25:00: um that's going to be uh dollars
25:04: and i wanted to have commas and zero
25:06: decimal places
25:08: promotions every n years um that can
25:11: just have
25:12: one decimal places one decimal place
25:17: um cost of living raise
25:20: that's going to be a percentage
25:23: we can give it up to two decimal places
25:27: promotion raise same thing really all
25:30: the
25:31: percentages same thing promotion raise
25:34: uh savings rate and interest rate
25:39: and then use retirement
25:42: we can make it zero decimal places
25:49: um
25:54: okay so now we see that with proper
25:57: formatting
25:59: um and the other thing we might want to
26:02: do is add some coloring to it
26:04: so i'm going to add the background
26:05: gradient with the
26:08: red yellow green color map
26:11: on just the years to retirement
26:15: column
26:20: i see that coming there and
26:25: you be careful that you don't style your
26:27: data frame which has
26:29: 10 000 rows in it because it is going to
26:31: show all of it
26:32: um
26:35: and we'll notice that um this is going
26:38: the opposite of the direction that we
26:39: want right it's showing green for high
26:41: values but really green is
26:43: our low is good in our model so we want
26:45: to reverse the color map and so we can
26:47: add
26:48: underscore r and now uh
26:51: when the year's retirement are low we're
26:53: seeing the dark green
26:54: and when they're high we're seeing the
26:56: red
27:01: so then we can wrap this in a function
27:04: style df takes the data frame
27:08: and then returns this
27:11: so that we can just hear the shortcut
27:15: which can see the top five rows so just
27:16: look
27:17: at the data we can then apply style df
27:20: to that
27:21: to just keep a look into our data in
27:24: the model
27:29: and it's useful to say
27:32: company simulations were run so we can
27:36: do the length
27:36: of data frame simulations we're running
27:46: so now we have the results from the
27:48: simulation and we want to
27:50: visualize and analyze them
27:54: so let's visualize the results
27:58: um well so this
28:03: file data frame that's the first part
28:06: um just example results
28:13: um this this can go at the end of the
28:15: results there we go
28:18: um the next what we want to visualize
28:21: is the distribution of the output
28:24: here's retirement
28:28: um so we can take
28:32: this data frame user retirement
28:35: and do a histogram
28:38: um and see the output there
28:42: um let me go ahead and just
28:45: run this with more iterations at this
28:47: point
28:50: so that we'll have a good idea what the
28:52: output
28:53: is going to look like
28:57: um 50 is not going to be enough pins for
29:00: that let's try
29:01: 100 that's a little bit more
29:04: reasonable um
29:08: so then
29:12: um we want to create the
29:15: um probability table
29:18: so uh probability table
29:23: um and we can get the quantiles
29:26: uh we're gonna do this i develop
29:29: five percent um percentiles
29:33: i have a twenty for i in range uh from
29:35: one to twenty
29:42: so in range so we get that five percent
29:46: to 95 percent
29:48: and then we can do vf.quantile
29:51: on that
29:56: so here's another advantage of the
30:00: data frame styler function pattern now
30:03: we have
30:04: a different data frame which is in the
30:06: same structure
30:08: we can apply the same function to that
30:11: so now this
30:15: probability table is nicely formatted as
30:18: well
30:20: and this is telling us you know only
30:22: five percent of the time will you be
30:24: able to retire in less than 21 years
30:28: and five percent of the time it could
30:30: take longer than 39 years to
30:31: retire based on the distributions that
30:34: we have assigned
30:38: so next
30:42: now we're going to get into analyzing
30:45: the relationship of the inputs versus
30:48: the outputs so the first that we can do
30:51: is uh plots of inputs
30:54: verse use retirement
30:59: um so if we do df df.plot.scatter
31:06: and we tell it the y is here's
31:09: retirement
31:11: and then the x is whatever
31:17: input that we want to look at then we're
31:18: going to get a
31:20: scatter plot as a result
31:24: so we want to do this but we want to do
31:26: it for all the different possible inputs
31:29: so i'm going to go back i'm going to
31:30: grab this list
31:34: so we can call this the
31:38: input columns
31:45: and then for each
31:48: column in the input columns then we want
31:52: to do this
31:53: scatter with that particular column
31:58: then we now have all the scatter plots
32:02: for each of the different inputs so we
32:05: can see the relationships
32:07: um and we don't need years to retirement
32:12: we only want the inputs not the output
32:14: here
32:15: so i'm going to remove that one
32:19: um and you know you can see
32:23: some of these have clearer patterns than
32:25: others like here with
32:26: savings rate you can see it's kind of a
32:29: curve here
32:29: that's a fairly defined pattern um
32:33: and with cost of living raise it's a
32:34: little more of an ambiguous cloud
32:37: here so just based on the scatter plots
32:41: it suggests a fairly strong relationship
32:43: between savings rate and years for
32:44: retirement
32:45: and not a very strong relationship
32:48: between the cost of living raise and
32:50: years to retirement
32:53: so then we can go on to the quantitative
32:58: analysis of the relationship between the
33:00: inputs and the outputs
33:03: and that is through the multivariate
33:08: regression
33:12: so we're going to use the stats models
33:15: package in order to run the regression
33:19: so i'm going to import stats models uh
33:23: dot api as sm
33:26: and this is another one of those
33:29: conventions
33:30: just take this import and use it as is
33:32: and then we'll use
33:33: sm to interact with stats models library
33:39: so what we want to do is
33:43: i'm going to say that our output column
33:45: is used for retirement
33:49: and we already had our input columns
33:51: defined here
33:55: so ultimately what we're doing
34:00: is we're going to get our x variables
34:04: as the uh input columns
34:08: from the data frame so then we have just
34:11: the inputs no years to retirement on
34:13: here
34:14: we're going to get uh the
34:18: um here's to retirement here
34:21: as our y variable um
34:25: and then we're going to create the
34:27: regression
34:28: model object so we're going to do an
34:32: ordinary least squares regression ols
34:34: regression uh which is the standard
34:38: and we're going to put the y first and
34:41: then the x
34:43: and then in order to get results from
34:45: that we're going to
34:47: fit the model call.fit on the model
34:52: and then we call the summary method on
34:54: the result object
34:55: in order to produce this summary that
34:58: you see here
35:01: so now the top part is the general fit
35:05: statistics
35:06: not too important for this what we're
35:08: really concerned about is the p-values
35:10: and the coefficients
35:12: so all the p-values are low and so
35:15: there's
35:15: no evidence from the p-values that any
35:18: of these
35:19: uh inputs are unrelated to the outputs
35:22: it seems that there is an evidence of a
35:25: relationship
35:26: with each one of them and we can look at
35:29: the coefficients
35:31: in order to interpret the um strength of
35:34: that relationship
35:37: but there is one other thing that we
35:39: need to do here
35:40: um which is that
35:44: the you'll notice here that there's no
35:47: constant or intercept if you're familiar
35:50: with
35:51: running regressions you typically have a
35:53: constant or intercept
35:55: as one of the x variables and that is
35:57: not included by default
35:59: in stats models you do have to add it
36:02: explicitly
36:03: so in order to do that we do sm.add
36:07: constant
36:10: and then in here we do has const equals
36:13: true
36:14: and then when we run this again now
36:16: we'll see we have this constant
36:18: in there um and when we look at the x
36:22: that basically added a column of ones
36:25: into the model
36:27: and that's how it works with the
36:29: constant
36:30: um we're not going to be diving into the
36:33: theory of ols regressions why you should
36:35: have this constant
36:37: but just in general you should probably
36:40: have the constant and so make sure to
36:42: add it
36:42: so you know you can just copy paste this
36:45: code snippet
36:46: or your own model and just switch out
36:48: the output columns
36:49: and the the output column and the input
36:52: columns
36:56: so now we have the regression results
37:01: and we want to interpret them
37:06: so we can go ahead and already look at
37:07: these and start doing some
37:09: interpretations you'll notice
37:11: um that for promotions every n years we
37:14: have a 1.2648
37:16: coefficient so what that's saying is if
37:19: we get uh if it takes one year longer
37:23: to get a promotion on average
37:27: that's going to lead to a 1.26
37:31: additional years it takes until we get
37:34: to
37:35: retirement
37:40: and so another question is you know
37:42: which of these
37:44: inputs is most impactful which matters
37:46: the most and so you might think well
37:48: just whichever have the biggest
37:50: coefficients those should be
37:52: the most impactful but that's not the
37:54: case you have to also consider
37:56: the standard deviation of the
37:59: inputs so we can evaluate that by
38:02: looking at the standard deviation on the
38:04: data frame that will tell us the
38:05: standard deviation of each of our inputs
38:07: and those should basically be the
38:09: standard deviations that we set out
38:11: in our simulation data um
38:15: which they are um
38:19: so we take that standard deviation
38:23: and then on this result
38:27: object we have result.params that gives
38:30: us
38:31: a panda series which has all those
38:34: coefficients that we saw up here in the
38:36: nice summary output
38:38: so what we can do is we can actually
38:40: multiply these two things together
38:43: and that gives us what's called
38:45: standardized coefficients
38:49: so what that is saying is now it's
38:50: instead of a one unit
38:52: increase in the input variable it's a
38:54: one standard deviation
38:56: increase in the input variable
38:59: so that's saying that a one standard
39:01: deviation
39:03: increase in the cost of living raise
39:07: decreases years to retirement by 0.9
39:10: years so these coefficients
39:14: are comparable in terms of which has the
39:16: biggest impact
39:18: so you can basically think in the
39:20: absolute value of these whichever are
39:22: the biggest
39:23: are going to have the biggest impact on
39:24: the model so here
39:26: is saying that savings rate has the
39:29: biggest
39:29: impact on the years to retirement
39:32: followed by the starting salary
39:37: and in your model you're going to want
39:39: to
39:40: include some text at the bottom that
39:43: interprets this
39:45: and draws conclusions from the
39:46: coefficients talking about the original
39:48: coefficients as well as the standardized
39:51: coefficients
39:53: so that it's very clear for the reader
39:56: of the model basically what was
39:57: important
39:58: from doing all of this analysis
40:04: so that's the general process of adding
40:07: monte carlo simulation to an existing
40:09: model
40:10: and analyzing the relationship between
40:13: the inputs and the outputs
40:16: now to go along with this
40:20: there is an analogous lab exercise here
40:25: so the lab exercise for this is then to
40:28: do something very similar
40:30: for the project one model project one
40:33: python model
40:34: now i'm not asking you to do it with
40:36: every input there here in the level one
40:39: just do it with the interest rate
40:42: just randomize that and then
40:46: run ten thousand simulations get the
40:48: years or
40:49: the mpv results visualize
40:53: and then create this table of
40:55: probabilities and
40:58: get the chance that the mpv will be more
41:00: than 400 million
41:02: and then in the level two um
41:05: then you're going to be doing the same
41:07: thing continuing on but then also
41:09: drawing the number of phones from an
41:11: oral distribution as well
41:13: and doing the same kind of analysis but
41:15: then following it up
41:17: with analyzing the relationship between
41:20: the inputs and the outputs so doing the
41:22: scatter plots
41:24: and the multivariate regression and then
41:26: interpreting
41:27: the results of that
41:30: so that wraps up um
41:34: adding monte carlo simulation to python
41:36: models
41:37: thanks for listening and see you next
41:40: time

Applying Monte Carlo Simulation to an Excel Model¶

Notes¶

The process for running Monte Carlo simulations in Excel is nearly the same as that in Python when we use Python to run the simulations on the Excel model using xlwings
The main difference is that we write the inputs into Excel and extract the results using xlwings rather than running Python logic for the core model
Excel recalculates whenever an input is changed. So writing the inputs in is enough to get the result calculated
For the analysis, you can either keep the results in Python and follow the process for analyzing the results in Python, or you can output them back to Excel and analyze the outputs there
Keep in mind that if you visualize the outputs in Excel, next time you run the simulation it will go slow due to the visualizations. Because of this it may be a better idea in general to do the analysis in Python if you have a choice

Resources¶

Transcript¶

00:02: hey everyone
00:03: nick dear bird is here teaching you
00:05: financial modeling so today
00:07: we're going to be talking about how we
00:08: can add monte carlo simulation
00:11: to an existing excel model this is part
00:14: of our lecture series on monte carlo
00:16: simulation
00:18: so this video is going to wrap up the
00:21: lecture series we already talked about
00:24: what monte carlo simulation is why we
00:26: would want to do it
00:27: looking example of running it on a new
00:29: python model
00:30: i did a more formal introduction of
00:33: monte carlo and all the parts of it
00:35: and the analysis of it and then went and
00:38: applied it to an existing python model
00:41: so all that's left is to apply it to an
00:43: existing
00:44: excel model so
00:48: if you're thinking about just using pure
00:50: excel
00:51: for monte carlo simulations it is
00:55: definitely a
00:56: challenge there are add-ins
00:59: which are able to do this for you but i
01:02: don't know
01:02: of any um good really flexible
01:06: free add-ins for this mostly good ones
01:09: you're gonna have to pay a substantial
01:11: premium to get that add-on
01:13: [Music]
01:15: and without the add-on then with only
01:18: excel
01:19: pretty much you're going to be going to
01:20: vba to complete this
01:22: there are ways to hack it with data
01:25: tables
01:26: but it can get quite complicated to do
01:29: that
01:31: especially if you only have one or two
01:33: inputs varying at a given time
01:36: that's not too bad with a data table um
01:40: but as soon as you want to change more
01:42: than two then it starts to get
01:44: uh to be quite a hacky kind of approach
01:47: to make that happen with some kind of
01:49: lookup in another table
01:51: in order to make that happen
01:55: and you might be able to hack it some
01:58: way
01:58: or you're going to using vba
02:01: or python so
02:05: you know generally i would recommend
02:08: just to use python to be able to
02:11: run your monte carlo simulation in excel
02:16: so you know we've already learned how to
02:18: combine
02:19: excel in python in the prior lecture
02:22: series
02:23: and so we can leverage that knowledge to
02:25: take our excel model
02:27: and use python to run monte carlo
02:29: simulations
02:30: on it
02:33: and the process that we're going to
02:35: follow there is
02:37: extremely similar to the one that we
02:39: just carried out
02:40: in python all we're doing is
02:43: changing the inputs running the model
02:45: and storing the output
02:47: each time the difference here is just
02:50: that
02:50: instead of running python code to um
02:53: change the inputs and uh run the model
02:57: we're going to use excel wings to take
03:00: the
03:00: inputs from python put them into excel
03:04: and then get the result that we want
03:06: from excel
03:07: back into python
03:11: um so same exact kind of flow
03:14: but just having the excel model hooked
03:16: up instead of the python core model
03:20: and then so at the end of that process
03:22: you'll have all your simulation results
03:25: in
03:25: python and it's up to you at that point
03:28: whether you want to just go ahead and
03:30: analyze them in python
03:31: and then you'll do the exact same kind
03:33: of analysis
03:34: that we showed in adding
03:38: a monte carlo simulation to an existing
03:41: python model
03:43: or you can take all those simulation
03:45: results and output them back
03:47: into excel and then do an analysis on
03:50: them
03:50: in excel
03:54: so let's look at an example of how we
03:57: would actually go about this
04:01: so i've got the dynamic salary
04:02: retirement model up here on the left and
04:04: a fresh
04:05: jupiter notebook up here on the right
04:09: so you know this model is already set up
04:12: so that everything flows through we give
04:15: it different
04:16: inputs and it's going to change the
04:18: output
04:20: so first thing we want to do is import
04:23: excel wings as xw
04:27: and we're going to need pandas as well
04:32: so just add those inputs there
04:37: and you can look on the course site
04:41: to see a fully built out example of this
04:45: uh which has all the proper um
04:49: explanations and formatting of
04:50: everything
04:52: but i'm going to go ahead and just get
04:55: right to the code
04:56: here so we're going to now use excel
04:58: wings to get a connection
04:59: to the workbook
05:04: so this is dynamic valerie
05:07: retirement model i just realized that
05:11: these are not in the same folder
05:16: so let me
05:22: let me move that into the same folder
05:25: just do that over here
05:26: on my other screen here
05:29: give me a moment for that okay
05:33: now they're in the same folder
05:36: so that's the potential pitfall as you
05:38: try to do this you want to make sure
05:39: they're in the same folder or otherwise
05:41: you're going to have to put the full
05:42: file path
05:44: of the
05:47: excel model
05:50: so this is uh copy
05:53: two um
05:58: copy e2 um
06:02: i'll not found so i must not have gotten
06:04: that name right
06:05: dynamic salary retirement model
06:08: copy two oh right i need to put the xlsx
06:12: okay now i have the connection to the
06:14: book and so now i can get
06:17: the inputs and outputs sheet
06:22: and ultimately we're going to use two
06:23: different sheets so i'm going to call
06:25: this i o sheet
06:27: [Music]
06:28: and this is going to be book.sheets
06:32: inputs and outputs to reference our
06:35: inputs and outputs worksheet here
06:39: um so
06:41: [Music]
06:42: now we're going to run a single
06:46: simulation
06:48: um so
06:51: um all that we need to do to run a
06:55: simulation in excel
06:56: is change the input that's going to
06:58: automatically trigger excel to
07:00: recalculate the model
07:02: and so then the output will change as
07:04: well
07:05: so let's look at just varying the
07:08: interest rate
07:08: so here in b10
07:12: we have the interest rate so io sheet
07:16: dot range uh b10
07:20: value and let's just try it out by
07:23: putting a value in there eight percent
07:25: let's run that and we see this has
07:27: updated to eight percent and the years
07:29: for retirement
07:30: as similarly updated so then the other
07:33: side of this is then just
07:34: getting that out the io sheet
07:37: uh we want to get the output here b18
07:40: is the output range so b18
07:44: value and we can see that gets us the
07:47: years to retirement
07:49: so we can save that as years retirement
07:54: [Music]
07:56: and that's basically it um you know
07:59: we've got to add the random part to do
08:00: the simulation but
08:02: just you know running these two cells is
08:04: how we can run the excel model from
08:09: python um
08:12: and then we're going to show in this
08:15: example
08:15: analyzing the outputs in excel so we'll
08:17: ultimately need to get the outputs back
08:19: to excel
08:21: so i'm going to go and create a new
08:23: worksheet here
08:25: call this simulations
08:29: and then i'm going to create
08:32: a reference to that sim sheet
08:34: book.sheets
08:37: simulations um and now
08:40: i can do the simsheet.range
08:44: uh a1 value equals your search
08:48: retirement
08:50: and now we see that came into there so
08:52: now we have recorded the result
08:54: of that simulation back into excel
08:59: so that's just a single
09:02: run of the model not even really a
09:04: simulation because
09:06: this wasn't random but let's now
09:09: go to uh running multiple simulations
09:16: so we're gonna need the random module as
09:20: well
09:23: and let's put a mean of the interest
09:26: five percent let's put a standard
09:29: deviation of the interest three percent
09:33: um and now we can do
09:35: random.normalvariate
09:38: to get a random interest rate drawn
09:42: from a normal distribution so then the
09:45: interest rate
09:46: um we can see we run it multiple times
09:49: we get different values of the interest
09:51: rate
09:56: so then
09:59: we want to basically
10:02: do this but in a loop over the number of
10:04: iterations
10:06: so i'm going to add a number of
10:09: iterations
10:10: as another variable there
10:13: and then
10:17: we're going to go through the range of
10:20: the number of iterations
10:23: and um we're going to get the interest
10:27: rate
10:29: and then we're going to
10:33: put that interest rate into the model
10:39: and then we're going to extract the
10:41: years to retirement from the model
10:45: and that would be running the
10:48: simulations
10:49: so then the other thing is just to save
10:51: the results
10:52: so all retirement years
10:58: uh all retirement years dot append
11:02: here's retirement
11:09: so now i run this and we can see we get
11:11: 10 different
11:12: aggregate retirement and if you look
11:14: over at the excel model while this
11:16: happens you can see
11:18: that the interest rate is changing
11:19: around and it actually changes around
11:21: more than you can even see because it's
11:23: going really fast
11:24: but you do see it changing around as we
11:26: run this
11:30: um so we want to bring these values back
11:33: into excel
11:35: um and if you recall we probably want
11:38: these in a column that generally makes
11:39: more sense in excel
11:42: you recall we had this trick where we
11:44: wrap each
11:45: item into its own list in order to get
11:48: it to output vertically
11:50: so we can do vertical retirement years
11:53: do list comprehension uh
11:57: just putting a list around
12:00: um each of the retirement years
12:04: so that we have something that
12:08: looks like that and now that um
12:11: we're able to write back into excel
12:16: um in a column format so i'm going to
12:20: go to the same spot as before but i'm
12:22: going to put the
12:24: vertical retirement years and now you
12:26: can see that
12:27: each time i run this it's going to bring
12:30: this into there
12:31: so i run these two together we're going
12:33: to get new simulation results
12:35: coming in each time and back into excel
12:40: so then it makes sense to wrap all this
12:42: up in a function
12:44: um so
12:46: [Music]
12:47: uh retirement simulations
12:52: it takes the number of iterations the
12:54: interest
12:55: mean and the interest standard deviation
13:01: does all this and then does this as well
13:09: and we can have it also return the all
13:12: retirement years just in case
13:14: uh we later wanted to do analysis in
13:17: python
13:17: as well um and then
13:21: we can do uh retirement simulations for
13:24: the results let's
13:25: go up to a thousand iterations this time
13:26: with a
13:28: 10 mean and a 5 standard deviation
13:33: and then look at top 10 results
13:36: and we'll see that run for a while we
13:39: can
13:39: [Music]
13:40: um
13:44: excel kind of froze up while it was
13:46: running but now we can see that we have
13:49: a thousand different results here from
13:52: 1000 different simulations
13:54: and we also have those same
13:59: results in python as well
14:05: so now we have our results in excel
14:08: and we have them in python so you could
14:10: go and do your analysis
14:12: in either at this point but
14:15: we've already seen how to do the
14:16: analysis in python so i'm going to show
14:18: doing the rest in excel
14:22: so we have all these results here
14:25: the first thing that we might want to do
14:28: is
14:29: a histogram to see the distribution of
14:31: the results
14:32: i just highlighted all of that i'm going
14:35: to go
14:35: and insert chart um
14:38: and then i'm going to go to histogram
14:42: and add the histogram and we can see the
14:46: basic distribution here
14:52: and
14:56: see uh we can change the number of bins
15:00: here
15:01: uh generally better to have more bins
15:04: for these simulations because you've got
15:06: so many different
15:07: cases
15:11: so 100 is maybe two mini bins because
15:14: now it looks really sparse
15:15: let me go with let's try 25 on that
15:21: which looks a little bit more reasonable
15:25: this would be um
15:30: probability distribution
15:38: of used to retirement
15:45: okay um the next thing that we'll want
15:47: to look at
15:48: is the percentile table
15:52: so in order to do that first you want to
15:54: set up your
15:56: uh percentiles so i'm just going to
15:58: start with five ten percent and then
15:59: i'll be able to drag for the rest of the
16:01: range
16:03: um this is going to be yours to
16:06: retirement
16:09: and then excel has the percentile
16:12: function which is like the quantile
16:15: in pandas and then we're going to grab
16:19: all that data and then the
16:22: percentile is going to be the one which
16:24: is there to the left
16:26: and make sure that you fix the range on
16:28: the
16:29: data because you don't want that to move
16:30: as you drag down but we do want the
16:32: percentile to move
16:35: so then we can complete that
16:38: and we can see that it looks right um
16:42: that you know five percent of the time
16:43: we can retire in less than 20 years
16:46: and 10 of the time it takes at least 40
16:48: years
16:52: and then the
16:56: last thing that we can do here is get
16:59: the probability of a certain
17:01: outcome so
17:04: for that then um we can recreate what we
17:07: have done in panas
17:09: by um
17:12: let's just say our objective is retiring
17:15: in 25 years
17:16: so objective
17:19: 25 by productive
17:22: used retirement to be more clear
17:31: um and then
17:34: we just do equals if uh remember we want
17:38: to check
17:38: did the simulation meet the condition
17:41: so is the year's retirement
17:47: less than the objective
17:50: your retirement that means we met the
17:51: objective and make sure we fix
17:53: that objective um
17:57: and if we met the objective we get a 1
17:59: otherwise we get a zero
18:01: and then we can just complete that for
18:03: all the results we can see whenever it's
18:05: less than 25
18:07: um well really it should be less than or
18:10: equal to
18:12: because 25 is also fine um
18:17: so yeah anything which is less than or
18:19: equal to 25 is now showing up as a one
18:21: and anything greater is zero so then
18:26: the probability
18:33: of uh year's retirement
18:38: less than or equal to the objective
18:42: is going to equal the average sorry it's
18:46: average in excel
18:48: average of this column that we just
18:50: created
18:53: so we get a 33 chance
18:56: that we're going to be able to retire in
18:58: 25 years
18:59: or less so that's
19:03: the basic monte carlo
19:06: analysis in excel
19:09: so that wraps up our example on how to
19:13: add monte carlo sim
19:14: simulation to an existing excel model so
19:18: thanks for listening and see you next
19:22: time

Relationship of Inputs and Outputs in Excel Monte Carlo Simulation¶

Notes¶

This continues off the prior lecture to keep the inputs associated with the outputs in the Excel output, and then to do the analysis of how the inputs relate to the outputs
It is easier to go to DataFrame output into Excel to keep everything together
We create scatter plots and run a multivariate regression, just as in Python
You may need to enable the Data Analysis Toolpack add-in in Excel to get access to multivariate regression

Transcript¶

00:03: hey everyone this is nick durabartis
00:05: teaching you financial modeling
00:06: today we're going to be talking about
00:08: how to analyze the relationship
00:11: between inputs and outputs in our model
00:14: using monte carlo simulation in the
00:17: context
00:17: of an existing excel model
00:20: this is part of our lecture segment on
00:23: monte carlo simulation
00:26: so we left off last time
00:30: we were working on this example of how
00:34: to run monte carlo simulation on an
00:36: existing excel model
00:38: and we went ahead and got it to where we
00:40: were able to run the simulations
00:42: and output the results of those
00:44: simulations into excel
00:46: and be able to get the probability of a
00:50: particular objective
00:52: a histogram of the distribution
00:56: and a table of the percentiles of the
00:59: distribution so definitely watch that
01:02: prior video
01:04: before coming to this one what we're
01:06: doing in this video
01:07: is we're going to modify the simulation
01:10: a little bit so that it keeps
01:12: the interest rate associated
01:16: with the uh years to retirement
01:20: and that will allow us to then
01:25: analyze the relationship between the
01:27: interest rate
01:28: and the years for retirement
01:32: so i'm going to come over to here um
01:36: and we already have this function
01:39: which is getting us the random interest
01:42: rate putting into the model
01:43: getting the result from the excel model
01:45: which has recalculated that point and
01:47: saving it
01:48: so what we need to do is
01:53: now we're gonna save not just the years
01:54: to retirement but also
01:57: the um interest rate as well
02:02: so i'm going to rename this list to all
02:04: data so it's more indicative of what
02:07: we're doing
02:08: and then here i'm going to append not
02:11: just the years to retirement but also
02:13: the
02:13: interest rate um
02:18: and let me just quickly grab that logic
02:22: [Music]
02:25: that we now
02:29: have those results in a list here
02:33: and then what we can do is we can create
02:35: a data frame
02:36: from those results so data frame
02:39: [Music]
02:40: and then the columns are going to be
02:44: interest and use retirement
02:53: and then we have that data frame
02:57: um and then instead of
03:00: i'm gonna go ahead and move this over
03:03: one
03:04: um instead of writing using the
03:08: column and uh you know list within list
03:11: approach we can actually just
03:13: write the data frame back into
03:16: excel um
03:19: so then we're going to do uh this same
03:23: assignment but we're going to
03:28: assign the data frame instead so
03:35: but i know if i do this right now it's
03:36: going to bring the index over and so
03:37: it'll be three columns and it's going to
03:39: overwrite what we have here
03:42: so i'm going to also put options
03:45: data frame uh index equals false
03:50: um and then we should get it coming in
03:54: it will have the headers um but that's
03:57: that's fine
03:58: so let's give that a try um
04:01: and now we do see coming over here what
04:04: we expected to see
04:05: of course we only have it at 10
04:07: iterations right now instead of the full
04:09: um instead of the full thousand
04:14: now we can see that works so let's bring
04:17: that back into our function
04:19: rather than what we had before
04:24: of doing the list within less approach
04:28: so bring all that into here and then we
04:32: can return the data frame instead
04:39: and now we should be able to run this
04:41: thousand simulations
04:43: ten percent mean five percent interest
04:46: or
04:46: standard deviation and
04:50: we want to see uh
04:53: the first few results out of that so
04:56: let's give that a try
04:58: it's going to take a little bit to run
05:00: through the model
05:02: [Music]
05:05: and oh i didn't redefine this that would
05:11: cause it to not actually change
05:15: i've accidentally got this still in here
05:18: definitely don't want that
05:21: that was the old code so now let's
05:24: redefine this okay now let's try this
05:27: again
05:29: um so again it's going to take a little
05:31: bit to run
05:32: but now we do see all the interest rates
05:35: associated with the year's retirement
05:37: coming in
05:39: and now we can modify this
05:42: because now the years to retirement have
05:44: moved
05:46: and so we can get back to our
05:48: probability of achieving the objective
05:56: and so now and then this also we want to
06:02: move over
06:05: and same thing with the percentile
06:07: because we did have to move
06:09: um that column
06:13: so just uh carrying those results all
06:16: over
06:17: great so
06:20: we have everything we need in excel now
06:23: let's
06:23: go to do the analysis of how the inputs
06:26: relate to the outputs
06:28: so the first thing that you want to do
06:30: is a scatter plot so we can just
06:32: highlight
06:32: all of these data
06:36: back up to the top and we want to insert
06:41: and we can see that the scatter plot
06:43: comes up as the first
06:45: recommended chart there so let's add
06:47: that
06:48: and we can definitely see a very clear
06:50: relationship here
06:51: between the year's retirement and
06:55: the interest rate
07:00: it also highlights that we have some
07:01: issues here we are getting some
07:03: negative interest rates and so we would
07:05: probably want to
07:07: deal with that in the model as well
07:11: um let's come over to python to quickly
07:13: fix that
07:17: so what we can do is instead of
07:20: random.normal variant we can
07:28: call that but um we're going to do
07:32: while the value is less than zero we're
07:34: going to keep drawing more
07:35: values um and initialize the value at a
07:40: negative number
07:43: and we can call this random normal
07:45: positive you can see it i explained this
07:47: in more detail
07:49: and the adding uh
07:52: monte carlo simulation to a python model
07:54: in that video
07:55: uh same built the same function over
07:58: there
07:59: um and then that can take the mean and
08:02: the
08:03: standard deviation
08:08: and then finally return the value
08:13: so then we can use that instead of
08:15: random normal variant
08:18: redefine that let's go ahead and try
08:21: this again
08:22: and then after this finishes we'll see
08:25: the members update
08:26: yep and now we see that we don't have
08:28: those negative numbers in the
08:29: distribution
08:30: so doing plots of your output is a great
08:33: way to
08:34: check and understand everything that's
08:36: going on in your
08:38: simulations so now we can see there's a
08:41: clear relationship
08:42: just from the scatter plot as the
08:44: interest rate goes up your retirement
08:46: goes down
08:46: and it's a non-linear and it's it's
08:49: a steeper decrease at first and then it
08:51: flattens out as you get to higher
08:53: interest rates
08:55: but we can quantify this relationship
08:59: using the regression so
09:02: regression excel is going to live on the
09:04: data tab
09:05: and then it would be over here
09:09: you can notice that i have nothing over
09:10: here right now and that's because i need
09:12: to enable
09:14: the add-in or the data analysis tool
09:16: pack
09:17: for that to show up so you might already
09:20: have
09:20: data analysis showing up over here but
09:22: if you don't
09:23: it is built into excel you just have to
09:25: enable it so in order to do that you do
09:28: file and then options
09:32: and then add-ins
09:36: um and then you want to manage excel
09:38: add-ins
09:40: and then
09:43: that's where it does say that i have it
09:45: enabled
09:47: let me try disabling it and then
09:49: re-enabling it to see
09:51: if that will allow it to come up um
09:56: hopefully this will come up okay good
09:59: good so now we have the data analysis
10:02: section
10:02: showing up over here and so we can go to
10:06: do
10:06: our regression so just click data
10:10: analysis it brings up a lot of options
10:11: the one we want to use here
10:13: is regression and then it's going to ask
10:15: for your y's and your x's
10:17: so the y is always going to be
10:20: whatever your output is here used to
10:23: retirement
10:25: and the x you can add multiple x
10:28: variables
10:29: and you should if you're changing
10:31: multiple things in your simulation
10:33: here we just have one variable so i'm
10:35: going to add that
10:38: and you'll notice with both of these
10:39: that i picked up the
10:41: label as well and so i'm going to check
10:43: that they have labels and that will
10:44: allow that to come through into the
10:46: regression results as well
10:49: and i'm going to output that analysis
10:53: um right here you can also put it on a
10:56: new sheet
10:57: if you'd like um and then hit okay to
11:01: run it
11:03: and then we see the regression output
11:05: coming up here
11:08: um so then we
11:11: um get the result here of a negative
11:15: 95.8 coefficient with a very low
11:17: p value so it definitely is
11:19: significantly related
11:22: and negative 95.8 that's saying a one
11:25: unit increase
11:26: in the interest rate decreases years to
11:28: retirement by 95.8 years
11:31: now you might say whoa that's huge why
11:32: is that so huge
11:34: that's because a one unit increase here
11:36: is going from
11:37: zero to one hundred percent interest so
11:40: that's obviously uh
11:42: much larger than a realistic interest
11:44: rate change
11:45: uh so to get it to a one percent entry
11:48: increase in interest rate which is a
11:50: more reasonable thing to talk about we
11:51: just divide by 100
11:53: um and so that would be a one percent
11:56: increase in interest rate
11:58: decreases years to retirement by almost
12:00: a year
12:03: um so
12:06: that um we can use that to interpret it
12:10: um and then you know if you had
12:13: other inputs they would just show up as
12:15: additional lines here and you could
12:17: interpret those
12:17: coefficients in a similar fashion
12:21: um and then the one other part of the
12:24: analysis
12:25: is when you do have multiple inputs you
12:27: can't just directly compare the
12:29: coefficients to determine
12:31: which is the most impactful you have to
12:34: compare these standardized coefficients
12:36: and to get the standardized coefficients
12:39: um
12:40: so you would do this for each one of
12:42: your um
12:44: you don't care about the intercept uh
12:47: you would do this for
12:48: each one of uh your inputs
12:52: you would go and you would calculate the
12:55: standard deviation
12:57: uh of that input
13:02: and then
13:06: the standardized coefficients so this
13:08: would be standard deviation
13:10: and then standard coefficients
13:16: so again interest it's just going to be
13:20: standard deviation multiplied by the
13:22: coefficient
13:24: so that's now saying that a one standard
13:26: deviation increase
13:28: in the interest rate leads to
13:32: uh is associated with a decrease in
13:35: years to retirement of
13:37: almost four and a half years and a one
13:39: standard deviation increase and interest
13:41: rate
13:41: is close to five percent
13:45: um and then when you have other
13:46: coefficients here you can just
13:48: pick the largest in absolute value and
13:50: those are going to be the ones
13:52: which are the most impactful inputs in
13:54: your
13:55: model so that shows
13:58: how we can do this analysis of the
14:01: relationship between the inputs and
14:02: outputs
14:03: in excel and then
14:06: to wrap up all this material on monte
14:09: carlo simulation
14:11: there's also a lab exercise here on
14:14: doing this process for
14:16: your project one model in excel
14:20: it's going to be very similar to the
14:23: python
14:25: project one extension exercise that was
14:27: mentioned
14:28: in the prior video on extending the
14:31: dynamic salary retirement model in
14:32: python
14:34: um you're just going to go through this
14:36: process of adding
14:38: a monte carlo simulation to your excel
14:40: model
14:42: so here in the level one you're going to
14:44: be varying the number of phones
14:47: um and then analyze the results
14:50: table of probabilities um chance of
14:53: reaching 800 million mpv
14:55: et and cetera two then you're going to
14:58: do the same
15:00: keep that varying as it was but also
15:02: vary the
15:04: lifespan of the machines
15:08: and then in addition to
15:11: the visualization we just talked about
15:13: then you want to go through this
15:14: analysis
15:15: of what's the relationship between the
15:18: inputs
15:19: and the outputs so
15:23: that wraps up this segment on monte
15:25: carlo
15:26: simulation so thanks for listening and
15:28: see you next time