Monte Carlo Simulation

Assign probability distributions to inputs to be able to get probability distributions of outputs. Enables a much deeper understanding of model results and especially the risk of a result.

Introduction to Monte Carlo Simulations


Notes

  • Monte Carlo simulation is the external counterpart to internal randomness

  • The core model is still (probably) deterministic, but then we add randomness into the model by randomizing the inputs to the model and running it many times

  • Adding this randomness allows us to answer deeper questions about the problem, such as what is the chance of some outcome occurring

  • The process is the same as sensitivity or external scenario analysis, just run the model multiple times with different inputs. Only here we are randomly drawing the inputs from distributions

  • We will also do some additional analysis on the results from the MC simulation

Transcript

  • 00:02: hey everyone
  • 00:03: nick duraburtis here teaching you
  • 00:05: financial modeling today we're going to
  • 00:07: be doing an
  • 00:08: introduction to monte carlo simulation
  • 00:11: and this
  • 00:11: is part of our new lecture segment on
  • 00:14: the same monte carlo simulation
  • 00:17: so we've explored in this course already
  • 00:21: a couple other ways of exploring the
  • 00:23: parameter space
  • 00:24: looking at different inputs and how they
  • 00:26: affect our model
  • 00:28: we've already looked at sensitivity
  • 00:30: analysis and scenario analysis
  • 00:32: and now monte carlo simulation comes to
  • 00:36: round out that set of possibilities
  • 00:40: so monte carlo simulation
  • 00:44: is unique in that it allows you to take
  • 00:47: a deterministic model which has no
  • 00:50: notion of randomness or
  • 00:52: probability and be able to make
  • 00:55: conclusions
  • 00:56: about the chance of certain outcomes
  • 00:59: occurring
  • 00:59: in your model so you're able to take
  • 01:01: this model with no probability
  • 01:03: in it at all and add it externally
  • 01:06: without having to change the core model
  • 01:08: itself
  • 01:10: um so that will give you a good
  • 01:13: understanding of
  • 01:15: not just what is kind of the expected
  • 01:17: outcome from the model
  • 01:18: but also what's the full range of
  • 01:20: possible outcomes and
  • 01:22: what is the chance of each of these
  • 01:24: outcomes occurring
  • 01:26: and certainly in finance this is an
  • 01:28: important thing to consider
  • 01:30: because we're always concerned about the
  • 01:32: risk return
  • 01:33: trade-off and we can
  • 01:36: think of in a sense the baseline output
  • 01:40: from our model as
  • 01:41: being kind of the return or you know
  • 01:44: whatever objective
  • 01:45: that we're uh looking at evaluating in
  • 01:48: the model
  • 01:49: and then the chance of of getting
  • 01:52: different
  • 01:54: outcomes we can think of that as the
  • 01:56: risk so
  • 01:59: running the model without monte carlo
  • 02:01: simulation we're kind of just looking at
  • 02:02: one side of the risk return trade-off
  • 02:05: and so it can be very helpful to bring
  • 02:07: this in to fully evaluate the problem
  • 02:11: so i think it's useful to motivate this
  • 02:13: by an example
  • 02:15: um so let's talk about a potential
  • 02:19: that you can place um so
  • 02:22: you have an opportunity to place a bet
  • 02:25: for one dollar
  • 02:26: and if you win that bet you're going to
  • 02:28: get two dollars
  • 02:30: but if you lose the bet then you're
  • 02:32: going to lose 750 000
  • 02:35: and there's no way to avoid this payment
  • 02:38: through legal means
  • 02:39: you're gonna have to be obligated to pay
  • 02:41: that no matter what
  • 02:44: and then looking at the odds of this bet
  • 02:48: uh one in a million you lose that 750
  • 02:52: 000 and every other time
  • 02:55: 999 999 times out of a million
  • 02:59: you're going to win the two dollars
  • 03:03: and if you just go and you take the
  • 03:05: expected value
  • 03:06: of this bet then the expected profit is
  • 03:10: 25 cents so just looking
  • 03:12: at kind of the expected outcome you
  • 03:15: should definitely take this back
  • 03:17: but uh you know any
  • 03:20: reasonable person looking at this bet
  • 03:24: would think twice and at least carefully
  • 03:27: consider should i really take this bet
  • 03:29: because that downside of losing 750
  • 03:33: thousand dollars
  • 03:35: is so severely bad that
  • 03:38: even though it has such a low
  • 03:40: probability
  • 03:41: you might not want to take the bet
  • 03:43: because you only have such a small
  • 03:45: amount to gain in the bet
  • 03:48: so if you make decisions 100 percent on
  • 03:50: expected value
  • 03:51: you would take the bet but
  • 03:55: more than just the expected value
  • 03:57: matters here the probabilities of
  • 03:59: different outcomes
  • 03:59: occurring and what possible outcomes can
  • 04:02: occur
  • 04:03: still matter even beyond expected value
  • 04:08: so this is to some extent the
  • 04:12: kind of concepts we're trying to get at
  • 04:14: with monte carlo simulation
  • 04:17: you want to see what are the different
  • 04:18: possible outcomes from the model
  • 04:20: and consider what that means for your
  • 04:23: particular situation
  • 04:27: and then as far as running monte carlo
  • 04:29: simulation
  • 04:30: this is a visualization of how that
  • 04:32: looks
  • 04:34: and you may notice that this is
  • 04:37: uh very similar to what we had seen
  • 04:40: for sensitivity analysis
  • 04:43: in fact it's the same image here
  • 04:46: describing monte carlo simulation
  • 04:48: and that's because the process to run
  • 04:50: each of them is almost exactly the same
  • 04:55: so the same for external scenario
  • 04:58: analysis
  • 04:59: really all three techniques you follow
  • 05:01: the same pattern
  • 05:03: uh basically just run the model a bunch
  • 05:04: of times passing in different inputs
  • 05:06: each time
  • 05:08: and associating those inputs with the
  • 05:09: outputs and putting it all together in
  • 05:12: some sort of analysis and visualization
  • 05:14: at the end
  • 05:16: um so the the difference
  • 05:20: because the process is so similar the
  • 05:22: difference with monte carlo simulations
  • 05:24: is that we are assigning distributions
  • 05:28: probability distributions
  • 05:30: to each of the inputs and we're randomly
  • 05:32: drawing
  • 05:33: the values of those inputs to put into
  • 05:36: our model
  • 05:37: whereas with external scenario analysis
  • 05:40: we said
  • 05:40: what are all the inputs that make sense
  • 05:43: for this situation
  • 05:44: we're manually picking those values and
  • 05:47: also with sensitivity analysis we said
  • 05:49: we want to look at
  • 05:50: you know investment rates between one
  • 05:52: and five percent
  • 05:53: and so you're manually saying well i
  • 05:55: want to look at one two three four five
  • 05:57: percent
  • 05:58: um but with monte carlo simulation you
  • 06:00: just say well the
  • 06:01: interest rate is going to have a mean of
  • 06:03: three it's
  • 06:05: going to have a standard deviation of 2
  • 06:06: percent on a normal distribution
  • 06:08: and then each time that you run the
  • 06:10: model you don't know what interest rate
  • 06:12: is going to go in the model it's just
  • 06:14: randomly picked from the distribution it
  • 06:16: could be 5
  • 06:17: this time 1 the next time and so on
  • 06:23: and then what this allows you to do
  • 06:26: which is unique
  • 06:27: to monte carlo simulation is
  • 06:30: because we have probability
  • 06:33: distributions of the inputs
  • 06:35: that allows you to assign a probability
  • 06:37: distribution to the outputs
  • 06:39: as well um so that's where
  • 06:42: we're going to be able to get into
  • 06:44: talking about the chance
  • 06:46: of a certain outcome occurring uh you
  • 06:48: know maybe it's a capital budgeting
  • 06:50: setting and you're
  • 06:51: looking at the the probability that
  • 06:53: you're going to have a positive
  • 06:55: mpv uh or it could be a portfolio
  • 06:58: setting where you're saying uh you know
  • 07:01: there's
  • 07:02: a 10 chance that we're gonna lose as
  • 07:04: much
  • 07:05: as three hundred thousand dollars uh
  • 07:08: there are a lot of different situations
  • 07:09: where you wanna think about the
  • 07:11: probability
  • 07:12: and monte carlo simulation is a nice way
  • 07:15: to
  • 07:16: be able to come to those conclusions
  • 07:18: with an otherwise
  • 07:20: deterministic model
  • 07:23: so that's the quick intro of monte carlo
  • 07:26: simulation
  • 07:28: we're gonna do a different order for
  • 07:31: uh exposing everyone to this we're going
  • 07:33: to go
  • 07:34: next to actually running the simulation
  • 07:37: um and then we're gonna come back and
  • 07:39: discuss
  • 07:40: more formally what we're doing i think
  • 07:43: it's easiest to learn all this by
  • 07:45: example
  • 07:47: because when we talk about it formally
  • 07:49: it can sound a little bit complicated
  • 07:51: but really it's not
  • 07:52: and just seeing it by an example really
  • 07:55: drives that home
  • 07:56: so we'll come back next time to look at
  • 07:58: that example of
  • 07:59: running a monte carlo simulation so
  • 08:02: thanks for listening
  • 08:03: and see you next time

Monte Carlo Investment Returns


Notes

  • Running Monte Carlo simulations in Excel without the use of an add-in is complex

  • Running Monte Carlo simulations in Python just a few lines of code

  • If you want to add Monte Carlo simulation to an Excel model, it is easiest to use xlwings to connect Python to run the simulations on your Excel model

  • After running the simulations, you must analyze and visualize the output

  • A histogram is a good choice for showing the output distribution

  • A table of the percentiles of the distribution and values corresponding to those percentiles is a more quantitative way to show the output distribution

  • If we have some specific objective or loss in mind, we can determine the probability of achieving the objective/loss

Transcript

  • 00:03: hey everyone
  • 00:04: nick dear burtis here teaching you
  • 00:05: financial modeling
  • 00:07: today we're going to be looking at an
  • 00:09: example of how to apply
  • 00:11: monte carlo simulation and this is going
  • 00:14: to be
  • 00:14: in the context of a portfolio model
  • 00:17: where we're looking to
  • 00:18: allocate resources between two
  • 00:21: assets this is part of our lecture
  • 00:24: series on
  • 00:24: monte carlo simulation so
  • 00:28: we introduced the whole idea of what
  • 00:32: monte carlo simulation
  • 00:33: is and why we would want to go about it
  • 00:36: and
  • 00:37: i mentioned that we're going to look at
  • 00:39: an example of how to do it before
  • 00:41: getting into the more formal definition
  • 00:44: of
  • 00:44: what it is that we're doing i think it's
  • 00:47: easiest to learn this by example
  • 00:50: before we look at that example let's
  • 00:52: just quickly talk about
  • 00:54: how you would actually go and run this
  • 00:57: so monte carlo simulation can be applied
  • 01:00: to
  • 01:01: any model it doesn't matter if it's in
  • 01:03: python or excel
  • 01:05: but it is definitely easier to run monte
  • 01:08: carlo simulations
  • 01:10: in python if you want to do a pure
  • 01:14: excel monte carlo simulation
  • 01:17: unless you're getting some kind of
  • 01:19: add-in uh
  • 01:20: then you're going to be looking at doing
  • 01:22: some kind of
  • 01:23: data table approach which is going to be
  • 01:26: fairly complicated to work out
  • 01:28: for more than two inputs
  • 01:31: um so i generally wouldn't recommend
  • 01:34: that
  • 01:35: you can also go to vba but
  • 01:38: it's generally easier to just handle
  • 01:40: this with python
  • 01:42: uh because in python you just do a loop
  • 01:45: over the number of iterations each time
  • 01:47: you draw the random inputs you run the
  • 01:49: model
  • 01:50: and then you collect the output so it's
  • 01:52: really not more than a few lines of code
  • 01:54: to
  • 01:55: make this happen in python
  • 01:59: so we're going to look first at the
  • 02:01: python example
  • 02:03: of how to do this
  • 02:06: because it's more straightforward since
  • 02:08: ultimately we're going to run the monte
  • 02:10: carlo simulation in python so if the
  • 02:12: model is already in python
  • 02:14: it's very straightforward and then when
  • 02:16: we go over to
  • 02:18: running a monte carlo simulation on an
  • 02:20: excel model
  • 02:22: in a future video then we're going to
  • 02:25: use
  • 02:25: excel wings to go back and forth between
  • 02:28: python and excel and have python
  • 02:30: orchestrate the monte carlo simulation
  • 02:33: on the excel
  • 02:34: model
  • 02:38: so the problem that we're going to look
  • 02:40: at here
  • 02:41: which is good application for monte
  • 02:43: carlo simulation
  • 02:45: is an investment problem we have a
  • 02:48: thousand dollars now
  • 02:49: we need a thousand and fifty a year from
  • 02:51: now
  • 02:52: and we have a choice of investing in two
  • 02:54: different assets a risk-free asset
  • 02:57: and a stock and the risk-free asset
  • 03:00: being that it's risk-free it always
  • 03:02: returns the same percentage
  • 03:04: whereas the stock the return is going to
  • 03:08: be drawn
  • 03:08: from a normal distribution
  • 03:12: with a 10 average and 20 standard
  • 03:15: deviation
  • 03:17: so the question here is how much should
  • 03:20: we allocate
  • 03:21: to the risk free and to the stock
  • 03:24: in order to maximize our chance of
  • 03:27: meeting our objective
  • 03:28: of having 1050 in one year
  • 03:33: i'm gonna intentionally set up the
  • 03:35: values of the inputs this way so that
  • 03:37: you couldn't just put 100 in the risk
  • 03:39: free
  • 03:39: that is not going to be able to get you
  • 03:41: to the goal of
  • 03:42: a thousand and fifty dollars so you do
  • 03:44: have to put some amount into the stock
  • 03:47: but how much should you put into the
  • 03:49: stock
  • 03:51: so the way we're going to approach this
  • 03:52: is first we're going to
  • 03:54: build out the basic model
  • 03:58: to get the portfolio value for given
  • 04:02: returns
  • 04:03: and then we're going to
  • 04:05: [Music]
  • 04:07: get that stock return from a normal
  • 04:10: distribution
  • 04:11: and re-run the model for a number of
  • 04:14: iterations
  • 04:15: and put this all together to analyze and
  • 04:18: visualize
  • 04:20: and then uh because we're trying to
  • 04:23: evaluate
  • 04:24: what's the best weight we're going to
  • 04:25: then repeat
  • 04:27: this whole process or a number of
  • 04:31: different weights
  • 04:32: into the two assets and then pick the
  • 04:35: one
  • 04:35: which has the highest probability of
  • 04:38: achieving the objective of a thousand
  • 04:40: fifty dollars
  • 04:43: so let's go ahead and jump over to the
  • 04:46: jupiter notebook
  • 04:47: example for this there's already a
  • 04:48: completed example
  • 04:50: on the course site that you can download
  • 04:52: and take a look at
  • 04:59: so this is that jupiter notebook and
  • 05:01: here at the beginning
  • 05:02: it again describes uh the problem
  • 05:06: and our general approach for going about
  • 05:09: it
  • 05:11: so let's go ahead and jump into the code
  • 05:13: since we've
  • 05:14: already kind of covered this well
  • 05:16: quickly i'll just talk about the math
  • 05:17: going on here
  • 05:20: so in order to get the return on the
  • 05:24: portfolio of these two assets
  • 05:26: we take a weighted average of the
  • 05:28: returns on each of the assets
  • 05:30: so it's the weight the weight and the
  • 05:32: risk free times the return in the risk
  • 05:34: free
  • 05:35: plus the weight in the stock times the
  • 05:37: return on the stock
  • 05:40: um but we can simplify this a little bit
  • 05:44: to remove
  • 05:44: one input from the model uh because we
  • 05:47: only have two assets
  • 05:49: the weights must sum up to one we must
  • 05:51: be 100 invested
  • 05:52: in total and so then the weight on the
  • 05:55: risk free is one minus
  • 05:56: the weight on the stock so we can
  • 05:59: eliminate that weight on the risk free
  • 06:01: and we have the return on the portfolio
  • 06:04: being the risk-free
  • 06:05: times one minus the stock weight plus
  • 06:08: the
  • 06:08: stock return times the weight on the
  • 06:11: stock
  • 06:13: so translating that over into python
  • 06:16: code we can define some variables here
  • 06:19: for the stock return the risk free and
  • 06:21: the weight on the stock
  • 06:23: and we can create this formula here
  • 06:26: which gets us the portfolio return
  • 06:28: so that's just taking this formula and
  • 06:30: implementing it
  • 06:31: in python risk-free times one minus
  • 06:34: stock weight
  • 06:35: uh plus the stock return times the stock
  • 06:37: weight
  • 06:39: and with these baseline values we get a
  • 06:41: 6.5 percent portfolio return
  • 06:47: so then we can take the initial value on
  • 06:51: the portfolio
  • 06:52: and multiply it by one plus the
  • 06:54: portfolio return this is a one-year
  • 06:56: model so we can multiply by one plus the
  • 06:59: portfolio return
  • 07:00: and get the ending value on the
  • 07:02: portfolio
  • 07:04: so this is basically our entire
  • 07:07: core model here i wanted to keep this as
  • 07:11: as very simple so we can focus on the
  • 07:14: monte carlo simulation
  • 07:16: so we can just put that all into a
  • 07:20: function
  • 07:22: we want to definitely have a function
  • 07:24: that runs our core model
  • 07:26: as we go into the monte carlo simulation
  • 07:30: so this is just taking that logic we
  • 07:32: worked out getting the portfolio return
  • 07:34: and then applying that return to the
  • 07:36: initial value on the portfolio
  • 07:39: to get the end value and we can pass any
  • 07:42: of these
  • 07:43: as inputs so we can say what if the
  • 07:47: uh stock return was instead 20
  • 07:50: then we would be getting even more money
  • 07:54: so yeah here just showing trying with
  • 07:57: some other inputs
  • 08:00: so our core model is done we have one
  • 08:02: function that runs the core model
  • 08:05: and you would want to have this kind of
  • 08:08: structure
  • 08:09: for any model any python model that
  • 08:11: you're going to go and do monte carlo
  • 08:12: simulations on you want to have one
  • 08:14: function
  • 08:15: which can take the model inputs and
  • 08:17: return the outputs from your model
  • 08:22: so we want to get into running the
  • 08:25: simulation now
  • 08:27: and recall that the first step in the
  • 08:30: monte carlo simulation process
  • 08:32: is to randomly pick the values of our
  • 08:36: inputs and so we've got to set it up so
  • 08:38: that we can randomly pick those
  • 08:40: values so we already covered in the
  • 08:43: continuous random variable material
  • 08:46: that we can use the python's random
  • 08:48: library and random.normal variant
  • 08:51: to pull numbers from a normal
  • 08:53: distribution
  • 08:54: so you'll see as i've run this multiple
  • 08:55: times i get different values
  • 08:57: uh according to pulling random numbers
  • 09:00: from
  • 09:01: a distribution which has a 10 percent
  • 09:04: mean and a 20 standard deviation
  • 09:11: so then we can put this together
  • 09:15: with the
  • 09:18: function which runs our model to now
  • 09:21: run the model along with a random
  • 09:24: uh rate of return on the stock so
  • 09:28: you're just that same uh logic to get
  • 09:31: the random stock return then i'm just
  • 09:33: printing out
  • 09:34: what that stock return is uh before
  • 09:38: running it through the model so you can
  • 09:41: see each time that i run it we get
  • 09:43: different stock returns
  • 09:45: and we get different portfolio in values
  • 09:47: as a result
  • 09:49: and this kind of gets into the core
  • 09:52: of why we need to do a model like this
  • 09:55: you can see that
  • 09:56: we're getting quite different portfolio
  • 09:58: values each time
  • 09:59: and a number of them are not meeting our
  • 10:01: goal of 1050 and so this definitely
  • 10:04: is um something we need to carefully
  • 10:07: consider
  • 10:08: if we have that goal of getting a
  • 10:10: thousand and fifty dollars in a year
  • 10:14: so that's how you can run one simulation
  • 10:17: at a time
  • 10:18: now let's run as many simulations as we
  • 10:22: want
  • 10:23: so we basically take that same exact
  • 10:26: thing that we just did
  • 10:28: uh getting the random stock return and
  • 10:31: then
  • 10:32: running the model with that random stock
  • 10:34: return and we just put it into a loop
  • 10:36: a loop over the number of iterations
  • 10:39: we set up an outputs list we append our
  • 10:41: result to that outputs list
  • 10:45: and that allows us to run the model as
  • 10:48: many times
  • 10:49: as we want or as many simulations as we
  • 10:51: want on the model
  • 10:54: so we can see here's three simulations
  • 10:56: we could change it to five
  • 10:57: and have five simulations there
  • 11:00: so we'll just take that same logic and
  • 11:03: just
  • 11:04: wrap it up into a function of course you
  • 11:06: know if you're doing this in your own
  • 11:07: model you wouldn't want to have both
  • 11:09: still sitting around
  • 11:10: you would just convert this into that
  • 11:11: but they're shown separately to
  • 11:13: show how you can kind of build it out
  • 11:16: um and so this is just that same logic
  • 11:19: now in a function
  • 11:20: format so
  • 11:24: and this now runs with a thousand
  • 11:26: iterations by default
  • 11:28: and here we're just showing uh how many
  • 11:30: results there were
  • 11:31: and the first five results so now
  • 11:35: this function we are running a thousand
  • 11:38: simulations
  • 11:39: on the model and we could change that
  • 11:43: uh number of simulations to whatever we
  • 11:46: want
  • 11:50: so um you know this is where i mentioned
  • 11:54: that uh you can run that you can run the
  • 11:57: simulation and you get some results but
  • 11:58: you really have to
  • 12:00: analyze and visualize them to get any
  • 12:02: kind of meaning out of that
  • 12:04: so a good first
  • 12:08: approach for the visualization is to
  • 12:10: create a histogram
  • 12:12: of the outputs
  • 12:15: so i'm just going to make a data frame
  • 12:18: and then
  • 12:20: put those results into the data frame as
  • 12:23: the portfolio in values
  • 12:25: and then do a histogram over those
  • 12:28: values
  • 12:29: [Music]
  • 12:30: and we can now see
  • 12:35: a distribution of the outputs the
  • 12:36: outputs also themselves
  • 12:38: look normal the portfolio in values we
  • 12:41: can see that it's kind of centered
  • 12:43: around
  • 12:44: this is maybe somewhere around 10 and
  • 12:47: 50.
  • 12:50: and we can see some values are
  • 12:53: as low as almost down to 600 more
  • 12:56: commonly down to 800
  • 12:59: and for getting as high as over 1500
  • 13:02: or over 1400 and more commonly we're
  • 13:06: getting high values in the
  • 13:07: 12 to 1400 range and
  • 13:10: you can also the kde is an alternative
  • 13:13: to the histogram
  • 13:15: which basically gets at the same concept
  • 13:19: but it just tries to kind of smooth out
  • 13:21: the curve
  • 13:25: so another way to look at this
  • 13:27: distribution
  • 13:28: is to say well you know five percent
  • 13:32: of the time we're going to uh get at
  • 13:36: least this value and 95
  • 13:38: of the time we're going to get at least
  • 13:39: this value um
  • 13:42: so we can do that with a table
  • 13:45: of the probabilities and the percentiles
  • 13:49: of the distribution
  • 13:51: so to get there first we can form the
  • 13:55: percentiles that we want to look at here
  • 13:56: just doing five percent increments
  • 13:59: throughout the whole range of
  • 14:01: percentiles
  • 14:03: and we can use the uh quantile
  • 14:07: method on the data frame
  • 14:11: column in order to see that so we just
  • 14:14: pass
  • 14:15: these percentiles this list of
  • 14:18: uh the percentiles to the quantile
  • 14:21: method and with that we get uh these
  • 14:25: results
  • 14:26: saying that um in 95
  • 14:30: of cases you're going to have less than
  • 14:34: 1231. so only in five percent of cases
  • 14:37: do you get
  • 14:38: at least 1231. um
  • 14:42: and in five percent of cases you get
  • 14:44: lower than
  • 14:45: 900 so we're already starting to
  • 14:49: get out what we want here we can see
  • 14:52: you know somewhere in the 40s chance
  • 14:56: of being lower than our objective
  • 14:59: so there's a 50-something percent chance
  • 15:01: of meeting our objective
  • 15:03: um but there's a more direct way to get
  • 15:06: at this probability of meeting the
  • 15:08: objective
  • 15:09: another note is you don't have to do
  • 15:11: this on the column you can do it on the
  • 15:13: data frame
  • 15:14: itself um and that achieves the same
  • 15:17: thing and then if you have
  • 15:19: other if you have your input values in
  • 15:21: the data frame as well then you'll see
  • 15:23: percentiles of those as well
  • 15:28: um so now we're getting to the
  • 15:30: probability of achieving a certain
  • 15:32: objective which makes a lot of sense to
  • 15:34: evaluate in this model where our
  • 15:37: objective is to have a thousand and
  • 15:39: fifty and we're trying
  • 15:40: to allocate to meet that objective and
  • 15:43: so this is kind of the main output for
  • 15:44: this model
  • 15:46: now in some other models you might not
  • 15:48: have an objective
  • 15:49: specific objective that makes sense
  • 15:53: in which case you don't need to do this
  • 15:55: analysis
  • 15:56: but it's useful in cases where you have
  • 15:58: some kind of target number that you're
  • 16:00: trying to achieve
  • 16:01: and you want to evaluate the probability
  • 16:03: of achieving that
  • 16:06: um so unfortunately there's not like a
  • 16:10: direct
  • 16:10: function in pandas to do this analysis
  • 16:13: but
  • 16:14: here's a simple one-liner which will
  • 16:16: accomplish that
  • 16:17: for you um so you can't feel free to
  • 16:21: just copy paste this
  • 16:22: into your model and change out you know
  • 16:26: whatever
  • 16:26: output you're looking at in the data
  • 16:28: frame
  • 16:31: but we'll break this down now so you can
  • 16:34: understand
  • 16:35: what's going on here
  • 16:39: so the first part going on here
  • 16:42: is we're checking which of the values
  • 16:46: which of our simulations met the
  • 16:48: objective
  • 16:49: that we're trying to achieve so we can
  • 16:52: compare
  • 16:53: a column of data frame to a number or
  • 16:56: whatever else
  • 16:57: and it will give us true a column of
  • 17:00: true false
  • 17:01: um representing we did meet the
  • 17:04: objective
  • 17:06: or we didn't meet the objective and just
  • 17:08: to see
  • 17:09: what values that's based on um
  • 17:12: you can see for these first four rows
  • 17:16: indeed they're above 10 50. we met the
  • 17:18: objective
  • 17:19: and then we get to this uh fifth row
  • 17:22: which it was below 10.50 and so we did
  • 17:25: not meet the objective
  • 17:29: so we we can't
  • 17:33: immediately do math with um boolean true
  • 17:36: false
  • 17:37: so then we convert this into ones and
  • 17:40: zeros
  • 17:41: so that we can do math with it um
  • 17:44: so now this is the same thing
  • 17:48: uh but we've just converted it into one
  • 17:50: means
  • 17:52: uh that we have met the objective and
  • 17:54: zero means we have not met the objective
  • 17:56: so now that covers this part of the
  • 17:58: expression
  • 18:00: and the last part is then just taking
  • 18:02: the average
  • 18:03: so when we take the average
  • 18:07: of one for yes we met the objective zero
  • 18:11: for no we did not meet the objective
  • 18:14: that gets us to the
  • 18:17: probability based on these simulation
  • 18:21: results
  • 18:22: of meeting the objective so that'll be
  • 18:25: explained
  • 18:26: in greater detail in the next video
  • 18:29: where we formally go over
  • 18:30: everything that we're doing um but just
  • 18:33: know for now that taking the average of
  • 18:35: this
  • 18:35: um one for we met the objective zero we
  • 18:39: didn't meet the objective
  • 18:40: will get you to an estimate of the
  • 18:42: probability of meeting the objective
  • 18:48: so now we have a few different results
  • 18:52: and analysis and visualization of the
  • 18:56: simulation
  • 18:58: but we mentioned in our approach to this
  • 19:00: that now we're going to have to evaluate
  • 19:02: this for the different
  • 19:03: weights that we can put into the stock
  • 19:06: so let's go ahead and wrap this up into
  • 19:09: functions
  • 19:10: so that we can easily reuse that logic
  • 19:13: across a number of different uh weights
  • 19:15: in the stock
  • 19:18: so here i've created functions for all
  • 19:21: these things
  • 19:22: that we just looked at um so
  • 19:25: first um here you'll see i made a
  • 19:28: function that
  • 19:29: takes the results from the simulation
  • 19:31: and creates a data frame
  • 19:33: uh from them i created a function
  • 19:37: which takes the data frame and
  • 19:40: does the histogram
  • 19:43: of the results and
  • 19:47: as you go to start putting
  • 19:49: visualizations into functions
  • 19:51: you should get familiar with this plt
  • 19:54: dot
  • 19:54: show like normally when you're doing a
  • 19:57: visualization out in a cell it doesn't
  • 19:58: matter
  • 19:59: but when you're running it in a function
  • 20:02: you generally want the visualization to
  • 20:04: show up
  • 20:04: as soon as you run it so that all the
  • 20:06: output can stay in order
  • 20:08: and this plt.show is going to ensure
  • 20:11: that
  • 20:12: so you just import matplotlib.piplot as
  • 20:15: plt that's the convention
  • 20:18: remember that matplotlib is the plotting
  • 20:20: library under the hood for pandas
  • 20:24: and then we just put this plt.show after
  • 20:27: any spot that we're doing
  • 20:28: a plot and this is a general practice
  • 20:31: that you should follow for
  • 20:33: any visualizations you have in functions
  • 20:38: and then we have another function here
  • 20:41: to produce that
  • 20:42: table of the probabilities that creates
  • 20:44: the percentile and then
  • 20:45: does quantile on the data frame with
  • 20:47: those percentiles
  • 20:49: and we have a function which gets us the
  • 20:51: probability
  • 20:52: of the objective and
  • 20:55: then another function which
  • 20:59: puts all of this together all the
  • 21:01: outputs
  • 21:03: so it takes the results creates the data
  • 21:05: frame it visualizes those results of the
  • 21:07: histogram
  • 21:08: it creates the probability table and it
  • 21:10: creates the probability of achieving the
  • 21:12: objective
  • 21:13: and then it returns the probability
  • 21:15: table and
  • 21:16: probability of achieving the objective
  • 21:18: finally
  • 21:19: one last function which does the whole
  • 21:22: summarization
  • 21:23: of the analysis it calls this function
  • 21:25: to get the
  • 21:26: probability table and probability
  • 21:28: objective
  • 21:30: and the plot is already going to be
  • 21:32: shown
  • 21:33: the histogram is going to be shown while
  • 21:35: we call this
  • 21:37: and then we can
  • 21:40: add some other output here to kind of
  • 21:42: separate things out so a header here to
  • 21:44: show the probability table and then
  • 21:46: formatting
  • 21:47: the probability table and then some
  • 21:49: space
  • 21:50: and then a sentence about the
  • 21:52: probability of
  • 21:53: meeting the objective and another space
  • 21:57: uh when i call this you can see
  • 22:00: now i'm taking the results from our
  • 22:03: prior simulation
  • 22:05: then we can see it shows the histogram
  • 22:07: it shows the probability table
  • 22:09: and it shows the probability of meeting
  • 22:12: our objective
  • 22:13: all in one function call
  • 22:18: so next we're going to get into okay now
  • 22:21: let's run this with the different stock
  • 22:22: plates and
  • 22:23: try to pick which is the most
  • 22:27: appropriate allocation based upon the
  • 22:29: simulation results
  • 22:31: um but before we get there i just want
  • 22:34: to show
  • 22:34: one other thing we can do to format the
  • 22:37: output
  • 22:39: and that's that um super notebooks
  • 22:43: work uh by using html and css
  • 22:47: and we can use that to our advantage we
  • 22:49: can actually put html
  • 22:51: in the notebook and display it and have
  • 22:54: whatever kind of formatting of the
  • 22:55: output that we want
  • 22:57: so i'm showing you a little
  • 23:01: snippet here we can use the ipython
  • 23:04: library
  • 23:05: and the display module of that library
  • 23:09: and we can import html and display from
  • 23:12: there and then when you do display
  • 23:13: html then it's going to
  • 23:17: show it's going to format as html
  • 23:20: whatever you have in there
  • 23:22: so you can um
  • 23:26: create whatever kind of output you want
  • 23:28: with html
  • 23:30: um here and this is not
  • 23:33: we're not teaching html in this class
  • 23:37: but just know this h2 thing that's a
  • 23:39: level two header
  • 23:42: and so you can use this function you can
  • 23:44: just copy paste this into your own model
  • 23:47: you can change these twos to anything
  • 23:49: from one to six
  • 23:50: for different levels of headings and if
  • 23:53: you just use this function
  • 23:54: then you just now have a function which
  • 23:58: displays a header
  • 23:59: so you can just pass whatever string
  • 24:01: that you want there and it's going to
  • 24:03: show a header
  • 24:05: that's going to be useful because i mean
  • 24:07: you see how much output we had just from
  • 24:09: one
  • 24:09: run here now we're going to have a bunch
  • 24:11: of different runs and we want to make
  • 24:13: sure we know
  • 24:14: what output corresponds to what
  • 24:18: so now we're coming to choosing
  • 24:22: uh the appropriate weight and so let's
  • 24:24: look at
  • 24:26: uh 10 increments and the stock weight
  • 24:28: going from 10 to 90
  • 24:30: um and then we're just going to do a
  • 24:34: loop
  • 24:34: over those weights we're going to use
  • 24:37: that display header
  • 24:38: functionality we just built out to
  • 24:41: separate it out
  • 24:42: um to whatever weight that we're looking
  • 24:45: at
  • 24:46: and then use our simulation function to
  • 24:49: get the results
  • 24:50: with whatever stock weight and then
  • 24:53: display the results
  • 24:54: and because we've wrapped everything
  • 24:56: nicely up in functions
  • 24:58: this code ultimately becomes very simple
  • 25:01: and that's how you can kind of build
  • 25:03: layers on layers
  • 25:04: with python and do quite complicated
  • 25:06: things without ever having to write
  • 25:09: complicated code
  • 25:10: so now we see we have the results for
  • 25:13: all these different weights on the stock
  • 25:17: so we can kind of look through that and
  • 25:20: see
  • 25:21: i mean really the most important output
  • 25:23: here is the probability of getting the
  • 25:24: objective
  • 25:26: but you can also get a better
  • 25:27: understanding of what's going on by
  • 25:29: looking at the other
  • 25:30: values um so when we're 10 in the stock
  • 25:34: you can see there's not a very big range
  • 25:37: of
  • 25:37: the possible values uh but we also only
  • 25:41: have a 20
  • 25:41: chance of meeting our objective whereas
  • 25:44: we come all the way to 90
  • 25:46: in the stock and you can see we have a
  • 25:48: much larger range here
  • 25:50: uh in the distribution but we do have a
  • 25:53: higher
  • 25:54: probability of meeting that objective
  • 25:57: then we go back to eighty percent and
  • 25:59: the probability goes
  • 26:00: up and seventy percent it went down a
  • 26:04: little bit
  • 26:05: sixty percent it was back up um so
  • 26:08: it looks like you know between um
  • 26:13: and by the time we get to uh 40
  • 26:16: it's going substantially down um so we
  • 26:19: know that the proper range in the stock
  • 26:21: is going to be somewhere between 50 and
  • 26:23: 80 percent
  • 26:24: and you can run this with additional
  • 26:28: simulations maybe run it with 10
  • 26:30: 000 or 50 000 simulations instead of a
  • 26:33: thousand
  • 26:34: and that will get you better stability
  • 26:36: in the results
  • 26:40: but this is the main idea here of
  • 26:43: we wanted to evaluate different weights
  • 26:45: the probability of getting the objective
  • 26:47: with each of those weights
  • 26:49: and setting up the functions and
  • 26:53: everything in a clean way
  • 26:54: that we can repeat all this analysis for
  • 26:58: the different weights without having to
  • 27:00: repeat the code
  • 27:03: so that's an overview of how we can
  • 27:07: apply monte carlo simulation in a simple
  • 27:10: model
  • 27:11: um in the other videos in this segment
  • 27:13: we'll look at applying monte carlo
  • 27:15: simulation to an existing model
  • 27:17: and also applying it to an excel model
  • 27:20: so thanks for listening and see you next
  • 27:24: time

Monte Carlo Dividend Discount Model (DDM) Lab Exercise


Notes

  • This is an example of applying Monte Carlo simulations to a typical model just to better understand the probability distribution of the results

  • Be careful that if the growth exceeds the discount rate in the model, it becomes invalid, so some conditions in the model may be needed to address this

Transcript

  • 00:03: hey everyone this is nick diabetis
  • 00:05: teaching you financial modeling
  • 00:06: today we're going to be going over the
  • 00:09: lab exercise
  • 00:10: on applying monte carlo simulation to
  • 00:13: the dividend discount model
  • 00:15: this is part of our lecture series on
  • 00:17: monte carlo simulation
  • 00:19: so we have already introduced what monte
  • 00:23: carlo simulation is and then we went
  • 00:25: and applied it in the context of a
  • 00:28: portfolio model choosing between two
  • 00:30: different assets
  • 00:31: and now we're reaching the lab exercise
  • 00:34: at the end of that material
  • 00:36: and it's focused on the dividend
  • 00:38: discount model
  • 00:40: so the situation here is that you're
  • 00:43: trying to value
  • 00:44: a mature company they have stable
  • 00:46: dividend
  • 00:47: growth and so this is a reasonable
  • 00:50: model to look at for evaluation
  • 00:54: and the model is defined as you see here
  • 00:57: in the second point the price is equal
  • 00:59: to the next dividend
  • 01:01: over the cost of capital or
  • 01:05: discount rate of the stock minus
  • 01:08: the growth rate of the dividends on
  • 01:11: stock
  • 01:14: and i gave you the initial
  • 01:17: values to use for the inputs so the next
  • 01:20: dividend is going to be a dollar
  • 01:22: uh the discount rate cost of capital
  • 01:25: is gonna be nine percent and the growth
  • 01:28: rate
  • 01:29: is going to be four percent so the first
  • 01:31: step here is just to build out the core
  • 01:33: model
  • 01:34: which is able to take these inputs and
  • 01:36: produce the price
  • 01:38: from those inputs
  • 01:41: but then as the modeler you're concerned
  • 01:44: that some of these inputs could have
  • 01:45: been mis-estimated
  • 01:47: maybe the growth isn't four percent
  • 01:49: maybe the cost of capital isn't nine
  • 01:51: percent
  • 01:52: so how can we evaluate changing these
  • 01:56: and understand what are the chances of
  • 01:58: achieving different
  • 01:59: prices uh based on the possibility that
  • 02:02: these values could be different
  • 02:04: that's where the monte carlo simulation
  • 02:06: comes in
  • 02:07: so for the level one exercise
  • 02:10: uh we're going to take the growth rate
  • 02:13: and now draw that from a normal
  • 02:15: distribution with a mean of four percent
  • 02:17: standard deviation of one percent and
  • 02:21: run that through the simulations
  • 02:24: and ultimately visualize and summarize
  • 02:27: the
  • 02:27: resulting probability distribution of
  • 02:30: the price
  • 02:33: and then coming to the level 2 exercise
  • 02:35: it's going to be continuing on from the
  • 02:37: first
  • 02:38: but here you're just also concerned that
  • 02:40: the cost of capital
  • 02:42: could be misestimated so for that
  • 02:45: we'll also be drawing from normal
  • 02:47: distributions using a mean of nine
  • 02:49: percent
  • 02:50: standard deviation of two percent and
  • 02:52: the growth is also being randomly drawn
  • 02:55: at the same time
  • 02:58: and then you want to run through the
  • 03:00: simulations and
  • 03:01: visualize and summarize the resulting
  • 03:04: probability distribution of the price
  • 03:08: and now you have to be careful in this
  • 03:11: level 2
  • 03:11: exercise that there is a condition
  • 03:16: in the dividend discount model for it to
  • 03:17: be valid
  • 03:19: when we look at the dividend discount
  • 03:20: model we have this denominator
  • 03:23: the uh cost of capital minus the growth
  • 03:26: rate
  • 03:27: for the model to be valid the cost of
  • 03:30: capital has to be greater than
  • 03:32: the growth rate otherwise this
  • 03:35: denominator becomes negative the price
  • 03:37: becomes negative which is nonsensical
  • 03:40: so that's actually an assumption of the
  • 03:43: dividend discount model
  • 03:44: that the cost of capital should be
  • 03:46: greater than the growth rate
  • 03:49: and as you're doing the level one it's
  • 03:51: it's pretty unlikely
  • 03:52: that that situation would occur
  • 03:55: that the growth rate would be greater
  • 03:57: than the cost of capital uh because
  • 04:00: we're drawing with a mean of four
  • 04:01: percent standard deviation of one
  • 04:03: percent it's pretty unlikely that it's
  • 04:04: going to hit nine percent
  • 04:06: but then once we start also varying the
  • 04:08: cost of capital
  • 04:10: now if we happen to get a low cost of
  • 04:12: capital at the same time we're getting a
  • 04:13: high growth rate
  • 04:15: then that would lead to this condition
  • 04:19: or violation a violation of the
  • 04:21: assumptions of the model
  • 04:23: uh that the cost of capital should be
  • 04:25: greater than the growth rate
  • 04:28: then you'll get negative crazy prices in
  • 04:30: your model
  • 04:31: so what you need to do in addition to
  • 04:35: building the base model and building the
  • 04:37: simulation is
  • 04:38: you need to be able to check the
  • 04:40: simulation inputs
  • 04:42: before passing it through the model and
  • 04:45: you want to check
  • 04:46: that the cost of capital is indeed
  • 04:48: greater than the growth rate
  • 04:50: and if not just reject that simulation
  • 04:53: you want to draw new inputs because it's
  • 04:55: not a valid
  • 04:56: run of the model
  • 04:59: so that's the overview of the live
  • 05:02: exercise
  • 05:03: on adding monte carlo simulation to a
  • 05:05: dividend discount model
  • 05:07: so thanks for listening and see you next
  • 05:09: time

Formal Introduction to Monte Carlo Simulations


Notes

  • The process described here to run Monte Carlo simulations may sound very similar to that to run sensitivity analysis, and that’s because it is. The only difference is that you randomly pick the input values from distributions with each run of the model rather than having fixed input ranges

  • Running the Monte Carlo simulation is not enough. You will have a bunch of outputs, but you must analyze them and visualize them to extract meaning

  • The main insights we can draw from analyzing a Monte Carlo simulation relate to the probabilities of certain outcomes in the model. We can also get a deeper picture of the relationships between inputs and outputs in a more complex model where that may not be clear

  • The probability table is the quantitative version of plotting the data on a histogram. I would generally recommend including both as the histogram allows quick understanding of the shape of the entire distribution whereas the probability table helps in quantifying the distribution

  • The Value at Risk (VaR) represents losing at least some amount with a degree of confidence, e.g. in 95% of periods the portfolio should not lose more than $1,000. The probability table can be interpreted in the same way if the outcome you are analyzing is the gain/loss

  • The probability of a certain outcome makes sense when you have some kind of goal in mind, then you can evaluate the probability of achieving that goal. If there is no specific goal in mind, there is no need to carry out this analysis

Transcript

  • 00:03: hey everyone
  • 00:04: this is nick dear burtis teaching you
  • 00:05: financial modeling
  • 00:07: today we're going to be doing a formal
  • 00:09: introduction
  • 00:10: to monte carlo simulation and the
  • 00:14: analysis
  • 00:15: of the outputs this is part of our
  • 00:17: lecture series
  • 00:18: on monte carlo simulation
  • 00:21: so we already ran through a general
  • 00:24: introduction
  • 00:25: of what monte carlo simulation is why we
  • 00:28: might want to go about it
  • 00:30: and then we kind of flip the structure
  • 00:34: on its head to first go through an
  • 00:36: example
  • 00:37: of how to do monte carlo simulation
  • 00:43: and i flipped it because i think it's
  • 00:45: easier to see by example
  • 00:47: um before getting formally introduced to
  • 00:51: it because it can sound a little bit
  • 00:53: complicated when you look at it formally
  • 00:56: but really it's a fairly simple process
  • 00:58: and seeing the example makes that clear
  • 01:00: so if you haven't viewed the video on
  • 01:02: the example go back and look at that
  • 01:04: first
  • 01:07: so we're now looking
  • 01:10: at theoretically what is monte carlo
  • 01:14: simulation and what is the process
  • 01:17: and as you look at this if you've seen
  • 01:20: the prior videos
  • 01:21: on sensitivity analysis you'll
  • 01:24: notice that the process here and the
  • 01:27: setup is almost exactly the same
  • 01:31: we have some model here we're
  • 01:34: representing the model
  • 01:35: mathematically as we get some output
  • 01:39: model is some function which converts
  • 01:40: the inputs to the output
  • 01:43: and we have multiple different inputs
  • 01:48: and in order to run these simulations
  • 01:52: we first assign a probability
  • 01:54: distribution
  • 01:55: to each input and then
  • 01:59: for each input we're going to randomly
  • 02:03: pick values from their distributions
  • 02:07: and we're going to repeat that previous
  • 02:10: step
  • 02:11: for n times n is our number of
  • 02:14: iterations number of simulations
  • 02:17: um and so then we're going to have all
  • 02:20: the random inputs and
  • 02:22: then you want to calculate the model run
  • 02:25: the model
  • 02:27: with those simulated input values
  • 02:30: so for each simulation
  • 02:34: you've got each random input you pass it
  • 02:36: into the model
  • 02:38: and you get the result then we want to
  • 02:42: keep the inputs associated to the
  • 02:45: outputs so you know which inputs
  • 02:46: produced which outputs and the final
  • 02:49: step
  • 02:50: is to visualize and analyze the results
  • 02:54: so basically
  • 02:58: all these last steps are
  • 03:02: all these last three steps are the same
  • 03:06: as sensitivity analysis running the
  • 03:07: model with each of the inputs
  • 03:09: keeping the model uh the inputs
  • 03:10: associated with the outputs
  • 03:12: analyzing the resulting outputs um
  • 03:17: the only thing that's different here is
  • 03:18: this random piece so
  • 03:20: assigning a probability distribution to
  • 03:22: each input and we're going to randomly
  • 03:24: pick the value of the input from the
  • 03:26: distribution
  • 03:27: each time that we want to run the model
  • 03:34: so let's say now we've run the
  • 03:37: simulation
  • 03:38: we've got our 10 000 different results
  • 03:41: from the model
  • 03:43: now what can we do with those results so
  • 03:46: there are a few outputs
  • 03:47: from the analysis and visualization
  • 03:50: that we can gather
  • 03:54: so the first category here is
  • 03:57: probability distributions of the output
  • 04:01: and we looked at two different ways in
  • 04:03: the prior example
  • 04:04: of how we can get at this
  • 04:09: one is with a histogram over all the
  • 04:11: results
  • 04:12: and then the other is with a table of
  • 04:15: the
  • 04:15: probabilities the percentiles of the
  • 04:18: probability distribution
  • 04:19: um and the value
  • 04:23: of the variable at that percentile in
  • 04:26: the
  • 04:26: distribution um
  • 04:29: so in the investment returns example
  • 04:31: that was you know saying that
  • 04:33: 45 of uh
  • 04:36: cases we got less than a thousand and
  • 04:39: twenty dollars and having that
  • 04:41: for the range of different percentiles
  • 04:46: then we also have the probability of a
  • 04:49: certain outcome
  • 04:50: so this i mean it is kind of within this
  • 04:53: idea of looking at the probability
  • 04:55: distribution of the outputs
  • 04:57: but it's just looking at a one
  • 04:58: particular point on the probability
  • 05:00: distribution
  • 05:02: and that particular point is some
  • 05:05: goal or objective that we care about and
  • 05:08: so we want to
  • 05:09: evaluate the probability of achieving
  • 05:12: that objective or outcome
  • 05:17: so in our investment returns model that
  • 05:19: was what's the chance that we're gonna
  • 05:21: get the thousand fifty dollars that we
  • 05:22: need to satisfy our
  • 05:24: obligation and then the last
  • 05:27: uh main output which we haven't looked
  • 05:30: at calculating yet and we're going to do
  • 05:33: in the next video where we add monte
  • 05:36: carlo simulation to the dynamic salary
  • 05:38: retirement model
  • 05:40: is we can look at the relationship
  • 05:42: between the inputs and the outputs
  • 05:46: so monte carlo simulation
  • 05:50: we can use similarly to
  • 05:54: sensitivity analysis where we're trying
  • 05:55: to see what changing an input does to
  • 05:58: the model
  • 05:59: we can get at the same kinds of
  • 06:00: questions with monte carlo simulation as
  • 06:03: well trying to understand
  • 06:04: how the inputs affect the outputs
  • 06:07: and in order to
  • 06:10: [Applause]
  • 06:10: [Music]
  • 06:11: analyze this the two main methods that
  • 06:14: we'll look at are
  • 06:15: visualizing it via a scatter plot
  • 06:19: and using a multivariate regression to
  • 06:22: get at it quantitatively
  • 06:27: so digging into the outcome probability
  • 06:30: distributions
  • 06:32: so uh you can see on the left here
  • 06:35: uh the kind of outputs that we had from
  • 06:38: our
  • 06:39: investment returns example we have a
  • 06:41: histogram
  • 06:42: distribution of the
  • 06:45: whole of all the outputs all the
  • 06:48: different
  • 06:48: portfolio and values and then we have
  • 06:51: this probability table
  • 06:53: as well as a quantitative way of showing
  • 06:55: that information
  • 06:58: these are both getting at just the
  • 07:00: chance of having different outcomes
  • 07:02: in your model so
  • 07:06: uh you can visualize this with a plot in
  • 07:08: a table and with a plot
  • 07:10: usually use a histogram you can also use
  • 07:12: a kde
  • 07:14: kernel density estimation plot as the
  • 07:16: other
  • 07:17: potential way to visualize this uh
  • 07:20: which just gives a smoother looking
  • 07:22: output
  • 07:24: um and this helps you understand
  • 07:28: at a high level what the distribution
  • 07:30: looks like is it basically a normal
  • 07:32: distribution
  • 07:33: as is the case here what's kind of the
  • 07:36: the range of the distribution
  • 07:38: um and just understanding at a high
  • 07:40: level are there heavy tails
  • 07:42: is it you know non-normal et cetera
  • 07:46: and then the probability table helps you
  • 07:48: quantify some of this and think about
  • 07:50: the chance
  • 07:50: of hitting certain values in the model
  • 07:54: so what this probability table says is
  • 07:57: that
  • 07:58: in 25 percent of cases we're gonna have
  • 08:00: less than
  • 08:01: a thousand and twenty dollars and fifty
  • 08:03: percent of cases we're going to have
  • 08:04: less than
  • 08:05: 1039 dollars and in 75 percent of cases
  • 08:09: we're going to have less than
  • 08:10: 1053 dollars
  • 08:17: and then i just wanted to note here that
  • 08:22: the value at risk is a common
  • 08:26: measure that's used in the industry and
  • 08:29: the value at risk
  • 08:30: is typically looking at a portfolio
  • 08:34: but it evaluates at a certain
  • 08:38: confidence level or certain probability
  • 08:41: uh the the minimum amount
  • 08:45: that you're going to lose so
  • 08:48: it's saying like we're in 95 percent of
  • 08:52: days
  • 08:52: we're not going to lose more than a
  • 08:54: thousand dollars
  • 08:56: or yeah at 95 percent of days
  • 09:00: we're going to lose less than a thousand
  • 09:02: dollars
  • 09:04: and those other five percent of days it
  • 09:06: doesn't say anything about that it could
  • 09:07: be
  • 09:08: 10 000 could be a million loss just that
  • 09:12: 95 of the time the loss is going to be
  • 09:14: less than
  • 09:15: uh a thousand dollars so this
  • 09:18: probability table
  • 09:20: um it actually gets at the same
  • 09:24: concept so the probability table is
  • 09:27: actually a more
  • 09:28: general version of the value at risk
  • 09:31: measure which
  • 09:32: probability table tells you all these
  • 09:34: different
  • 09:35: values and as general to whatever kind
  • 09:37: of output you want to look at
  • 09:38: whereas value at risk is specifically
  • 09:41: talking about some kind of loss
  • 09:43: in your model and usually
  • 09:48: most commonly to look at 90 95 or 99
  • 09:51: um percentiles so if you need to
  • 09:55: calculate the var
  • 09:56: from a monte carlo simulation look no
  • 10:00: further
  • 10:00: than the probability table there are
  • 10:03: other ways to calculate var
  • 10:05: in other situations but it's getting at
  • 10:09: the same concept as this probability
  • 10:10: table
  • 10:14: so then coming to the probability of a
  • 10:16: certain
  • 10:17: outcome um
  • 10:20: so this was where we were saying uh
  • 10:23: what's the chance of getting a thousand
  • 10:25: fifty dollars
  • 10:26: in our portfolio
  • 10:29: so in order to understand how this works
  • 10:33: let's look at a very simple example
  • 10:37: so you have some box here and this box
  • 10:41: has red and blue balls on the inside
  • 10:44: and you can't see what's inside the box
  • 10:48: you don't know
  • 10:49: how many red balls and how many blue
  • 10:51: balls there are in the box
  • 10:54: and what you want to get is an estimate
  • 10:57: of what is the chance
  • 10:59: when i reach in to pull out a ball that
  • 11:01: i'm going to get a blue ball
  • 11:02: when i do that
  • 11:06: so what's the process you might go
  • 11:09: through
  • 11:09: to figure this out
  • 11:12: well just reach in grab a ball so i get
  • 11:15: red or blue let me write it down
  • 11:18: put it back in take the box up mix it up
  • 11:21: whatever so it's random
  • 11:23: and pull another one out write down its
  • 11:25: color and just keep doing the same thing
  • 11:27: a thousand times
  • 11:31: and so you get a blue ball 350
  • 11:34: out of 1000 times so then what is your
  • 11:38: probability
  • 11:39: of getting a blue ball and so
  • 11:43: a decent amount of people will probably
  • 11:45: just kind of intuitively
  • 11:48: understand this and say well that's
  • 11:50: that's 35
  • 11:51: chance of getting a blue ball
  • 11:55: how do we get there so
  • 12:00: we can estimate the probability of
  • 12:03: some outcome by basically trying a bunch
  • 12:07: of times and
  • 12:10: seeing how many of those times we
  • 12:12: achieve the objective that we want
  • 12:15: and you just divide the number of times
  • 12:17: you hit the objective
  • 12:18: by the total number of times and that's
  • 12:20: going to be an estimate of the
  • 12:22: probability of achieving that objective
  • 12:25: so in the ball situation three 350 blue
  • 12:29: balls
  • 12:30: um when whenever we draw a blue ball
  • 12:33: that's meaning our objective
  • 12:35: of getting a blue ball whenever we draw
  • 12:37: a red ball that's
  • 12:38: not meeting that objective and so it
  • 12:40: doesn't enter into the numerator
  • 12:42: so uh we get 350
  • 12:46: times where the trial was successful we
  • 12:48: pulled the blue ball
  • 12:49: and so that's the numerator and we had a
  • 12:51: thousand times that we pulled balls
  • 12:53: in total and so that's the denominator
  • 12:55: and so we get that 35 percent
  • 12:58: estimate of the probability of getting a
  • 13:01: blue ball
  • 13:04: and when we apply this in the investment
  • 13:06: investment example
  • 13:07: it was the same exact logic
  • 13:11: it may have looked a little bit
  • 13:12: complicated what we did in pandas
  • 13:14: but all that we did was convert it into
  • 13:17: this kind of format
  • 13:18: where we assigned a one or a zero
  • 13:23: based on whether we met the objective if
  • 13:25: we got at least a thousand and fifty
  • 13:27: dollars it became a one
  • 13:29: if we got less than that then it became
  • 13:30: a zero so that's the same
  • 13:32: as here when we get a blue ball we make
  • 13:36: it
  • 13:36: one and when we get a red ball we make
  • 13:39: it zero
  • 13:41: and then sum it all up and that's going
  • 13:44: to be
  • 13:46: just the count of how many times that
  • 13:48: you got the blue ball
  • 13:49: or just the count of how many times that
  • 13:51: we got a thousand and fifty dollars
  • 13:55: so then that's the numerator and the
  • 13:58: denominator is just the total
  • 14:01: number of trials
  • 14:04: number of simulations
  • 14:07: and that you know taking
  • 14:11: the sum of all that divided by the count
  • 14:15: that is an average right so really
  • 14:18: it's just getting a one for the positive
  • 14:20: outcome a zero for the negative outcome
  • 14:22: and then taking the average
  • 14:24: of that and that will give you an
  • 14:27: estimate of the probability of achieving
  • 14:29: the objective
  • 14:30: that you desire
  • 14:33: so that's a formal introduction
  • 14:37: to monte carlo simulation
  • 14:40: we're going to come back next time to
  • 14:41: discuss how we can analyze
  • 14:44: the relationship between the inputs and
  • 14:47: the outputs
  • 14:49: so thanks for listening and see you next
  • 14:51: time

Analyzing Relationships with Monte Carlo Simulations


Notes

  • The results from the Monte Carlo simulation can be run through multivariate regression or another empirical method to better understand the relationship between inputs and outputs

  • Sensitivity analysis gets at the same goal, but sensitivity analysis is a bit more narrow because at most one other input is changing at the same time. With Monte Carlo simulation, all inputs are changing with each run and so if inputs have complex interactions in the model they will be better understood through MC simulation

  • The multivariate regression results give the quantitative interpretation of the relationship while scatter plots can help visualize the relationship

Transcript

  • 00:02: hey everyone this is nick diabetis
  • 00:04: teaching you financial modeling
  • 00:06: today we're going to be looking at how
  • 00:07: we can analyze the relationship
  • 00:09: between inputs and outputs in our models
  • 00:12: by using
  • 00:13: monte carlo simulation this is part of
  • 00:16: our lecture segment on
  • 00:17: monte carlo simulation so
  • 00:21: we've already covered an intro to monte
  • 00:24: carlo we've run
  • 00:25: monte carlo on a model and then we went
  • 00:28: through the
  • 00:29: formal introduction of what we're doing
  • 00:32: in monte carlo simulation so now we're
  • 00:36: going to
  • 00:37: dig into how we can
  • 00:41: establish relationships and quantify the
  • 00:43: relationships
  • 00:45: between our inputs and outputs using
  • 00:47: monte carlo simulation
  • 00:49: so we
  • 00:53: the monte carlo simulation is not the
  • 00:54: only tool that we can use
  • 00:56: to achieve this objective we've already
  • 00:59: looked at
  • 01:00: sensitivity analysis which can get at
  • 01:03: the same basic idea what is the
  • 01:05: relationship
  • 01:06: between the inputs and the outputs with
  • 01:08: sensitivity analysis we're changing one
  • 01:10: or two inputs at once
  • 01:12: and seeing what happens to the
  • 01:16: output of our model with those changing
  • 01:18: input values
  • 01:21: with monte carlo simulation then
  • 01:24: we're uh running the model a bunch of
  • 01:27: different times with all these
  • 01:28: randomized inputs
  • 01:30: and that allows us to get a more
  • 01:34: full picture of how the inputs relate to
  • 01:37: the outputs
  • 01:38: because when you change just one or two
  • 01:40: inputs at a time
  • 01:42: you may be leaving out that
  • 01:45: when other inputs are at different
  • 01:47: values
  • 01:49: these inputs that you're changing have
  • 01:51: different effects
  • 01:53: so thinking about the retirement model
  • 01:58: when we're changing the interest rate
  • 01:59: and seeing how that changes our years to
  • 02:02: retirement
  • 02:03: well the interest rate is going to have
  • 02:06: a bigger impact
  • 02:07: on the model if the initial salary
  • 02:11: was higher to begin with so you might
  • 02:14: you know thinking about this in advance
  • 02:16: you might do a sensitivity analysis
  • 02:18: of interest rate versus the initial
  • 02:20: salary so you can see how these two
  • 02:22: interplay together
  • 02:25: but you may not have thought about this
  • 02:26: relationship at the get-go
  • 02:28: or there may be other relationships
  • 02:30: which
  • 02:31: matter as well i mean with a higher
  • 02:33: savings rate
  • 02:34: also the interest rate is going to be
  • 02:36: more impactful
  • 02:38: so uh doing it the monte carlo
  • 02:42: simulation route to analyze the
  • 02:43: relationship
  • 02:45: you're kind of bringing all these
  • 02:46: different relationships together
  • 02:48: and not having to explicitly think about
  • 02:51: them as the modeler
  • 02:52: you just kind of throw everything in
  • 02:54: with random inputs
  • 02:56: and the simulation is going to reveal
  • 02:59: those relationships
  • 03:04: so by changing all the inputs each time
  • 03:06: that you run the model
  • 03:08: then you're going to get cases uh
  • 03:11: with each different values of the inputs
  • 03:13: so
  • 03:15: some cases you're going to have a high
  • 03:16: interest rate and within that in some
  • 03:18: cases you're going to have a high salary
  • 03:20: some cases you're going to have a low
  • 03:21: salary
  • 03:22: some uh simulations you're going to have
  • 03:25: a low interest rate and within that
  • 03:27: you're going to have some that have a
  • 03:28: high salary and some that have a low
  • 03:30: salary
  • 03:30: as well as different values of the other
  • 03:32: inputs as well
  • 03:34: so as long as you do enough simulations
  • 03:36: a high enough number of iterations
  • 03:39: then you're going to capture cases of
  • 03:41: all these different values of the inputs
  • 03:44: interplaying together and that gives you
  • 03:46: a much more
  • 03:47: fuller picture of the relationship
  • 03:50: between the inputs and outputs
  • 03:54: now the issue with this with sensitivity
  • 03:57: analysis it was fairly straightforward
  • 03:59: to understand uh how we can just look at
  • 04:02: those results we just see the result
  • 04:04: from the model
  • 04:05: we can visualize it using conditional
  • 04:08: formatting
  • 04:09: or a hexpin plot and it's fairly
  • 04:12: straightforward
  • 04:13: it's a little more complicated to take
  • 04:17: okay well now we've got 10 000 results
  • 04:20: from this model
  • 04:21: how do we get an understanding of the
  • 04:24: relationship between the inputs and
  • 04:25: outputs
  • 04:27: so i'll discuss two different approaches
  • 04:30: that we can use here
  • 04:31: to get an understanding the first is
  • 04:35: by visualizing uh via a scatter plot
  • 04:40: so the scatter plot
  • 04:43: shows the relationship between two
  • 04:45: variables one that gets plotted on the
  • 04:47: x-axis and one that gets plotted on the
  • 04:49: y-axis
  • 04:50: and each point is the values of the x
  • 04:54: and the y's together
  • 04:56: so this could be looking at that
  • 05:00: investment rate on the x and looking at
  • 05:02: the years to retirement
  • 05:04: on the y dimension and you would say
  • 05:05: well one time i ran it
  • 05:07: and i had a 2.2 interest rate and i got
  • 05:12: 22 years to retirement uh
  • 05:15: et cetera and
  • 05:19: um the disadvantage of
  • 05:22: scatter plots is that it only does look
  • 05:24: at one variable at a time so you do have
  • 05:26: to have
  • 05:26: like one scatter plot for each variable
  • 05:30: but then each graph is is very focused
  • 05:33: on that variable
  • 05:36: and so what you're looking for when you
  • 05:38: look at these scatter plots
  • 05:40: is you want to see some kind of pattern
  • 05:44: um if the points are just kind of in a
  • 05:47: cloud
  • 05:48: as we see in the bottom picture here
  • 05:51: there's no kind of linear
  • 05:53: or shape in here it's just kind of an
  • 05:57: ambiguous cloud that
  • 06:00: is supportive of there not being a
  • 06:02: strong relationship
  • 06:03: between the variables whereas
  • 06:07: when the points seem to kind of fit
  • 06:09: along a line
  • 06:10: or like a u shape or some other kind of
  • 06:13: very defined shape
  • 06:16: [Music]
  • 06:17: then that's evidence in support
  • 06:20: of there being a relationship between
  • 06:23: the two variables
  • 06:27: so
  • 06:31: as far as quantifying this then we're
  • 06:34: going to
  • 06:35: look at regressions multivariate
  • 06:38: regressions
  • 06:39: to accomplish that the scatter plot just
  • 06:41: gives you a quick picture
  • 06:43: visualization that you can quickly see
  • 06:46: the relationship and
  • 06:48: how the relationship changes throughout
  • 06:49: the range of the input
  • 06:53: but the multivariate regression is going
  • 06:56: to be able to give you
  • 06:57: quantitatively what is the impact of the
  • 07:00: input on the output
  • 07:02: so that allows you to answer questions
  • 07:04: like if i earned
  • 07:06: ten thousand dollars more for my
  • 07:07: starting salary how much sooner
  • 07:10: would i be able to retire uh
  • 07:13: so of course in order to answer that
  • 07:15: question
  • 07:16: an easy attempt at it is to just go to
  • 07:19: your model inputs
  • 07:20: and increase the salary by 10 000 and
  • 07:23: see what happens
  • 07:24: to the years of retirement that's kind
  • 07:26: of
  • 07:27: that's you know basically the
  • 07:28: sensitivity analysis approach
  • 07:31: um but it is a simplistic way of looking
  • 07:34: at it
  • 07:35: it doesn't take into account how all the
  • 07:38: other inputs
  • 07:39: in the model could change you're still
  • 07:41: just assuming that those other inputs
  • 07:43: are at their baseline values
  • 07:46: so by doing the monte carlo simulation
  • 07:48: we take into account
  • 07:49: all these cases of all the other inputs
  • 07:52: being at different values
  • 07:55: so multivariate regression basically we
  • 07:59: put
  • 07:59: whatever output we're trying to analyze
  • 08:02: as our y variable
  • 08:04: and then each of our inputs as the x
  • 08:05: variables
  • 08:07: and this will be able to tell us
  • 08:09: quantitatively
  • 08:10: what is the relationship what's the
  • 08:13: strength
  • 08:13: magnitude of the relationship and
  • 08:15: direction
  • 08:17: uh between the input and the output
  • 08:22: so the process or how you interpret the
  • 08:27: uh results of that as we run a
  • 08:29: multivariate
  • 08:30: regression we get some fit statistics
  • 08:33: and then the part that really matters
  • 08:36: for this
  • 08:37: is the coefficients and the
  • 08:40: p-values and if the p-value is high
  • 08:45: that's evidence of there not being a
  • 08:47: relationship
  • 08:49: if it's low then there is at least some
  • 08:52: relationship it doesn't necessarily mean
  • 08:54: that the relationship is strong or
  • 08:56: even meaningful in your model but it
  • 08:58: does mean that there is evidence that
  • 09:00: there is a relationship
  • 09:02: and then you look at the coefficient to
  • 09:05: assess the strength or magnitude of that
  • 09:08: relationship so the coefficient
  • 09:12: in multivariate regression is
  • 09:16: how much does the outcome variable
  • 09:18: change when there
  • 09:19: is a one unit increase in
  • 09:22: the input variable or x variable
  • 09:26: so to give an example of that say we're
  • 09:29: working still with this
  • 09:31: retirement model and you get a
  • 09:33: coefficient of negative .0002
  • 09:37: on starting salary when years to
  • 09:40: retirement is your y variable
  • 09:43: so what that means is a one unit
  • 09:44: increase in our x
  • 09:46: is associated with a negative point zero
  • 09:49: zero zero
  • 09:49: two unit increase or decrease in the y
  • 09:54: um so our salary is in dollars and so
  • 09:58: that means that's a one dollar increase
  • 10:00: in salary
  • 10:01: and then our year's retirement is in
  • 10:04: years
  • 10:05: and so that's a one dollar increase in
  • 10:07: salary associated with a
  • 10:08: decrease in year's retirement of 0.0002
  • 10:12: years
  • 10:13: of course that's not
  • 10:16: a nice way to interpret it right like
  • 10:19: who cares about a one dollar increase in
  • 10:20: salary that's not
  • 10:22: gonna be a meaningful thing uh but the
  • 10:24: nice thing about these
  • 10:26: uh relationships is you can just
  • 10:27: multiply them up
  • 10:29: in order to get it in terms of something
  • 10:31: which is meaningful
  • 10:33: so we can multiply both sides here by
  • 10:35: ten thousand
  • 10:36: to now change it into a ten thousand
  • 10:39: dollar increase in salary
  • 10:41: is associated with a decrease in use to
  • 10:44: retirement by
  • 10:45: two years so whenever you interpret the
  • 10:48: coefficients
  • 10:50: you want to put them in terms of
  • 10:52: something which is meaningful
  • 10:54: for your model no one cares about a one
  • 10:56: dollar change in salary how about a ten
  • 10:58: thousand dollar change
  • 11:00: and let's interpret it in that context
  • 11:05: an important thing as you go to
  • 11:07: interpret the results from the
  • 11:08: regression
  • 11:10: um and this can definitely be a point of
  • 11:13: confusion
  • 11:14: is that these coefficients are
  • 11:17: all less all else constant so that means
  • 11:22: that this you know ten thousand dollar
  • 11:25: increase in salary decreases years to
  • 11:26: retirement by two years
  • 11:28: that is not taking into account um
  • 11:33: that you know when you're able to earn
  • 11:35: more money you're probably able to save
  • 11:37: more money as well and so the savings
  • 11:39: rate is going to be higher for an
  • 11:40: individual who makes
  • 11:42: more money that is not being captured
  • 11:45: in the coefficient now
  • 11:49: a big reason that i've been saying we
  • 11:51: should use this approach rather than
  • 11:53: just
  • 11:53: sensitivity analysis is because it takes
  • 11:55: into account that all the other
  • 11:56: variables are changing
  • 11:58: so that's where students can get
  • 12:00: confused by this
  • 12:02: because now here we're saying oh but
  • 12:05: we're basically treating all the other
  • 12:06: inputs as constant
  • 12:08: um but it's
  • 12:12: even though we're kind of isolating the
  • 12:14: effect to this one
  • 12:16: uh input with this coefficient
  • 12:19: the regression model is still
  • 12:21: considering all these different cases of
  • 12:23: the input values so you can kind of
  • 12:25: think of it as an
  • 12:26: average across you know thinking about
  • 12:29: um earning salary uh you can think of it
  • 12:33: as an average across
  • 12:34: like sometimes the investment rate was
  • 12:35: high sometimes the investment rate was
  • 12:37: low
  • 12:38: sometimes the saving rate was high
  • 12:39: sometimes the saving rate was low
  • 12:42: taking the average across all these
  • 12:44: different cases
  • 12:45: what was the overall effect of just the
  • 12:49: salary portion
  • 12:53: so if you know that two inputs in your
  • 12:56: model
  • 12:57: are linked as is the case
  • 13:00: uh potentially here with starting salary
  • 13:02: and savings rate you know that
  • 13:04: if you have a higher starting salary
  • 13:06: you're gonna have a higher savings rate
  • 13:09: then you can basically combine the
  • 13:12: coefficients and say well i know when
  • 13:15: the
  • 13:15: salary goes up at ten thousand the
  • 13:17: savings rate is going to go up by five
  • 13:18: percent and then you can add those two
  • 13:21: effects together
  • 13:22: to get the total effect on the years to
  • 13:24: retirement so you can still use these
  • 13:27: regression results to get at the full
  • 13:29: relationship
  • 13:30: it's just that each coefficient is
  • 13:33: interpreting
  • 13:34: just the effect of that variable
  • 13:39: and another thing to be careful about
  • 13:41: here is the units we already talked
  • 13:43: about
  • 13:43: you know how this is a one unit increase
  • 13:45: and you're probably
  • 13:47: in a lot of cases going to need to
  • 13:48: adjust the units
  • 13:50: in order to get that to a meaningful
  • 13:53: number
  • 13:54: uh and that's definitely the case when
  • 13:56: you think about decimals versus
  • 13:59: percentages so you know we're always
  • 14:02: representing our investment returns in
  • 14:04: decimal format
  • 14:06: in our python models and
  • 14:09: a one unit change in decimal format is
  • 14:12: actually a 100
  • 14:13: change it's going from zero to one which
  • 14:15: is zero to one hundred percent
  • 14:18: um and so the coefficient that you'll
  • 14:20: get
  • 14:21: for the investment rate or any other
  • 14:23: decimal
  • 14:25: um decimal number that is actually a
  • 14:27: percentage
  • 14:29: um then basically
  • 14:32: the coefficient is going to be much
  • 14:35: larger
  • 14:36: uh it's going to be a hundred times as
  • 14:38: large as
  • 14:39: the value would be for a 1 change so you
  • 14:42: have to divide by 100
  • 14:44: to get it to a one percent change
  • 14:48: um and if you you know have things in
  • 14:51: percentages it could go the other way
  • 14:53: around in the model
  • 14:55: and so you just need to be careful about
  • 14:56: the units and thinking through
  • 14:58: the coefficients and what
  • 15:02: makes sense to have everything in the
  • 15:04: proper units
  • 15:07: so that's an overview of theoretically
  • 15:10: how and why we're going to analyze the
  • 15:13: relationship
  • 15:14: of inputs and outputs through monte
  • 15:16: carlo simulations
  • 15:18: we'll come back next time to apply monte
  • 15:21: carlo simulation
  • 15:22: to the dynamic salary retirement model
  • 15:25: and within that we're going to see an
  • 15:26: example of
  • 15:27: how we can do all this analysis of
  • 15:29: relating inputs to outputs
  • 15:31: so thanks for listening and see you next
  • 15:35: time

Applying Monte Carlo Simulation to a Python Model


Notes

  • It can make sense to set up a separate dataclass for your simulation-specific inputs, or you may add them to the existing dataclass

  • Once you start running large numbers of simulations, some unexpected situations may occur in your model such as inputs going negative that were supposed to only be positive, or one input being greater than another when it is supposed to be less. To solve this, we can build functions which produce the random inputs according to the necessary conditions in our model

  • Create a function which runs a single simulation, then call that function in a loop over the number of iterations to run all the simulations

  • Because we typically have multiple changing inputs and may even have multiple outputs, it is useful to store data as a list of tuples and then create a DataFrame at the end

  • It doesn’t hurt to take the quantile of the entire DataFrame to see the distributions of the inputs as well. It can be a nice check to make sure your random inputs are working appropriately

  • After running a multivariate regression, be sure to add some text interpreting the results

  • We can check the standardized coefficients (coef * std) to understand which inputs have the greatest impact on the outputs. Be careful that these results are influenced by your choice of the input distributions. If your input distributions are not reasonable, neither will be the results

Transcript

  • 00:03: hey everyone
  • 00:04: nick diabetes here teaching you
  • 00:05: financial modeling today
  • 00:07: we're going to be looking at an example
  • 00:09: of how to add
  • 00:10: monte carlo simulation to an existing
  • 00:13: python
  • 00:14: model this is part of our lecture series
  • 00:16: on monte carlo simulation
  • 00:19: so we introduced monte carlo we looked
  • 00:22: on how to build out a model with monte
  • 00:24: carlo
  • 00:25: and we went through a more formal
  • 00:27: introduction and an
  • 00:29: explanation of everything that we're
  • 00:30: doing and now
  • 00:32: it's time to go and apply monte carlo
  • 00:35: simulation
  • 00:36: to our existing dynamic salary
  • 00:38: retirement model
  • 00:40: and you can find the full completed
  • 00:43: exercise there on the course site so
  • 00:46: that you can
  • 00:48: take from that example to build out your
  • 00:50: own monte carlo simulations
  • 00:54: so let's jump over here to the dynamic
  • 00:57: salary
  • 00:58: retirement model and i'm just going to
  • 01:00: go ahead and
  • 01:01: restart kernel run all cells so that we
  • 01:04: can
  • 01:04: get everything defined to get ready to
  • 01:08: do our monte carlo simulations
  • 01:12: so we can add a new section here monte
  • 01:16: carlo simulation
  • 01:18: and you would want to describe
  • 01:22: what you're doing here what's the goal
  • 01:24: of the simulation etc i'm going to skip
  • 01:26: over that
  • 01:27: for brevity in the video and go right to
  • 01:30: the code you can see all of that
  • 01:32: in the completed example so
  • 01:37: we're going to have some additional
  • 01:39: inputs now
  • 01:40: from the simulation
  • 01:44: we're going to need to draw all the
  • 01:46: different inputs from normal
  • 01:47: distributions
  • 01:50: and so we're going to have to have means
  • 01:51: and standard deviations of those
  • 01:53: distributions
  • 01:57: and so we can
  • 02:00: use our existing baseline
  • 02:04: input as the mean
  • 02:07: so we don't have to add all the means
  • 02:10: we do need to go and add these standard
  • 02:12: deviations though
  • 02:14: and we're also going to need to have a
  • 02:15: number of iterations for the simulations
  • 02:18: as an input so we've got a number of
  • 02:21: different inputs to
  • 02:22: manage here because we've got a
  • 02:25: bunch of inputs to manage it makes sense
  • 02:27: to create a data class
  • 02:29: to manage them there's a number of ways
  • 02:32: you could set this up you don't
  • 02:33: necessarily need to use the data class
  • 02:34: you could go and add these inputs to the
  • 02:36: existing model inputs data class
  • 02:38: but i'm just going to create a separate
  • 02:41: simulation inputs data class
  • 02:48: and so in that i'm going to put the
  • 02:50: number of iterations
  • 02:52: uh that would be an integer let's
  • 02:55: default it to 10 000. oh
  • 02:58: we're building this out i'll put it at
  • 03:00: 100 then i'll go back and change it to a
  • 03:02: thousand
  • 03:03: later uh we're gonna have the
  • 03:06: starting salaries so let's look at what
  • 03:08: we
  • 03:10: have in the model data
  • 03:13: starting salary
  • 03:17: so we want a standard deviation for that
  • 03:21: um and let's make that ten thousand
  • 03:24: dollars
  • 03:25: um and as you go to pick
  • 03:29: a standard deviation for your
  • 03:30: distribution
  • 03:32: so the mean you know whatever the kind
  • 03:34: of expected or most likely value is
  • 03:37: should be fine for the mean we already
  • 03:39: have that from our baseline
  • 03:40: values so that's fine the standard
  • 03:44: deviations you want to think about
  • 03:46: uh one standard deviation changes in
  • 03:48: either direction should happen often so
  • 03:50: going
  • 03:50: between 70 and between 50 and 70 000
  • 03:53: salary happens often that makes sense
  • 03:56: two standard deviations in either
  • 03:59: direction
  • 04:00: should be not happening very often but
  • 04:03: not
  • 04:03: rare either so that's
  • 04:06: going from a 40 000 to an 80 000 salary
  • 04:09: that
  • 04:10: seems reasonable three standard
  • 04:12: deviation changes should be rare
  • 04:14: um so going from thirty thousand
  • 04:19: to uh ninety thousand yeah those those
  • 04:22: outer
  • 04:22: thirty to forty and uh eighty to ninety
  • 04:25: seem pretty rare for a starting salary
  • 04:28: and outside like four times standard
  • 04:31: deviation should like almost never
  • 04:33: happen
  • 04:34: um so 20 000 starting salary or 100 000
  • 04:37: starting salary
  • 04:38: for you know if this is some just
  • 04:41: undergraduate getting a job
  • 04:42: both of those almost never going to
  • 04:44: happen
  • 04:46: so that seems like a reasonable standard
  • 04:49: deviation and that's how you can think
  • 04:51: through
  • 04:52: what standard deviation should i pick
  • 04:53: for my distribution
  • 04:56: so then we can go
  • 04:59: to create the rest of our standard
  • 05:02: deviations promo every new year's
  • 05:04: std um
  • 05:07: let's put that at 1.5
  • 05:11: um the cost of living raised
  • 05:17: let's put that at a half of a percent
  • 05:24: um the savings rate
  • 05:31: let's put that at seven percent
  • 05:35: and the interest rate
  • 05:41: let's put that at one percent
  • 05:46: okay and then we can create an instance
  • 05:48: of our
  • 05:50: simulation inputs
  • 05:53: and we have everything there
  • 05:58: so the first step in the monte carlo
  • 06:01: simulation
  • 06:02: is to draw the random values
  • 06:05: of the inputs in order to run them
  • 06:07: through the model
  • 06:11: but looking at the inputs into our model
  • 06:15: before we go and draw random values we
  • 06:17: want to think about
  • 06:19: what are valid ranges of these inputs
  • 06:22: in our model is it possible that we're
  • 06:24: going to hit some
  • 06:25: invalid numbers by pulling these
  • 06:28: random values um
  • 06:32: so salary how often you're getting
  • 06:35: promotions
  • 06:37: cost of living raised promotion raise
  • 06:39: savings rate
  • 06:41: um all these things really they need to
  • 06:44: be positive they don't make sense
  • 06:46: if they're negative
  • 06:50: and the interest rate i would say you
  • 06:52: know if this was
  • 06:54: each individual year we were getting a
  • 06:55: random interest rate sure that can go
  • 06:57: negative
  • 06:58: but if we're talking about a long-term
  • 07:00: interest rate
  • 07:01: that also should be positive so really
  • 07:05: all these inputs that we're randomizing
  • 07:07: should be positive
  • 07:08: in the model so
  • 07:11: knowing that knowing the conditions that
  • 07:13: we need to have on our inputs
  • 07:16: we can write functions to draw the
  • 07:19: random inputs
  • 07:20: that are always going to satisfy these
  • 07:23: conditions
  • 07:25: so
  • 07:29: well first i'm gonna i'm gonna go back
  • 07:30: up to the top and import random
  • 07:33: uh because we're definitely going to
  • 07:36: need that
  • 07:38: to draw the values from normal
  • 07:39: distributions
  • 07:41: um and so
  • 07:44: you recall from the um
  • 07:47: continuous random variable material
  • 07:51: random.normal variant is able to draw
  • 07:53: values from a normal distribution
  • 07:56: um and so let's just
  • 07:59: take an example mean here
  • 08:02: of two and a standard deviation
  • 08:06: of one then i set these up because i
  • 08:10: know
  • 08:10: that this is going to go negative in
  • 08:12: some cases it's only two standard
  • 08:14: deviations away from zero and so that
  • 08:16: should happen decently often
  • 08:18: um so putting the mean and standard
  • 08:20: deviation
  • 08:22: then we get random values from that
  • 08:24: normal distribution
  • 08:26: um and most of them are going to be
  • 08:27: positive but some of them are going to
  • 08:30: come up negative i saw one that was
  • 08:31: negative there
  • 08:34: um so
  • 08:37: what we can do um we
  • 08:40: want to figure out a way so that every
  • 08:43: value that we draw is going to be
  • 08:44: positive
  • 08:45: and let me actually just increase this
  • 08:47: so that it's a lot more likely
  • 08:49: to get negative numbers in here
  • 08:52: so that it's really clear that this is
  • 08:55: working appropriately
  • 08:58: so what we can do is basically
  • 09:01: pick the value and then if we didn't
  • 09:05: get a value that meets our conditions in
  • 09:07: this case that is a positive number
  • 09:10: then we're just going to keep drawing
  • 09:11: values until we do
  • 09:13: so what we can do is use a while loop
  • 09:18: for this because the while loop executes
  • 09:20: until
  • 09:22: some condition evaluates to false as
  • 09:24: long as it's
  • 09:25: true it's going to keep executing and so
  • 09:28: this is the perfect fit here
  • 09:29: because we want to keep drawing random
  • 09:31: values until
  • 09:33: we meet our condition of
  • 09:36: it being positive so
  • 09:41: that condition so let's um
  • 09:45: call this drawn value um
  • 09:49: our condition would be while the drawn
  • 09:52: value
  • 09:54: is less than zero so as long as we're
  • 09:56: getting a negative number
  • 09:57: keep going so it's basically the
  • 09:59: opposite
  • 10:00: of the condition that you want we want
  • 10:03: the drawn value to be greater than zero
  • 10:05: so as long as greater than or equal to
  • 10:08: zero
  • 10:09: so as long as it's not the case that it
  • 10:12: satisfies that condition
  • 10:14: as long as it's a negative number then
  • 10:16: we're going to keep
  • 10:17: drawing additional values um but then
  • 10:21: you go and run this and you'll get the
  • 10:22: name error that
  • 10:23: drawn value is not defined because we
  • 10:25: don't define it until here
  • 10:27: so we also need to initialize it so just
  • 10:30: initialize it to some value which is
  • 10:33: going to satisfy
  • 10:37: the reverse condition so basically put
  • 10:40: it
  • 10:41: at a value which is not acceptable for
  • 10:44: your model
  • 10:45: and that will make sure that it goes
  • 10:47: into the while loop
  • 10:49: and so then we just show the drawn value
  • 10:51: at the end
  • 10:53: and then you'll notice that no matter
  • 10:55: how many times i run this
  • 10:56: it's going to come up positive every
  • 10:58: time
  • 11:00: um even though we saw it was decently
  • 11:03: often
  • 11:04: that we were getting negative numbers
  • 11:05: before
  • 11:08: so now we have a function we can call
  • 11:11: this
  • 11:12: uh random normal
  • 11:15: positive which takes a mean and a
  • 11:19: standard deviation
  • 11:21: um and returns that drawn value at the
  • 11:25: end
  • 11:25: so now we can just do random normal
  • 11:27: positive
  • 11:29: with whatever mean and standard
  • 11:31: deviation
  • 11:32: and it's going to uh give us
  • 11:36: values from the normal distribution
  • 11:38: basically but just
  • 11:39: chop off any of those ones which are
  • 11:42: negative and
  • 11:42: try again
  • 11:46: so we can apply this function across all
  • 11:48: the different inputs that we're
  • 11:50: randomizing
  • 11:50: in the model
  • 11:54: so um and of course you would add a doc
  • 11:57: string to explain
  • 11:58: what this does i'm just skipping that
  • 12:00: for
  • 12:01: keep the video short but definitely take
  • 12:03: a look at the completed example
  • 12:05: for having all the doc strings and
  • 12:07: everything filled out
  • 12:10: so what we want to do next is we want to
  • 12:14: pick the random values of all these
  • 12:16: inputs
  • 12:17: so
  • 12:20: all these different inputs here we want
  • 12:24: to
  • 12:24: randomly draw them um
  • 12:28: so i'm just going to copy these
  • 12:31: to just make my life easier to type this
  • 12:34: out
  • 12:35: um so then i can get
  • 12:39: all the names of the different inputs
  • 12:40: there
  • 12:44: delete off these commas
  • 12:50: and then um
  • 12:55: we can then um
  • 12:58: that's not gonna work delete off these
  • 13:01: values as well
  • 13:05: we're going to use the random normal
  • 13:06: positive function that we just created
  • 13:09: in order to um
  • 13:13: let me put a space there random normal
  • 13:17: positive
  • 13:18: um and
  • 13:21: we want to
  • 13:25: do the mean there
  • 13:28: as the original input value and we want
  • 13:31: to get from the sim data
  • 13:33: the std of
  • 13:36: that value so then
  • 13:40: we have drawn all these different inputs
  • 13:45: now let me add a data equals model data
  • 13:53: um and then i named one of these
  • 13:57: oh i did i forgot to put promotional
  • 14:00: arrays standard deviation
  • 14:01: so um that in here as well
  • 14:05: promo raise std
  • 14:10: and what's the reasonable value for that
  • 14:13: let's say 5 on that
  • 14:17: so now hopefully this will work yep um
  • 14:20: and so now we have all these different
  • 14:23: random values interest rate
  • 14:25: promotion promotional raise etc
  • 14:28: we're getting random values for each and
  • 14:30: they're always positive
  • 14:34: um so then we can
  • 14:38: make a function out of this um so
  • 14:41: we can call this
  • 14:44: years retirement
  • 14:48: simulation inputs
  • 14:52: i'm going to take the data and the sim
  • 14:56: data
  • 15:00: and
  • 15:03: then we can return all of these values
  • 15:15: so we want to return all these different
  • 15:18: values and so it's doing it as
  • 15:21: a tuple
  • 15:28: where we're returning all these at once
  • 15:33: then we can call this years to
  • 15:37: retirement simulation inputs
  • 15:39: with the data and the sim data and we're
  • 15:42: going to get
  • 15:43: all these different random values and
  • 15:45: you can see they're changing each time
  • 15:47: first one corresponds to salary second
  • 15:49: promotions every nears
  • 15:50: and so on
  • 15:54: so now we're able to draw the random
  • 15:57: values
  • 15:58: of all our inputs and so the next step
  • 16:01: is then to get to
  • 16:02: running a single simulation
  • 16:07: so we
  • 16:11: are going to call this function
  • 16:14: um and we want to save
  • 16:18: the results of it so we can do
  • 16:21: we can take the same thing to split it
  • 16:22: back out into the
  • 16:25: individual variable values
  • 16:30: so you see i run that and now all these
  • 16:33: things
  • 16:33: are defined individually
  • 16:39: and then we want to create the data
  • 16:43: so create an instance of the model
  • 16:45: inputs
  • 16:46: with these values
  • 16:51: so i'm going to grab these again
  • 17:02: and let them equal put a comma
  • 17:06: and now we should have that new data
  • 17:08: created appropriately
  • 17:11: so i run this and i get the model data
  • 17:14: being created
  • 17:15: with random values now
  • 17:21: now that we have the data into the model
  • 17:24: inputs data class
  • 17:25: now we can run the model so we can do
  • 17:28: years for retirement equals
  • 17:29: we have the years to retirement function
  • 17:32: um
  • 17:33: we want to pass the new data and we want
  • 17:36: to make sure
  • 17:37: we don't need to print out the do i have
  • 17:40: the print output in this version of the
  • 17:41: model
  • 17:42: i might not
  • 17:45: okay it seems it's not there let me
  • 17:48: quickly add it um otherwise we're going
  • 17:51: to have
  • 17:52: a huge amount of output coming out of
  • 17:55: this
  • 17:56: so if we're an output and just wrap
  • 17:59: all the print statements in that
  • 18:06: three print statements here
  • 18:14: um and then coming back over to here
  • 18:18: now for an output equals false
  • 18:23: so with that we get the year's
  • 18:24: retirement and we're going to get
  • 18:28: it should be different using retirement
  • 18:29: but we're getting the same
  • 18:31: year's retirement so if the print output
  • 18:33: wasn't there i think i might have used
  • 18:35: the version yep which had this model
  • 18:37: data mistake
  • 18:38: so make sure it flows all the way
  • 18:40: through redefine that
  • 18:43: and now hopefully we'll get different
  • 18:45: years of retirement with each run
  • 18:47: of the model
  • 18:50: and we're still not getting that that's
  • 18:54: odd
  • 18:58: let me just restart this and run all the
  • 19:00: way through while i
  • 19:01: um go back and take another look
  • 19:05: um oh we have
  • 19:13: oh i was editing this wellsdf function
  • 19:16: okay so this is the one that also had
  • 19:19: that model data mistake
  • 19:20: your data data okay we're good now
  • 19:24: hopefully it should come through now yes
  • 19:27: okay now we're getting different years
  • 19:29: of retirement with each one of the model
  • 19:31: so you definitely want to do these
  • 19:33: checks on your own with your own model
  • 19:35: as you build it out
  • 19:36: if one simulation does not work properly
  • 19:38: then certainly
  • 19:39: 10 000 are not going to work properly
  • 19:41: either
  • 19:44: um and now that we have the logic to
  • 19:46: produce
  • 19:47: one simulation then we can wrap that up
  • 19:50: into a function
  • 19:53: so i'm going to call this producer
  • 19:56: retirement
  • 19:58: single simulation it takes the data
  • 20:02: and the sim data
  • 20:05: and then all this and return the years
  • 20:09: or we want to return more than just the
  • 20:11: years for retirement though
  • 20:13: um we want to actually
  • 20:16: return all the inputs as well so we can
  • 20:19: return all the inputs
  • 20:21: and the years to retirement um
  • 20:24: so that we have all the inputs
  • 20:27: associated with the output
  • 20:29: so then when i call this we then get
  • 20:33: all those inputs again but also the
  • 20:35: output the year's retirement as well and
  • 20:38: that's all
  • 20:38: associated together so now we can
  • 20:42: run a single monte carlo simulation with
  • 20:46: a single line of code
  • 20:49: so now that we can do that we want to
  • 20:51: get to running the full
  • 20:53: monte carlo simulation process with
  • 20:56: however many iterations that we want
  • 21:00: so um we want to basically call this a
  • 21:04: loop
  • 21:05: over the number of iterations and
  • 21:09: all we're doing is just calling this a
  • 21:10: bunch of times and putting it into a
  • 21:11: list so i'm going to use a list
  • 21:13: comprehension
  • 21:15: to simplify that loop so just calling
  • 21:18: the function
  • 21:19: or i in range
  • 21:22: sim data dot number of iterations
  • 21:29: hold out all results and then
  • 21:32: we can look at let's just look at the
  • 21:36: first
  • 21:36: five because there's going to be a lot
  • 21:37: in there and we can see we're getting
  • 21:39: multiple runs of this with the inputs
  • 21:41: associated with the outputs
  • 21:45: so then we can put this into a data
  • 21:47: frame
  • 21:48: and if i imported andis uh
  • 21:51: yep uh so put this all into a data frame
  • 21:56: pd.dataframe
  • 21:59: of all results
  • 22:03: and the columns then we want to
  • 22:07: name these columns so we're going to
  • 22:09: have starting salary
  • 22:11: first and you want to go in the same
  • 22:12: order as whatever you have in the tuple
  • 22:15: the starting salary and then promos
  • 22:19: every n years
  • 22:28: then the cost of living raise
  • 22:32: and then the promotion rate
  • 22:36: and then the savings rate
  • 22:40: the interest rate and finally the years
  • 22:43: to retirement
  • 22:46: and you don't want to have really long
  • 22:48: cells like this it's just really
  • 22:50: difficult to read so i'm going to split
  • 22:52: this
  • 22:52: onto multiple lines it's within
  • 22:54: parentheses and so i can split it
  • 22:59: and this is going to make the code
  • 23:01: easier to read
  • 23:06: so then we should have
  • 23:10: our data frame created
  • 23:13: and we see that here so i have it set at
  • 23:16: 100
  • 23:16: simulations right now that's why we have
  • 23:18: 100 rows in the data frame each row is
  • 23:20: one simulation
  • 23:22: and we see all the input values
  • 23:23: associated with
  • 23:25: the output so
  • 23:29: now we're able to run all the
  • 23:31: simulations so let's make a function for
  • 23:33: that year's retirement
  • 23:36: monicarlo takes the data and sim data
  • 23:41: and let's end in all this and return it
  • 23:46: so now i can call this
  • 23:50: and we should get the same thing
  • 23:54: and of course i could you know change it
  • 23:58: and
  • 23:58: run for
  • 24:02: say a thousand iterations and then we
  • 24:04: would see a thousand rows in the data
  • 24:05: frame so everything
  • 24:06: seems to be flowing through properly
  • 24:13: so um
  • 24:17: now we've got the simulation results and
  • 24:20: we can get them with a single function
  • 24:24: uh let's go ahead and save those results
  • 24:26: into a data frame
  • 24:32: so now we've got this data frame
  • 24:35: but it doesn't have great formatting we
  • 24:39: might want to apply some formatting to
  • 24:41: it
  • 24:43: um so style format um
  • 24:47: so starting salary and i'm just gonna
  • 24:51: i want to probably format all of them so
  • 24:53: i'm just gonna copy these to get started
  • 24:55: with
  • 24:56: um and then starting salary
  • 25:00: um that's going to be uh dollars
  • 25:04: and i wanted to have commas and zero
  • 25:06: decimal places
  • 25:08: promotions every n years um that can
  • 25:11: just have
  • 25:12: one decimal places one decimal place
  • 25:17: um cost of living raise
  • 25:20: that's going to be a percentage
  • 25:23: we can give it up to two decimal places
  • 25:27: promotion raise same thing really all
  • 25:30: the
  • 25:31: percentages same thing promotion raise
  • 25:34: uh savings rate and interest rate
  • 25:39: and then use retirement
  • 25:42: we can make it zero decimal places
  • 25:49: um
  • 25:54: okay so now we see that with proper
  • 25:57: formatting
  • 25:59: um and the other thing we might want to
  • 26:02: do is add some coloring to it
  • 26:04: so i'm going to add the background
  • 26:05: gradient with the
  • 26:08: red yellow green color map
  • 26:11: on just the years to retirement
  • 26:15: column
  • 26:20: i see that coming there and
  • 26:25: you be careful that you don't style your
  • 26:27: data frame which has
  • 26:29: 10 000 rows in it because it is going to
  • 26:31: show all of it
  • 26:32: um
  • 26:35: and we'll notice that um this is going
  • 26:38: the opposite of the direction that we
  • 26:39: want right it's showing green for high
  • 26:41: values but really green is
  • 26:43: our low is good in our model so we want
  • 26:45: to reverse the color map and so we can
  • 26:47: add
  • 26:48: underscore r and now uh
  • 26:51: when the year's retirement are low we're
  • 26:53: seeing the dark green
  • 26:54: and when they're high we're seeing the
  • 26:56: red
  • 27:01: so then we can wrap this in a function
  • 27:04: style df takes the data frame
  • 27:08: and then returns this
  • 27:11: so that we can just hear the shortcut
  • 27:15: which can see the top five rows so just
  • 27:16: look
  • 27:17: at the data we can then apply style df
  • 27:20: to that
  • 27:21: to just keep a look into our data in
  • 27:24: the model
  • 27:29: and it's useful to say
  • 27:32: company simulations were run so we can
  • 27:36: do the length
  • 27:36: of data frame simulations we're running
  • 27:46: so now we have the results from the
  • 27:48: simulation and we want to
  • 27:50: visualize and analyze them
  • 27:54: so let's visualize the results
  • 27:58: um well so this
  • 28:03: file data frame that's the first part
  • 28:06: um just example results
  • 28:13: um this this can go at the end of the
  • 28:15: results there we go
  • 28:18: um the next what we want to visualize
  • 28:21: is the distribution of the output
  • 28:24: here's retirement
  • 28:28: um so we can take
  • 28:32: this data frame user retirement
  • 28:35: and do a histogram
  • 28:38: um and see the output there
  • 28:42: um let me go ahead and just
  • 28:45: run this with more iterations at this
  • 28:47: point
  • 28:50: so that we'll have a good idea what the
  • 28:52: output
  • 28:53: is going to look like
  • 28:57: um 50 is not going to be enough pins for
  • 29:00: that let's try
  • 29:01: 100 that's a little bit more
  • 29:04: reasonable um
  • 29:08: so then
  • 29:12: um we want to create the
  • 29:15: um probability table
  • 29:18: so uh probability table
  • 29:23: um and we can get the quantiles
  • 29:26: uh we're gonna do this i develop
  • 29:29: five percent um percentiles
  • 29:33: i have a twenty for i in range uh from
  • 29:35: one to twenty
  • 29:42: so in range so we get that five percent
  • 29:46: to 95 percent
  • 29:48: and then we can do vf.quantile
  • 29:51: on that
  • 29:56: so here's another advantage of the
  • 30:00: data frame styler function pattern now
  • 30:03: we have
  • 30:04: a different data frame which is in the
  • 30:06: same structure
  • 30:08: we can apply the same function to that
  • 30:11: so now this
  • 30:15: probability table is nicely formatted as
  • 30:18: well
  • 30:20: and this is telling us you know only
  • 30:22: five percent of the time will you be
  • 30:24: able to retire in less than 21 years
  • 30:28: and five percent of the time it could
  • 30:30: take longer than 39 years to
  • 30:31: retire based on the distributions that
  • 30:34: we have assigned
  • 30:38: so next
  • 30:42: now we're going to get into analyzing
  • 30:45: the relationship of the inputs versus
  • 30:48: the outputs so the first that we can do
  • 30:51: is uh plots of inputs
  • 30:54: verse use retirement
  • 30:59: um so if we do df df.plot.scatter
  • 31:06: and we tell it the y is here's
  • 31:09: retirement
  • 31:11: and then the x is whatever
  • 31:17: input that we want to look at then we're
  • 31:18: going to get a
  • 31:20: scatter plot as a result
  • 31:24: so we want to do this but we want to do
  • 31:26: it for all the different possible inputs
  • 31:29: so i'm going to go back i'm going to
  • 31:30: grab this list
  • 31:34: so we can call this the
  • 31:38: input columns
  • 31:45: and then for each
  • 31:48: column in the input columns then we want
  • 31:52: to do this
  • 31:53: scatter with that particular column
  • 31:58: then we now have all the scatter plots
  • 32:02: for each of the different inputs so we
  • 32:05: can see the relationships
  • 32:07: um and we don't need years to retirement
  • 32:12: we only want the inputs not the output
  • 32:14: here
  • 32:15: so i'm going to remove that one
  • 32:19: um and you know you can see
  • 32:23: some of these have clearer patterns than
  • 32:25: others like here with
  • 32:26: savings rate you can see it's kind of a
  • 32:29: curve here
  • 32:29: that's a fairly defined pattern um
  • 32:33: and with cost of living raise it's a
  • 32:34: little more of an ambiguous cloud
  • 32:37: here so just based on the scatter plots
  • 32:41: it suggests a fairly strong relationship
  • 32:43: between savings rate and years for
  • 32:44: retirement
  • 32:45: and not a very strong relationship
  • 32:48: between the cost of living raise and
  • 32:50: years to retirement
  • 32:53: so then we can go on to the quantitative
  • 32:58: analysis of the relationship between the
  • 33:00: inputs and the outputs
  • 33:03: and that is through the multivariate
  • 33:08: regression
  • 33:12: so we're going to use the stats models
  • 33:15: package in order to run the regression
  • 33:19: so i'm going to import stats models uh
  • 33:23: dot api as sm
  • 33:26: and this is another one of those
  • 33:29: conventions
  • 33:30: just take this import and use it as is
  • 33:32: and then we'll use
  • 33:33: sm to interact with stats models library
  • 33:39: so what we want to do is
  • 33:43: i'm going to say that our output column
  • 33:45: is used for retirement
  • 33:49: and we already had our input columns
  • 33:51: defined here
  • 33:55: so ultimately what we're doing
  • 34:00: is we're going to get our x variables
  • 34:04: as the uh input columns
  • 34:08: from the data frame so then we have just
  • 34:11: the inputs no years to retirement on
  • 34:13: here
  • 34:14: we're going to get uh the
  • 34:18: um here's to retirement here
  • 34:21: as our y variable um
  • 34:25: and then we're going to create the
  • 34:27: regression
  • 34:28: model object so we're going to do an
  • 34:32: ordinary least squares regression ols
  • 34:34: regression uh which is the standard
  • 34:38: and we're going to put the y first and
  • 34:41: then the x
  • 34:43: and then in order to get results from
  • 34:45: that we're going to
  • 34:47: fit the model call.fit on the model
  • 34:52: and then we call the summary method on
  • 34:54: the result object
  • 34:55: in order to produce this summary that
  • 34:58: you see here
  • 35:01: so now the top part is the general fit
  • 35:05: statistics
  • 35:06: not too important for this what we're
  • 35:08: really concerned about is the p-values
  • 35:10: and the coefficients
  • 35:12: so all the p-values are low and so
  • 35:15: there's
  • 35:15: no evidence from the p-values that any
  • 35:18: of these
  • 35:19: uh inputs are unrelated to the outputs
  • 35:22: it seems that there is an evidence of a
  • 35:25: relationship
  • 35:26: with each one of them and we can look at
  • 35:29: the coefficients
  • 35:31: in order to interpret the um strength of
  • 35:34: that relationship
  • 35:37: but there is one other thing that we
  • 35:39: need to do here
  • 35:40: um which is that
  • 35:44: the you'll notice here that there's no
  • 35:47: constant or intercept if you're familiar
  • 35:50: with
  • 35:51: running regressions you typically have a
  • 35:53: constant or intercept
  • 35:55: as one of the x variables and that is
  • 35:57: not included by default
  • 35:59: in stats models you do have to add it
  • 36:02: explicitly
  • 36:03: so in order to do that we do sm.add
  • 36:07: constant
  • 36:10: and then in here we do has const equals
  • 36:13: true
  • 36:14: and then when we run this again now
  • 36:16: we'll see we have this constant
  • 36:18: in there um and when we look at the x
  • 36:22: that basically added a column of ones
  • 36:25: into the model
  • 36:27: and that's how it works with the
  • 36:29: constant
  • 36:30: um we're not going to be diving into the
  • 36:33: theory of ols regressions why you should
  • 36:35: have this constant
  • 36:37: but just in general you should probably
  • 36:40: have the constant and so make sure to
  • 36:42: add it
  • 36:42: so you know you can just copy paste this
  • 36:45: code snippet
  • 36:46: or your own model and just switch out
  • 36:48: the output columns
  • 36:49: and the the output column and the input
  • 36:52: columns
  • 36:56: so now we have the regression results
  • 37:01: and we want to interpret them
  • 37:06: so we can go ahead and already look at
  • 37:07: these and start doing some
  • 37:09: interpretations you'll notice
  • 37:11: um that for promotions every n years we
  • 37:14: have a 1.2648
  • 37:16: coefficient so what that's saying is if
  • 37:19: we get uh if it takes one year longer
  • 37:23: to get a promotion on average
  • 37:27: that's going to lead to a 1.26
  • 37:31: additional years it takes until we get
  • 37:34: to
  • 37:35: retirement
  • 37:40: and so another question is you know
  • 37:42: which of these
  • 37:44: inputs is most impactful which matters
  • 37:46: the most and so you might think well
  • 37:48: just whichever have the biggest
  • 37:50: coefficients those should be
  • 37:52: the most impactful but that's not the
  • 37:54: case you have to also consider
  • 37:56: the standard deviation of the
  • 37:59: inputs so we can evaluate that by
  • 38:02: looking at the standard deviation on the
  • 38:04: data frame that will tell us the
  • 38:05: standard deviation of each of our inputs
  • 38:07: and those should basically be the
  • 38:09: standard deviations that we set out
  • 38:11: in our simulation data um
  • 38:15: which they are um
  • 38:19: so we take that standard deviation
  • 38:23: and then on this result
  • 38:27: object we have result.params that gives
  • 38:30: us
  • 38:31: a panda series which has all those
  • 38:34: coefficients that we saw up here in the
  • 38:36: nice summary output
  • 38:38: so what we can do is we can actually
  • 38:40: multiply these two things together
  • 38:43: and that gives us what's called
  • 38:45: standardized coefficients
  • 38:49: so what that is saying is now it's
  • 38:50: instead of a one unit
  • 38:52: increase in the input variable it's a
  • 38:54: one standard deviation
  • 38:56: increase in the input variable
  • 38:59: so that's saying that a one standard
  • 39:01: deviation
  • 39:03: increase in the cost of living raise
  • 39:07: decreases years to retirement by 0.9
  • 39:10: years so these coefficients
  • 39:14: are comparable in terms of which has the
  • 39:16: biggest impact
  • 39:18: so you can basically think in the
  • 39:20: absolute value of these whichever are
  • 39:22: the biggest
  • 39:23: are going to have the biggest impact on
  • 39:24: the model so here
  • 39:26: is saying that savings rate has the
  • 39:29: biggest
  • 39:29: impact on the years to retirement
  • 39:32: followed by the starting salary
  • 39:37: and in your model you're going to want
  • 39:39: to
  • 39:40: include some text at the bottom that
  • 39:43: interprets this
  • 39:45: and draws conclusions from the
  • 39:46: coefficients talking about the original
  • 39:48: coefficients as well as the standardized
  • 39:51: coefficients
  • 39:53: so that it's very clear for the reader
  • 39:56: of the model basically what was
  • 39:57: important
  • 39:58: from doing all of this analysis
  • 40:04: so that's the general process of adding
  • 40:07: monte carlo simulation to an existing
  • 40:09: model
  • 40:10: and analyzing the relationship between
  • 40:13: the inputs and the outputs
  • 40:16: now to go along with this
  • 40:20: there is an analogous lab exercise here
  • 40:25: so the lab exercise for this is then to
  • 40:28: do something very similar
  • 40:30: for the project one model project one
  • 40:33: python model
  • 40:34: now i'm not asking you to do it with
  • 40:36: every input there here in the level one
  • 40:39: just do it with the interest rate
  • 40:42: just randomize that and then
  • 40:46: run ten thousand simulations get the
  • 40:48: years or
  • 40:49: the mpv results visualize
  • 40:53: and then create this table of
  • 40:55: probabilities and
  • 40:58: get the chance that the mpv will be more
  • 41:00: than 400 million
  • 41:02: and then in the level two um
  • 41:05: then you're going to be doing the same
  • 41:07: thing continuing on but then also
  • 41:09: drawing the number of phones from an
  • 41:11: oral distribution as well
  • 41:13: and doing the same kind of analysis but
  • 41:15: then following it up
  • 41:17: with analyzing the relationship between
  • 41:20: the inputs and the outputs so doing the
  • 41:22: scatter plots
  • 41:24: and the multivariate regression and then
  • 41:26: interpreting
  • 41:27: the results of that
  • 41:30: so that wraps up um
  • 41:34: adding monte carlo simulation to python
  • 41:36: models
  • 41:37: thanks for listening and see you next
  • 41:40: time

Applying Monte Carlo Simulation to an Excel Model


Notes

  • The process for running Monte Carlo simulations in Excel is nearly the same as that in Python when we use Python to run the simulations on the Excel model using xlwings

  • The main difference is that we write the inputs into Excel and extract the results using xlwings rather than running Python logic for the core model

  • Excel recalculates whenever an input is changed. So writing the inputs in is enough to get the result calculated

  • For the analysis, you can either keep the results in Python and follow the process for analyzing the results in Python, or you can output them back to Excel and analyze the outputs there

  • Keep in mind that if you visualize the outputs in Excel, next time you run the simulation it will go slow due to the visualizations. Because of this it may be a better idea in general to do the analysis in Python if you have a choice

Transcript

  • 00:02: hey everyone
  • 00:03: nick dear bird is here teaching you
  • 00:05: financial modeling so today
  • 00:07: we're going to be talking about how we
  • 00:08: can add monte carlo simulation
  • 00:11: to an existing excel model this is part
  • 00:14: of our lecture series on monte carlo
  • 00:16: simulation
  • 00:18: so this video is going to wrap up the
  • 00:21: lecture series we already talked about
  • 00:24: what monte carlo simulation is why we
  • 00:26: would want to do it
  • 00:27: looking example of running it on a new
  • 00:29: python model
  • 00:30: i did a more formal introduction of
  • 00:33: monte carlo and all the parts of it
  • 00:35: and the analysis of it and then went and
  • 00:38: applied it to an existing python model
  • 00:41: so all that's left is to apply it to an
  • 00:43: existing
  • 00:44: excel model so
  • 00:48: if you're thinking about just using pure
  • 00:50: excel
  • 00:51: for monte carlo simulations it is
  • 00:55: definitely a
  • 00:56: challenge there are add-ins
  • 00:59: which are able to do this for you but i
  • 01:02: don't know
  • 01:02: of any um good really flexible
  • 01:06: free add-ins for this mostly good ones
  • 01:09: you're gonna have to pay a substantial
  • 01:11: premium to get that add-on
  • 01:13: [Music]
  • 01:15: and without the add-on then with only
  • 01:18: excel
  • 01:19: pretty much you're going to be going to
  • 01:20: vba to complete this
  • 01:22: there are ways to hack it with data
  • 01:25: tables
  • 01:26: but it can get quite complicated to do
  • 01:29: that
  • 01:31: especially if you only have one or two
  • 01:33: inputs varying at a given time
  • 01:36: that's not too bad with a data table um
  • 01:40: but as soon as you want to change more
  • 01:42: than two then it starts to get
  • 01:44: uh to be quite a hacky kind of approach
  • 01:47: to make that happen with some kind of
  • 01:49: lookup in another table
  • 01:51: in order to make that happen
  • 01:55: and you might be able to hack it some
  • 01:58: way
  • 01:58: or you're going to using vba
  • 02:01: or python so
  • 02:05: you know generally i would recommend
  • 02:08: just to use python to be able to
  • 02:11: run your monte carlo simulation in excel
  • 02:16: so you know we've already learned how to
  • 02:18: combine
  • 02:19: excel in python in the prior lecture
  • 02:22: series
  • 02:23: and so we can leverage that knowledge to
  • 02:25: take our excel model
  • 02:27: and use python to run monte carlo
  • 02:29: simulations
  • 02:30: on it
  • 02:33: and the process that we're going to
  • 02:35: follow there is
  • 02:37: extremely similar to the one that we
  • 02:39: just carried out
  • 02:40: in python all we're doing is
  • 02:43: changing the inputs running the model
  • 02:45: and storing the output
  • 02:47: each time the difference here is just
  • 02:50: that
  • 02:50: instead of running python code to um
  • 02:53: change the inputs and uh run the model
  • 02:57: we're going to use excel wings to take
  • 03:00: the
  • 03:00: inputs from python put them into excel
  • 03:04: and then get the result that we want
  • 03:06: from excel
  • 03:07: back into python
  • 03:11: um so same exact kind of flow
  • 03:14: but just having the excel model hooked
  • 03:16: up instead of the python core model
  • 03:20: and then so at the end of that process
  • 03:22: you'll have all your simulation results
  • 03:25: in
  • 03:25: python and it's up to you at that point
  • 03:28: whether you want to just go ahead and
  • 03:30: analyze them in python
  • 03:31: and then you'll do the exact same kind
  • 03:33: of analysis
  • 03:34: that we showed in adding
  • 03:38: a monte carlo simulation to an existing
  • 03:41: python model
  • 03:43: or you can take all those simulation
  • 03:45: results and output them back
  • 03:47: into excel and then do an analysis on
  • 03:50: them
  • 03:50: in excel
  • 03:54: so let's look at an example of how we
  • 03:57: would actually go about this
  • 04:01: so i've got the dynamic salary
  • 04:02: retirement model up here on the left and
  • 04:04: a fresh
  • 04:05: jupiter notebook up here on the right
  • 04:09: so you know this model is already set up
  • 04:12: so that everything flows through we give
  • 04:15: it different
  • 04:16: inputs and it's going to change the
  • 04:18: output
  • 04:20: so first thing we want to do is import
  • 04:23: excel wings as xw
  • 04:27: and we're going to need pandas as well
  • 04:32: so just add those inputs there
  • 04:37: and you can look on the course site
  • 04:41: to see a fully built out example of this
  • 04:45: uh which has all the proper um
  • 04:49: explanations and formatting of
  • 04:50: everything
  • 04:52: but i'm going to go ahead and just get
  • 04:55: right to the code
  • 04:56: here so we're going to now use excel
  • 04:58: wings to get a connection
  • 04:59: to the workbook
  • 05:04: so this is dynamic valerie
  • 05:07: retirement model i just realized that
  • 05:11: these are not in the same folder
  • 05:16: so let me
  • 05:22: let me move that into the same folder
  • 05:25: just do that over here
  • 05:26: on my other screen here
  • 05:29: give me a moment for that okay
  • 05:33: now they're in the same folder
  • 05:36: so that's the potential pitfall as you
  • 05:38: try to do this you want to make sure
  • 05:39: they're in the same folder or otherwise
  • 05:41: you're going to have to put the full
  • 05:42: file path
  • 05:44: of the
  • 05:47: excel model
  • 05:50: so this is uh copy
  • 05:53: two um
  • 05:58: copy e2 um
  • 06:02: i'll not found so i must not have gotten
  • 06:04: that name right
  • 06:05: dynamic salary retirement model
  • 06:08: copy two oh right i need to put the xlsx
  • 06:12: okay now i have the connection to the
  • 06:14: book and so now i can get
  • 06:17: the inputs and outputs sheet
  • 06:22: and ultimately we're going to use two
  • 06:23: different sheets so i'm going to call
  • 06:25: this i o sheet
  • 06:27: [Music]
  • 06:28: and this is going to be book.sheets
  • 06:32: inputs and outputs to reference our
  • 06:35: inputs and outputs worksheet here
  • 06:39: um so
  • 06:41: [Music]
  • 06:42: now we're going to run a single
  • 06:46: simulation
  • 06:48: um so
  • 06:51: um all that we need to do to run a
  • 06:55: simulation in excel
  • 06:56: is change the input that's going to
  • 06:58: automatically trigger excel to
  • 07:00: recalculate the model
  • 07:02: and so then the output will change as
  • 07:04: well
  • 07:05: so let's look at just varying the
  • 07:08: interest rate
  • 07:08: so here in b10
  • 07:12: we have the interest rate so io sheet
  • 07:16: dot range uh b10
  • 07:20: value and let's just try it out by
  • 07:23: putting a value in there eight percent
  • 07:25: let's run that and we see this has
  • 07:27: updated to eight percent and the years
  • 07:29: for retirement
  • 07:30: as similarly updated so then the other
  • 07:33: side of this is then just
  • 07:34: getting that out the io sheet
  • 07:37: uh we want to get the output here b18
  • 07:40: is the output range so b18
  • 07:44: value and we can see that gets us the
  • 07:47: years to retirement
  • 07:49: so we can save that as years retirement
  • 07:54: [Music]
  • 07:56: and that's basically it um you know
  • 07:59: we've got to add the random part to do
  • 08:00: the simulation but
  • 08:02: just you know running these two cells is
  • 08:04: how we can run the excel model from
  • 08:09: python um
  • 08:12: and then we're going to show in this
  • 08:15: example
  • 08:15: analyzing the outputs in excel so we'll
  • 08:17: ultimately need to get the outputs back
  • 08:19: to excel
  • 08:21: so i'm going to go and create a new
  • 08:23: worksheet here
  • 08:25: call this simulations
  • 08:29: and then i'm going to create
  • 08:32: a reference to that sim sheet
  • 08:34: book.sheets
  • 08:37: simulations um and now
  • 08:40: i can do the simsheet.range
  • 08:44: uh a1 value equals your search
  • 08:48: retirement
  • 08:50: and now we see that came into there so
  • 08:52: now we have recorded the result
  • 08:54: of that simulation back into excel
  • 08:59: so that's just a single
  • 09:02: run of the model not even really a
  • 09:04: simulation because
  • 09:06: this wasn't random but let's now
  • 09:09: go to uh running multiple simulations
  • 09:16: so we're gonna need the random module as
  • 09:20: well
  • 09:23: and let's put a mean of the interest
  • 09:26: five percent let's put a standard
  • 09:29: deviation of the interest three percent
  • 09:33: um and now we can do
  • 09:35: random.normalvariate
  • 09:38: to get a random interest rate drawn
  • 09:42: from a normal distribution so then the
  • 09:45: interest rate
  • 09:46: um we can see we run it multiple times
  • 09:49: we get different values of the interest
  • 09:51: rate
  • 09:56: so then
  • 09:59: we want to basically
  • 10:02: do this but in a loop over the number of
  • 10:04: iterations
  • 10:06: so i'm going to add a number of
  • 10:09: iterations
  • 10:10: as another variable there
  • 10:13: and then
  • 10:17: we're going to go through the range of
  • 10:20: the number of iterations
  • 10:23: and um we're going to get the interest
  • 10:27: rate
  • 10:29: and then we're going to
  • 10:33: put that interest rate into the model
  • 10:39: and then we're going to extract the
  • 10:41: years to retirement from the model
  • 10:45: and that would be running the
  • 10:48: simulations
  • 10:49: so then the other thing is just to save
  • 10:51: the results
  • 10:52: so all retirement years
  • 10:58: uh all retirement years dot append
  • 11:02: here's retirement
  • 11:09: so now i run this and we can see we get
  • 11:11: 10 different
  • 11:12: aggregate retirement and if you look
  • 11:14: over at the excel model while this
  • 11:16: happens you can see
  • 11:18: that the interest rate is changing
  • 11:19: around and it actually changes around
  • 11:21: more than you can even see because it's
  • 11:23: going really fast
  • 11:24: but you do see it changing around as we
  • 11:26: run this
  • 11:30: um so we want to bring these values back
  • 11:33: into excel
  • 11:35: um and if you recall we probably want
  • 11:38: these in a column that generally makes
  • 11:39: more sense in excel
  • 11:42: you recall we had this trick where we
  • 11:44: wrap each
  • 11:45: item into its own list in order to get
  • 11:48: it to output vertically
  • 11:50: so we can do vertical retirement years
  • 11:53: do list comprehension uh
  • 11:57: just putting a list around
  • 12:00: um each of the retirement years
  • 12:04: so that we have something that
  • 12:08: looks like that and now that um
  • 12:11: we're able to write back into excel
  • 12:16: um in a column format so i'm going to
  • 12:20: go to the same spot as before but i'm
  • 12:22: going to put the
  • 12:24: vertical retirement years and now you
  • 12:26: can see that
  • 12:27: each time i run this it's going to bring
  • 12:30: this into there
  • 12:31: so i run these two together we're going
  • 12:33: to get new simulation results
  • 12:35: coming in each time and back into excel
  • 12:40: so then it makes sense to wrap all this
  • 12:42: up in a function
  • 12:44: um so
  • 12:46: [Music]
  • 12:47: uh retirement simulations
  • 12:52: it takes the number of iterations the
  • 12:54: interest
  • 12:55: mean and the interest standard deviation
  • 13:01: does all this and then does this as well
  • 13:09: and we can have it also return the all
  • 13:12: retirement years just in case
  • 13:14: uh we later wanted to do analysis in
  • 13:17: python
  • 13:17: as well um and then
  • 13:21: we can do uh retirement simulations for
  • 13:24: the results let's
  • 13:25: go up to a thousand iterations this time
  • 13:26: with a
  • 13:28: 10 mean and a 5 standard deviation
  • 13:33: and then look at top 10 results
  • 13:36: and we'll see that run for a while we
  • 13:39: can
  • 13:39: [Music]
  • 13:40: um
  • 13:44: excel kind of froze up while it was
  • 13:46: running but now we can see that we have
  • 13:49: a thousand different results here from
  • 13:52: 1000 different simulations
  • 13:54: and we also have those same
  • 13:59: results in python as well
  • 14:05: so now we have our results in excel
  • 14:08: and we have them in python so you could
  • 14:10: go and do your analysis
  • 14:12: in either at this point but
  • 14:15: we've already seen how to do the
  • 14:16: analysis in python so i'm going to show
  • 14:18: doing the rest in excel
  • 14:22: so we have all these results here
  • 14:25: the first thing that we might want to do
  • 14:28: is
  • 14:29: a histogram to see the distribution of
  • 14:31: the results
  • 14:32: i just highlighted all of that i'm going
  • 14:35: to go
  • 14:35: and insert chart um
  • 14:38: and then i'm going to go to histogram
  • 14:42: and add the histogram and we can see the
  • 14:46: basic distribution here
  • 14:52: and
  • 14:56: see uh we can change the number of bins
  • 15:00: here
  • 15:01: uh generally better to have more bins
  • 15:04: for these simulations because you've got
  • 15:06: so many different
  • 15:07: cases
  • 15:11: so 100 is maybe two mini bins because
  • 15:14: now it looks really sparse
  • 15:15: let me go with let's try 25 on that
  • 15:21: which looks a little bit more reasonable
  • 15:25: this would be um
  • 15:30: probability distribution
  • 15:38: of used to retirement
  • 15:45: okay um the next thing that we'll want
  • 15:47: to look at
  • 15:48: is the percentile table
  • 15:52: so in order to do that first you want to
  • 15:54: set up your
  • 15:56: uh percentiles so i'm just going to
  • 15:58: start with five ten percent and then
  • 15:59: i'll be able to drag for the rest of the
  • 16:01: range
  • 16:03: um this is going to be yours to
  • 16:06: retirement
  • 16:09: and then excel has the percentile
  • 16:12: function which is like the quantile
  • 16:15: in pandas and then we're going to grab
  • 16:19: all that data and then the
  • 16:22: percentile is going to be the one which
  • 16:24: is there to the left
  • 16:26: and make sure that you fix the range on
  • 16:28: the
  • 16:29: data because you don't want that to move
  • 16:30: as you drag down but we do want the
  • 16:32: percentile to move
  • 16:35: so then we can complete that
  • 16:38: and we can see that it looks right um
  • 16:42: that you know five percent of the time
  • 16:43: we can retire in less than 20 years
  • 16:46: and 10 of the time it takes at least 40
  • 16:48: years
  • 16:52: and then the
  • 16:56: last thing that we can do here is get
  • 16:59: the probability of a certain
  • 17:01: outcome so
  • 17:04: for that then um we can recreate what we
  • 17:07: have done in panas
  • 17:09: by um
  • 17:12: let's just say our objective is retiring
  • 17:15: in 25 years
  • 17:16: so objective
  • 17:19: 25 by productive
  • 17:22: used retirement to be more clear
  • 17:31: um and then
  • 17:34: we just do equals if uh remember we want
  • 17:38: to check
  • 17:38: did the simulation meet the condition
  • 17:41: so is the year's retirement
  • 17:47: less than the objective
  • 17:50: your retirement that means we met the
  • 17:51: objective and make sure we fix
  • 17:53: that objective um
  • 17:57: and if we met the objective we get a 1
  • 17:59: otherwise we get a zero
  • 18:01: and then we can just complete that for
  • 18:03: all the results we can see whenever it's
  • 18:05: less than 25
  • 18:07: um well really it should be less than or
  • 18:10: equal to
  • 18:12: because 25 is also fine um
  • 18:17: so yeah anything which is less than or
  • 18:19: equal to 25 is now showing up as a one
  • 18:21: and anything greater is zero so then
  • 18:26: the probability
  • 18:33: of uh year's retirement
  • 18:38: less than or equal to the objective
  • 18:42: is going to equal the average sorry it's
  • 18:46: average in excel
  • 18:48: average of this column that we just
  • 18:50: created
  • 18:53: so we get a 33 chance
  • 18:56: that we're going to be able to retire in
  • 18:58: 25 years
  • 18:59: or less so that's
  • 19:03: the basic monte carlo
  • 19:06: analysis in excel
  • 19:09: so that wraps up our example on how to
  • 19:13: add monte carlo sim
  • 19:14: simulation to an existing excel model so
  • 19:18: thanks for listening and see you next
  • 19:22: time

Relationship of Inputs and Outputs in Excel Monte Carlo Simulation


Notes

  • This continues off the prior lecture to keep the inputs associated with the outputs in the Excel output, and then to do the analysis of how the inputs relate to the outputs

  • It is easier to go to DataFrame output into Excel to keep everything together

  • We create scatter plots and run a multivariate regression, just as in Python

  • You may need to enable the Data Analysis Toolpack add-in in Excel to get access to multivariate regression

Transcript

  • 00:03: hey everyone this is nick durabartis
  • 00:05: teaching you financial modeling
  • 00:06: today we're going to be talking about
  • 00:08: how to analyze the relationship
  • 00:11: between inputs and outputs in our model
  • 00:14: using monte carlo simulation in the
  • 00:17: context
  • 00:17: of an existing excel model
  • 00:20: this is part of our lecture segment on
  • 00:23: monte carlo simulation
  • 00:26: so we left off last time
  • 00:30: we were working on this example of how
  • 00:34: to run monte carlo simulation on an
  • 00:36: existing excel model
  • 00:38: and we went ahead and got it to where we
  • 00:40: were able to run the simulations
  • 00:42: and output the results of those
  • 00:44: simulations into excel
  • 00:46: and be able to get the probability of a
  • 00:50: particular objective
  • 00:52: a histogram of the distribution
  • 00:56: and a table of the percentiles of the
  • 00:59: distribution so definitely watch that
  • 01:02: prior video
  • 01:04: before coming to this one what we're
  • 01:06: doing in this video
  • 01:07: is we're going to modify the simulation
  • 01:10: a little bit so that it keeps
  • 01:12: the interest rate associated
  • 01:16: with the uh years to retirement
  • 01:20: and that will allow us to then
  • 01:25: analyze the relationship between the
  • 01:27: interest rate
  • 01:28: and the years for retirement
  • 01:32: so i'm going to come over to here um
  • 01:36: and we already have this function
  • 01:39: which is getting us the random interest
  • 01:42: rate putting into the model
  • 01:43: getting the result from the excel model
  • 01:45: which has recalculated that point and
  • 01:47: saving it
  • 01:48: so what we need to do is
  • 01:53: now we're gonna save not just the years
  • 01:54: to retirement but also
  • 01:57: the um interest rate as well
  • 02:02: so i'm going to rename this list to all
  • 02:04: data so it's more indicative of what
  • 02:07: we're doing
  • 02:08: and then here i'm going to append not
  • 02:11: just the years to retirement but also
  • 02:13: the
  • 02:13: interest rate um
  • 02:18: and let me just quickly grab that logic
  • 02:22: [Music]
  • 02:25: that we now
  • 02:29: have those results in a list here
  • 02:33: and then what we can do is we can create
  • 02:35: a data frame
  • 02:36: from those results so data frame
  • 02:39: [Music]
  • 02:40: and then the columns are going to be
  • 02:44: interest and use retirement
  • 02:53: and then we have that data frame
  • 02:57: um and then instead of
  • 03:00: i'm gonna go ahead and move this over
  • 03:03: one
  • 03:04: um instead of writing using the
  • 03:08: column and uh you know list within list
  • 03:11: approach we can actually just
  • 03:13: write the data frame back into
  • 03:16: excel um
  • 03:19: so then we're going to do uh this same
  • 03:23: assignment but we're going to
  • 03:28: assign the data frame instead so
  • 03:35: but i know if i do this right now it's
  • 03:36: going to bring the index over and so
  • 03:37: it'll be three columns and it's going to
  • 03:39: overwrite what we have here
  • 03:42: so i'm going to also put options
  • 03:45: data frame uh index equals false
  • 03:50: um and then we should get it coming in
  • 03:54: it will have the headers um but that's
  • 03:57: that's fine
  • 03:58: so let's give that a try um
  • 04:01: and now we do see coming over here what
  • 04:04: we expected to see
  • 04:05: of course we only have it at 10
  • 04:07: iterations right now instead of the full
  • 04:09: um instead of the full thousand
  • 04:14: now we can see that works so let's bring
  • 04:17: that back into our function
  • 04:19: rather than what we had before
  • 04:24: of doing the list within less approach
  • 04:28: so bring all that into here and then we
  • 04:32: can return the data frame instead
  • 04:39: and now we should be able to run this
  • 04:41: thousand simulations
  • 04:43: ten percent mean five percent interest
  • 04:46: or
  • 04:46: standard deviation and
  • 04:50: we want to see uh
  • 04:53: the first few results out of that so
  • 04:56: let's give that a try
  • 04:58: it's going to take a little bit to run
  • 05:00: through the model
  • 05:02: [Music]
  • 05:05: and oh i didn't redefine this that would
  • 05:11: cause it to not actually change
  • 05:15: i've accidentally got this still in here
  • 05:18: definitely don't want that
  • 05:21: that was the old code so now let's
  • 05:24: redefine this okay now let's try this
  • 05:27: again
  • 05:29: um so again it's going to take a little
  • 05:31: bit to run
  • 05:32: but now we do see all the interest rates
  • 05:35: associated with the year's retirement
  • 05:37: coming in
  • 05:39: and now we can modify this
  • 05:42: because now the years to retirement have
  • 05:44: moved
  • 05:46: and so we can get back to our
  • 05:48: probability of achieving the objective
  • 05:56: and so now and then this also we want to
  • 06:02: move over
  • 06:05: and same thing with the percentile
  • 06:07: because we did have to move
  • 06:09: um that column
  • 06:13: so just uh carrying those results all
  • 06:16: over
  • 06:17: great so
  • 06:20: we have everything we need in excel now
  • 06:23: let's
  • 06:23: go to do the analysis of how the inputs
  • 06:26: relate to the outputs
  • 06:28: so the first thing that you want to do
  • 06:30: is a scatter plot so we can just
  • 06:32: highlight
  • 06:32: all of these data
  • 06:36: back up to the top and we want to insert
  • 06:41: and we can see that the scatter plot
  • 06:43: comes up as the first
  • 06:45: recommended chart there so let's add
  • 06:47: that
  • 06:48: and we can definitely see a very clear
  • 06:50: relationship here
  • 06:51: between the year's retirement and
  • 06:55: the interest rate
  • 07:00: it also highlights that we have some
  • 07:01: issues here we are getting some
  • 07:03: negative interest rates and so we would
  • 07:05: probably want to
  • 07:07: deal with that in the model as well
  • 07:11: um let's come over to python to quickly
  • 07:13: fix that
  • 07:17: so what we can do is instead of
  • 07:20: random.normal variant we can
  • 07:28: call that but um we're going to do
  • 07:32: while the value is less than zero we're
  • 07:34: going to keep drawing more
  • 07:35: values um and initialize the value at a
  • 07:40: negative number
  • 07:43: and we can call this random normal
  • 07:45: positive you can see it i explained this
  • 07:47: in more detail
  • 07:49: and the adding uh
  • 07:52: monte carlo simulation to a python model
  • 07:54: in that video
  • 07:55: uh same built the same function over
  • 07:58: there
  • 07:59: um and then that can take the mean and
  • 08:02: the
  • 08:03: standard deviation
  • 08:08: and then finally return the value
  • 08:13: so then we can use that instead of
  • 08:15: random normal variant
  • 08:18: redefine that let's go ahead and try
  • 08:21: this again
  • 08:22: and then after this finishes we'll see
  • 08:25: the members update
  • 08:26: yep and now we see that we don't have
  • 08:28: those negative numbers in the
  • 08:29: distribution
  • 08:30: so doing plots of your output is a great
  • 08:33: way to
  • 08:34: check and understand everything that's
  • 08:36: going on in your
  • 08:38: simulations so now we can see there's a
  • 08:41: clear relationship
  • 08:42: just from the scatter plot as the
  • 08:44: interest rate goes up your retirement
  • 08:46: goes down
  • 08:46: and it's a non-linear and it's it's
  • 08:49: a steeper decrease at first and then it
  • 08:51: flattens out as you get to higher
  • 08:53: interest rates
  • 08:55: but we can quantify this relationship
  • 08:59: using the regression so
  • 09:02: regression excel is going to live on the
  • 09:04: data tab
  • 09:05: and then it would be over here
  • 09:09: you can notice that i have nothing over
  • 09:10: here right now and that's because i need
  • 09:12: to enable
  • 09:14: the add-in or the data analysis tool
  • 09:16: pack
  • 09:17: for that to show up so you might already
  • 09:20: have
  • 09:20: data analysis showing up over here but
  • 09:22: if you don't
  • 09:23: it is built into excel you just have to
  • 09:25: enable it so in order to do that you do
  • 09:28: file and then options
  • 09:32: and then add-ins
  • 09:36: um and then you want to manage excel
  • 09:38: add-ins
  • 09:40: and then
  • 09:43: that's where it does say that i have it
  • 09:45: enabled
  • 09:47: let me try disabling it and then
  • 09:49: re-enabling it to see
  • 09:51: if that will allow it to come up um
  • 09:56: hopefully this will come up okay good
  • 09:59: good so now we have the data analysis
  • 10:02: section
  • 10:02: showing up over here and so we can go to
  • 10:06: do
  • 10:06: our regression so just click data
  • 10:10: analysis it brings up a lot of options
  • 10:11: the one we want to use here
  • 10:13: is regression and then it's going to ask
  • 10:15: for your y's and your x's
  • 10:17: so the y is always going to be
  • 10:20: whatever your output is here used to
  • 10:23: retirement
  • 10:25: and the x you can add multiple x
  • 10:28: variables
  • 10:29: and you should if you're changing
  • 10:31: multiple things in your simulation
  • 10:33: here we just have one variable so i'm
  • 10:35: going to add that
  • 10:38: and you'll notice with both of these
  • 10:39: that i picked up the
  • 10:41: label as well and so i'm going to check
  • 10:43: that they have labels and that will
  • 10:44: allow that to come through into the
  • 10:46: regression results as well
  • 10:49: and i'm going to output that analysis
  • 10:53: um right here you can also put it on a
  • 10:56: new sheet
  • 10:57: if you'd like um and then hit okay to
  • 11:01: run it
  • 11:03: and then we see the regression output
  • 11:05: coming up here
  • 11:08: um so then we
  • 11:11: um get the result here of a negative
  • 11:15: 95.8 coefficient with a very low
  • 11:17: p value so it definitely is
  • 11:19: significantly related
  • 11:22: and negative 95.8 that's saying a one
  • 11:25: unit increase
  • 11:26: in the interest rate decreases years to
  • 11:28: retirement by 95.8 years
  • 11:31: now you might say whoa that's huge why
  • 11:32: is that so huge
  • 11:34: that's because a one unit increase here
  • 11:36: is going from
  • 11:37: zero to one hundred percent interest so
  • 11:40: that's obviously uh
  • 11:42: much larger than a realistic interest
  • 11:44: rate change
  • 11:45: uh so to get it to a one percent entry
  • 11:48: increase in interest rate which is a
  • 11:50: more reasonable thing to talk about we
  • 11:51: just divide by 100
  • 11:53: um and so that would be a one percent
  • 11:56: increase in interest rate
  • 11:58: decreases years to retirement by almost
  • 12:00: a year
  • 12:03: um so
  • 12:06: that um we can use that to interpret it
  • 12:10: um and then you know if you had
  • 12:13: other inputs they would just show up as
  • 12:15: additional lines here and you could
  • 12:17: interpret those
  • 12:17: coefficients in a similar fashion
  • 12:21: um and then the one other part of the
  • 12:24: analysis
  • 12:25: is when you do have multiple inputs you
  • 12:27: can't just directly compare the
  • 12:29: coefficients to determine
  • 12:31: which is the most impactful you have to
  • 12:34: compare these standardized coefficients
  • 12:36: and to get the standardized coefficients
  • 12:39: um
  • 12:40: so you would do this for each one of
  • 12:42: your um
  • 12:44: you don't care about the intercept uh
  • 12:47: you would do this for
  • 12:48: each one of uh your inputs
  • 12:52: you would go and you would calculate the
  • 12:55: standard deviation
  • 12:57: uh of that input
  • 13:02: and then
  • 13:06: the standardized coefficients so this
  • 13:08: would be standard deviation
  • 13:10: and then standard coefficients
  • 13:16: so again interest it's just going to be
  • 13:20: standard deviation multiplied by the
  • 13:22: coefficient
  • 13:24: so that's now saying that a one standard
  • 13:26: deviation increase
  • 13:28: in the interest rate leads to
  • 13:32: uh is associated with a decrease in
  • 13:35: years to retirement of
  • 13:37: almost four and a half years and a one
  • 13:39: standard deviation increase and interest
  • 13:41: rate
  • 13:41: is close to five percent
  • 13:45: um and then when you have other
  • 13:46: coefficients here you can just
  • 13:48: pick the largest in absolute value and
  • 13:50: those are going to be the ones
  • 13:52: which are the most impactful inputs in
  • 13:54: your
  • 13:55: model so that shows
  • 13:58: how we can do this analysis of the
  • 14:01: relationship between the inputs and
  • 14:02: outputs
  • 14:03: in excel and then
  • 14:06: to wrap up all this material on monte
  • 14:09: carlo simulation
  • 14:11: there's also a lab exercise here on
  • 14:14: doing this process for
  • 14:16: your project one model in excel
  • 14:20: it's going to be very similar to the
  • 14:23: python
  • 14:25: project one extension exercise that was
  • 14:27: mentioned
  • 14:28: in the prior video on extending the
  • 14:31: dynamic salary retirement model in
  • 14:32: python
  • 14:34: um you're just going to go through this
  • 14:36: process of adding
  • 14:38: a monte carlo simulation to your excel
  • 14:40: model
  • 14:42: so here in the level one you're going to
  • 14:44: be varying the number of phones
  • 14:47: um and then analyze the results
  • 14:50: table of probabilities um chance of
  • 14:53: reaching 800 million mpv
  • 14:55: et and cetera two then you're going to
  • 14:58: do the same
  • 15:00: keep that varying as it was but also
  • 15:02: vary the
  • 15:04: lifespan of the machines
  • 15:08: and then in addition to
  • 15:11: the visualization we just talked about
  • 15:13: then you want to go through this
  • 15:14: analysis
  • 15:15: of what's the relationship between the
  • 15:18: inputs
  • 15:19: and the outputs so
  • 15:23: that wraps up this segment on monte
  • 15:25: carlo
  • 15:26: simulation so thanks for listening and
  • 15:28: see you next time