Data Pipelines¶
Covers how to collect, clean, and structure the data for your model in an automated way, using Python.
Creating a Data Loading Module for Capital IQ Debt Details¶
Notes¶
For the DCF model we have explored using the market value of individual debt instruments to estimate the market value of debt for the whole company
For Project 3, we looked at using Capital IQ as the source of those debt details
Here I examine how to automate the loading and cleaning of the debt details from Capital IQ, so they are ready to be worked with in the model
I also look at taking the resulting code and making a Python module, so it can be reused in future models
To clean up the data, I use a combination of Pandas methods, string methods, and regular expressions (regex)
Transcript¶
- hey everyone this is nick dear burtis
- teaching you financial modeling
- today we're going to be looking at how
- to load in capital iq
- details into python to work with them
- and also how to
- structure our code into a python module
- such that this code can be
- reused anytime that we want to load in
- debt details from capital iq
- regardless of the model this is part of
- our lecture segment
- on data pipelines so
- the data that we're looking at loading
- in here
- comes in the form of this excel
- spreadsheet
- and within that spreadsheet we have
- basically some
- header area which doesn't really matter
- to us
- we don't care about that what we want to
- get out of this
- is this table which in which is in here
- and there's a few things that we're
- going to have to do in order to clean up
- this data so we can
- actually work with it in our model
- before we get to that
- just want to note that there is
- typically a second table
- in here that represents older data so we
- don't want to use
- any of the data from that table only the
- top table
- and then some of the issues that we're
- going to have to deal with is the fact
- that
- these coupon rates actually have the
- percentage sign
- in the string it's not
- just formatted as a percent it actually
- has the percent sign in there
- uh so when we load it into python we
- will also see that percent sign in there
- and the um floating rates
- we have live or plus some rate
- so we're going to want to be able to
- take that and
- plug in a libor rate and get what the
- resulting rate is
- and then we also have these coupon rate
- ranges
- where for the purposes of our model
- we'll just take the midpoint of that
- range
- and use that as the coupon for the bond
- then one other thing is that in order to
- do value calculations with these bonds
- we're going to want to get
- uh that into a number of years or
- maturity
- and so we're going to be converting
- these dates into a number of years
- from today um and then
- some of the maturities only give a year
- and so we're going to assume those are
- at the end of the year
- and handle those that way
- so that's an overview of all the issues
- that we face
- in loading this in so now let's look at
- some python code
- to make that happen and um
- throughout doing this then we're going
- to try to make sure that it's standard
- and reusable such that even though this
- is for walmart we can then do the same
- thing
- or we can then use the same code to load
- any other company's
- details and have them all cleaned up and
- ready to use for our model
- so coming over to the jupiter notebook
- we've talked about two approaches in the
- course which will be useful
- to load this information in we can load
- it in using pandas
- or we can use excel wings
- i will generally go to pandas for any
- usage that
- doesn't require back and forth between
- python
- and excel wings for the most part unless
- i'm just trying to grab a single value
- out of excel then sometimes i'll still
- use excel wings
- but pandas generally is a lot more
- compatible
- you can run the code on linux or
- somewhere else that excel is not
- installed
- and still get the same result so i
- generally will prefer pandas and that's
- what i'll use here in this video
- so import pandas and then i'm going to
- load uh
- the file into a data frame
- the walmart that details i have it here
- in the same folder
- um and then we'll see what we get
- so we can see right away that this does
- not look good right we got
- a bunch of junk in here this is not at
- all the table that we wanted to get
- we've got all these on names we got a
- bunch of missing values in here
- um basically what we're seeing here
- corresponds to
- this top part where indeed there are no
- column headers and we've just got some
- junk up here we don't care about
- the actual table here starts in row
- 13. so
- one thing i can do to
- cut off this entire top area is there a
- lot of
- options for the read excel command as
- you can see
- um one such of those options is skip
- rows
- so we can tell it we want to skip
- 12 rows while loading it in and just
- start from the 13th row which is where
- we know the data starts
- so now we can see that we got the actual
- names of the columns in that table that
- looks good and it looks like we got the
- values of that table as well
- um now what we can also see is we have
- you know these last two rows are junk so
- let's say um
- you know 127 rows counting the zero
- and then let's hop over and look at how
- many rows
- are in this top table
- and we have 62 rows so clearly we're
- picking up
- this bottom table as well which we don't
- want
- [Music]
- so there's a number of ways that we can
- choose to
- handle that
- [Music]
- i one thing that we can do
- i see
- that well let's look at how the data
- frame looks
- in that middle area so
- of course the with the data frame
- display the middle gets cut out
- so we can use iloke to
- look at some range of rows so we know
- that was
- 63 rows so let's just look at row 60 to
- 70.
- um so now we can clearly see here
- the table ending and then we can see
- a number of um blank rows
- um and then we have the next table
- starting
- um so there's a number of ways
- we can deal with this to be able to say
- let's cut off everything after
- the table the first table
- so ultimately it's really easy to drop
- rows which have all missing values
- so i'm not too worried about these ones
- once i drop the missing values these
- will be gone
- um it's once we're getting to here
- and we want to try and pick something
- that's going to be consistent
- as we load in data for different
- companies or different time periods
- so you know this is the first kind of
- distinctive marker that we see
- for the second table but we can see this
- is very specific to the date
- so perhaps not a good choice to try and
- target
- now we do have this description here in
- the next row
- signifying that we're starting the next
- table
- so that seems like an appropriate thing
- to target
- um because we can assume that that is
- going to be the same
- for other time periods and other
- companies as well
- um so we can try to target
- that row
- so that row is going to be
- what is going to be where the
- description
- is equal to description
- right um so by that
- we're now able to target this second
- header row
- um and that way we know we've gotten
- into the next table
- but we don't want to cut off just that
- we want to start cutting off
- from the row before um
- so if we
- take the index from that we get
- row 67
- [Music]
- we pull the first value out of that we
- get actually the number 67
- and then if we subtract 1 now we get row
- 66
- which is the first row that we don't
- want to keep
- so there then we can
- [Music]
- say that that's our
- fiscal year row index
- because now we're talking about row 66
- here
- um and now we can
- [Music]
- do we can get all of the data frame up
- until
- that fiscal year row index
- so now we'll have those blanks on the
- end but we do not have that fistful
- year row and we do not have the second
- table
- at all
- so that's a good step we've removed that
- second table
- but now we also want to remove these
- blank rows thankfully pandas has
- a convenient method for this drop n a
- um but we want to make sure that we're
- still keeping
- these even if there's no floating rate
- it can still be a valid
- row so we only want to um drop
- when all are missing so we put how
- equals all
- and now we can see that those other rows
- get dropped as well and yet we are
- keeping the
- rows which just have some missing values
- so we can do df equals
- that in order to save it back into the
- result
- um and that way we've got
- now just the first table
- so
- just kind of looking through um one
- other issue that we see in here
- is we have these dashes um
- and pandas this nan represents a missing
- value
- a dash represents that's actually the
- string dash
- and so if we try and do any math with
- these columns which have the dash in
- them
- it is not going to work for example
- so this is the principal do usd column
- so if we try to sum that we're going to
- get an error
- and it's actually going to say you can't
- add a float in a string and that string
- that it's talking about
- is that dash which is in there
- so in order to fix that we can use
- replace
- on the data frame so
- with this same look at the data
- if we do a replace of dash
- with uh we do have to go import numpy as
- mp
- and have to uh
- run that
- um then
- uh the missing value that pandas uses is
- numpy dot nan
- so we're replacing the dash which we're
- seeing here in row
- 16 with a nand let's see how that looks
- now row 16 we have a man in each of
- those spots which previously had dashes
- so let's save that back into the data
- frame
- so df equals df.replace
- um
- let me just load it and then clean it
- again
- so now um that looks good we have the
- right number of rows
- and now let's try to do this sum again
- now the sum actually
- works because we replace the dash with
- the missing
- um so i think
- this represents a nice concrete step
- in the process um so let's go ahead and
- wrap that up
- into a function so
- um load details
- we can give that a file path
- um and it's going to
- use read excel and do all these other
- things
- which we just did
- and ultimately return the data frame
- instead of
- that specific file we're going to use
- the file path
- um so now we can do df equals load
- that details on that file path
- and we should have the same thing which
- we do
- great so now we have to start tackling
- some of those other cleanup pieces that
- i mentioned
- so
- let's next tackle um
- the um
- coupons so
- for the coupons we have
- a couple things to deal with let's look
- back at the
- sheet over here so we have
- the fact that we have this percent sign
- on everything and then we also have
- these
- ranges to deal with
- so there's a lot of different ways to
- tackle
- this problem um
- you know you could just uh
- use like string methods like we can
- um we have a string
- 4.0 percent we can do a strip
- and strip off the percent sign and then
- you're left without it
- and then you can take float of
- all that and we get back to the number
- and then divide by 100
- and we get that actual um percentage
- so we can use this kind of approach but
- this is not going to work
- for the range uh where we have the range
- like 4.5 percent to 5.5 percent
- um so we could use similar
- approaches along with split split
- will let us take this string and split
- it into a list
- on a delimitator so here we're putting
- the space dash space as a delimiter and
- now we have these two
- uh coupon rates which we can do that
- kind of approach with
- and then um take the midpoint
- of the range um
- so that works perfectly fine but then
- you have to
- come up with a way to detect which of
- these two styles of strings you got
- um you could do a split and then check
- the length
- uh because if you just split that you're
- gonna
- get a list of length one if you split
- that you have a list of length two so
- you could
- compare off that to make the distinction
- um so that's one possible approach
- um the approach that i'm going to look
- at
- in this video is actually using regular
- expressions
- so regular expressions are kind of a
- mini language within python
- and they also exist in many other
- programming languages
- and actually have nearly the same syntax
- across
- different programming languages and
- they're a way of
- parsing strings they sleela you match
- any kind of pattern and strings and
- extract any parts out of them
- that you want the downside being that
- they do tend to be a little bit
- complicated to write
- and read because they're their own
- little mini language um
- but they're extremely powerful so i do
- recommend
- um taking some time to try and learn
- them
- and i think i will be releasing at some
- point in the future a video
- dedicated to a regular expression so i'm
- not going to spend a whole lot of time
- on them here
- but we want to write a regular
- expression to be able to match
- this pattern of a percent
- and a dash another percent and then
- extract the two numbers
- out of that um
- so in order to write regular expressions
- i like to use
- a little tool to help me in that
- tool so reg x is is the abbreviation for
- regular expressions
- and regexpal is the tool
- that i like to use to help out
- so this tool is nice because you put
- your whatever string you're trying to
- match
- down here and then you start typing your
- regular expression
- up here and
- it will show what you're matching so
- again this is not
- a video to teach you regular expressions
- so the syntax is
- um you know not going to make any sense
- to you but
- you'll see kind of how this works um
- so backslash d will match numbers
- and so now you can see it's highlighting
- the numbers here
- and if i put a dot
- then you can see it's matching the dots
- as well number
- and then dot
- well actually uh dot means any single
- character
- if i escape it with a backslash then it
- means the literal dot
- um and when you highlight um
- over this and regexpal it also explains
- to you what you have here
- and so we then
- [Music]
- can make a character group out of this
- and then say that it repeats one or more
- times
- and now we're matching uh you know
- any number here regardless of how many
- decimal places are there on it
- and then we're going to have a percent
- sign and then space dash space
- and then uh you know basically the same
- thing
- over here yeah entire same thing over
- here
- um
- so what am i missing here
- there we go and then percent and now
- it's matching clearly
- so this is our regular expression
- and then one more thing that we want to
- do is put groups in here
- and groups are going to allow us to
- extract certain parts
- of that string so ultimately you want to
- extract
- this here and this here so let's put
- those into groups with parentheses
- um so that whole thing is grouped not
- the percent sign
- and over here uh same thing not the
- percent sign so now we have our regular
- expression and we can bring that
- back over into python
- so this would be
- the range pattern
- um and i'm going to import re for
- regular expressions
- and it's going to be re.compile is what
- we're going to do here
- again i'm really glossing over the
- regular expression stuff i plan to do a
- whole video on that later
- but you're just seeing quickly how it
- can be used here
- so then we have uh our string
- so let's just find that as a and then
- uh or let's say that's coupon string
- give it a better name
- um so then we can do
- uh range pattern dot match
- on the coupon string and
- when we get a match we're gonna see
- something like that
- um when we don't
- uh then you're going to get none
- as your result um
- so what you can do
- is then say that's the match
- um and then if there's a match um
- then we can extract the result from it
- so then
- um the bottom percent uh
- is going to be match.group uh one group
- zero is the entire thing group one is
- the first spot you put parentheses
- um and then the top percent is going to
- be match.group
- two um and let's just make sure that we
- have those
- uh correct yep we got those
- and we can then take the float
- of each yep so now we have the numbers
- and then the midpoint is going to be
- equal to
- the top percent plus the bottom uh
- percent divided by two
- so the midpoint we should get uh five
- for that which we do
- so now we can make this into a function
- um so we can call this standardized
- fixed
- coupons fix the coupon string as an
- argument
- okay
- and so we're going to return the
- midpoint in that case
- but what about in the case where we
- don't have a match so we basically have
- two alternatives here
- and the other alternative is this where
- um we already looked at
- um for that we just do this kind of
- approach so
- um if it's not a match then we're going
- to
- turn this on the coupon string
- let's just save the divided by 100 for
- later
- so now we have a function where we can
- give it say 2.5 it gets the 2.5 out of
- that or we can give it
- a range and it's going to get the
- midpoint
- of that range
- so now we can
- [Music]
- apply this to the correct column
- so that's the coupon base right column
- so we're going to apply standardizing
- the fixed coupons and we get an
- error um
- [Music]
- expected string or bytes like object
- um so
- [Music]
- i'm just going to quickly look at what
- that error was
- using the built-in debugger uh and we
- can see it's because we're getting a
- nand here there is a missing value so we
- do have to handle that
- in here as well if pd dot
- is null coupon string
- then we want to return the coupon string
- okay now this is working appropriately
- okay um so
- we can
- [Music]
- um then have this
- um as part of our
- cleanup the coupon base rate
- we're going to apply this
- okay so um
- now we want to handle next
- um let's look at the floating rate
- coupons next so for the floating rate
- we have a similar thing here
- where for the floating rate
- most the time here we just have this
- um n a and
- that has been converted to nam over here
- so
- it already has missing values
- appropriately
- but what we're going to need to handle
- is these libor plus
- so we can use a similar regular
- expression
- approach so i'll bring that over to
- regex pal
- um let's now put
- that as our test string and so what
- we're looking for
- is the string libor plus
- but plus is a special character in regex
- so you gotta put a backslash
- um and then we're going to have numbers
- as well as uh decimals and
- uh one or more of those and
- then a percent sign so that's our
- regular expression
- for that um
- so that's going to be the libor pattern
- okay and i'll move this up here as well
- since it's also done
- um and now we
- can use a similar approach so libor
- let's
- [Music]
- give that example coupon string
- live or plus
- and try a similar approach so
- we have the libor pattern that match
- the coupon string we can see that's a
- match
- um so that's the match and then if match
- um and we got to add our group in here
- so here the group is is going to be
- around the number again
- um so if match
- then the percentage
- is going to be the float of match.group
- one
- and then let's just say that libor
- was one percent
- um then we're going to
- return
- the library rate
- times 100 because this is in whole
- percentage points
- plus the percentage and
- not quite returning yet we're not
- writing our function
- um so this would be the
- you know the overall rate and then we
- can look at that rate
- libor just libor
- and we get 1.75 so that appears to be
- working correctly
- so let's uh wrap that into a function
- so standardize
- loading coupons
- takes the coupon spring as well
- as libor um
- and it does this and
- you know just in case there are just
- normal rates in there
- we can throw this in there
- and we'll also throw our catch for
- missing values
- as well
- okay so now
- we should be able to take our floating
- rate column
- and apply standardized floating coupons
- and pass the libor rate
- and it seems it did not work
- appropriately um for some reason
- our string was getting through without a
- match oh because we never returned
- okay so then we want to return the rate
- and
- now it seems to be working uh let's just
- drop an a in there so we can see the two
- yeah they're both live or plus
- uh 0.75 and so that's correct if we put
- a
- two percent rate now we're getting 2.75
- percent so
- everything looks correct
- um so we can now add this into our
- cleaning approach
- so yeah floating rate
- he pulls that
- um so now this is starting to get
- fairly cleaned up we have the coupon and
- the floating rates
- both handled well
- now we want to um
- combine the two um
- so we can create a coupon column
- so that's going to be equal to
- the coupon base rate column
- and then we can say well wherever that's
- missing we can fill it in
- with the floating rate column
- so now we have the coupon right here um
- and it should be taking the values um
- yeah so i got the rose or floating rates
- not missing and we can see
- it is indeed getting filled in for the
- coupon um
- so everything is working appropriately
- um and then in order to do calculations
- with it we want to get it to decimal
- so df coupon equals df coupon
- divided by 100.
- and now that looks good
- so now the coupons are cleaned up and we
- just have the
- maturities to deal with now
- um let me just get all this group
- together
- and uh bring this up with my other
- functions that are now
- working well
- okay um
- [Music]
- so now we've got to deal with the
- maturities
- um and so with the maturities uh we
- talked about
- how um well
- one we got to get it to date time
- instead of just a string is that how it
- is now let me look at the data types
- maturities object so yes it's coming as
- a string
- um and not a date time
- um and then we've got to deal with this
- year
- issue as well um
- so let's see
- um what is the column name
- so maturity
- um
- so
- trying to look for
- where one of these years
- so let's grab 20 that's where we have a
- year
- so what is the type of that
- that's actually coming as an integer
- and then one that's a date
- is actually that is actually coming as a
- date time not as a string
- so
- we want to let these basically
- pass through as they are they're already
- in date formats that's what we want
- um but if we're getting an integer that
- means we have just the year
- and then we've got to convert it um
- so we can use pandas's two date time to
- do that conversion
- for us um so if we just say you know
- december 31st
- of may 20 it will be able to take that
- and turn it into
- a timestamp which will work for the
- daytime purposes
- and so if we you know just have our year
- then we can just format the string to
- drop in the year
- and it will do that conversion for us
- so we can do that along with some type
- checking
- in order to standardize the maturities
- so um you know if
- is instance that's a way we can check
- the type
- of um and here let's make this the more
- general maturity because we're not going
- to
- um know coming in whether it's
- a year or a date time so if it's an
- integer
- then um that's going to be our result
- um and
- otherwise basically it's passing through
- let's just go ahead and make the
- function for that
- standardized maturity dates
- um so if it's an editor then return the
- date time of the end of the year
- there needs to be maturity there
- um if it's not
- uh
- you know i guess it could conceivably
- conceivably get loaded
- as a string so if it's not a string or
- an integer then we're just going to
- return the maturity so if it's a date
- time or if it's missing then just return
- it
- um and then basically we get here if
- it's just a regular string
- and then we can just return it
- um so now
- we give it a year that will give us the
- end of that year
- um if we give it a date time
- then well let's just throw it at the
- data frame and see
- um so df maturity
- dot apply standardized maturity dates
- and we can see this indeed has worked
- for us
- so we didn't get any errors uh we
- remember row 20
- had that year we can see now we're
- getting the end of the year for that
- um and the missing values are passing
- through just fine
- so that becomes the next step here in
- our cleanup process
- we're going to overwrite the maturity
- column with that
- [Music]
- and that has allowed it to become a date
- time type as well now that everything is
- date
- um so then we just have to
- um convert that i'll just you know put
- this over here because we're done with
- it
- then we have to convert the
- um year
- or the the maturity date into a number
- of years from today
- so i'm going to import date time and i
- already covered this
- um in a prior video
- um on on handling the market value of
- debt
- in python so i'm just going to copy the
- function that we had
- from that this uh year until from date
- so it gets today's date and then looks
- the difference between
- the date we're passing to it and today
- and ultimately gives you a number of
- years
- so then we can just apply that
- [Music]
- so we take maturity now that it's
- been all cleaned up and we apply the
- year until from date um
- then we get the number of years so we
- can add that in here
- as df maturity
- years equals that
- and now we should have that coming into
- the data frame as well
- so one last step is to determine
- whether we have enough uh information to
- be able to
- value the bond um
- so there's a few things that will make
- it invalid
- um so
- um if
- um
- uh if the coupon
- is missing um which
- here we can see the rows that have a
- missing coupon then we're not able to do
- anything
- um if the
- maturity years is missing
- that also is invalid we have no way to
- know
- the value if we don't know how long it's
- outstanding
- um and then uh
- if the maturity years is less than
- zero that means the bond is already um
- expired and so those are going to be
- invalid as well so we can
- um use loc in order to
- [Music]
- catch all these three things at once um
- so if we just
- wrap this in parentheses and then put
- this pipe
- operator for or and then we can put
- these other conditions in there as well
- wrap it in parentheses pipe
- and then if the coupon is null
- so now that should catch all three of
- those cases
- um so then we can
- use that to create a column
- which represents whether the row is
- valid
- um so yeah we can see we've got the
- negative maturities we can see we've got
- the missing coupons
- and we can see we've got the missing
- maturities all caught here in one filter
- so what we can do is we can say um
- create a validity column and just set it
- to true
- but wherever any of those things uh
- conditions are meant for it to be
- invalid then we set it to false
- so then we have this validity column
- and you can
- [Music]
- oh i accidentally had it set
- the entire row of false we want to then
- put a comma
- um and then i've just said only the
- valid column
- let me get a new data frame and then do
- this again and
- yes now it was false the rest of the row
- still stays there so we can do other
- things with it like just take it out as
- principle instead of doing an evaluation
- but now we can differentiate between the
- two so let's just create a function out
- of this
- so we can call this a sine validity
- of bonds
- and this is going to just work in place
- so we don't even have to return anything
- um i'm going to put this up there
- um and now we can just call this on the
- data frame add in
- um and now we have that in there
- as well so this is all working
- everything is all cleaned up now it's
- ready for you to go do evaluation from
- here
- but let's just wrap all these steps into
- one further function
- so we can call this load and
- clean uh statement or
- that details and that's going to take
- the file path it's also going to take
- the libor rate
- and it's going to do all these things
- and this is going to be the file path
- and it's already using a library in
- there and then we're going to return the
- data frame
- at the end so now we can just do load
- and clean.details
- on the file path and give it a library
- rate
- and we'll have the data frame all
- cleaned up and ready to go
- so now we have our code working
- to be able to do all this cleanup
- but it's kind of you know a decent
- number of lines of code here
- it's definitely not ideal to to copy
- paste this into every project and if you
- did
- and you needed to make some updates to
- it
- that could cause some issues because
- then you've got to go
- back and update it and every place that
- you've copy pasted it so really you just
- want to have it once
- and be able to use it everywhere so
- that's the perfect application
- of a python module um so
- we can create a module
- um in order to
- use this very easily in any of our
- models
- but real quick before we do that i do
- want to try
- just loading in different dead details
- to make sure it works with pfizer's
- details as well to see if anything is
- different there
- so let's give that a try
- um and there did not work so what is
- going on here
- you can see you got a key error for the
- floating rate um
- so that suggests maybe that the um
- floating rate column is not there
- but let's see let me open these up
- so we can take a look
- and we can see in the pfizer data that
- the floating rate column is there so why
- are we getting an
- error so let's go back over to python
- and we can use the built-in debugger
- in order to check this out so let's get
- back up to our code
- uh so let's look at the columns of the
- data frame
- loading rate is right there isn't that
- weird
- um so this is definitely
- a gotcha that you may face and loading
- in
- [Music]
- um data from uh
- different sources now let me um
- look at that directly oh it looks a
- little bit different now that i pull it
- out
- um it's floating backslash xaoxa zero
- rate what does that mean so uh
- it's uh you know a little outside of the
- score outside of the scope of the course
- to talk in detail about what's going on
- here but basically
- this is another way that white space can
- be represented
- in a file and so
- what looks to us like a space is being
- represented by this backslash
- xa0 so what we can do here while we're
- loading in the columns
- is just replace any of these xa zeros
- with a space and then we shouldn't have
- to worry about it
- so i'm just going to try adding that in
- here uh df.columns
- equals um column.replace
- uh this oh uh backslash xa0
- with a space or call in the update
- columns
- and hopefully that will allow it to work
- awesome
- this time it loaded it and now
- everything seems to be working properly
- here as well
- and then let's just check we can go back
- to walmart and everything is still
- working there as well
- awesome so this is ready to go um
- so now let's let's take this over and
- put it into a python module
- so that we can reuse this in multiple
- different
- models so
- we can go and create a text file
- and then i'm going to rename this file
- uh you want to pick a name without
- spaces
- so i'm going to call this cap iq debt
- and you want the file extension to be
- py that's what makes it a python module
- and then we can just uh take
- all of this that's why it was good to
- gather it all in one cell
- and put it over here um
- and now save this and so now this is
- just a text file
- or the dot py extension which is in the
- same folder
- and now let me totally
- let me just totally start a new jupiter
- notebook
- um so over here we've never
- defined anything um
- and so we don't have any of those that
- loading functions ready
- um but let's try import app iq dat
- so we do that and we can see it was
- successful
- and now when we do cap iq dot we can see
- that we have
- all these functions that we defined
- in what was previously the jupiter cell
- and we brought over to the python module
- so now we can do load and clean debt
- details um
- so df equals and we're going to do that
- on the pfizer
- um that details
- and say our libor rate let's just say
- it's two percent now
- um that's the live one
- and there we go now we have our data
- frame cleaned up ready to go
- and we don't have all that code in this
- jupyter notebook
- so now um we're able to take all this
- detail
- of all this loading cleaning process for
- the data pipeline
- and just offload that so that the reader
- of our model does not have to care
- about how all this works it happens in
- the background
- through this cap iq depth module and now
- this
- module can be reused in as many uh
- different um notebooks as you want
- and if you go and make some changes like
- we had to do
- to be able to handle the father details
- then that's going to go and apply for
- all your different models
- so maybe you decide okay well i'm going
- to add some functionality which prints
- out some summary statistics and
- totals up the principle to make sure
- that you have the right amount of
- principle
- well you just go and change it in this
- module and then
- all your different models which use this
- module are going to get that
- functionality
- so there's a lot of power in
- restructuring
- your code to use modules
- so that's an overview of how we can
- clean up
- capital iq debt details
- 100 percent automated in python so now
- it's ready to use to value
- all the bonds and calculate the total
- market value of debt
- um and we also covered how to
- take that code and offload it into a
- module such that it can be easily reused
- in multiple different projects so
- thanks for listening and see you next
- time