Retrieving Public Financial Data Using Python
Kenneth See, ACCA
Jan 04, 21
Using Alpha Vantage's free API to automate the retrieval of company financial data for analysis  *Missing Thumbnail*

Financial data on publicly traded companies are easily accessible on the World Wide Web. However, downloading financial statements and manually parsing them could get tedious, especially when you are looking at dozens of companies. In this article, I will be demonstrating how I used Python to seamlessly retrieve company financials and following which, run a DuPont analysis on the parsed data.

Requesting data from an APIAlpha Vantage Logo

I used the stock APIs provided by Alpha Vantage to request company fundamental data. There are a few financial data APIs available to use, but I found Alpha Vantage to be the simplest to work with that is free. With the following three pieces of information, you can essentially pull any publicly-available financial data for listed companies:

 

  1. API Key
    • This will be unique to your application.
  2. Function
    • The type of data you are requesting (e.g. Intraday price, Statement of Cash Flow).
    • The various functions available and their exact names to be used can be found in the Alpha Vantage documentation.
  3. Stock Ticker
    • The ticker (e.g. BABA) will be used to identify the specific company you are requesting the data on.

An important thing to note is that Alpha Vantage's free tier has a limitation of 5 API calls a minute and 500 calls a day. It should be sufficient for testing and simple use cases.

Generating an API key

The very first step to using Alpha Vantage's API is to generate a free API key. Be careful to store your API key somewhere once it is generated as only one key is created per email address.

Writing the code to make the API calls

I stored the API key in a variable called "api_key". That variable will be included in the payload to all following calls to Alpha Vantage's API. I also imported the "json" and "requests" python libraries to make my requests.

 

import json
 import requests

I then created a generic function, called "retrieve_data", that will allow me to specify which stock I want (symbol) and what kind of data I would like to request (function). The three pieces of information that I mentioned are needed to make any API call - function, symbol, and api_key - were included as my function's arguments.

 

def retrieve_data(function: str, symbol: str, api_key: str) -> dict:
    """
    Retrieves data from AlphaVantage's open API.
    Documentation located at: https://www.alphavantage.co/documentation
    """
    # query from API
    url = f'https://www.alphavantage.co/query?function={function}&symbol={symbol}&apikey={api_key}'
    response = requests.get(url)
    # read output
    data = response.text
    # parse output
    parsed = json.loads(data)
    
    return parsed
    

Alpha Vantage provides a wide variety of data, including stock time series and currency exchange rate data. In my case for conducting a DuPont analysis, I will only need balance sheet and income statement data.

From printing out the retrieved results of a company's balance sheet, I could observe that the balance sheet information I needed can be accessed by the key "annualReports".

Retrieved Balance Sheet for Apple

The value of the "annualReports" key is a list where each element is a period's balance sheet, ordered by the ending fiscal date in descending order. With this information, I could then create two functions to retrieve the latest and second latest balance sheets of a company.

 

def get_latest_balance_sheet(symbol: str, api_key: str) -> dict:
    """
    Retrieves latest balance sheet of company specified in symbol.
    """
    balance_sheets = retrieve_data('BALANCE_SHEET', symbol, api_key)['annualReports']
    latest_balance_sheet = balance_sheets[0]
    return latest_balance_sheet

def get_second_latest_balance_sheet(symbol: str, api_key: str) -> dict:
    """
    Retrieves second latest balance sheet of company specified in symbol.
    """
    balance_sheets = retrieve_data('BALANCE_SHEET', symbol, api_key)['annualReports']
    second_latest_balance_sheet = balance_sheets[1]
    return second_latest_balance_sheet

Printing the retrieved results of a company's income statement showed a similar structure, so I could use similar logic to create a function to retrieve the latest income statement.

 

def get_latest_income_statement(symbol: str, api_key: str) -> dict:
    """
    Retrieves latest income statement of company specified in symbol.
    """
    income_statements = retrieve_data('INCOME_STATEMENT', symbol, api_key)['annualReports']
    latest_income_statement = income_statements[0]
    return latest_income_statement

Using the Retrieved Data to Perform Analysis

Now that I can retrieve all the data I need to perform a DuPont analysis, I can then write the logic to calculate the ratios.

I created an analyzer python class that would establish the blueprint for the calculations required for a DuPont analysis. This includes class functions to calculate the Net Profit Margin, Asset Turnover, Equity Multiplier, and ROE.

 

class FinancialStatementAnalyzer:
    def __init__(self, balance_sheet: dict, prior_balance_sheet: dict, income_statement: dict):
        self.balance_sheet = balance_sheet
        self.prior_balance_sheet = prior_balance_sheet
        self.income_statement = income_statement
        
    def net_profit_margin(self):
        net_income = float(self.income_statement['netIncome'])
        revenue = float(self.income_statement['totalRevenue'])
        return round(net_income/revenue, 2)
    
    def asset_turnover(self):
        revenue = float(self.income_statement['totalRevenue'])
        average_total_assets = (float(self.balance_sheet['totalAssets']) + 
                                float(self.prior_balance_sheet['totalAssets']))/2
        return round(revenue/average_total_assets, 2)
    
    def equity_multiplier(self):
        average_total_assets = (float(self.balance_sheet['totalAssets']) + 
                                float(self.prior_balance_sheet['totalAssets']))/2
        average_shareholders_equity = (float(self.balance_sheet['totalShareholderEquity']) + 
                                       float(self.prior_balance_sheet['totalShareholderEquity']))/2
        return round(average_total_assets/average_shareholders_equity, 2)
    
    def roe(self):
        net_income = float(self.income_statement['netIncome'])
        average_shareholders_equity = (float(self.balance_sheet['totalShareholderEquity']) + 
                                       float(self.prior_balance_sheet['totalShareholderEquity']))/2
        return round(net_income/average_shareholders_equity, 2)

Notice that each class function is essentially looking for a specific data point in either of the balance sheets or the income statement and then using these data points to calculate the corresponding ratio.


Putting it all together

At this point, I have both the code to retrieve the data and the code to analyze the data. Now, I am ready to put them all together to analyze a small portfolio of stocks.

I created a dummy portfolio consisting of the following stocks:

 

Company Ticker
Zoom ZM
Facebook FB
Google (Alphabet) GOOGL
Alibaba BABA
Exxon Mobil XOM
General Electric GE
Lockheed Martin LMT
Eli Lilly and Company LLY
Procter & Gamble PG

In a loop, I then retrieved each stock's financials, put them through the analyzer class, and printed out the DuPont analysis ratios.

 

import time

stocks = ['ZM', 'FB', 'GOOGL', 'BABA', 'XOM', 'GE', 'LMT', 'LLY', 'PG', ]

# simple dupont analysis
print('DuPont Analysis')
for stock in stocks:
    latest_bs = get_latest_balance_sheet(stock, api_key)
    prior_bs = get_second_latest_balance_sheet(stock, api_key)
    latest_is = get_latest_income_statement(stock, api_key)
    analyzer = FinancialStatementAnalyzer(latest_bs, prior_bs, latest_is)
    
    # print ROE and DuPont analysis ratios
    print(f'{stock}\tProfit Margin: {analyzer.net_profit_margin()}\tAsset Turnover: {analyzer.asset_turnover()}\t\
    Equity Multiplier: {analyzer.equity_multiplier()}\tROE: {analyzer.roe()}')
    time.sleep(60)

I had to use the time.sleep() function to intentionally delay the API calls between the requests for each company to avoid hitting the limit of 5 calls per minute. The output of this code block is a nice simple breakdown of each company's ROE.

DuPont analysis

Click here if you'd like to download the source code for free or watch a video covering this tutorial.