Data Analysis in Excel - Using Excel in Tandem With Code
Excel can speed up R and SQL workflows. Why not leverage this tool?
This is the third article in a series, “Data Analysis in Excel”. The initial articles can be found below:
Code-based analysis provides much more flexibility than Excel, provided you have the right tools to get there. Complex WHERE
statements, data exploration, and a litany of other tasks can be sped up in Excel. Today’s article, using the two examples laid out above, will walk you through Excel use-cases in tandem with R and SQL. As always, resources used in this article will be available on my Github.
Drafting WHERE Statements
My day-to-day requires close collaboration with our investment management and accounting teams, and most of their requests pertain to our income statement and balance sheet. Financial statements, by their nature, are organized by account; and in large and complex organizations, the chart of accounts is often sprawling. Querying all of our financial data is a non-starter, the dataset is simply too large to work with efficiently. We need to query the financial data where a series of conditions are met.
Let’s walk through a real life example:
The investment management team needs to calculate net cap rate1 for the most recent twelve months. To derive our net cap rate we need revenues and expenses from our income statement:
Rental Revenue
Fee Revenue
Bad Debt
Revenue Reductions
Maintenance Expenses
Turnover Expenses
Utility Expense
Insurance Expense
Property Taxes
We also need our cost basis from the balance sheet:
Purchase Price
Closing Costs
Initial Capital Expenditure
Other Capitalized Costs
We could, by hand, write a drawn out WHERE
clause with a condition for each account, or we can do this in Excel and paste the results into our SQL query. To do so, we need a couple pieces of information. I have provided the answers for our example in italics:
What is the table name (and abbreviation) we need to query? financialdata f
What is the column name where we store the accounts? AccountName
We can feed this into an Excel template. Please see the query-editor.xlsx file saved to my excel-resources repository. Feel free to modify the file as needed.
Now, we have one more thing to consider. For our income statement accounts, we want all twelve months worth of data. For our balance sheet accounts we want just the most recent period. Let’s take a look at our final query:
---Sample Net Cap Rate Query
SELECT
f.*
FROM
[SampleDatabase].[dbo].[financialdata] f
WHERE ((f.[AccountName] = 'Rental Revenue' OR
f.[AccountName] = 'Fee Revenue' OR
f.[AccountName] = 'Bad Debt' OR
f.[AccountName] = 'Revenue Reductions' OR
f.[AccountName] = 'Maintenance Expenses' OR
f.[AccountName] = 'Turnover Expenses' OR
f.[AccountName] = 'Utility Expense' OR
f.[AccountName] = 'Insurance Expense' OR
f.[AccountName] = 'Property Taxes') AND
f.[FinancialPeriod] >= '2022-03-01' AND
f.[FinancialPeriod] <= '2023-03-01') OR
((f.[AccountName] = 'Purchase Price' OR
f.[AccountName] = 'Closing Costs' OR
f.[AccountName] = 'Initial Capital Expenditure' OR
f.[AccountName] = 'Other Capitalized Costs') AND
f.[FinancialPeriod] = '2023-03-01')
As you can see, we bifurcated our income statement and balance sheet data to specify different financial periods. This query will generate an output we can use to calculate our net cap rates per investment management’s request.
Data Exploration
As you know, I love R Studio. It is intuitive, fast, and flexible. Of all the data analysis tools I have used, this IDE takes the cake. However, sometimes I just want to look at the raw data. An Excel or .csv output allows me to quickly apply filters and find out what questions I need to ask the business.
I could do this in R, and sometimes it would even be faster to do so. However, when communicating with the business, it is a lot easier to explain an Excel file. Creating Excel exports in R is easy, as I have previously discussed in my Adding Credibility article series.
Again, let’s use the net cap rate example. Investment management has a roundabout way of getting to the same result. They need me and my team to create a pipeline that will free up their time for more pressing matters. However, they know enough to be dangerous, and any variance between my results and theirs will be questioned at length.
In my experience, the best approach is to preempt these questions with a reconciliation that I send in conjunction with my deliverable. Let’s imagine we wrote the query from above to a SQL Server, and accessed the table via R. This way we can do our data analysis in a tool that is best suited for the task. At various stages in my workflow, I’d export the data to Excel using openxlsx
, and reconcile it with the investment management team’s result. Upon discovering discrepancies, we can point the business to where the disconnect is and determine the cause.
Closing Thoughts
Excel is a great tool. We can speed up manual query writing processes and communicate data reconciliations with non-code business users. The latter in particular is invaluable in moving a data asset into production. Based on where the discrepancies arise, you may need to return to the drawing board with your analysis. On the other hand, maybe you have discovered a flaw in existing methodologies or with the underlying data. By exporting this data to Excel, you can communicate the possible flaw to decision makers who can rectify the error
Alec A. Nathan is the Director of Research and Data Analytics at Sylvan Road Capital. A graduate of the University of Georgia, Alec specializes in real estate investment and data science. Please direct any inquiries to anathan@sylvanroad.com.
Net cap rate looks at our annualized Net Operating Income (NOI) and divides it by our total cost basis.
NOI is a simple measure of the profitability of revenue generated by your day-to-day business. The calculation is Net Revenue - Operating Expenses. This metric is an industry standard. All real estate businesses, and in our case SFR, are measured on NOI performance.