By Josh Panknin | December 25, 2020

In a previous article, we discussed the current state of data in real estate and outlined some steps firms can take to achieve better data integration and analysis capabilities. In this article we’ll focus on one of those steps specifically: how to work with third-party data providers to supplement your internal data.

To start, though, we’ll briefly discuss how to identify what data you need and how to get it. This could be obtaining the data from third-party providers or it could be developing your own internal resources. The conversations you have with data providers need to be framed in the lens of what you’re paying for what you’re getting.

Identifying the Data You Need

The first and most important step in obtaining data is identifying what you need in the first place. This will serve two purposes:

  1. A really deep analysis working backwards from what capabilities you want or problems you need to solve to what data is needed to achieve the desired outcome is key. You’ll not only have a better idea of what data is still needed, but you’ll likely develop a much better understanding of the process and workflow you’ll need to implement with the additional data. You’ll often find that you have some pieces of the data you need for the objective, but are missing other pieces. Becoming clear on what data you have that is high quality enough for use in more advanced analytics and what additional data you need will guide where you need to look to obtain the additional data.
  2. We’ve all heard the stories of people going to Target to buy more toothpaste and coming out having spent $250. By working backwards from use case to data needed you can avoid the “shiny object” syndrome that you may be presented with by data providers. When firms spend money and time on data that they can’t or shouldn’t immediately use, often frustration and discouragement starts to creep into the technology development process and this can kill any positive movement towards better analytics.

Working with Data Providers

Before we start, a quick caveat: this is not an article on data science or artificial intelligence, so we’ve made an attempt to provide enough information without going into topics that those without a technology background may not be familiar with. Also, there are likely hundreds of questions that need to be asked of data providers, but those below should be a good place to start the process.

Geography covered

Some data providers offer data only on select geographical areas, such as major metropolitan areas, and do not offer data on smaller areas. If the coverage of the data provider matches the geography of your interest, then it could be a good fit. If the data only covers a portion of the area of your interest, you need to think very carefully about how much benefit you’ll get from the partial coverage and whether or not partial coverage will present any challenges to smooth operations and analytics across your entire business.

Cleanliness of data

Real estate data is notoriously dirty (meaning values are missing, data is in the wrong columns, formatting errors, misspellings, etc.) and difficult to work with. If the data coming from the provider is still dirty and requires significant effort on the part of your firm to clean it and make it available for analysis, then you’ll be spending a lot more time and money than you probably planned to. In many cases this may be necessary as it’s difficult to find and correct many of the idiosyncrasies of real estate data without knowledge of a specific property. If, however, the reason the data is dirty is because the provider is not taking additional steps to clean it, you may want to explore other providers with data that is further along in the preparation process.

Availability of data

There are many different ways that data gets from one place to another. Some providers require that you access information by individually extracting each set of information you want. Others provide files that must be manually downloaded and additional steps that need to be followed to integrate the data into your system. Other providers offer highly structured API’s (Application Programming Interface) that allow seamless communication between one system and another. Depending on your use case and internal technical capabilities, you’ll need to decide which of these options best fit your needs.

Where they get their data

Much of the data available on real estate is from public sources. You want to be sure you’re not paying a data provider for data that you could easily collect and integrate on your own. As discussed above, however, many times this data is very dirty and the data provider spends a significant amount of time fixing this data. Cleaning the data often requires data scientists with specific skills in data collection and cleaning, so it may be worth it to go through a provider even for publicly available data. Other times it may be more cost-effective to explore collecting and integrating the data on your own through either internal hires or consultants with expertise in data collection. This will require a cost analysis of each option.

Format of the data

When it’s clear that data will be combined with other data, either your own internal data or external data from multiple third-party providers, you need to make sure that the format of all the data is consistent. By this we mean that if your internal data covers census tracts and the data provider covers zip codes, you may have difficulty merging these two data sets together effectively and accurately. Other formatting issues could be the time of collection. If all of your data is aggregated monthly, but a data provider only provides quarterly aggregated data, it might not suit your purposes. Ensuring consistency of geography, periodicity (time interval), and other formats is key to a smooth transition from external provider to internal use.


After reviewing all the factors above, you finally need to decide whether the price of the data from the provider is worth what you’re getting. Paying for dirty, incomplete data for the sake of having data hurts your firm more than it will help. Paying too much for data that provides only marginal benefit will also hamper your technology development efforts and eventually will become a risk to the entire digital transformation process if it leads to discouragement in the firm. Understand what you’re getting and what it’s worth to your firm. Then decide if that benefit is worth the cost. Note: this may involve bringing in outside expertise to evaluate some of the more technical aspects of a data provider transaction if your firm does not have the expertise internally.

It’s not likely that data providers will understand your business or goals as well as you do, so it’s not likely they’ll be able to vet what data is right for you. Only by understanding the nuances of your needs and the data available will you be able to effectively choose the right provider.

What to Expect from Data Providers

Throughout the vetting process, providers should be willing and able to answer your questions directly, show you examples and formats of data, and be as transparent as possible. This doesn’t mean exposing trade secrets, but they should be extremely open about how their data and processes will help your company. They should also be transparent about what challenges you may experience with their data or what shortfalls their data has relative to your needs. If questions are not openly addressed or the process becomes difficult, it should be a red flag as to how the relationship and efficacy of the data could evolve.

Working with as few data providers as possible

One final note on choosing data providers. It is usually recommended that you partner with as few data providers as possible. All of the factors above (geographic concerns, availability, cleanliness, format, etc.) will be dealt with in every single data provider you use. Eventually, firms begin having difficulty integrating data providers’ data with their own data and also with every other data providers’ data. This can quickly become a mess and stall any forward movement you could have made towards implementing more advanced analysis in your firm.


Firms need to either adapt or face the risks associated with stagnation, but adapting ineffectively is just as dangerous as not adapting at all. Answering questions on strategy and backing into what you need to accomplish that strategy is the first step in the process. Choosing appropriate partners along the way can make or break the potential of your firm in the future. By thinking more deeply about the topics above, hopefully you’ll be able to make decisions that better position your company for success.

About the Author

Josh Panknin is a Visiting Assistant Professor of Real Estate at New York University’s Schack Institute of Real Estate and an adjunct professor in the school of engineering at Columbia University. Prior to academics, Josh was Head of Credit Modeling and Analytics at Deutsche Bank’s secondary CMBS trading desk where he helped develop and implement automated models for valuing CMBS loans and bonds. He also spent time at the Ackman-Ziff Real Estate Group and in various other roles in research, acquisitions, and redevelopment. Josh has a master’s degree in finance from San Diego State University and a master’s degree in real estate finance from New York University’s Schack Institute of Real Estate.


Thank you for contacting us. we will get back to you shortly!

This site uses cookies to improve your user experience. By using our website, you are agreeing to our use of cookies.
Click here for more information.