Data Visualization at Orchard
A picture is worth a thousand words, but what if that picture contains a thousand data points? What if it contains one hundred thousand? The visualization of data is a combination of both science and art. Much like the editing of a photograph or the post-production of an audio-visual work, a thoughtful and compelling visualization of a dataset has the ability to capture and convey the insight generated by an analysis.
At Orchard, the graphical presentation of data plays a big part in what we do: in our product, on our blog, and in the analytical research used by our clients. This week, we are releasing a major update to the software our investor clients use to manage and analyze their marketplace lending portfolios. Included in this update are new graphical capabilities for investors – the most advanced in Orchard’s history. While the product itself is only available to Orchard clients, we wanted to take this occasion to share Orchard’s approach to data visualization in general.
- Clarity: An effective graph is self explanatory, easy to read, and adds value to the presentation of data. Overly cluttered graphs, or the use of visual elements that don’t convey meaning, will detract from the clarity of the presentation.
- Right Graph For The Job: There are many types of graphs, ranging from the simple (e.g. bar charts, line charts) to the complex (e.g. bubble chart, force-directed graph, chord diagram, word cloud). Think carefully about which type of graph will most effectively present the data you want to emphasize, as well as whether the additional complexity of certain graph types will enhance or detract from the presentation.
- Emphasis and Persuasiveness: An effective graph should emphasize the point you are trying to make and assist in telling the right story from the data.
- Orchard Product: The Orchard investor portal, particularly in its newest incarnation, makes ample use of various types of graphs, the majority of which are highly customizable and responsive to user interaction. Graphs are chosen and built based on what would most effectively convey information to Orchard’s clients.
- Orchard Blog: The Orchard Blog has always been home a wide variety of analysis, generally supported by many graphs. While many points can be effectively made with a bar chart or line chart, more complex analyses have called for violin plots, heatmaps, word clouds, geographic maps, and jitter plots. Our post on the Top Data Visualizations of 2014 is the most popular Orchard Blog post of all time.
- Client-Focused Analytics: Orchard captures a massive amount of data, and our analytics team uses this information to conduct analysis that will benefit our clients. These analyses often use graphs to show tradeoffs between risk, return, and volume, as well as many other factors.
- Technical Operations: If you visit the Orchard office, you’ll see plenty of monitors showing dashboards packed with graphs. These dashboards are used by our technical operations team to monitor our trading systems and other core infrastructure. Graphs here are selected based on their ability to quickly expose a problem if it should arise.
Technologies Used in Visualization
Data visualization is not new, and the desire to add an aesthetic dimension to numbers has existed for a long time. Of course, the technology available to those working with data has had a great effect on the visualizations that are possible.
Below, we see a graph published in 1821 by William Playfair (1759-1823), whose work on statistical graphics has remained relevant nearly 200 years after his death.
Next, we see a graph made in VisiCalc (an early spreadsheet program) on the Apple IIe in 1981.
Below, we see an incredibly detailed interactive visualization of the 2013 United States federal budget. The below is just a screenshot, so be sure to click here to play with the data yourself.
Luckily, we find ourselves at a time in history when we have both ample computing power and a whole constellation of graphics libraries to give us great flexibility in visualizing data. Here are a few technologies we use at Orchard:
- R: The “R” statistical programming language has gained impressive popularity among statisticians and data scientists across many disciplines. R is open-source and enjoys the support of a very large global community. R is the used extensively at Orchard, including in model development, investment strategy development, internal analytics, performance tracking, and blog-focused research.
- ggplot2: This is the name of an R package that is so useful as to warrant its own mention. Built by Hadley Wickham, this bit of software allows R programmers to create incredibly compelling and customizable graphs. Nearly every chart shown on the Orchard Blog is written in R using the ggplot library. ggplot allows us to create amazingly detailed graphs with hundreds of thousands of data points, such as the one below:
The availability and transparency of data are major factors in the advancement of marketplace lending. The ability to capture, analyze, and display these data has contributed significantly to the industry and its growth. We look forward to bringing you more information on how we work with data at Orchard in future blog posts.