A Q&A with Christian Lemp, Director of Data Engineering at Arbol

September 7, 2023
Arbol QA Series featuring Christian Lemp Arbol QA Series featuring Christian Lemp

Welcome to the first in a series of conversations with the people at Arbol who are helping to invigorate the risk management sector.

To kick things off, here is a conversation with Christian Lemp who leads Arbol’s Data team. Christian has 7+ years in data engineering, analytics, and operations research in the risk management and insurance industry. Prior to joining Arbol, he led teams at Travelers and The Hartford focused on automating internal processes like claims, business life cycles, and customer journeys, then founded a blockchain graph analytics company. He is passionate about complex systems research and is working toward completing a PhD in Systems Science. He lives in Santa Fe, NM with his wife, child, and Weimaraner – all of whom love hiking and being outdoors.

Q: What’s the most impactful project or initiative you’ve worked on at Arbol?
Establishing a vision for the team and setting up the internal systems to accomplish big goals. My position was created because Arbol had matured beyond a small startup into a growth company, and the Data team needed to organize to match that growth. The vision is clear: accelerate the growth of a competitive insurance and derivatives business, support a data-driven culture at Arbol, be model citizens of dClimate, and create a culture of accountability, diversity, and career opportunities on the Data team.

This vision laid the foundation for a number of ambitious projects, from re-architecting our climate data warehouse, to creating a Business Intelligence function, to establishing more well defined career tracks for the team. The engineers at Arbol are truly extraordinary and I’m happy I got to join the team and work with them.
Q: How is Arbol's tech stack uniquely positioned to address climate risks?
Arbol uses a unique combination of traditional cloud and cutting-edge web3 technologies. This hybrid approach allows us to offer highly scalable and efficient data solutions for climate risk forecasting, specifically tailored for our Risk and Pricing teams. Additionally, our infrastructure is designed to be transparent and well-structured, making it ideal for blockchain applications such as dRe Lifecycle, which is managed by our Office of Innovation.

Moreover, we have open sourced some of our internal tools - gridded-etl-tools and nettle to transform climate data from a variety of non-standard formats into standardized climate datasets optimized for analysis and building applications
Q: How does Arbol use machine learning and artificial intelligence to assess risk and price policies?
Arbol’s risk and pricing engine uses large amounts of climate data to identify key trends and hidden patterns with machine learning and AI algorithms. The Data team supports this work by ensuring our climate datasets are accurate, complete, and optimized for high performance analytics. The trends these models discover give Arbol a perspective on future weather events, which is incorporated into pricing and portfolio risk management.
Q: Would you describe some of the challenges you’ve faced developing Arbol's tech stack and how they were overcome?
One challenge the team recently faced was how to serve very large gridded datasets of climate activity - which can be up to 5 terabytes in size - efficiently over IPFS, one of our core data procotols. The team ended up combining several technologies and standards - Zarr and IPLD - into a solution that uses specialized data formats which integrate very well with the IPFS distributed storage protocol. The result was a 90% improvement in query performance in some cases. You can read more about this work in a blog post Introducing Zarrchitecture on dClimate or you can check out our open source toolkit which bundled this custom solution into an open package for climate data developers.
Q: How does Arbol's platform integrate with other systems used by clients to manage risk, such as weather monitoring or satellite data?
Arbol has partnered with the RiskStream Collaborative, a network of insurers collaborating on blockchain solutions for risk innovation. Arbol’s dRe Lifecycle dashboard application provides real time parametric loss calculations to participants using smart contracts hosted on the Canopy network. The weather data my team maintains on IPFS is used in smart contracts to send updates based on real world events. This use of IPFS allows for external data to be shared and stored in a secure manner with immutability guarantees.
Q: What is one complex data problem you’ve solved that you think our audience would benefit from learning about?
Managing a variety of climate data for different uses - research, pricing, applications - is a complex problem in itself. The sheer volume and variety of data - we currently maintain over 70 terabytes of data - with some single datasets being more than one terabyte. The size of this data requires us to use specialized tools (and even build our own) to process and transform raw data into a format optimized for high power analytics. Additionally, we receive station data from multiple sources and have had to develop our own in-house schema to standardize these formats so that we can create custom, blended datasets for climate analysis that spans the entire earth.
Q: What Industry trends or technologies have recently caught your attention?
I still see big opportunities for blockchain technology as a tool to create efficiencies and trust within complex multi-party transactions. Looking beyond the consumer hype that has dominated the crypto space over the past few years, I’ve seen a number of private, small blockchain networks be developed through a consortium of institutions that all rely on each other to complete transactions - like reinsurance, collateralized securities, or carbon markets.

Then, of course, the deployment of Large-Language Model AI is a huge step forward for discovering and synthesizing information based on collective knowledge. Again, I see opportunity to build bespoke applications with AI trained on specific domain knowledge to create custom search and discovery tooling.

And generally, any creative visualization of complex systems (like weather patterns) to storytell and illustrate how interactions of individual components give rise to macro scenarios I think is interesting for research and analysis. Ultimately these micro conditions give rise to the events that Arbol helps our customers and partners protect against.
Q: Looking ahead, what new technologies or capabilities is Arbol exploring to further improve its platform and better serve clients?
Currently, Arbol is working closely with the dClimate team to create developer tools capable of sophisticated climate analytics, and to make source climate data accessible over IPFS. As part of the process, we are reworking our internal codebase – which we’ve developed over the past three years – as well as releasing bundles of open source microservice toolkits.

Looking ahead we’re focused on AI in a couple different areas:
  1. Building extremely high powered computing environments to analyze climate patterns across the entire planet for many variables to tune and simulate scenarios for our risk and pricing models.
  2. Using LLMs like ChatGPT and similar technologies to help identify exactly which data source is best suited for a particular climate risk scenario.

To keep abreast of all of the exciting initiatives underway at Arbol, we invite you to follow us on LinkedIn and X.


Continue reading