Jiahao Chen is a Vice President and Research Lead at JPMorgan AI Research in New York, with research focusing on explainability and fairness in machine learning, as well as semantic knowledge management. He was previously a Senior Manager of Data Science at Capital One focusing on machine learning research for credit analytics and retail operations.
When still in academia, Jiahao was a Research Scientist at MIT CSAIL where he co-founded and led the Julia Lab, focusing on applications of the Julia programming language to data science, scientific computing, and machine learning. Jiahao has organized JuliaCon, the Julia conference, for the years 2014-2016, as well as organized workshops at NeurIPS, SIAM CSE, and the American Chemical Society National Meetings. Jiahao has authored over 120 packages for numerical computation, data science and machine learning for the Julia programming language, in addition to numerous contributions to the base language itself.
PhD in Chemical Physics / Computational Science & Engineering, 2009
University of Illinois at Urbana-Champaign
MS in Applied Mathematics, 2008
University of Illinois at Urbana-Champaign
BS in Chemistry, 2002
University of Illinois at Urbana-Champaign
ELI5: I led a team studying how we can use machine learning fairly, to improve customer service and experience, and change banking for good.
Technical Lead for Banking on Explainable AI ResearchGroup within Card Machine Learning. Compliance analytics for fair lending, natural language processing for customer service analytics, customer segmentation.
ELI5: I started and ran a research lab to prove that the Julia programming language was useful for big data and data science work.
Started and managed the Julia Lab together with Professor Alan Edelman, providing the main academic funding responsible for the development, growth and adoption of the Julia programming language. Applied Julia to problems in high performance computing, computational genomics, and statistical computing. The lab comprised 16 students and postdocs at its peak.
ELI5: I coded up a model for studying how drug molecules dissolve in water, and shipped it in commercial software.
Productionized and shipped the 1D- and 3D-RISM (Reduced Interaction Site Model) codes that are now available in Accelrys Discovery Studio.
ELI5: I made new computer models of how molecules trap light and turn them into electricity. I used these models to study new materials used for OLEDs and solar cells.
Computational chemistry research on organic semiconductors, using new techniques of random matrix theory blended with new molecular models for describing atomic charge excitations and transfer.
ELI5: I made new materials that would distort light and/or blow up, and shoot them with lasers to see how they would react.
Synthesized and characterized novel materials for nonlinear optics and energetic materials (explosives), with organic and inorganic chemical synthesis techniques and nonlinear laser spectroscopy.
I am a core contributor to the Julia programming language. In addition to starting and running the Julia Lab at MIT CSAIL from …
Assessing the fairness of a decision making system with respect to a protected class, such as gender or race, is challenging when class membership labels are unavailable. Probabilistic models for predicting the protected class based on observable proxies, such as surname and geolocation for race, are sometimes used to impute these missing labels for compliance assessments. Empirically, these methods are observed to exaggerate disparities, but the reason why is unknown. In this paper, we decompose the biases in estimating outcome disparity via threshold-based imputation into multiple interpretable bias sources, allowing us to explain when over- or underestimation occurs. We also propose an alternative weighted estimator that uses soft classification, and show that its bias arises simply from the conditional covariance of the outcome with the true class membership. Finally, we illustrate our results with numerical simulations and a public dataset of mortgage applications, using geolocation as a proxy for race. We confirm that the bias of threshold-based imputation is generally upward, but its magnitude varies strongly with the threshold chosen. Our new weighted estimator tends to have a negative bias that is much simpler to analyze and reason about.
The financial services industry has unique explainability and fairness challenges arising from compliance and ethical considerations in credit decisioning. These challenges complicate the use of model machine learning and artificial intelligence methods in business decision processes.